Community
Participate
Working Groups
While running some tests using the new benchmark [1] I reproducibly hit this exception [2] when creating >10k refs in RefTable. [1] https://git.eclipse.org/r/c/jgit/jgit/+/185683 [2] # JMH version: 1.21 # VM version: JDK 17, OpenJDK 64-Bit Server VM, 17+35-LTS-sapmachine # VM invoker: /Users/d029788/.sdkman/candidates/java/17-sapmchn/bin/java # VM options: -ea # Warmup: 2 iterations, 10 ms each # Measurement: 2 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.eclipse.jgit.benchmarks.GetRefsBenchmark.testGetExactRef # Parameters: (doGc = true, numBranches = 50000, trustFolderStat = true) # Run progress: 0,00% complete, ETA 00:00:40 # Fork: 1 of 1 # Warmup Iteration 1: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Preparing test - repository: /var/folders/16/nqvdxf5s45309nxkv4k4r20r0000gq/T/dir601914784298366167/testrepo/.git - refDatabase: reftable - trustFolderStat: true - doGc: true - branches: 50000 10% of branches created 20% of branches created<failure> java.lang.ArrayIndexOutOfBoundsException: Index -16258 out of bounds for length 4096 at org.eclipse.jgit.util.NB.decodeUInt24(NB.java:97) at org.eclipse.jgit.internal.storage.reftable.BlockReader.seekKey(BlockReader.java:406) at org.eclipse.jgit.internal.storage.reftable.ReftableReader.seek(ReftableReader.java:216) at org.eclipse.jgit.internal.storage.reftable.ReftableReader.seekLog(ReftableReader.java:204) at org.eclipse.jgit.internal.storage.reftable.MergedReftable.seekLog(MergedReftable.java:159) at org.eclipse.jgit.internal.storage.reftable.Reftable.seekLog(Reftable.java:165) at org.eclipse.jgit.internal.storage.reftable.ReftableReflogReader.getReverseEntries(ReftableReflogReader.java:80) at org.eclipse.jgit.internal.storage.reftable.ReftableReflogReader.getReverseEntries(ReftableReflogReader.java:52) at org.eclipse.jgit.internal.storage.file.GC.listRefLogObjects(GC.java:1035) at org.eclipse.jgit.internal.storage.file.GC.repack(GC.java:809) at org.eclipse.jgit.internal.storage.file.GC.doGc(GC.java:269) at org.eclipse.jgit.internal.storage.file.GC.gc(GC.java:220) at org.eclipse.jgit.api.GarbageCollectCommand.call(GarbageCollectCommand.java:179) at org.eclipse.jgit.benchmarks.GetRefsBenchmark$BenchmarkState.setupBenchmark(GetRefsBenchmark.java:109) at org.eclipse.jgit.benchmarks.generated.GetRefsBenchmark_testGetExactRef_jmhTest._jmh_tryInit_f_benchmarkstate1_G(GetRefsBenchmark_testGetExactRef_jmhTest.java:457) at org.eclipse.jgit.benchmarks.generated.GetRefsBenchmark_testGetExactRef_jmhTest.testGetExactRef_AverageTime(GetRefsBenchmark_testGetExactRef_jmhTest.java:166) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833)
What is the recommended way to run these benchmarks? I just get complaints about slf4j. this is with commit 677910252bcd5b59d4b2a71106deaa0ef9ae7ee3 and running bazelisk bazelisk build org.eclipse.jgit.benchmarks/... bazel-bin/org.eclipse.jgit.benchmarks/benchmarks (I removed all other benchmarks, but the other benchmarks fail as well.) # Run progress: 90.00% complete, ETA 00:00:00 # Fork: 5 of 5 # Warmup Iteration 1: <failure> java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory at org.eclipse.jgit.util.SystemReader.<clinit>(SystemReader.java:55) at org.eclipse.jgit.lib.BaseRepositoryBuilder.readEnvironment(BaseRepositoryBuilder.java:363) at org.eclipse.jgit.api.InitCommand.call(InitCommand.java:58) at org.eclipse.jgit.benchmarks.GetRefsBenchmark$BenchmarkState.setupBenchmark(GetRefsBenchmark.java:85) at org.eclipse.jgit.benchmarks.generated.GetRefsBenchmark_testGetRefsByPrefix_jmhTest._jmh_tryInit_f_benchmarkstate1_G(GetRefsBenchmark_testGetRefsByPrefix_jmhTest.java:457) at org.eclipse.jgit.benchmarks.generated.GetRefsBenchmark_testGetRefsByPrefix_jmhTest.testGetRefsByPrefix_AverageTime(GetRefsBenchmark_testGetRefsByPrefix_jmhTest.java:166) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ... 18 more
yes, running benchmarks from bazel seems broken: bazelisk run //org.eclipse.jgit.benchmarks:benchmarks GetRefsBenchmark INFO: Invocation ID: e7b8ae84-89fe-444a-ab0f-dfbac783f2d2 DEBUG: Rule 'com_googlesource_gerrit_bazlets' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1581093008 -0500" DEBUG: Repository com_googlesource_gerrit_bazlets instantiated at: no stack (--record_rule_instantiation_callstack not enabled) Repository rule git_repository defined at: /Users/d029788/.cache/bazel/b645215973ac043930fa5900e337e822/external/bazel_tools/tools/build_defs/repo/git.bzl:195:33: in <toplevel> INFO: Analyzed target //org.eclipse.jgit.benchmarks:benchmarks (8 packages loaded, 67 targets configured). INFO: Found 1 target... Target //org.eclipse.jgit.benchmarks:benchmarks up-to-date: bazel-bin/org.eclipse.jgit.benchmarks/benchmarks.jar bazel-bin/org.eclipse.jgit.benchmarks/benchmarks INFO: Elapsed time: 2.931s, Critical Path: 1.42s INFO: 5 processes: 4 darwin-sandbox, 1 worker. INFO: Build completed successfully, 9 total actions INFO: Build completed successfully, 9 total actions # JMH version: 1.21 # VM version: JDK 1.8.0_302, OpenJDK 64-Bit Server VM, 25.302-b08 # VM invoker: /Users/d029788/.sdkman/candidates/java/8.0.302-tem/jre/bin/java # VM options: <none> # Warmup: 3 iterations, 100 ms each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.eclipse.jgit.benchmarks.GetRefsBenchmark.testGetExactRef # Parameters: (doGc = false, numBranches = 10000, trustFolderStat = true, useRefTable = false) # Run progress: 0.00% complete, ETA 00:08:23 # Fork: 1 of 5 # Warmup Iteration 1: <failure> java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory at org.eclipse.jgit.util.SystemReader.<clinit>(SystemReader.java:55) at org.eclipse.jgit.lib.BaseRepositoryBuilder.readEnvironment(BaseRepositoryBuilder.java:363) at org.eclipse.jgit.api.InitCommand.call(InitCommand.java:58) at org.eclipse.jgit.benchmarks.GetRefsBenchmark$BenchmarkState.setupBenchmark(GetRefsBenchmark.java:86) at org.eclipse.jgit.benchmarks.generated.GetRefsBenchmark_testGetExactRef_jmhTest._jmh_tryInit_f_benchmarkstate1_G(GetRefsBenchmark_testGetExactRef_jmhTest.java:460) at org.eclipse.jgit.benchmarks.generated.GetRefsBenchmark_testGetExactRef_jmhTest.testGetExactRef_AverageTime(GetRefsBenchmark_testGetExactRef_jmhTest.java:166) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 18 more Using Maven it works: $ mvn -Dmaven.javadoc.skip=true -DskipTests package $ java -jar org.eclipse.jgit.benchmarks/target/benchmarks.jar GetRefsBenchmark
I also did like Matthias mentioned: $ java -jar org.eclipse.jgit.benchmarks/target/benchmarks.jar GetRefsBenchmark
fix for running benchmarks from bazel https://git.eclipse.org/r/c/jgit/jgit/+/185855
run only GetRefsBenchmark using bazel: $ bazelisk run //org.eclipse.jgit.benchmarks:benchmarks GetRefsBenchmark
With this diff: ``` diff --git a/org.eclipse.jgit.benchmarks/BUILD b/org.eclipse.jgit.benchmarks/BUILD index 66a2482f0..45daefec9 100644 --- a/org.eclipse.jgit.benchmarks/BUILD +++ b/org.eclipse.jgit.benchmarks/BUILD @@ -9,6 +9,7 @@ jmh_java_benchmarks( srcs = SRCS, deps = [ "//lib:slf4j-api", + "//lib:javaewah", "//org.eclipse.jgit:jgit", ], ) diff --git a/org.eclipse.jgit.benchmarks/src/org/eclipse/jgit/benchmarks/GetRefsBenchmark.java b/org.eclipse.jgit.benchmarks/src/org/eclipse/jgit/benchmarks/GetRefsBenchmark.java index eab01a150..df96ea23c 100644 --- a/org.eclipse.jgit.benchmarks/src/org/eclipse/jgit/benchmarks/GetRefsBenchmark.java +++ b/org.eclipse.jgit.benchmarks/src/org/eclipse/jgit/benchmarks/GetRefsBenchmark.java @@ -57,9 +57,9 @@ public class GetRefsBenchmark { @State(Scope.Benchmark) public static class BenchmarkState { - boolean useRefTable = false; + boolean useRefTable = true; - @Param({ "100", "2500", "10000", "50000" }) + @Param({ "50000" }) int numBranches; @Param({ "true", "false" }) hanwen@hanwen1:~/vc/jgit$ git rev-parse HEAD 47fc86ba3999e20803a222dab1151248902289bb ``` I get the following: ``` $ bazelisk run //org.eclipse.jgit.benchmarks:benchmarks GetRefsBenchmark # JMH version: 1.21 # VM version: JDK 11.0.12, OpenJDK 64-Bit Server VM, 11.0.12+7-post-Debian-2 # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java # VM options: <none> # Warmup: 2 iterations, 100 ms each # Measurement: 2 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.eclipse.jgit.benchmarks.GetRefsBenchmark.testGetExactRef # Parameters: (doGc = true, numBranches = 50000, trustFolderStat = true) # Run progress: 0.00% complete, ETA 00:13:28 # Fork: 1 of 5 # Warmup Iteration 1: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Preparing test - repository: /tmp/dir14905737446309329634/testrepo/.git - refDatabase: reftable - trustFolderStat: true - doGc: true - branches: 50000 100% of branches created Pushing 50000 branches ... DONE Running GC ... DONE 5.788 us/op # Warmup Iteration 2: 4.545 us/op Iteration 1: 2.461 us/op Iteration 2: 2.264 us/op # Run progress: 2.50% complete, ETA 02:17:00 # Fork: 2 of 5 # Warmup Iteration 1: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Preparing test - repository: /tmp/dir15815808046509383323/testrepo/.git - refDatabase: reftable - trustFolderStat: true - doGc: true - branches: 50000 100% of branches created Pushing 50000 branches ... DONE Running GC ... DONE 9.032 us/op # Warmup Iteration 2: 6.676 us/op Iteration 1: 2.598 us/op Iteration 2: 2.671 us/op ``` note that the test setup is e x t r e m e l y s l o w, compared to the operation being benchmarked.
Is this a problem with JDK 11 vs JDK 17 ?
could you provide me with a .tar.gz of the testrepo just before you start the GC operation?
(In reply to Han-Wen NIenhuys from comment #6) ... > note that the test setup is e x t r e m e l y s l o w, compared to the > operation being benchmarked. I think I hit this using patchset 9 of the benchmark which was e v e n m u c h s l o w er (didn't sit in front of the computer while it was running)
thanks, I can see the problem with PS 9.
it's trying to find the reflog for "HEAD". This somehow reads an incorrect offset, which then reads garbage, leading to negative indices. I cross-checked with my own reftable code in Go, which can do the seek correctly, so it's likely a problem on the read side.
reftable.BlockReader has a `truncated` boolean field, described as // Its OK during sequential scan for an index block to have been // partially read and be truncated in-memory. This happens when // the index block is larger than the file's blockSize. Caller // will break out of its scan loop once it sees the blockType. this is bogus as: - index blocks are never sequentially scanned - no caller is checking the `truncated` field reflog blocks can be variably sized. In the error scenario, the block is 8188 bytes long (default blocksize is 4096). A few lines below, the code gets the restart counts from the end of the buffer. However, if the buffer is truncated (like here), that location does not contain the restart count. The Go/C code I wrote correctly issues a second read if the actual blocksize is more than the initial guess. This has not surfaced at Google, as we currently don't store the reflogs in reftable. since the `truncated` bit seems unused, maybe it's best to drop this and issue a second read if necessary.
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/185902
Gerrit change https://git.eclipse.org/r/c/jgit/jgit/+/185902 was merged to [stable-5.6]. Commit: http://git.eclipse.org/c/jgit/jgit.git/commit/?id=5f8c48413623ea4d1685063582c74a216207ef51