Bug 582292 - jgit GC: ArrayIndexOutOfBoundsException during repacking
Summary: jgit GC: ArrayIndexOutOfBoundsException during repacking
Status: NEW
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: 5.13   Edit
Hardware: PC Mac OS X
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-10 08:18 EDT by Antonio Barone CLA
Modified: 2023-08-16 10:04 EDT (History)
1 user (show)

See Also:


Attachments
A repository exposing the problem (674.64 KB, application/zip)
2023-08-16 10:04 EDT, Antonio Barone CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Antonio Barone CLA 2023-08-10 08:18:05 EDT
1. Keep stacking changes on top of each other for multiple branches on different terminals, run the following:

 while true; do date > date.txt; git add date.txt; git commit -m "Testing"; git push origin HEAD:refs/for/master; done
 while true; do date > date.txt; git add date.txt; git commit -m "Testing"; git push origin HEAD:refs/for/branch_1; done
 while true; do date > date.txt; git add date.txt; git commit -m "Testing"; git push origin HEAD:refs/for/branch_2; done
 while true; do date > date.txt; git add date.txt; git commit -m "Testing"; git push origin HEAD:refs/for/branch_3; done

2. In a different terminal run GC, as writes keep incoming

jgit.sh gc

*What's the expected output?*

GC completes successfully


*What do you see instead?*

An exception is thrown:

Index -1 out of bounds for length xxxx
java.lang.ArrayIndexOutOfBoundsException: 
	at org.eclipse.jgit.internal.storage.file.BitSet.set(BitSet.java:40)
	at org.eclipse.jgit.internal.storage.file.PackBitmapIndexRemapper.getBitmap(PackBitmapIndexRemapper.java:161)
	at org.eclipse.jgit.internal.storage.file.BitmapIndexImpl.getBitmap(BitmapIndexImpl.java:60)
	at org.eclipse.jgit.internal.storage.file.BitmapIndexImpl.getBitmap(BitmapIndexImpl.java:32)
	at org.eclipse.jgit.revwalk.BitmapWalker.findObjectsWalk(BitmapWalker.java:180)
	at org.eclipse.jgit.revwalk.BitmapWalker.findObjects(BitmapWalker.java:128)
	at org.eclipse.jgit.internal.storage.pack.PackWriter.prepareBitmapIndex(PackWriter.java:2416)
	at org.eclipse.jgit.internal.storage.file.GC.writePack(GC.java:1219)
	at org.eclipse.jgit.internal.storage.file.GC.repack(GC.java:864)
	at org.eclipse.jgit.internal.storage.file.GC.doGc(GC.java:285)
	at org.eclipse.jgit.internal.storage.file.GC.gc(GC.java:232)


* Notes *

An analysis of this makes me think this is related to the recent bug fix [1] merged in 5.13.

Specifically, the error happens during repacking and more specifically during the bitmap remapping phase.

Some objects (i.e. blobs, commits, trees) cannot be remapped from the previous to the new bitmap since they have been added to the exclusion list (as they were part of the packfile associated with the keep file found during GC). 

This is where the packfile associated with the keep file is added to the exclusion list [2].

This is where the objects in the exclusion list are not repacked [3]

This is where the remapping happens [4]: Note that objects in the old bitmap index are always expected to be found in the new one.
This assumption is wrong, if the new pack index simply does not contain those objects since they have been excluded.

[1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=582039

[2] https://git.eclipse.org/r/plugins/gitiles/jgit/jgit/+/refs/heads/stable-5.13/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/GC.java#843

[3] https://git.eclipse.org/r/plugins/gitiles/jgit/jgit/+/refs/heads/stable-5.13/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/PackWriter.java#2130

[4] https://git.eclipse.org/r/plugins/gitiles/jgit/jgit/+/refs/heads/stable-5.13/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/PackBitmapIndexRemapper.java#78
Comment 1 Antonio Barone CLA 2023-08-10 08:40:57 EDT
Introduced WIP change to show the issue: https://git.eclipse.org/r/c/jgit/jgit/+/203633
Comment 2 Antonio Barone CLA 2023-08-16 10:04:48 EDT
Created attachment 289161 [details]
A repository exposing the problem

Running GC with https://git.eclipse.org/r/plugins/gitiles/jgit/jgit/+/3a6eec9bb697a599a32ccb08ee176e6d4982f90f