Community
Participate
Working Groups
Created attachment 288454 [details] e-mail between Thomas and myself. Hello, I've been working with Thomas Wolf on a performance issue when cloning to a Network Samba share. Thomas asked me to open this bug. Regards, Jim See also https://www.eclipse.org/forums/index.php?t=msg&th=1110388&goto=1851740&#msg_1851740
Created attachment 288455 [details] Eclipse run output when cloning to a network Samba drive Thomas reviewed this log already; it's part of an attachment e-mail thread.
Yes. This looks similar to the problems we had in EGit when the user home (and thus the user config) was on a shared drive on Windows: far too many accesses to the git repo config, each time getting the file attributes of the repo config, the user config, and the system config to check whether they were modified. Getting file attributes on Windows is a costly operation. It looks like we have about 2-3 git config accesses *per file* that is checked out. Also a lot of checks for the pack directory. This will need need some detailed tests to first find out where these Filesnapshot tests are coming from exactly, and then making them artificially slower (a short Thread.sleep() should do the trick). That way, this could be analyzed and fixed even on a fast filesystem.
Hello, Any chance to get a few cycles on this issue? Thanks, Jim
Improving this is a lot of work. There are several reasons for the many git config accesses. 1. During check-out: these come from DirCacheCheckout.checkoutEntry(), which calls repo.getConfig().get(WorkingTreeOptions.KEY) for every single file. repo.getConfig() checks the repo, user, and system config for being modified. This could be avoided in several ways: * Pass the config into DirCacheCheckout.checkoutEntry() and make sure it is loaded once outside in DirCacheCheckout.doCheckout(). There are other check-out paths though, also in merge, that might need to be adapted. (Or alternatively pass in the WorkingTreeOptions directly.) * Incorporate EGit's CachingRepository into FileRepository, port EGit's UnitOfWork, and modify commands that perform check-outs to run the actual check-out in a UnitOfWork. (A UnitOfWork caches the git config and does not re-check for on-disk modifications for the duration of the UnitOfWork.) 2. Writing refs (i.e., branches and tags received). There are many RefUpdates that call repo.getConfig() to get the core.trustFolderStat setting, and if false, would also check the packed refs file for having been modified. (See RefDirectory.getPackedRefs().) Similar remedies as above might be applied. Either read that trustFolderStat setting only once and pass it around, or see if a UnitOfWork could be used. Or handle the initial writing of refs specially when cloning. 3. (Not a git config access, but a costly file attributes access) Checks for the pack directory being modified: these come from the fetch phase when receiving pack files and parsing them. There is a check against duplicate objects (i.e., objects with the same object ID but different content) in there, which scans all packs in the directory, and if not found, checks whether new packs have been added to the pack directory and if so re-scans again. Unclear how this could be improved. This duplicate check is done for every object received. Maybe switch off this duplicate check? Or just don't re-check for new pack files if cloning and the first pass over the pack files didn't find a duplicate?
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/195181
(In reply to Eclipse Genie from comment #5) > New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/195181 This fixes (1) from comment 4 by loading the WorkingTreeOptions only once instead of for every file to be checked out. (2) and (3) are still open. Note that (3) has the biggest effect.
Gerrit change https://git.eclipse.org/r/c/jgit/jgit/+/195181 was merged to [master]. Commit: http://git.eclipse.org/c/jgit/jgit.git/commit/?id=b255eb0fb6ab9c7487423083e8df9b1e373bc37e
Thanks, Tomas, for the work on this issue. Do you know when this change will make it into Eclipse and what version that will be? Thanks, Jim
(In reply to James Poli from comment #8) > Thanks, Tomas, for the work on this issue. Do you know when this change > will make it into Eclipse and what version that will be? > Thanks, > Jim This patch will be included in JGit 6.3.0 which is planned to be released with Eclipse 2022-09 (4.25) on Sep 14.