Bug 574368 - commit-graph feature support
Summary: commit-graph feature support
Status: NEW
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: unspecified   Edit
Hardware: PC All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 577948
  Show dependency tree
 
Reported: 2021-06-21 22:21 EDT by Kyle Zhao CLA
Modified: 2023-04-13 08:31 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kyle Zhao CLA 2021-06-21 22:21:54 EDT
In a large repository, it may takes a few seconds to list and filter commit history.

We inspected the stack traces for command git log --before=xxxxx with CPU Profiler. Here are a few of stack traces with some hot spots identified, the left of the line is the percent of total cpu time.

|--99.1% o.e.j.Mytest.testLinuxLog 
   `--97.1% o.e.j.revwalk.RevWalk.iterator
      `--97.1% o.e.j.revwalk.RevWalk.nextForIterator 
          `--97.1% o.e.j.revwalk.RevWalk.next 
              `--97.1% o.e.j.revwalk.PendingGenerator.next
                   `--94.9% o.e.j.revwalk.RevCommit.parseHeaders
                      |--86.1% o.e.j.revwalk.RevWalk.getCachedBytes
                      `--8.6% o.e.j.revwalk.RevCommit.parseCanonical

When filtering the commit history, JGit always invokes the method o.e.j.revwalk.RevWalk.getCachedBytes to get the entire infomation of the commit ( It takes over 85% in the stacke traces above ). Git had to go looking for where the commit was stored, decompress the commit file, and parse the commit in plain-text. This can be expensive, especially when doing it thousands of times.

The commit-graph can help with this problem. It is a supplemental data structure that accelerates commit graph walks ( E.g. listing and filtering commit history, computing merge-base....).

see 
https://git-scm.com/docs/commit-graph
https://devblogs.microsoft.com/devops/supercharging-the-git-commit-graph/
https://devblogs.microsoft.com/devops/supercharging-the-git-commit-graph-ii-file-format/
https://devblogs.microsoft.com/devops/supercharging-the-git-commit-graph-iii-generations/
https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-bloom-filters/
Comment 1 Kyle Zhao CLA 2021-06-21 22:33:21 EDT
Our company have been implementing this feature based on JGit for a long while to improve the performance of commit graph walks. And it's already used in some of our git repositories.

we test it on https://github.com/torvalds/linux, and the following is the performance number with this change.

    cmmand                    |   berfore  |    after  |  change
git log --before=2009-1-1     |  4790ms    |    589ms  |   -87%
git merge-base master topic   |  483ms     |    58ms   |   -88%

master: 032b4cc8ff84490c4bc7c4ef8c91e6d83a637538 
topic:  62d18ecfa64137349fac9c5817784fbd48b54f48

We are willing to contribute our code to the community, and hope someone can review it.

Regards,
Kyle Zhao
Comment 2 Matthias Sohn CLA 2021-06-22 03:14:12 EDT
(In reply to Kyle Zhao from comment #1)
> Our company have been implementing this feature based on JGit for a long
> while to improve the performance of commit graph walks. And it's already
> used in some of our git repositories.
> 
> we test it on https://github.com/torvalds/linux, and the following is the
> performance number with this change.
> 
>     cmmand                    |   berfore  |    after  |  change
> git log --before=2009-1-1     |  4790ms    |    589ms  |   -87%
> git merge-base master topic   |  483ms     |    58ms   |   -88%

that's a nice speedup

> master: 032b4cc8ff84490c4bc7c4ef8c91e6d83a637538 
> topic:  62d18ecfa64137349fac9c5817784fbd48b54f48
> 
> We are willing to contribute our code to the community, and hope someone can
> review it.

sure, please contribute it
Looking forward to reviewing your implementation.
Comment 3 Eclipse Genie CLA 2021-07-06 23:41:51 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/182832
Comment 4 Eclipse Genie CLA 2021-07-08 09:06:19 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/182892
Comment 5 Eclipse Genie CLA 2021-07-12 08:02:46 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/182976
Comment 6 Eclipse Genie CLA 2021-07-15 00:49:06 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/183078
Comment 7 Eclipse Genie CLA 2021-07-15 00:49:08 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/183079
Comment 8 Eclipse Genie CLA 2021-11-09 07:48:31 EST
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/187541
Comment 9 Eclipse Genie CLA 2022-11-18 02:28:32 EST
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/197097