Bug 578707 - checkout of branch with lfs files uses .gitattributes from old branch instead of target branch
Summary: checkout of branch with lfs files uses .gitattributes from old branch instead...
Status: RESOLVED FIXED
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: 6.0   Edit
Hardware: PC Windows 10
: P3 normal (vote)
Target Milestone: 6.1   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-11 22:52 EST by Alex Lowe CLA
Modified: 2022-03-08 04:45 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Lowe CLA 2022-02-11 22:52:20 EST
Clone a respository like this:

Git.cloneRepository()
    .setURI(remoteUrl)
    .setDirectory(local)
    .setBranch("branchA")
    .call();

Suppose branchA has an LFS file a.bin and .gitattributes in branchA looks like this:

a.bin filter=lfs diff=lfs merge=lfs -text

Then all is dandy up to this point.  Now checkout branchB

Git git = Git.open(local);
git.branchCreate()
    .setName("branchB")
    .setStartPoint("origin/branchB")
    .setUpstreamMode(SetupUpstreamMode.SET_UPSTREAM)
    .call();
git.checkout()
    .setName(branchOrTag)
    .call();

Suppose branchB has an LFS file b.bin and .gitattributes in branchB looks like this:

b.bin filter=lfs diff=lfs merge=lfs -text

Then all is NOT dandy.  b.bin is just the LFS pointer, not the actual file.  checkout branchA again and again all is NOT dandy. a.bin is now just the LFS pointer too.

However if .gittattributes looks like this in both branches:

a.bin filter=lfs diff=lfs merge=lfs -text
b.bin filter=lfs diff=lfs merge=lfs -text

Then all is dandy.

(or maybe results will vary depending on the order in which .gitattributes is checked out?)
Comment 1 Thomas Wolf CLA 2022-02-12 13:15:50 EST
Looks like JGit always uses the .gitattributes as given by <global, info, <working tree, index, HEAD>>. Which is fine for determining whether index of working tree are dirty.

But for then doing the actual checkout from a commit X, it should probably take the .gitattributes defined by <global, info, <X>>.

If this is really wrong in JGit we're looking at fairly complicated fixes in a complicated area. (Not just checkout, but also merge/diff.)

What happens when you perform the same operations with C git?
Comment 2 Thomas Wolf CLA 2022-02-12 16:02:34 EST
I gave this a try with C git, using explicit (and stupid) clean and smudge filters. Easier than LFS, and easier reproducible than attempting similar
things with different CRLF settings. 

$ git init
$ git config filter.foo.clean 'echo "cleaned"'
$ git config filter.foo.smudge 'echo "foo"'
$ git config filter.foo2.clean 'echo "cleaned2"'
$ git config filter.foo2.smudge 'echo "foo2"'
$ git config filter.bar.clean 'echo "cleaned"'
$ git config filter.bar.smudge 'echo "bar"'
$ git config filter.a.clean 'echo "aclean"'
$ git config filter.a.smudge 'echo "aaa"'
$ git config filter.b.clean 'echo "bclean"'
$ git config filter.b.smudge 'echo "bbb"'
$ echo "test" > test.txt
$ git add .
$ git commit -m "initial commit"

End of test setup.

$ git checkout -b a
$ echo "foo" > foo.txt
$ echo "foo2" > foo2.txt
$ echo "aaa" > a.txt
$ echo "foo.txt filter=foo" > .gitattributes
$ echo "foo2.txt filter=foo2" >> .gitattributes
$ echo "a.txt filter=a" >> .gitattributes
$ git add .
$ git commit -m "commit a"
$ git checkout master
$ git checkout -b b
$ echo "bar" > foo.txt
$ echo "bar" > foo2.txt
$ echo "bbb" > b.txt
$ echo "foo.txt filter=bar" > .gitattributes
$ echo "foo2.txt filter=bar" >> .gitattributes
$ echo "b.txt filter=b" >> .gitattributes
$ git add .
$ git commit -m "commit b"
$ git checkout master
Switched to branch 'master'
$ git checkout a
Switched to branch 'a'
$ cat a.txt 
aaa
$ cat foo.txt 
foo
$ cat foo2.txt 
foo2
$ git checkout b
Switched to branch 'b'
$ cat b.txt 
bbb
$ cat foo.txt 
foo
$ cat foo2.txt 
bar
$

C git does apply the .gitattributes from commit b for the new file b.txt.
Since the "cleaned" versions of foo.txt are identical in both branches, it
leaves foo.txt untouched on disk. foo2.txt is different when cleaned in the
two branches, git can detect a modification, and applies the .gitattributes
from commit b on checkout.

JGit indeed seems to have this one wrong.

$ rm foo.txt
$ git checkout -- foo.txt
$ cat foo.txt
bar
$

Now it did apply the .gitattributes. But which one?

$ cat .gitattributes 
foo.txt filter=bar
foo2.txt filter=bar
b.txt filter=b
$ sed -i -e s/=bar/=foo/ .gitattributes 
$ cat .gitattributes 
foo.txt filter=foo
foo2.txt filter=foo
b.txt filter=b
$ rm foo.txt
$ git checkout -- foo.txt
$ cat foo.txt 
foo
$

Hmmm... it applies the working tree .gitattributes now?

$ sed -i -e s/foo.txt/#foo.txt/ .gitattributes 
$ cat .gitattributes 
#foo.txt filter=foo
foo2.txt filter=foo
b.txt filter=b
$ rm foo.txt
$ git checkout -- foo.txt
$ cat foo.txt 
cleaned
$ 

And _only_ the working tree .gitattributes now?

$ rm foo.txt
$ git checkout a -- foo.txt
$ cat foo.txt
cleaned
$

So not a special case when checking out from HEAD... looks like for checking
out specific paths, C git does *not* take the .gitattributes of the commit
we're checking out from into account. And even though we're doing a checkout,
the working tree change overrides the .gitattributes in the commit.
Comment 3 Thomas Wolf CLA 2022-02-12 16:05:53 EST
And this is a simple JUnit test (for LfsGitTest) to reproduce the problem in JGit:

  @Test
  public void testBranchSwitch() throws Exception {
    git.branchCreate().setName("abranch").call();
    git.checkout().setName("abranch").call();
    File aFile = writeTrashFile("a.bin", "aaa");
    writeTrashFile(".gitattributes", "a.bin filter=lfs");
    git.add().addFilepattern(".").call();
    git.commit().setMessage("acommit").call();
    git.checkout().setName("master").call();
    git.branchCreate().setName("bbranch").call();
    git.checkout().setName("bbranch").call();
    File bFile = writeTrashFile("b.bin", "bbb");
    writeTrashFile(".gitattributes", "b.bin filter=lfs");
    git.add().addFilepattern(".").call();
    git.commit().setMessage("bcommit").call();
    git.checkout().setName("abranch").call();
    checkFile(aFile, "aaa");
    git.checkout().setName("bbranch").call();
    checkFile(bFile, "bbb");
  }

It does indeed check out aFile as raw LFS pointer.
Comment 4 Eclipse Genie CLA 2022-02-14 06:36:29 EST
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/190766