Bug 573328 - convert-ref-storage --format refdir fails with "Too many open files"
Summary: convert-ref-storage --format refdir fails with "Too many open files"
Status: RESOLVED FIXED
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: 5.12   Edit
Hardware: PC Mac OS X
: P3 normal (vote)
Target Milestone: 5.12   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-03 17:02 EDT by Matthias Sohn CLA
Modified: 2021-05-11 04:16 EDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Sohn CLA 2021-05-03 17:02:49 EDT
I tried convert-ref-storage command which was added to gerrit stable-3.4 recently [1] with a mirror of the egit repository, converting to reftable works nicely, but converting back to refdir storage fails with the error

$ jgit convert-ref-storage --format refdir
2021-05-03 22:54:26 ERROR LockFile:134 - Creating lock file /Users/xxxxxx/src/git/gerrit-site-3.4/git/egit.git/refs/changes/44/151444/meta.lock failed
java.nio.file.FileSystemException: /Users/xxxxxx/src/git/gerrit-site-3.4/git/egit.git/refs/changes/44/151444/meta.lock: Too many open files
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
	at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
	at java.base/java.nio.file.Files.newByteChannel(Files.java:370)
	at java.base/java.nio.file.Files.createFile(Files.java:647)
	at org.eclipse.jgit.util.FS_POSIX.createNewFileAtomic(FS_POSIX.java:427)
	at org.eclipse.jgit.internal.storage.file.LockFile.lock(LockFile.java:132)
	at org.eclipse.jgit.internal.storage.file.PackedBatchRefUpdate.lockLooseRefs(PackedBatchRefUpdate.java:316)
	at org.eclipse.jgit.internal.storage.file.PackedBatchRefUpdate.execute(PackedBatchRefUpdate.java:159)
	at org.eclipse.jgit.lib.BatchRefUpdate.execute(BatchRefUpdate.java:602)
	at org.eclipse.jgit.internal.storage.file.FileRepository.convertToPackedRefs(FileRepository.java:671)
	at org.eclipse.jgit.internal.storage.file.FileRepository.convertRefStorage(FileRepository.java:814)
	at org.eclipse.jgit.pgm.ConvertRefStorage.run(ConvertRefStorage.java:34)
	at org.eclipse.jgit.pgm.TextBuiltin.execute(TextBuiltin.java:235)
	at org.eclipse.jgit.pgm.Main.execute(Main.java:245)
	at org.eclipse.jgit.pgm.Main.run(Main.java:133)
	at org.eclipse.jgit.pgm.Main.main(Main.java:105)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:107)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
	at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88)
java.nio.file.FileSystemException: /Users/xxxxxx/src/git/gerrit-site-3.4/git/egit.git/refs/changes/44/151444/meta.lock: Too many open files
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
	at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
	at java.base/java.nio.file.Files.newByteChannel(Files.java:370)
	at java.base/java.nio.file.Files.createFile(Files.java:647)
	at org.eclipse.jgit.util.FS_POSIX.createNewFileAtomic(FS_POSIX.java:427)
	at org.eclipse.jgit.internal.storage.file.LockFile.lock(LockFile.java:132)
	at org.eclipse.jgit.internal.storage.file.PackedBatchRefUpdate.lockLooseRefs(PackedBatchRefUpdate.java:316)
	at org.eclipse.jgit.internal.storage.file.PackedBatchRefUpdate.execute(PackedBatchRefUpdate.java:159)
	at org.eclipse.jgit.lib.BatchRefUpdate.execute(BatchRefUpdate.java:602)
	at org.eclipse.jgit.internal.storage.file.FileRepository.convertToPackedRefs(FileRepository.java:671)
	at org.eclipse.jgit.internal.storage.file.FileRepository.convertRefStorage(FileRepository.java:814)
	at org.eclipse.jgit.pgm.ConvertRefStorage.run(ConvertRefStorage.java:34)
	at org.eclipse.jgit.pgm.TextBuiltin.execute(TextBuiltin.java:235)
	at org.eclipse.jgit.pgm.Main.execute(Main.java:245)
	at org.eclipse.jgit.pgm.Main.run(Main.java:133)
	at org.eclipse.jgit.pgm.Main.main(Main.java:105)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:107)
	at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
	at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88)

[1] https://gerrit-review.googlesource.com/c/gerrit/+/303748
Comment 1 Matthias Sohn CLA 2021-05-03 17:04:07 EDT
PackedBatchRefUpdate creates too many open file handles and the command crashes.

PackedBatchRefUpdate is trying to lock packed-refs and also tries to lock each ref as a loose ref individually which creates a ton of lock files which is slow and prone to hit this exception. It has an optimization to skip the locking of loose refs in case of a clone but doesn't use it for conversion from reftable to refdir.
Comment 2 Matthias Sohn CLA 2021-05-03 17:05:49 EDT
I tried to increase ulimit -n to 1000000 (on Mac) but it still crashed.
Comment 3 Luca Milanesio CLA 2021-05-03 18:57:14 EDT
Thanks @Matthias for trying this out.
Comment 4 Thomas Wolf CLA 2021-05-04 03:43:56 EDT
Same problem as in bug 552173. What does C git do when using packed-refs? Does it also run into this problem? If not, how does C git avoid this?
Comment 5 Thomas Wolf CLA 2021-05-04 06:24:37 EDT
Looks to me that the problem is the eager creation of the FileOutputStream in LockFile.lock(). That must not happen; only later when getOutputStream() is called.
Comment 6 Matthias Sohn CLA 2021-05-04 17:11:48 EDT
(In reply to Thomas Wolf from comment #5)
> Looks to me that the problem is the eager creation of the FileOutputStream
> in LockFile.lock(). That must not happen; only later when getOutputStream()
> is called.

I think this is done intentionally to ensure an exclusive write lock. 

C git opens a file descriptor for the lock file, see lock_file() [1] calling 
create_tempfile_mode() [2] which opens a file descriptor for the lock file
with flags O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC. If the kernel doesn't support
O_CLOEXEC it omits that. If locking succeeds lock_file() returns the file descriptor of the open lock file to the caller.

I guess opening a FileOutputStream on the lock file is the equivalent method to do this in Java.

[1] https://github.com/git/git/blob/master/lockfile.c#L73
[2] https://github.com/git/git/blob/master/tempfile.c#L133
Comment 7 Thomas Wolf CLA 2021-05-04 17:30:20 EDT
I'm not sure opening a FileOutputStream sets O_EXCL.

Anyway, see C git: files_transaction_prepare() [1] takes care to not keep all these file descriptors open.

[1] https://github.com/git/git/blob/master/refs/files-backend.c#L2659
Comment 8 Thomas Wolf CLA 2021-05-04 17:45:14 EDT
Looking at [1], I don't think opening a FileOutputStream set O_EXCL.

[1] https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/unix/native/libjava/FileOutputStream_md.c#L56
Comment 9 Thomas Wolf CLA 2021-05-05 01:48:49 EDT
I got confused by the mention of "exclusive write lock". There is no such thing in the C git code.

O_CREAT | O_EXCL is the equivalent of "atomic" file creation. O_RDWR is just normal read/write access. O_CLOEXEC atomically sets a flag to close the file descriptor when a child process is spawned, so that the child doesn't "inherit" the fd.

In C it is natural and makes sense to return the fd opened. In Java using java.io, this makes no sense at all; the way it's implemented currently it's two operations anyway: first creating the file, then opening a FileOutputStream on it. So one can delay the latter until one really wants to write.

If we wanted to create the file and get an open OutputStream in one go, we would have to rewrite the whole way of creating these lock files anyway using java.nio, not using Files.createFile() in FS.createFileAtomically() but using File.newOutputStream(path, StandardOpenOption.WRITE, StandardOpenOption.CREATE_NEW) directly. (Which is what Files.createFile() uses internally, immediately closing the byte channel.)
Comment 10 Eclipse Genie CLA 2021-05-05 02:34:55 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/180198
Comment 11 Matthias Sohn CLA 2021-05-07 06:33:28 EDT
(In reply to Thomas Wolf from comment #9)
> I got confused by the mention of "exclusive write lock". There is no such
> thing in the C git code.

yes, you are right

> O_CREAT | O_EXCL is the equivalent of "atomic" file creation. O_RDWR is just
> normal read/write access. O_CLOEXEC atomically sets a flag to close the file
> descriptor when a child process is spawned, so that the child doesn't
> "inherit" the fd.
> 
> In C it is natural and makes sense to return the fd opened. In Java using
> java.io, this makes no sense at all; the way it's implemented currently it's
> two operations anyway: first creating the file, then opening a
> FileOutputStream on it. So one can delay the latter until one really wants
> to write.

I agree

> If we wanted to create the file and get an open OutputStream in one go, we
> would have to rewrite the whole way of creating these lock files anyway
> using java.nio, not using Files.createFile() in FS.createFileAtomically()
> but using File.newOutputStream(path, StandardOpenOption.WRITE,
> StandardOpenOption.CREATE_NEW) directly. (Which is what Files.createFile()
> uses internally, immediately closing the byte channel.)
Comment 13 Eclipse Genie CLA 2021-05-10 18:17:32 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/180432