Community
Participate
Working Groups
Hi! Gerrit 2.16 was just released <https://groups.google.com/forum/#!topic/repo-discuss/wngI5mi6PtY>. \o/ The same code has been working well on gerrit-review.googlesource.com for a while now. Looking at https://www.gerritcodereview.com/2.14.html#bugfix-releases, I see a number of bug fixes we are missing. More importantly to me :), Gerrit 2.16 includes some very nice UI improvements (e.g. ability to see all the unaddressed comments on a change in one place) that I've gotten used to and would love to see at git.eclipse.org/r/. If there's anything I can do to help, just ask.
+1 I think it would be really nice to update to 2.16 The new UI is awesome. Let me know if you need help
+1 Gerrit upgrades are always easy and seamless.
While we are at it, we should also upgrade the foundation Gerrit instance. :)
I typically start with our internal instance :)
Let's schedule this for Tuesday, May 14 14:00 EDT
How is the upgrade going? Any ETA on when the service might be back up?
Seems the step "Migrating data to schema 146 ..." is taking a really long time. No ETA, at this rate.
I aborted about 2g ago to try a different approach, but I'm still at "Migrating data to schema 146 ...". If I don't have any feedback from the install process in the next hour, I will rollback and try a different approach during the weekend.
I've rolled back the upgrade, and Gerrit is up once again. I'll investigate in the morning and update the sandbox, so I can stage a migration there.
> Denis Roy 2018-11-16 10:37:16 EST > +1 Gerrit upgrades are always easy and seamless. Of course, I had to jinx myself :/
Looks like the Genie ist not working, e.g. the following gerrit is not linked to bugzilla after almost an hour: https://git.eclipse.org/r/#/c/142149/
(In reply to Till Brychcy from comment #11) > Looks like the Genie ist not working, e.g. the following gerrit is not > linked to bugzilla after almost an hour: > https://git.eclipse.org/r/#/c/142149/ Also bugzilla doesn't comment on merged commits anymore, see https://bugs.eclipse.org/bugs/show_bug.cgi?id=547148 and https://git.eclipse.org/r/#/c/141316/.
(In reply to Till Brychcy from comment #11) (In reply to Andrey Loskutov from comment #12) I don't think that is directly related to this upgrade. The process responsible for linking gerrit and bugzilla seems to have taken an unexpected nap. I've restarted it so events should start flowing again. -M.
I've sandboxed the entirety of gerrit and all our Git repos, and running 2.15.13 on it.
Migrating data to schema 146 ... has been running for almost 8h now. In a few hours I'll jstack it to see if I can determine what's going on under the hood.
I uploaded a patch [1] for this migration which you can use to get progress logged, this should help to get an estimate how long it would take. In the gerrit site run $ cd git/All-Users.git/ [All-Users.git (BARE:refs/meta/config)]$ git show-ref | grep 'refs/users' | wc -l to get the number of users you have. They are already stored in noteDb since 2.14 though in 2.15 some fields were added hence another migration is needed. [1] https://gerrit-review.googlesource.com/c/gerrit/+/224648
(In reply to Matthias Sohn from comment #16) For some reason, I figured you'd chime in and help :) Thank you. 128,000 users. I'll try your patched build. All-Users is definitely increasing in size. Will report back soon.
(In reply to Matthias Sohn from comment #16) > I uploaded a patch [1] for this migration which you can use to get progress The patch was helpful, Schema 146 starts like a rabbit -- several hundred Users/sec and it gradually slows down to about 200 users/sec. Things hit a hard wall at exactly 124,000 users migrated. I see very slow progress after that, several minutes for each slice of 100 users, yet CPU usage fluctuates between 20% and 80% of one core. I tried max heap of 8g then 12g, didn't seem to change much. 2:58pm... migrated 124100 users 3:06pm... migrated 124200 users 3:16pm... migrated 124300 users 3:24pm... migrated 124400 users 3:36pm... migrated 124500 users One other thing: It may be losing its connection to MySQL from time to time. It would be great if it could reconnect automatically.
Interestingly, I stopped the process at 125,300, after 2h of runtime. I started it up, it zipped to 125,300 then stopped, and resumed its crawl. Is it safe for me to run the 2.15 init process on the 2.14 installation _while it's running_ ?
I am working on a parallel implementation of this migration and that's much faster. I just test on an empty Gerrit server on which I created 128k users using a script and it looks like this can migrate these 128k users in 25min on my 4 core MacBook running the migration using 16 threads. By experimenting I found that gcíng the All-Users.git repo before the upgrade improves performance: $ cd git/All-Users.git $ git gc --prune=now and repacking refs during the (parallel migration). I added this ref repacking to the migration.
Strangely, at this time: ... migrated 143600 users > By experimenting I found that gcíng the All-Users.git repo before the > upgrade improves performance: > > $ cd git/All-Users.git > $ git gc --prune=now Great minds... I ran gc on the production All-Users last Friday. Thanks for the assistance.
I uploaded another version of this patch, the previous one didn't migrate all users :-(. Patchset 3 does. See https://gerrit-review.googlesource.com/c/gerrit/+/224833/3
When do you intend to continue with this upgrade ? Did the fix I provided help to get the users migrated or are you stuck somewhere ?
The upgrade is still stuck as it loses its connection to the database, regardless of the autoreconnect setting. I'd like to resume the upgrade process at some point soon though.
(In reply to Denis Roy from comment #24) > The upgrade is still stuck as it loses its connection to the database, > regardless of the autoreconnect setting. I'd like to resume the upgrade > process at some point soon though. ok meanwhile my patch was merged and is included in the Gerrit 2.15.14 release https://www.gerritcodereview.com/2.15.html
Any progress here? Or, any further show stopper(s)? JGit and Gitiles plugin versions are totally outdated in Gerrit 2.14.6, that prevents us to consume JGit-Archive using JGit archive command because of this bug: [1]. See also this discussion in this CL why this fix on git.eclipse.org is needed: [2]. Also note, that Gerrit 2.4.6 is quite outdated minor release with tons of bugs. You should consider to upgrade ASAP to: 2.14.20: [3]. At least this upgrade should be "easy and seamless" ;-). [1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=548312 [2] https://gerrit-review.googlesource.com/c/gerrit/+/227897 [3] https://repo1.maven.org/maven2/com/google/gerrit/gerrit-war/2.14.20/gerrit-war-2.14.20.war
I'll try to throw this back on my radar.
(In reply to David Ostrovsky from comment #26) > Also note, that Gerrit 2.4.6 is quite outdated minor release with tons of > bugs. You should consider to upgrade ASAP to: 2.14.20: [3]. At least this > upgrade should be "easy and seamless" ;-). > > [3] > https://repo1.maven.org/maven2/com/google/gerrit/gerrit-war/2.14.20/gerrit- > war-2.14.20.war Here is an excerpt from fixed issues in 14 minor releases that are missing on eclipse.org: v2.14.15 o (https://www.gerritcodereview.com/2.14.html#21415), Update JGit to 4.7.5.201810051826-r to fix CVE-2018-17456 o Issue 9823:[a] Fix force push permission check for administrators and project owners over SSH. v2.14.16 o Issue 9836:[b] Fix database connections leaks. v2.14.17 o Issue 9952:[c] Upgrade dependencies to newer versions to fix CVEs. v2.14.18 o Major Git protocol security issue fixed. Issue 10262:[d] Upgrade JGit to 4.7.7.201812240805-r to fix validation of wants in git-upload-pack for protocol v0 stateless transports. v2.14.20 o Issue 10695:[e] Upgrade JGit to 4.7.9.201904161809-r to fix regression in packfile list handling. [a] https://bugs.chromium.org/p/gerrit/issues/detail?id=9823 [b] https://bugs.chromium.org/p/gerrit/issues/detail?id=9836 [c] https://bugs.chromium.org/p/gerrit/issues/detail?id=9952 [d] https://bugs.chromium.org/p/gerrit/issues/detail?id=10262 [e] https://bugs.chromium.org/p/gerrit/issues/detail?id=10695
As an occasional contributor to several of the Foundation's projects, I'm very keen on seeing this happening as well! Gerrit 2.14 is becoming increasingly outdated compared to all the features and commodities platforms such as GitHub have to offer nowadays. I'm also seeing an increasing number of issues, mainly on mobile devices: it's a big challenge to post any review comment using Firefox for Android, I even managed to loose an entire review comment I was preparing earlier this morning. Upgrading will hopefully make things much smoother.
Denis, is there plans for this?
BTW it is time to think about moving to Gerrit 3.x now.
(In reply to Alexander Kurtakov from comment #31) > BTW it is time to think about moving to Gerrit 3.x now. +1 My offer to help with this upgrade is still available :-)
Many thanks -- our hands are tied at the moment, no ETA on this yet.
We'll start by upgrading to 2.14.20 Friday, Feb 28.
2.14.20 upgraded cleanly; running the reindexers for good measure.
Thanks, looks all good. When can we do the next step ?
Do you have any suggestions on what our next step should be?
(In reply to Denis Roy from comment #37) > Do you have any suggestions on what our next step should be? I'd target the following steps (for any upgrade always use the latest available service release of the given minor release): 1. upgrade to 2.15 [1] then 2.16 [2] in one step, stay on reviewDB (MySQL) 2. on 2.16 migrate to noteDB [3] 3. upgrade to 3.0 then 3.1 in one step [4] Upgrade all plugins you use to the respective target release of each step. If you have some own plugins you need to compile and test them for 1. against Gerrit API 2.16 and for step 3. against Gerrit API 3.1. There is no need to install the plugins for the intermediate versions 2.15 and 3.0 if you follow this plan since you won't run Gerrit on these intermediate versions. We should also review your configuration for each step. After step 2. git gc needs to be scheduled to run more frequently for the All-Users repository (something like every 30min instead of once a day). [1] https://www.gerritcodereview.com/2.15.html $ java -jar gerrit-2.15.18.war init -d site_path [2] https://www.gerritcodereview.com/2.16.html $ java -jar gerrit-2.16.16.war init -d site_path $ java -jar gerrit-2.16.16.war reindex --index projects -d site_path $ java -jar gerrit-2.16.16.war reindex --index groups -d site_path you may consider to also reindex all indexes (including the largest changes index) offline (that's faster than online migration but your downtime will be longer) then run instead of the 2 commands above $ java -jar gerrit-2.16.16.war reindex -d site_path [3] if you can afford the downtime run the offline migration https://gerrit-review.googlesource.com/Documentation/note-db.html#offline-migration otherwise use the online migration https://gerrit-review.googlesource.com/Documentation/note-db.html#online-migration [4] https://www.gerritcodereview.com/3.0.html $ java -jar gerrit-3.0.7.war init -d site_path https://www.gerritcodereview.com/3.1.html $ java -jar gerrit-3.1.3.war init -d site_path
I am running a full reindex on a sandbox instance, since 2.15 won't upgrade: Migrating data to schema 144 ... Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: Failed to migrate external IDs to NoteDb
(In reply to Denis Roy from comment #39) > I am running a full reindex on a sandbox instance, since 2.15 won't upgrade: > > Migrating data to schema 144 ... > Couldn't upgrade schema. Expected if slave and read-only database > ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. > Expected if slave and read-only database > com.google.gwtorm.server.OrmException: Failed to migrate external IDs to > NoteDb Any additional errors in the error log ? How was the sandbox instance created ? Which steps did you run and what was their outcome / errors logged ? Can you grant me access to the sandbox instance ?
> Any additional errors in the error log ? Nothing. > How was the sandbox instance created ? mysqldump > mysql to import the db rsync production git to sandbox area rsync production gerrit app directory to staging area (minus the staging config file) > Which steps did you run and what was their outcome / errors logged ? java -jar gerrit-2.15.18.war reindex -d gerrit-sandbox/ java -jar gerrit-2.15.18.war init -d gerrit-sandbox/ > Can you grant me access to the sandbox instance ? SSH access? We could use something like "screen" to give you access temporarily. Would that work?
(In reply to Denis Roy from comment #41) > > Any additional errors in the error log ? > Nothing. > > > How was the sandbox instance created ? > > mysqldump > mysql to import the db > rsync production git to sandbox area > rsync production gerrit app directory to staging area (minus the staging > config file) > > > > Which steps did you run and what was their outcome / errors logged ? > > java -jar gerrit-2.15.18.war reindex -d gerrit-sandbox/ > java -jar gerrit-2.15.18.war init -d gerrit-sandbox/ > > > > > Can you grant me access to the sandbox instance ? > > SSH access? We could use something like "screen" to give you access > temporarily. Would that work? yes, let me know how I can connect then I will try later this evening. I am going to commute home now.
I've sent you an email with details. Thanks!
(In reply to Matthias Sohn from comment #38) > (In reply to Denis Roy from comment #37) > > Do you have any suggestions on what our next step should be? > > I'd target the following steps (for any upgrade always use the latest > available service release of the given minor release): > > 1. upgrade to 2.15 [1] then 2.16 [2] in one step, stay on reviewDB (MySQL) this week 2.16.17 was released which enables direct upgrade from 2.14 to 2.16.17. I.e. using that version the intermediate upgrade from 2.14 to 2.15 can be skipped since [5] was fixed. > 2. on 2.16 migrate to noteDB [3] > 3. upgrade to 3.0 then 3.1 in one step [4] [5] https://bugs.chromium.org/p/gerrit/issues/detail?id=10248
When can we continue ?
> I.e. using that version the intermediate upgrade from 2.14 to 2.15 can be > skipped since Awesome, I will look at this tomorrow.
(In reply to Denis Roy from comment #39) > I am running a full reindex on a sandbox instance, since 2.15 won't upgrade: > > Migrating data to schema 144 ... > Couldn't upgrade schema. Expected if slave and read-only database > ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. > Expected if slave and read-only database > com.google.gwtorm.server.OrmException: Failed to migrate external IDs to > NoteDb Still the error I get with 2.16.17
(In reply to Denis Roy from comment #47) > (In reply to Denis Roy from comment #39) > > I am running a full reindex on a sandbox instance, since 2.15 won't upgrade: > > > > Migrating data to schema 144 ... > > Couldn't upgrade schema. Expected if slave and read-only database > > ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. > > Expected if slave and read-only database > > com.google.gwtorm.server.OrmException: Failed to migrate external IDs to > > NoteDb > > Still the error I get with 2.16.17 the details you sent via email: gerrit@gerrit-vm1:~> /usr/lib64/jvm/jre-1.8.0-openjdk/bin/java -jar gerrit-2.16.17.war init -d gerrit-sandbox/ Using secure store: com.google.gerrit.server.securestore.DefaultSecureStore *** Gerrit Code Review 2.16.17 *** *** Git Repositories *** Location of Git repositories [/home/data/git_stg/]: *** SQL Database *** Database server type [jdbc]: URL [jdbc:mysql://dbmaster/gerrit_stg]: Driver class name [com.mysql.jdbc.Driver]: Database username [gerrit_stg]: Change gerrit_stg's password [y/N]? *** Index *** Type [lucene/?]: The index must be rebuilt before starting Gerrit: java -jar gerrit.war reindex -d site_path *** User Authentication *** Authentication method [ldap/?]: Git/HTTP authentication [http/?]: LDAP server [ldap://ldapmaster]: LDAP username : Account BaseDN [dc=eclipse,dc=org]: Group BaseDN [ou=group,dc=eclipse,dc=org]: Enable signed push support [Y/n]? n *** Email Delivery *** SMTP server hostname [localhost]: SMTP server port [(default)]: SMTP encryption [none/?]: SMTP username : *** Container Process *** Run as [gerrit]: Java runtime [/usr/lib64/jvm/jre-1.8.0-openjdk/bin]: Upgrade gerrit-sandbox/bin/gerrit.war [Y/n]? Copying gerrit-2.16.17.war to gerrit-sandbox/bin/gerrit.war *** SSH Daemon *** Listen on address [*]: Listen on port [29419]: *** HTTP Daemon *** Behind reverse proxy [Y/n]? Proxy uses SSL (https://) [Y/n]? Subdirectory on proxy server [/stg/]: Listen on address [*]: Listen on port [8081]: Canonical URL [https://git.eclipse.org/staging/]: *** Cache *** Delete cache file /home/data/users/gerrit/gerrit-sandbox/cache/change_kind.h2.db [y/N]? y Delete cache file /home/data/users/gerrit/gerrit-sandbox/cache/diff.h2.db [y/N]? y Delete cache file /home/data/users/gerrit/gerrit-sandbox/cache/diff_intraline.h2.db [y/N]? y Delete cache file /home/data/users/gerrit/gerrit-sandbox/cache/diff_summary.h2.db [y/N]? y Delete cache file /home/data/users/gerrit/gerrit-sandbox/cache/git_tags.h2.db [y/N]? y Delete cache file /home/data/users/gerrit/gerrit-sandbox/cache/mergeability.h2.db [y/N]? y *** Plugins *** Installing plugins. Install plugin codemirror-editor version v2.16.17 [y/N]? Install plugin commit-message-length-validator version v2.16.17 [Y/n]? commit-message-length-validator v2.15.18 is already installed, overwrite it [Y/n]? Updated commit-message-length-validator to v2.16.17 Install plugin download-commands version v2.16.17 [Y/n]? download-commands v2.15.18 is already installed, overwrite it [Y/n]? Updated download-commands to v2.16.17 Install plugin hooks version v2.16.17 [y/N]? Install plugin replication version v2.16.17 [Y/n]? replication v2.15.18 is already installed, overwrite it [Y/n]? Updated replication to v2.16.17 Install plugin reviewnotes version v2.16.17 [Y/n]? reviewnotes v2.15.18 is already installed, overwrite it [Y/n]? Updated reviewnotes to v2.16.17 Install plugin singleusergroup version v2.16.17 [Y/n]? singleusergroup v2.15.18 is already installed, overwrite it [Y/n]? Updated singleusergroup to v2.16.17 Initializing plugins. Upgrading schema to 143 ... Upgrading schema to 144 ... Upgrading schema to 145 ... Upgrading schema to 146 ... Upgrading schema to 147 ... Upgrading schema to 148 ... Upgrading schema to 149 ... Upgrading schema to 150 ... Upgrading schema to 151 ... Upgrading schema to 152 ... Upgrading schema to 153 ... Upgrading schema to 154 ... Upgrading schema to 155 ... Upgrading schema to 156 ... Upgrading schema to 157 ... Upgrading schema to 158 ... Upgrading schema to 159 ... Upgrading schema to 160 ... Upgrading schema to 161 ... Upgrading schema to 162 ... Upgrading schema to 163 ... Upgrading schema to 164 ... Upgrading schema to 165 ... Upgrading schema to 166 ... Upgrading schema to 167 ... Upgrading schema to 168 ... Upgrading schema to 169 ... Upgrading schema to 170 ... Migrating data to schema 143 ... > Done (0.001 s) Migrating data to schema 144 ... Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: update failure on schema_version at com.google.gwtorm.schema.sql.SqlDialect.convertError(SqlDialect.java:162) at com.google.gwtorm.schema.sql.DialectMySQL.convertError(DialectMySQL.java:232) at com.google.gwtorm.jdbc.JdbcAccess.convertError(JdbcAccess.java:489) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:232) at com.google.gerrit.server.schema.SchemaVersion.finish(SchemaVersion.java:175) at com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java:156) at com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java:94) at com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:85) at com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:111) at com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:225) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) at Main.main(Main.java:28) Caused by: java.sql.SQLException: No operations allowed after statement closed. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:964) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.StatementImpl.checkClosed(StatementImpl.java:442) at com.mysql.jdbc.PreparedStatement.clearBatch(PreparedStatement.java:1003) at com.mysql.jdbc.PreparedStatement.executeBatchInternal(PreparedStatement.java:1266) at com.mysql.jdbc.StatementImpl.executeBatch(StatementImpl.java:970) at com.google.gwtorm.schema.sql.SqlDialect.executeBatch(SqlDialect.java:448) at com.google.gwtorm.jdbc.JdbcAccess.execute(JdbcAccess.java:460) at com.google.gwtorm.jdbc.JdbcAccess.updateAsBatch(JdbcAccess.java:276) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:227) ... 16 more Initialized /home/data/users/gerrit/gerrit-sandbox Reindexing projects: 11% ( 189/1714)ERROR com.google.gerrit.index.SiteIndexer : Failed to index project elogbook/elogbook java.util.concurrent.ExecutionException: java.lang.NullPointerException at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:492) at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:83) at com.google.gerrit.index.SiteIndexer$ErrorListener.run(SiteIndexer.java:112) at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:398) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1029) at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:871) at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:716) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.afterRanInterruptibly(TrustedListenableFutureTask.java:133) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:80) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:83) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:646) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at com.google.gerrit.server.index.project.AllProjectsIndexer.lambda$reindexProjects$0(AllProjectsIndexer.java:77) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) ... 10 more Reindexing projects: 100% (1714/1714) Reindexed 1714 documents in projects index in 70.3s (24.4/s)
Is there any more context to the stack trace? The message says it failed on schema 144, so I would expected to see com.google.gerrit.server.schema.Schema_144 in there somewhere.
(In reply to David Pursehouse from comment #49) > Is there any more context to the stack trace? I'm not sure what you're asking. The complete output is in comment 48. Should I be looking elsewhere?
(In reply to Denis Roy from comment #50) > (In reply to David Pursehouse from comment #49) > > Is there any more context to the stack trace? > > I'm not sure what you're asking. The complete output is in comment 48. > Should I be looking elsewhere? The stack strace is complete. The Schema_144 doesn't show up in the stack trace, because it was apparently successful, but the problem occurs during attempt to bump schema version in finish() method that is updating the schema version table in batch operation mode: for (SchemaVersion v : pending) { Stopwatch sw = Stopwatch.createStarted(); ui.message(String.format("Migrating data to schema %d ...", v.getVersionNbr())); v.migrateData(db, ui); => v.finish(curr, db); ui.message(String.format("\t> Done (%.3f s)", sw.elapsed(TimeUnit.MILLISECONDS) / 1000d)); } This part from the stack trace is relevant: Caused by: java.sql.SQLException: No operations allowed after statement closed. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:964) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.StatementImpl.checkClosed(StatementImpl.java:442) at com.mysql.jdbc.PreparedStatement.clearBatch(PreparedStatement.java:1003) at com.mysql.jdbc.PreparedStatement.executeBatchInternal(PreparedStatement.java:1266) at com.mysql.jdbc.StatementImpl.executeBatch(StatementImpl.java:970) at com.google.gwtorm.schema.sql.SqlDialect.executeBatch(SqlDialect.java:448) It can be seen, that the MySQL driver tries to re-use statement in batch mode but is failing because this statement is closed. Can this be a problem with your MySQL driver? What driver version are you using? Are you using database pooling? What I would also try to do is to disable batch mode for MySQL dialect and to see if that helps in your case. I uploaded this CL for gwtorm project: [1] and conducted gwtorm release 1.21, with that change included, for you to try: [2]. To use, just unzip release.war and replace gwtorm.jar in WEB-INF/lib/gwtorm-1.18.jar. It worth noting, that there is exactly the same stack trace that was reported by another site running MySQL dialect: [3]. They worked around the problem by re-running init program again after the failed attempt and this solved the problem. [1] https://gerrit-review.googlesource.com/c/gwtorm/+/263932 [2] https://github.com/davido/gwtorm/releases/tag/v1.21 [3] https://bugs.chromium.org/p/gerrit/issues/detail?id=9734
> Can this be a problem with your MySQL driver? What driver version are you > using? How would I determine that? > Are you using database pooling? poollimit = 400 poolMaxIdle = 16 poolMaxWait = 60s connectionPool = true I'll try the upgrade with pooling disabled. > What I would also try to do is to disable batch mode for MySQL dialect and > to see > if that helps in your case. I uploaded this CL for gwtorm project: [1] and > conducted > gwtorm release 1.21, with that change included, for you to try: [2]. To use, > just > unzip release.war and replace gwtorm.jar in WEB-INF/lib/gwtorm-1.18.jar. Will try this next if pooling disabled doesn't resolve the issue. > > It worth noting, that there is exactly the same stack trace that was > reported by another > site running MySQL dialect: [3]. They worked around the problem by > re-running init > program again after the failed attempt and this solved the problem. That issue gives me Permission Denied. Regardless, I've re-run init countless times. Many thanks for your help
(In reply to Denis Roy from comment #52) > > Can this be a problem with your MySQL driver? What driver version are you > > using? > > How would I determine that? I think the mysql driver is named mysql-connector-java-<version>.jar if you un-jar it there should be a META-INF/MANIFEST.MF file which has all the details
Disabling pooling did nothing, but replacing gwtorm-1.18.jar with 1.21 is getting me somewhere. The upgrade to 2.16.17 is still running! Upgrading schema to 144 ... Upgrading schema to 145 ... Upgrading schema to 146 ... Upgrading schema to 147 ... Upgrading schema to 148 ... Upgrading schema to 149 ... Upgrading schema to 150 ... Upgrading schema to 151 ... Upgrading schema to 152 ... Upgrading schema to 153 ... Upgrading schema to 154 ... Upgrading schema to 155 ... Upgrading schema to 156 ... Upgrading schema to 157 ... Upgrading schema to 158 ... Upgrading schema to 159 ... Upgrading schema to 160 ... Upgrading schema to 161 ... Upgrading schema to 162 ... Upgrading schema to 163 ... Upgrading schema to 164 ... Upgrading schema to 165 ... Upgrading schema to 166 ... Upgrading schema to 167 ... Upgrading schema to 168 ... Upgrading schema to 169 ... Upgrading schema to 170 ... Migrating data to schema 144 ... > Done (269.570 s) Migrating data to schema 145 ... > Done (18.109 s) Migrating data to schema 146 ... Migrating accounts ... (287.851 s) scan accounts ... (288.390 s) gc --prune=now Pack refs: 100% (127734/127734) Counting objects: 1849397 Finding sources: 100% (1849397/1849397) Getting sizes: 100% (1059302/1059302) Compressing objects: 99% (1001264/1001270) Writing objects: 100% (1849397/1849397) Selecting commits: 263340 Selecting commits: 37% <fingers crossed>
(In reply to Denis Roy from comment #54) > Disabling pooling did nothing, but replacing gwtorm-1.18.jar with 1.21 is > getting me somewhere. The upgrade to 2.16.17 is still running! Great news! To assist you even further, we have uploaded this work in Pprogress change to gerrit project and replaced gwtorm with the custom version with disabled batch operation mode: [1] for MySQl dialect driver. With that change in place and with the verification on GerritForge-CI you can now fetch the release.war version with custom gwtorm version, so that you don't have to patch release.war to replace gwtorm version: [2]: $ wget https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/84365/artifact/gerrit/bazel-bin/release.war 2020-04-24 07:50:43 (6.26 MB/s) - ‘release.war’ saved [76479488/76479488] $ unzip -t release.war | grep gwtorm testing: WEB-INF/lib/gwtorm-client.jar OK testing: WEB-INF/lib/gwtorm-client-src.jar OK We are preparing new gerrit release with disabled batch operation mode in gwtorm for MySQL dialect driver, and waiting for your final confirmation, that this in fact solved the migration. [1] https://gerrit-review.googlesource.com/c/gerrit/+/178051 [2] https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/84365/artifact/gerrit/bazel-bin/release.war
That was a red herring, since I had patched the wrong war file. Simply running init over and over again apparently got me further along. But with the unpatched file, it stops here: ... (14258.322 s) Migrated all 276602 accounts to schema 146 Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: update failure on schema_version at com.google.gwtorm.schema.sql.SqlDialect.convertError(SqlDialect.java:162) I'll patch the gerrit.war file and re run init
> I'll patch the gerrit.war file and re run init Patched version is init'ing now. It will take a while; I'll report back later.
(In reply to Denis Roy from comment #56) > That was a red herring, since I had patched the wrong war file. Simply > running init over and over again apparently got me further along. That's exactly what the other Gerrit user (Typo3) reported on the issue that I linked in my previous comment: just re-running the init fixed the MySQL driver breakage for them too. But that sounds like MySQL driver issue to me and not Gerrit issue. Just to let you know, we bumped MySQL Connector/J version to 5.1.48 today morning from version 5.1.43 on stable 2.16 branch: [1]. This change was merged already. On GeritForge CI the corresponding final artifact for change: [1] is here: [2]. Can you try that release.war as well? Note, though, that gwtorm is not patched in: [2], only MySQL driver was upgraded, that's because the CL that deactivated batch operation mode in MySQL dialect wasn't merged yet. [1] https://gerrit-review.googlesource.com/c/gerrit/+/264252 [2] https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/84439/artifact/gerrit/bazel-bin/release.war
The previous attempt failed with: ... (15029.988 s) Migrated all 276602 accounts to schema 146 Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: update failure on schema_version at com.google.gwtorm.schema.sql.SqlDialect.convertError(SqlDialect.java:159) at com.google.gwtorm.schema.sql.DialectMySQL.convertError(DialectMySQL.java:242) at com.google.gwtorm.jdbc.JdbcAccess.convertError(JdbcAccess.java:484) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:230) at com.google.gerrit.server.schema.SchemaVersion.finish(SchemaVersion.java:175) at com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java:156) at com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java:94) at com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:85) at com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:111) at com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:225) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) at Main.main(Main.java:28) Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 15,029,843 milliseconds ago. The last packet sent successfully to the server was 15,029,843 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) I'm running now using your release.war file, and jdbc with auto_reconnect=true
No dice. ... (14342.220 s) Migrated all 276602 accounts to schema 146 Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: update failure on schema_version at com.google.gwtorm.schema.sql.SqlDialect.convertError(SqlDialect.java:159) at com.google.gwtorm.schema.sql.DialectMySQL.convertError(DialectMySQL.java:242) at com.google.gwtorm.jdbc.JdbcAccess.convertError(JdbcAccess.java:484) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:230) at com.google.gerrit.server.schema.SchemaVersion.finish(SchemaVersion.java:175) at com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java:156) at com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java:94) at com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:85) at com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:111) at com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:225) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) at Main.main(Main.java:28) Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 14,342,073 milliseconds ago. The last packet sent successfully to the server was 14,342,073 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3746) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2509) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2494) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) at com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1998) at com.google.gwtorm.jdbc.JdbcAccess.updateIndividually(JdbcAccess.java:246) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:227) ... 16 more Caused by: java.net.SocketException: Broken pipe (Write failed) at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) at java.net.SocketOutputStream.write(SocketOutputStream.java:155) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3728) ... 26 more
(In reply to Denis Roy from comment #60) > No dice. We hit SQL wait_timeout because there is no database activity in schema migration 146 for hours where accounts are migrated from ReviewDb to NoteDb. Database server is closing stale connection because wait timeout is smaller than the time required for schema migration 146 to complete. Hopefully fixed in: [1]. Can you re-try the migration with release.war from GerritForge CI built for this CL: [2]? Sorry for the trouble. [1] https://gerrit-review.googlesource.com/c/gerrit/+/264472 [2] https://gerrit-ci.gerritforge.com/job/Gerrit-verifier-bazel/84542/artifact/gerrit/bazel-bin/release.war
I had mentioned that in comment 18 :) Running the latest release.war now.
Seems to be getting further: ... (14420.614 s) migrated 100% (276600/276602) accounts ... (14420.622 s) Migrated all 276602 accounts to schema 146 > Done (14420.524 s) Migrating data to schema 147 ... > Done (64.629 s) Migrating data to schema 148 ... > Done (11.100 s) Migrating data to schema 149 ... > Done (0.007 s) Migrating data to schema 150 ... > Done (0.008 s) Migrating data to schema 151 ... > Done (0.031 s) Migrating data to schema 152 ... > Done (1.444 s) Migrating data to schema 153 ... > Done (0.675 s) Migrating data to schema 154 ... Collecting accounts: 276602 Migrating accounts to NoteDb: 100% (276602/276602) > Done (5700.368 s) Migrating data to schema 155 ... > Done (429.224 s) Migrating data to schema 156 ... > Done (0.011 s) Migrating data to schema 157 ... > Done (0.357 s) Migrating data to schema 158 ... > Done (0.018 s) Migrating data to schema 159 ... Migrate draft changes to private changes (default is work-in-progress) [y/N]? Replace draft changes with work_in_progress changes ... done > Done (0.250 s) Migrating data to schema 160 ... Removing "My Drafts" menu items: 202 > Done (1042.199 s) Migrating data to schema 161 ... > Done (0.635 s) Migrating data to schema 162 ... > Done (0.037 s) Migrating data to schema 163 ... > Done (48.575 s) Migrating data to schema 164 ... > Done (0.030 s) Migrating data to schema 165 ... > Done (0.012 s) Migrating data to schema 166 ... > Done (66.250 s) Migrating data to schema 167 ... > Done (68.363 s) Migrating data to schema 168 ... > Done (0.009 s) Migrating data to schema 169 ... Migrating projects: 100% (1714/1714) Skipped 1713 projects with no legacy comments Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: Migration failed at com.google.gerrit.server.schema.Schema_169.migrateData(Schema_169.java:89) at com.google.gerrit.server.schema.Schema_169.migrateData(Schema_169.java:58) at com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java:167) at com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java:100) at com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:87) at com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:114) at com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:225) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) at Main.main(Main.java:28)
Re-running init is now really fast but fails at the same place. Initializing plugins. Upgrading schema to 169 ... Upgrading schema to 170 ... Migrating data to schema 169 ... Migrating projects: 100% (1714/1714) Skipped 1713 projects with no legacy comments Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: Migration failed at com.google.gerrit.server.schema.Schema_169.migrateData(Schema_169.java:89) at com.google.gerrit.server.schema.Schema_169.migrateData(Schema_169.java:58) at com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java:167) at com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java:100) at com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:87) at com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:114) at com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:225) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) at Main.main(Main.java:28) There is nothing in the error log.
(In reply to Denis Roy from comment #64) > Re-running init is now really fast but fails at the same place. > > Initializing plugins. > > Upgrading schema to 169 ... > Upgrading schema to 170 ... > Migrating data to schema 169 ... > Migrating projects: 100% (1714/1714) > Skipped 1713 projects with no legacy comments This looks good. Schema migration 169 should be a no op for your gerrit installation. It is migrating NoteDb inline comments to JSON format. Given that you have not migrated to NoteDb yet, this migration should be a no-op. Unfortunately, for one project something went wrong: 1713/1714. The relevant part of the code is here: for (Project.NameKey project : projects) { try (Repository repo = repoManager.openRepository(project)) { ProjectMigrationResult progress = migrator.migrateProject(project, repo, false); skipped += progress.skipped; } catch (IOException e) { ok = false; logger.atWarning().log("Error migrating project " + project, e); } pm.update(1); } pm.endTask(); ui.message( "Skipped " + skipped + " project" + (skipped == 1 ? "" : "s") + " with no legacy comments"); if (!ok) { => we got that exception throw new OrmException("Migration failed"); } Let me adapt the severity level to error so that we better understand the breakage. I will also turn off the piece of the code that is failing the migration in this case. Will upload a patch in few minutes.
(In reply to David Ostrovsky from comment #65) > (In reply to Denis Roy from comment #64) > > Re-running init is now really fast but fails at the same place. > > > > Initializing plugins. > > > > Upgrading schema to 169 ... > > Upgrading schema to 170 ... > > Migrating data to schema 169 ... > > Migrating projects: 100% (1714/1714) > > Skipped 1713 projects with no legacy comments > > This looks good. > > Schema migration 169 should be a no op for your gerrit installation. > It is migrating NoteDb inline comments to JSON format. > [...] > Will upload a patch in few minutes. The patch is here: [1] and the new release with release-v2.16.18-20-ga95a2b7f53.war is here: [2]. [1] https://gerrit-review.googlesource.com/c/gerrit/+/264480 [2] https://github.com/davido/gerrit/releases/tag/v2.16.18-20-ga95a2b7f53
Many thanks for all your efforts. I think we've found the problem: Migrating projects: 11% ( 189/1714)ERROR com.google.gerrit.server.schema.Schema_169 : Error migrating project (ignoring) elogbook/elogbook [ERROR: UNUSED LOG ARGUMENTS] Migrating projects: 100% (1714/1714) I'll look at fixing this Monday.
I see "ERROR: UNUSED LOG ARGUMENTS" in the log. The schema migration is misusing the logger, and as a result we're not getting the exception trace. This needs to be fixed separately.
(In reply to Denis Roy from comment #67) > Many thanks for all your efforts. I think we've found the problem: > > Migrating projects: 11% ( 189/1714)ERROR > com.google.gerrit.server.schema.Schema_169 : Error migrating project > (ignoring) elogbook/elogbook [ERROR: UNUSED LOG ARGUMENTS] David Pursehouse fixed the logging issue: [1] and here is new release to try: [2]. But looking into the project: "elogbook/elogbook", it is in some way corrupt. This project doesn't exist, instead, I've found another project: www.eclipse.org/elogbook without any changes. I can clone that project: $ git clone git://git.eclipse.org/gitroot/www.eclipse.org/elogbook Cloning into 'elogbook'... remote: Enumerating objects: 8, done. remote: Total 8 (delta 0), reused 0 (delta 0) Receiving objects: 100% (8/8), done. But it's empty and only have one single commit: $ git log commit 3e9a9811fbf50f001617b7118288d29b0e5b03a5 (HEAD -> master, origin/master, origin/HEAD) Author: Webmaster [...] Date: Mon Jun 26 15:43:39 2017 -0400 Initial commit by Webmaster [1] https://gerrit-review.googlesource.com/c/gerrit/+/264493 [2] https://github.com/davido/gerrit/releases/tag/v2.16.18-23-g7360297a66
(In reply to David Ostrovsky from comment #69) > (In reply to Denis Roy from comment #67) > > Many thanks for all your efforts. I think we've found the problem: > > > > Migrating projects: 11% ( 189/1714)ERROR > > com.google.gerrit.server.schema.Schema_169 : Error migrating project > > (ignoring) elogbook/elogbook [ERROR: UNUSED LOG ARGUMENTS] > > David Pursehouse fixed the logging issue: [1] and here is new release to > try: [2]. > > But looking into the project: "elogbook/elogbook", it is in some way > corrupt. > > This project doesn't exist, instead, I've found another project: > > www.eclipse.org/elogbook > > without any changes I found a project restructuring review [1] for technology.elogbook stating: "With this review, we will merge "eLogbook@openK" into "Eclipse openK User Modules"; all content from eLogBook@openK will be moved to Eclipse openK User Modules, and the eLogBook@openK project will terminated." [2] looks like the new repository for this project [1] https://projects.eclipse.org/projects/technology.elogbook/reviews/restructuring-review [2] https://git.eclipse.org/r/#/admin/projects/openk-usermodules/org.eclipse.openk-usermodules.elogbook
Yep, the new project is here: https://git.eclipse.org/r/#/admin/projects/openk-usermodules/org.eclipse.openk-usermodules.elogbook I found a single change in the MySQL database assigned to the old project elogbook/elogbook. I changed that update changes set dest_project_name = "openk-usermodules/org.eclipse.openk-usermodules.elogbook" where dest_project_name = "elogbook/elogbook"; But that has not fixed the indexer issue. I don't know where it's finding a reference to elogbook/elogbook.
Is there anything under $gerrit_site/git/elogbook/elogbook ? e.g. an empty repository or maybe some corrupt remainders from a former empty repository
That did it... There was garbage left over from the move. init is running all the necessary indexers, looks good from here. Once it's done, I will re-sync all the data to staging, restart the upgrade from scratch with https://github.com/davido/gerrit/releases/tag/v2.16.18-23-g7360297a66 and report back. If that is successful, and we test it to be clean, I'll schedule an upgrade of the production site. Again, many thanks for the assistance.
yeah, great news :-) Do you have a 2.16 version of all the plugins you are using ? Or do you have some own plugins which need to be upgraded to 2.16 ?
The only non-stock plugin we're using is our ECA validator. I believe we have it running on an internal 2.16 already, but I will confirm.
(In reply to Matthias Sohn from comment #74) > yeah, great news :-) +1 ;-)
Our plugin is incompatible with 2.16 because it uses Projectcontrol and RefControl which has been moved to another package and made package private (via https://gerrit-review.googlesource.com/c/gerrit/+/153212). Any hint about how to workaround this? Code is here https://github.com/EclipseFdn/gerrit-eca-plugin/blob/master/eclipse-cla/src/main/java/org/eclipse/foundation/gerrit/validation/EclipseCommitValidationListener.java#L355 I've fixed the other (minor) issues / deletion.
(In reply to Mikaël Barbero from comment #77) > I've fixed the other (minor) issues / deletion. See draft PR https://github.com/EclipseFdn/gerrit-eca-plugin/pull/15
(In reply to Mikaël Barbero from comment #77) > Our plugin is incompatible with 2.16 because it uses Projectcontrol and > RefControl which has been moved to another package and made package private > (via https://gerrit-review.googlesource.com/c/gerrit/+/153212). Any hint > about how to workaround this? I migrated your plugin to permission backend in this PR: [1], to be compatible with 2.16 branch. I have not tested it, though. [1] https://github.com/EclipseFdn/gerrit-eca-plugin/pull/16
As a data point, the init still fails when I start over from scratch: Migrating data to schema 144 ... Couldn't upgrade schema. Expected if slave and read-only database ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. Expected if slave and read-only database com.google.gwtorm.server.OrmException: update failure on schema_version at com.google.gwtorm.schema.sql.SqlDialect.convertError(SqlDialect.java:162) at com.google.gwtorm.schema.sql.DialectMySQL.convertError(DialectMySQL.java:232) at com.google.gwtorm.jdbc.JdbcAccess.convertError(JdbcAccess.java:489) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:232) at com.google.gerrit.server.schema.SchemaVersion.finish(SchemaVersion.java:175) at com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java:156) at com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java:94) at com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:85) at com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:111) at com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:225) at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) at Main.main(Main.java:28) Caused by: java.sql.SQLException: No operations allowed after statement closed. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:964) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.StatementImpl.checkClosed(StatementImpl.java:442) at com.mysql.jdbc.PreparedStatement.clearBatch(PreparedStatement.java:1003) at com.mysql.jdbc.PreparedStatement.executeBatchInternal(PreparedStatement.java:1266) at com.mysql.jdbc.StatementImpl.executeBatch(StatementImpl.java:970) at com.google.gwtorm.schema.sql.SqlDialect.executeBatch(SqlDialect.java:448) at com.google.gwtorm.jdbc.JdbcAccess.execute(JdbcAccess.java:460) at com.google.gwtorm.jdbc.JdbcAccess.updateAsBatch(JdbcAccess.java:276) at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:227) I'll keep running init to see how many times it takes for success.
(In reply to Denis Roy from comment #80) > As a data point, the init still fails when I start over from scratch: > > Migrating data to schema 144 ... > Couldn't upgrade schema. Expected if slave and read-only database > ERROR com.google.gerrit.pgm.init.BaseInit : Couldn't upgrade schema. > Expected if slave and read-only database > com.google.gwtorm.server.OrmException: update failure on schema_version > at > com.google.gwtorm.schema.sql.SqlDialect.convertError(SqlDialect.java:162) > at > com.google.gwtorm.schema.sql.DialectMySQL.convertError(DialectMySQL.java:232) > at > com.google.gwtorm.jdbc.JdbcAccess.convertError(JdbcAccess.java:489) > at com.google.gwtorm.jdbc.JdbcAccess.update(JdbcAccess.java:232) > at > com.google.gerrit.server.schema.SchemaVersion.finish(SchemaVersion.java:175) > at > com.google.gerrit.server.schema.SchemaVersion.migrateData(SchemaVersion.java: > 156) > at > com.google.gerrit.server.schema.SchemaVersion.upgradeFrom(SchemaVersion.java: > 94) > at > com.google.gerrit.server.schema.SchemaVersion.check(SchemaVersion.java:85) > at > com.google.gerrit.server.schema.SchemaUpdater.update(SchemaUpdater.java:111) > at > com.google.gerrit.pgm.init.BaseInit$SiteRun.upgradeSchema(BaseInit.java:389) > at com.google.gerrit.pgm.init.BaseInit.run(BaseInit.java:145) > at > com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl. > java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java: > 225) > at > com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:121) > at > com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:65) > at Main.main(Main.java:28) > Caused by: java.sql.SQLException: No operations allowed after statement > closed. > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:964) During review discussion on wait timeout fix CL: [1], reviewers decided to abandon the global multiple connection approach and prefered less intrusive approach, that only fixed Schema_146 migration. Apparently it's not enough for your site (or you have small wait timeout number). Anyway, I restored: [1] again, rebased it on top of stable-2.16 branch and conducted yet another release for your to try: [2]. With the release: [2] you shouldn't see any wait timeout exceptions again. [1] https://gerrit-review.googlesource.com/c/gerrit/+/264472 [2] https://github.com/davido/gerrit/releases/tag/v2.16.18-28-g6897a73143
> With the release: [2] you shouldn't see any wait timeout exceptions again. > > [1] https://gerrit-review.googlesource.com/c/gerrit/+/264472 > [2] https://github.com/davido/gerrit/releases/tag/v2.16.18-28-g6897a73143 Confirmed! With a fresh data snapshot, init is running smoothly, albeit with this step taking a long time: Migrating accounts to NoteDb: 46% (130619/279351)
> Migrating accounts to NoteDb: 46% (130619/279351) Migrating accounts to NoteDb: 61% (172281/279351) Is it normally that slow?
From your output, it looks like you have around 280k accounts, is that correct? The migration of all accounts to NoteDb would create *a lot* of refs on the All-Users.git repository, at least one per account. Is the All-Users.git repository on a fast SSD disk? Can you check the system load, CPU and memory utilisation?
281k accounts, correct. When init starts the migration of accounts to NoteDb, it does so at about 100 accounts/sec At about 65% (193,000 accounts) it's down to one account every few seconds. Methinks they're going into some array in RAM... I threw more Xms and Xmx memory at init and that sped things up considerably to about 60%. Then it crawls. The java process is using 50% of one core, and there's no I/O to speak of. The java process is using 3.1G of RAM
> 281k accounts, correct. 279,351
(In reply to Denis Roy from comment #85) > 281k accounts, correct. > > When init starts the migration of accounts to NoteDb, it does so at about > 100 accounts/sec > > At about 65% (193,000 accounts) it's down to one account every few seconds. > Methinks they're going into some array in RAM... > > I threw more Xms and Xmx memory at init and that sped things up considerably > to about 60%. Then it crawls. The java process is using 50% of one core, and > there's no I/O to speak of. The java process is using 3.1G of RAM How big Xmx is? Just give it as much as you can. It could be gc running all the time if it can't free up enough memory. You can run jstack <javapid> to show where the process is if it crawls. You can run jmap -histo <javapid> to show used heap stats.
(In reply to Denis Roy from comment #83) > > Migrating accounts to NoteDb: 46% (130619/279351) > > Migrating accounts to NoteDb: 61% (172281/279351) > > Is it normally that slow? I cannot reproduce the performance problem. I set-up vanilla 2.14 site, with auth.type = DEVELOPMENT_BECOME_ANY_ACCOUNT Also admin user is created by init program. I wrote this script to generate 300,000 accounts: drop procedure if exists genAccounts; DELIMITER // CREATE PROCEDURE genAccounts() BEGIN DECLARE i INT DEFAULT 2; WHILE (i <= 300000) DO insert into accounts (registered_on, full_name, preferred_email, inactive, account_id) values (CURRENT_TIMESTAMP(), concat('Administrator', i), concat('john', i, '@doe.com'), 'N', i); SET i = i+1; END WHILE; END; // CALL genAccounts(); It took ca. 31 min to run the database script: MariaDB [gerrit]> CALL genAccounts(); Query OK, 299999 rows affected (31 min 58.245 sec) I migrated the gerrit site and used for that patched release from stable-2.15 branch (but your patch gerrit-2.16 is similar). On my laptop (32GB RAM) schema 146 migration took 31 minutes: [...] Migrating data to schema 146 ... Migrating accounts ... (0.357 s) scan accounts ... (0.803 s) gc --prune=now Pack refs: 100% (3/3) Counting objects: 11 Finding sources: 100% (11/11) Getting sizes: 100% (8/8) Compressing objects: 100% (1838/1838) Writing objects: 100% (11/11) Selecting commits: 100% (3/3) Building bitmaps: 100% (3/3) Prune loose objects also found in pack files: 100% (13/13) Prune loose, unreferenced objects: 100% (13/13) ... using 8 threads ... ... (3.449 s) migrated 0% (100/300000) accounts ... (3.590 s) migrated 0% (200/300000) accounts ... (3.823 s) migrated 0% (300/300000) accounts ... (4.028 s) migrated 0% (400/300000) accounts ... (4.323 s) migrated 0% (500/300000) accounts ... (4.624 s) migrated 0% (600/300000) accounts ... (4.888 s) migrated 0% (700/300000) accounts ... (5.296 s) migrated 0% (800/300000) accounts ... (5.634 s) migrated 0% (900/300000) accounts ... (6.121 s) migrated 0% (1000/300000) accounts ... (6.121 s) pack refs Pack refs: 100% (1008/1008)... (6.609 s) migrated 0% (1100/300000) accounts [...] Pack refs: 100% (299002/299002)... (1855.449 s) migrated 100% (299100/300000) accounts Pack refs: 100% (299002/299002) ... (1858.968 s) migrated 100% (299200/300000) accounts ... (1859.168 s) migrated 100% (299300/300000) accounts ... (1859.424 s) migrated 100% (299400/300000) accounts ... (1859.763 s) migrated 100% (299500/300000) accounts ... (1860.094 s) migrated 100% (299600/300000) accounts ... (1860.530 s) migrated 100% (299700/300000) accounts ... (1861.251 s) migrated 100% (299800/300000) accounts ... (1861.949 s) migrated 100% (299900/300000) accounts ... (1863.030 s) Migrated all 299994 accounts to schema 146 > Done (1862.693 s)
Another suggestion is to look at the All-Users.git and see: - On which storage lies on - Is it GCed or not In my experience, that repo is under high pressure with NoteDb and if it is on a slow storage or has not been GCed for weeks, months, years, then any migration would take ages.
> - Is it GCed or not Again on ^^^^^^^^^^^^ When migrating from v2.13 to v2.14, Gerrit already migrated the accounts to All-Users.git, but that was a "half-migration" that created the refs but not migrated all the information. However, after you migrated to v2.14, the All-Users.git was left in a very bad shape I believe, with lots of loose refs. Just count the number of files under the All-Users.git repo: find All-Users.git -type f | wc -l If the above 'find' returns anything > 100, then run the 'git gc' of All-Users.git *before* running the migration again. HTH Luca.
(In reply to Luca Milanesio from comment #90) > > - Is it GCed or not > Again on ^^^^^^^^^^^^ > > When migrating from v2.13 to v2.14, Gerrit already migrated the accounts to > All-Users.git, but that was a "half-migration" that created the refs but not > migrated all the information. > > However, after you migrated to v2.14, the All-Users.git was left in a very > bad shape I believe, with lots of loose refs. > > Just count the number of files under the All-Users.git repo: > find All-Users.git -type f | wc -l > > If the above 'find' returns anything > 100, then run the 'git gc' of > All-Users.git *before* running the migration again. Since https://gerrit-review.googlesource.com/c/gerrit/+/224833/ schmema migration 146 first runs a full gc and packs refs after every 1000 migrated accounts and it is running parallelized. By default it's using one thread per processor, you can tune the thread count via system property e.g. -Dthreadcount=42
Denis mentioned: > At about 65% (193,000 accounts) it's down to one account every few seconds. If the GC is done once every 1000 accounts, then it would give some relief only once an hour. Possibly the GC interval needs to be configurable? Denis also mentioned: > The java process is using 50% of one core, and there's no I/O to speak of. The java process is using 3.1G of RAM That means he is not configuring any parallelism. If the bottleneck is the repo though, running in parallel would actually make any better? @Matthias what do you think? Luca.
(In reply to Luca Milanesio from comment #92) > Denis mentioned: > > At about 65% (193,000 accounts) it's down to one account every few seconds. > > If the GC is done once every 1000 accounts, then it would give some relief > only once an hour. Possibly the GC interval needs to be configurable? no, in my tests on a MacBook using a SSD disk I could migrate accounts at a constant rate of ~100 accounts/second. Tested this up to 128k generated accounts (see the commit message). If storage is not SSD but some network connected storage this is probably slower. > Denis also mentioned: > > The java process is using 50% of one core, and there's no I/O to speak of. The java process is using 3.1G of RAM > > That means he is not configuring any parallelism. If the bottleneck is the > repo though, running in parallel would actually make any better? > > @Matthias what do you think? create a couple of thread dumps when it's progressing only slowly in order to get some data what's going on You may try to move All-Users.git to a RAM disk to speedup the migration. Ensure you create a backup copy so there's a way back in case there's a problem with the RAM disk or the machine crashes. I guess this could be symlinked into the gerrit site.
gerrit@gerrit-vm1:~> find /home/data/git_stg/All-Users.git -type f | wc -l 1028283 The All-Users.git repo is on NFS. As mentioned, after about 160,000 accounts, things crawl and there's no real I/O to speak of. And also, as mentioned, I thought that was gc'd during init? I'm running git gc --aggressive, will re-run init afterwards.
> The All-Users.git repo is on NFS NFS isn't the best place were to do this type of operations: it is notoriously very slow and should be used only when you have an HA setup. Can you follow Matthias' suggestion: copy it to a local SSD or, best, to a ramdisk and then: - GC the repo - Run the migration again HTH Luca.
> NFS isn't the best place were to do this type of operations: it is > notoriously very slow and should be used only when you have an HA setup. Well, we have a standby. > Can you follow Matthias' suggestion: copy it to a local SSD or, best, to a > ramdisk and then: > - GC the repo > - Run the migration again Made an 8G tmpfs, moved All-Users.git onto it, git gc --aggressive. find All-Users.git -type f | wc -l reported 12. Ran init with Xms12g and Xmx16g. It starts of blazing fast -- 500/sec until about 123,000 accounts, then it slows down very fast. Right now, at 166,000, it's about 3 accounts/sec and slowly slowing down, if that makes sense. Could there be something wrong with our All-Users repo?
> Made an 8G tmpfs, moved All-Users.git onto it, git gc --aggressive. > find All-Users.git -type f | wc -l reported 12. That looks good. Did you by the way checked *before* the GC on how bad it was? > Ran init with Xms12g and Xmx16g. It starts of blazing fast -- 500/sec until about 123,000 accounts, then it slows down very fast. Right now, at 166,000, it's about 3 accounts/sec and slowly slowing down, if that makes sense. > Could there be something wrong with our All-Users repo? I would suggest two more things: 1. Get a few thread dumps to understand where the process is stuck (jstack the pid) 2. Run a `find All-Users.git -type f | wc -l` when it starts getting slower Once we get 1. and 2. we can have more ideas on what's wrong. As Davido said, it should be executed in around 30', not taking hours or days !
> NFS isn't the best place were to do this type of operations: it is > notoriously very slow and should be used only when you have an HA setup. > Well, we have a standby. I would recommend to use replication to the standby node rather than NFS. NFS is useful if you have them active at the same time, or active failure. For active / standby, replication is typically better. You won't have to pay the price of a slower and more problematic access to a shared NFS. Relying on replication rather than NFS would also eliminate the SPOF of the shared disk. HTH Luca.
(In reply to Denis Roy from comment #96) > > NFS isn't the best place were to do this type of operations: it is > > notoriously very slow and should be used only when you have an HA setup. > > Well, we have a standby. > > > > Can you follow Matthias' suggestion: copy it to a local SSD or, best, to a > > ramdisk and then: > > - GC the repo > > - Run the migration again > > Made an 8G tmpfs, moved All-Users.git onto it, git gc --aggressive. > > find All-Users.git -type f | wc -l reported 12. > > Ran init with Xms12g and Xmx16g. It starts of blazing fast -- 500/sec until > about 123,000 accounts, then it slows down very fast. Right now, at 166,000, > it's about 3 accounts/sec and slowly slowing down, if that makes sense. > > Could there be something wrong with our All-Users repo? cd to the All-Users.git and run $ git count-objects -v on a fully packed repository this looks like count: 0 size: 0 in-pack: 12794617 packs: 1 size-pack: 3718534 prune-packable: 0 garbage: 0 size-garbage: 0 If count > 10k it's time to run gc to pack the loose objects into pack files. Another reason to run gc is when the number of packs is large, typically around 150-200 packs performance degrades If that's the case run git gc on the repo, that should also work while the migration is running though it's slower than if it's running alone.
Created attachment 282648 [details] jstack Here is a series of jstacks, with a second or so between.
> cd to the All-Users.git and run > $ git count-objects -v > > on a fully packed repository this looks like > > count: 0 > size: 0 > in-pack: 12794617 > packs: 1 > size-pack: 3718534 > prune-packable: 0 > garbage: 0 > size-garbage: 0 > > If count > 10k it's time to run gc to pack the loose objects into pack files. > Another reason to run gc is when the number of packs is large, typically > around 150-200 packs performance degrades > > If that's the case run git gc on the repo, that should also work while the > migration is running though it's slower than if it's running alone. warning: garbage found: ./objects/pack/pack-7b25f23399cd45ed34bbe3c322a0f9539075e57c.bitmap count: 303668 size: 1214672 in-pack: 2144092 packs: 1 size-pack: 258415 prune-packable: 0 garbage: 1 size-garbage: 6960 I ran git dc --aggressive on the repo prior to the upgrade, and nothing else other than init is writing to it. Running git gc again...
(In reply to Denis Roy from comment #101) > > cd to the All-Users.git and run > > $ git count-objects -v > > > > on a fully packed repository this looks like > > > > count: 0 > > size: 0 > > in-pack: 12794617 > > packs: 1 > > size-pack: 3718534 > > prune-packable: 0 > > garbage: 0 > > size-garbage: 0 > > > > If count > 10k it's time to run gc to pack the loose objects into pack files. > > Another reason to run gc is when the number of packs is large, typically > > around 150-200 packs performance degrades > > > > If that's the case run git gc on the repo, that should also work while the > > migration is running though it's slower than if it's running alone. > > > warning: garbage found: > ./objects/pack/pack-7b25f23399cd45ed34bbe3c322a0f9539075e57c.bitmap > count: 303668 this means you have 303668 loose objects (files under the objects directory) and this is causing the slowness. These objects have been created by init so far > size: 1214672 > in-pack: 2144092 > packs: 1 > size-pack: 258415 > prune-packable: 0 > garbage: 1 > size-garbage: 6960 there is 7MB of garbage which can be pruned by gc, that's not causing the slowness > > > > I ran git dc --aggressive on the repo prior to the upgrade, and nothing else > other than init is writing to it. > > Running git gc again...
(In reply to Denis Roy from comment #100) > Created attachment 282648 [details] > jstack > > Here is a series of jstacks, with a second or so between. schema migration Schema_154 is updating account information in notedb, this is probably slow since there are a ton of loose objects
Yep, see in your stack trace: "main" #1 prio=5 os_prio=0 tid=0x00007fd00000a000 nid=0x79e4 runnable [0x00007fd009bb3000] java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.isDirectory(File.java:849) at org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner.scanTree(RefDirectory.java:483) at org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner.scanTree(RefDirectory.java:489) at org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner.scanTree(RefDirectory.java:489) at org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner.scan(RefDirectory.java:445) at org.eclipse.jgit.internal.storage.file.RefDirectory.getLooseRefs(RefDirectory.java:310) As you move further with the upgrade, it will get worse, because the number of loose refs would increase further. @Matthias developed a fix that every 1000 users would automatically do a GC: are you running the latest version of v2.16.x with the fix? Luca.
> @Matthias developed a fix that every 1000 users would automatically do a GC: > are you running the latest version of v2.16.x with the fix? Running git gc --aggressive seems to have helped. It's at 91% now. I'm using DavidO's https://github.com/davido/gerrit/releases/tag/v2.16.18-28-g6897a73143
https://git.eclipse.org/gerrit-staging/q/status:open
Wow! Congrats!
Many thanks to everyone that chipped in. I am getting one 500 error: [2020-05-01 11:10:23,894] [HTTP GET /gerrit-staging/config/server/top-menus (droy from 24.202.134.63)] ERROR com.google.gerrit.httpd.restapi.RestApiServlet : Error in GET /gerrit-staging/config/server/top-menus com.google.inject.ProvisionException: Unable to provision, see the following errors: 1) Error injecting constructor, java.lang.NoSuchMethodError: com.google.gerrit.server.CurrentUser.getCapabilities()Lcom/google/gerrit/server/account/CapabilityControl; at com.googlesource.gerrit.plugins.javamelody.MonitoringTopMenu.<init>(Unknown Source) while locating com.googlesource.gerrit.plugins.javamelody.MonitoringTopMenu while locating com.google.gerrit.extensions.webui.TopMenu annotated with @com.google.inject.internal.UniqueAnnotations$Internal(value=184) 1 error at com.google.inject.internal.InternalProvisionException.toProvisionException(InternalProvisionException.java:226) at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1097) at com.google.gerrit.extensions.registration.DynamicSet$1.next(DynamicSet.java:165) at com.google.gerrit.server.restapi.config.ListTopMenus.apply(ListTopMenus.java:38) at com.google.gerrit.server.restapi.config.ListTopMenus.apply(ListTopMenus.java:26) at com.google.gerrit.httpd.restapi.RestApiServlet.service(RestApiServlet.java:458) at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:290) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:280) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:184) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:89) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) at com.google.gerrit.httpd.raw.StaticModule$PolyGerritFilter.doFilter(StaticModule.java:485) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.GetUserFilter.doFilter(GetUserFilter.java:92) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.RequireSslFilter.doFilter(RequireSslFilter.java:72) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.RunAsFilter.doFilter(RunAsFilter.java:121) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.GwtCacheControlFilter.doFilter(GwtCacheControlFilter.java:72) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.SetThreadNameFilter.doFilter(SetThreadNameFilter.java:62) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.AllRequestFilter$FilterProxy$1.doFilter(AllRequestFilter.java:133) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:239) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:215) at com.googlesource.gerrit.plugins.javamelody.GerritMonitoringFilter.doFilter(GerritMonitoringFilter.java:67) at com.google.gerrit.httpd.AllRequestFilter$FilterProxy$1.doFilter(AllRequestFilter.java:129) at com.google.gerrit.httpd.AllRequestFilter$FilterProxy.doFilter(AllRequestFilter.java:135) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.RequestCleanupFilter.doFilter(RequestCleanupFilter.java:60) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.RequestMetricsFilter.doFilter(RequestMetricsFilter.java:57) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.gerrit.httpd.RequestContextFilter.doFilter(RequestContextFilter.java:64) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:121) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:133) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1607) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1577) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoSuchMethodError: com.google.gerrit.server.CurrentUser.getCapabilities()Lcom/google/gerrit/server/account/CapabilityControl; at com.googlesource.gerrit.plugins.javamelody.CapabilityChecker.canMonitor(CapabilityChecker.java:35) at com.googlesource.gerrit.plugins.javamelody.MonitoringTopMenu.<init>(MonitoringTopMenu.java:29) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.google.inject.internal.DefaultConstructionProxyFactory$ReflectiveProxy.newInstance(DefaultConstructionProxyFactory.java:126) at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114) at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:306) at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:62) at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1094) ... 66 more
Have you upgraded the plugins as well? Get the latest versions of the plugins for v2.16 from the GerritForge's CI: https://gerrit-ci.gerritforge.com/view/Plugins-stable-2.16/
Duh, thanks. Upgraded all of them, looks like we're in good shape.
We know that the upgrade of plugins is one of the "missing pieces" of the process. From v3.0 onwards, the plugin-manager has become a core plugin and has the option to upgrade plugins by fetching them from our CI. I neeed to also add the option to integrate that part with the 'init' steps, so that the overall operation would become transparent for whoever is upgrading. (it's in my TODO list) One question: are you with v2.16/ReviewDb or have you also migrated to NoteDb? Luca.
(In reply to Denis Roy from comment #110) > Duh, thanks. Upgraded all of them, looks like we're in good shape. Have you also installed new version of gerrit-eca plugin? I migrated it to permission backend in this PR: [1]. I have upload this release for your to test: [2]. Note, though, that I have not tested it, but may be Matthias can help with the tests on the staging Gerrit instance? [1] https://github.com/EclipseFdn/gerrit-eca-plugin/pull/16 [2] https://github.com/davido/gerrit-eca-plugin/releases/tag/1.10-SNAPSHOT
(In reply to David Ostrovsky from comment #112) > (In reply to Denis Roy from comment #110) > > Duh, thanks. Upgraded all of them, looks like we're in good shape. > > Have you also installed new version of gerrit-eca plugin? I did; thanks. I installed it yesterday on an internal Gerrit (which was already on 2.16) and from my testing, it did everything it was supposed to. It's now running on the sandbox, so we can test further. Again, thanks to everyone for stepping up and helping out!
(In reply to Matthias Sohn from comment #38) > (In reply to Denis Roy from comment #37) > > Do you have any suggestions on what our next step should be? > > I'd target the following steps (for any upgrade always use the latest > available service release of the given minor release): > > 1. upgrade to 2.15 [1] then 2.16 [2] in one step, stay on reviewDB (MySQL) > 2. on 2.16 migrate to noteDB [3] > 3. upgrade to 3.0 then 3.1 in one step [4] With migration to 2.16.18, the migration marathon is not over, unfortunately. You have only reached the bullet point 1. in Matthias's plan. Release 2.16.18 is already outdated and will be flagged as EOL in only 3 weeks from today, when release 3.2 is going to be released. That why I would suggest to proceed with the steps 2. and 3. as pointed out by Matthias. We could create a follow-up issue if you prefer: "Migrate gerrit backend from ReviewDb to NoteDb and upgrade to 3.x release"
(In reply to Denis Roy from comment #105) > > @Matthias developed a fix that every 1000 users would automatically do a GC: > > are you running the latest version of v2.16.x with the fix? > > Running git gc --aggressive seems to have helped. It's at 91% now. you should try to get rid of the habit to use the gc --aggressive option, in most cases this is unnecessary and if you stick to run plain gc regularly you typically end up with better delta chains. https://stackoverflow.com/questions/28720151/git-gc-aggressive-vs-git-repack/
(In reply to Luca Milanesio from comment #104) > Yep, see in your stack trace: > > "main" #1 prio=5 os_prio=0 tid=0x00007fd00000a000 nid=0x79e4 runnable > [0x00007fd009bb3000] > java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) > at > java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) > at java.io.File.isDirectory(File.java:849) > at > org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner. > scanTree(RefDirectory.java:483) > at > org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner. > scanTree(RefDirectory.java:489) > at > org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner. > scanTree(RefDirectory.java:489) > at > org.eclipse.jgit.internal.storage.file.RefDirectory$LooseScanner. > scan(RefDirectory.java:445) > at > org.eclipse.jgit.internal.storage.file.RefDirectory. > getLooseRefs(RefDirectory.java:310) > > As you move further with the upgrade, it will get worse, because the number > of loose refs would increase further. > > @Matthias developed a fix that every 1000 users would automatically do a GC: > are you running the latest version of v2.16.x with the fix? this fix is included since 2.15.14
> this fix is included since 2.15.14 I am wondering why it did not work transparently then: it should have kept the repo in good state, avoiding this accumulation of loose refs. Something worth checking and investigating. @Matthias thanks for the feedback. Luca.
(In reply to Luca Milanesio from comment #117) > > this fix is included since 2.15.14 > > I am wondering why it did not work transparently then: it should have kept > the repo in good state, avoiding this accumulation of loose refs. > > Something worth checking and investigating. I think this was not caused by an excessive number of loose refs but by the >300k loose objects Denis saw when migration 154 went slow. The fix I did runs gc once before migration 146 and then packs all refs every 10k refs during migration 146 but it does not run a full gc again. It looks like the migration created 300k new loose objects after the gc at the beginning of migration 146. That's probably due to the extraordinarily large number of accounts of this site. I guess this could be fixed by running gc every 100k accounts during the migration. I didn't see this when I tested the fix for migration 146 since I only tested up to 128k accounts.
> I think this was not caused by an excessive number of loose refs but by the >300k loose objects Denis saw when migration 154 went slow. The stack traces though were showing JGit always stuck at scanning loose refs. It would be worth reproducing the issue again and, before releasing another v2.15, having another fix with further GC cycles. Luca.
(In reply to Matthias Sohn from comment #118) > (In reply to Luca Milanesio from comment #117) > > > this fix is included since 2.15.14 > > > > I am wondering why it did not work transparently then: it should have kept > > the repo in good state, avoiding this accumulation of loose refs. > > > > Something worth checking and investigating. > > I think this was not caused by an excessive number of loose refs but by the > >300k loose objects Denis saw when migration 154 went slow. Yes. Running count-objects on my test site, where I have migrated 300,000 accounts in my previous comment: $ time git count-objects -v -H count: 901923 size: 3.44 GiB in-pack: 11 packs: 1 size-pack: 3.11 KiB prune-packable: 0 garbage: 0 size-garbage: 0 bytes real 0m1.378s user 0m0.245s sys 0m1.133s We run full garbage collection already at the start of Schema_146 migration, during the mogration we are only packing the refs: if (refsOnly) { ui.message(String.format("... (%.3f s) pack refs", elapsed())); gc.packRefs(); } else { ui.message(String.format("... (%.3f s) gc --prune=now", elapsed())); gc.setExpire(new Date()); gc.gc(); } I think the solution would be, to run the full gc say all 10k/100k accounts and in any event at the end of the migration. I will send a patch in a moment.
There is another key factor here: the NFS. Having a repo with lots of loose objects on a local filesystem on SSD, isn't much of an issue. On the other side, NFS is notoriously very slow in accessing files and, also, JGit cannot rely on the file/folder stats to avoid reading them over and over again. Should we just detect the NFS at the start of the migration and block if the number of accounts is too high? Also, running GC on NFS would take equally a very long time :-( Luca.
(In reply to David Ostrovsky from comment #120) > (In reply to Matthias Sohn from comment #118) > > (In reply to Luca Milanesio from comment #117) > > > > this fix is included since 2.15.14 > > > > > > I am wondering why it did not work transparently then: it should have kept > > > the repo in good state, avoiding this accumulation of loose refs. > > > > > > Something worth checking and investigating. > > > > I think this was not caused by an excessive number of loose refs but by the > > >300k loose objects Denis saw when migration 154 went slow. > > Yes. Running count-objects on my test site, where I have migrated 300,000 > accounts in my previous comment: > > $ time git count-objects -v -H > > count: 901923 > size: 3.44 GiB > in-pack: 11 > packs: 1 > size-pack: 3.11 KiB > prune-packable: 0 > garbage: 0 > size-garbage: 0 bytes > > real 0m1.378s > user 0m0.245s > sys 0m1.133s > > We run full garbage collection already at the start of Schema_146 migration, > during the mogration we are only packing the refs: > > > if (refsOnly) { > ui.message(String.format("... (%.3f s) pack refs", elapsed())); > gc.packRefs(); > } else { > ui.message(String.format("... (%.3f s) gc --prune=now", > elapsed())); > gc.setExpire(new Date()); > gc.gc(); > } > > I think the solution would be, to run the full gc say all 10k/100k accounts > and in any event at the end of the migration. I will send a patch in a > moment. It took longer as I did extensive tests, with 300,000 accounts site. As Matthias pointed out in the review, there is a tradeoff between running full gc frequently and the time full gc itself takes. Moreover, it turns out, that JGit full gc also unconditionally generates bitmap index and this takes very long time on huge All-Users repository. In fact, it took almost 1 hour after 100k accounts were migrated to NoteDb and bitmap generation even crashed with OOME after 200k accounts were migrated. I filed this issue upstream: [1] and provided all the details. So, the workaround for now is: run full JGit gc for every 100k accounts during schema migration versions 146 and 154, bit disable bitmap index generation. We can consider to re-enable bitmap index again when: [1] is fixed. I conducted yet another release with the above fixes: [2]. With this release: [2], Eclipse Gerrit site should be migrated from 2.14.20 to 2.16.18 within 1 hour. [1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=562740 [2] https://github.com/davido/gerrit/releases/tag/v2.16.18-53-g08b5e2e519
@Denis, I've just released v1.0.10 of the ECA plugin https://github.com/EclipseFdn/gerrit-eca-plugin/releases/tag/1.0.10 You'll need to install the file "eclipse-cla-1.0.10-jar-with-dependencies.jar"
Thx. I'm in the process of staging the upgrade again with DavidO's latest, and will deploy the plugin there. If that works, I'll schedule an upgrade on production and continue the plan from there...
(In reply to Denis Roy from comment #124) > Thx. I'm in the process of staging the upgrade again with DavidO's latest, > and will deploy the plugin there. If that works, I'll schedule an upgrade on > production and continue the plan from there... The upgrade to 2.16 was clean with : https://github.com/davido/gerrit/releases/tag/v2.16.18-53-g08b5e2e519
(In reply to Denis Roy from comment #125) > > The upgrade to 2.16 was clean with : > https://github.com/davido/gerrit/releases/tag/v2.16.18-53-g08b5e2e519 Thanks for confirming. Meantime all pending changes were merged and you can use the artifact from GerritForge CI: [1] o Revert "Keep alive database connection to prevent exceeding wait timeout" o Schema_146: Periodically run full gc o Schema_154: Periodically run full gc o Schema_154: Disable bitmap index re-build during full gc o Schema_146: Disable bitmap index re-build during full gc Release war is here: [2]. ETA for official gerrit 2.16.19 release is end of May. [1] https://gerrit-ci.gerritforge.com/view/Gerrit/job/Gerrit-bazel-stable-2.16/895 [2] https://gerrit-ci.gerritforge.com/view/Gerrit/job/Gerrit-bazel-stable-2.16/895/artifact/gerrit/bazel-bin/release.war
I've got to 3.1.4 on staging: https://git.eclipse.org/gerrit-staging/q/status:open The upgrade from 2.16 was clean (albeit takes a while). I'll schedule the upgrade for the end of June, to not interfere with the Eclipse release cycles.
That's awesome news.
(In reply to Denis Roy from comment #127) > I've got to 3.1.4 on staging: > > https://git.eclipse.org/gerrit-staging/q/status:open > > The upgrade from 2.16 was clean (albeit takes a while). > > I'll schedule the upgrade for the end of June, to not interfere with the > Eclipse release cycles. great news
(In reply to Denis Roy from comment #127) > I've got to 3.1.4 on staging: > > https://git.eclipse.org/gerrit-staging/q/status:open One exiting feature we have added in Gerrit 3.1 is to enable Git wire protocol v2, so that git fetch in Gerrit is as fast as in Google: [1] and in GitHub: [2]. Unfortunately you have not activated it yet. Would you mind for fix it? $ GIT_TRACE_PACKET=1 git ls-remote https://git.eclipse.org/gerrit-staging/jgit/jgit 22:41:39.776096 pkt-line.c:80 packet: git< # service=git-upload-pack 22:41:39.776120 pkt-line.c:80 packet: git< 0000 22:41:39.776126 pkt-line.c:80 packet: git< 0a2a094feaac41966f3de22748d80d4d29a7ba30 HEAD\0 include-tag multi_ack_detailed multi_ack ofs-delta side-band side-band-64k thin-pack no-progress shallow no-done agent=JGit/unknown symref=HEAD:refs/heads/master Compare this to gerrit.googlesource.com: $ GIT_TRACE_PACKET=1 git ls-remote https://gerrit.googlesource.com/plugins/javamelody 22:44:33.962175 pkt-line.c:80 packet: git< version 2 Given that Gerrit 3.1 unconditionally activated Git wire protocol v2, all you need to do is to add these lines in your jgit.config for gerrit site: [protocol] version = 2 Note, that the client must enable it as well. On my box I have these lines: $ grep -1 protocol ~/.gitconfig [protocol] version = 2 And last but not least, the client git version must be at least 2.18.0 $ git version git version 2.26.2 [1] https://opensource.googleblog.com/2018/05/introducing-git-protocol-version-2.html [2] https://github.blog/changelog/2018-11-08-git-protocol-v2-support
3.1.4 sure is fast :)
> Given that Gerrit 3.1 unconditionally activated Git wire protocol v2, all > you need to do is to add these lines in your jgit.config for gerrit site: > > [protocol] > version = 2 Done. I didn't have etc/jgit.config but created it.
(In reply to Denis Roy from comment #132) > > Given that Gerrit 3.1 unconditionally activated Git wire protocol v2, all > > you need to do is to add these lines in your jgit.config for gerrit site: > > > > [protocol] > > version = 2 > > Done. I didn't have etc/jgit.config but created it. Thanks, confirmed! Git wire protocol v2 is up and running in gerrit@EclipseFdn: $ davido@wizball:~$ GIT_TRACE_PACKET=1 git ls-remote https://git.eclipse.org/gerrit-staging/jgit/jgit 06:57:55.050010 pkt-line.c:80 packet: git< version 2 06:57:55.050058 pkt-line.c:80 packet: git< version 2 06:57:55.050071 pkt-line.c:80 packet: git< ls-refs
(In reply to David Ostrovsky from comment #133) > (In reply to Denis Roy from comment #132) > > > Given that Gerrit 3.1 unconditionally activated Git wire protocol v2, all > > > you need to do is to add these lines in your jgit.config for gerrit site: > > > > > > [protocol] > > > version = 2 > > > > Done. I didn't have etc/jgit.config but created it. > > Thanks, confirmed! Git wire protocol v2 is up and running in > gerrit@EclipseFdn: > > $ davido@wizball:~$ GIT_TRACE_PACKET=1 git ls-remote > https://git.eclipse.org/gerrit-staging/jgit/jgit > 06:57:55.050010 pkt-line.c:80 packet: git< version 2 > 06:57:55.050058 pkt-line.c:80 packet: git< version 2 > 06:57:55.050071 pkt-line.c:80 packet: git< ls-refs BTW, there is a dedicated feature request to enable git wire protocol v2 on gerrit@EclipseFdn: [1], that was implemented during gerrit upgrade ;-) [1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=552048
To be clear, wire protocol is only on the sandbox Gerrit; we'll enable it in production in June, with the upgrade 2.14 -> 3.1
Scheduled for Saturday, June 27 09:00 EDT
Can you give an update when https://git.eclipse.org/r/ is working again? Despite what https://status.eclipse.org/ tells me it seems to be out of order.
(In reply to Carsten Hammer from comment #137) > Can you give an update when https://git.eclipse.org/r/ is working again? > Despite what https://status.eclipse.org/ tells me it seems to be out of > order. Probably another few hours. Apologies for the status page -- I wrote "down for 24-36 hours" but it automatically cleared after 24h. Thanks for your patience.
Production Gerrit is running 3.2.2, with wire protocol enabled.
As a side-note, the production VM only had 4 CPU cores assigned to it; I've now bound 16 to it, so we should get a decent performance boost under heavy load.
Thanks for working on the upgrade! Already loving the new version! :)
Yes, thanks for working on it, looks good! However keep in mind that putting more cpus on a virtual machine might mean that delays between getting cpu time increases because vm host has to wait until it has 16 real cpu cores free. It might be more often possible to have 4 free cpus or 8 free cpus than 16. Not that I know exactly how the system is built up. Just from my experience over-provisioning regarding cpus with high number of cpus can be counter productive. VMs than feel jerky..
Congratulations from the Gerrit community for the successful upgrade starting from 2.14 all the way up to the latest release 3.2.2: https://groups.google.com/d/msg/repo-discuss/5DUUU8SOP74/VJOIo5YhBwAJ
I noticed that the URLs are different now. E.g., in this bug: https://bugs.eclipse.org/bugs/show_bug.cgi?id=563042 There is a link to this review: https://git.eclipse.org/r/163612 That link redirects to https://git.eclipse.org/r/c/emf/org.eclipse.emf/+/163612/ I don't know if that's related to the problems I had, but I had a hard time getting a ready-to-push Gerrit commit actually pushed. In any case, that did finally work, but on the review itself, EMF's Gerrit-triggered build is no longer triggered: https://ci.eclipse.org/emf/job/gerrit/ Do I need to reconfigure that job or the ci instance somehow?
(In reply to Ed Merks from comment #144) > I noticed that the URLs are different now. > > E.g., in this bug: > > https://bugs.eclipse.org/bugs/show_bug.cgi?id=563042 > > There is a link to this review: > > https://git.eclipse.org/r/163612 > > That link redirects to > > https://git.eclipse.org/r/c/emf/org.eclipse.emf/+/163612/ Since Gerrit 2.15 [1] the change URL changed to this format which includes the project name which helps readability of e.g. log files, improves performance and prepares for advanced load balancing for multi-master Gerrit setups. > I don't know if that's related to the problems I had, but I had a hard time > getting a ready-to-push Gerrit commit actually pushed. Which difficulties did you face with pushing a commit for review ? > In any case, that > did finally work, but on the review itself, EMF's Gerrit-triggered build is > no longer triggered: > > https://ci.eclipse.org/emf/job/gerrit/ > > Do I need to reconfigure that job or the ci instance somehow? [1] https://www.gerritcodereview.com/2.15.html#new-url-scheme
(In reply to Matthias Sohn from comment #145) > (In reply to Ed Merks from comment #144) > > I don't know if that's related to the problems I had, but I had a hard time > > getting a ready-to-push Gerrit commit actually pushed. > > Which difficulties did you face with pushing a commit for review ? > Push was disabled and pull complained about something not be advertised... In any case, I did manage to push it eventually, so probably just me being stupid... I was able to fetch from Gerrit in another IDE, so I don't think there is any serious problem... > > In any case, that > > did finally work, but on the review itself, EMF's Gerrit-triggered build is > > no longer triggered: > > > > https://ci.eclipse.org/emf/job/gerrit/ > > > > Do I need to reconfigure that job or the ci instance somehow? > > [1] https://www.gerritcodereview.com/2.15.html#new-url-scheme I've done nothing in my job configuration that specifies URLs using either schema. There is "Gerrit Trigger" thing which is just dark magic to me and it stopped working so I have no clue what is needed to make it work again. The configuration is publicly readable: https://ci.eclipse.org/emf/job/gerrit/configure Maybe someone will see something obvious that needs to change now...
Note: since the update, gerrit change set links are not accepted by bugzilla (and so new gerrits aren't automatically linked to bugzilla bugs). Trying to add the link manually results in the error, just tried it with bug 564634 and related gerrit https://git.eclipse.org/r/c/platform/eclipse.platform.runtime/+/165536 : bugzilla reports an error : https://git.eclipse.org/r/c/platform/eclipse.platform.runtime/+/165536 is not a valid URL to a bug. See Also URLs should point to one of: show_bug.cgi in a Bugzilla installation. A bug on launchpad.net An issue on code.google.com. A change on Gerrit. A bug on bugs.debian.org. An issue in a JIRA installation. A ticket in a Trac installation. A bug in a MantisBT installation. A bug on sourceforge.net. An issue/pull request on github.com. A cGit commit link. A Gerrit change.
Bug 564727 also related here?
I'll close this as fixed. Many thanks to Matthias, Luca and David and others for stepping in to assist. There are bugs for the follow-up issues. I'll comment there.
Comparing the current gerrit to the old one now I cannot see any longer the build state of a gerrit on a mobile phone or an ipad. Now we have only name of contributer, size of push and title of gerrit. Can I configure it somehow so that it now is similar to the old one for small displays?
(In reply to David Ostrovsky from comment #79) > I migrated your plugin to permission backend in this PR: [1], to be > compatible with 2.16 branch. I have not tested it, though. > > [1] https://github.com/EclipseFdn/gerrit-eca-plugin/pull/16 @David Ostrovsky we need your help: the ECA plugin that you fixed for us worked on 3.1, but somehow does not work for 3.2. The plugin loads cleanly but prevents all pushes. Please see: bug 565080 and bug 565077
(In reply to Denis Roy from comment #151) > (In reply to David Ostrovsky from comment #79) > > > I migrated your plugin to permission backend in this PR: [1], to be > > compatible with 2.16 branch. I have not tested it, though. > > > > [1] https://github.com/EclipseFdn/gerrit-eca-plugin/pull/16 > > > @David Ostrovsky we need your help: the ECA plugin that you fixed for us > worked on 3.1, but somehow does not work for 3.2. The plugin loads cleanly > but prevents all pushes. Please see: bug 565080 and bug 565077 FYI, I've already migrated to 3.2.2 https://github.com/EclipseFdn/gerrit-eca-plugin/commit/9d17e93013f65d75feee01cc6560b7b38c432a76 (major changes are import renamed from com.google.gerrit.reviewdb.client.* to com.google.gerrit.entities.*)
We are all done here. Many thanks to all for helping us get up-to-date.