Bug 499399 - Eclipse Infrastructure - uptime and speed
Summary: Eclipse Infrastructure - uptime and speed
Status: RESOLVED FIXED
Alias: None
Product: Community
Classification: Eclipse Foundation
Component: Cross-Project (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Denis Roy CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
Depends on: 499400
Blocks:
  Show dependency tree
 
Reported: 2016-08-09 02:47 EDT by Jonas Helming CLA
Modified: 2019-11-26 02:27 EST (History)
40 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jonas Helming CLA 2016-08-09 02:47:37 EDT
This BR is not about the selection of tool (Bugzilla vs. Foobar or Hudson vs. Jenkins)
This BR is not to about a single issue.
This BR is about a general perception, that I hear quite often:

"The Eclipse Infrastructure is often slow and unreliable"

It is the essence of conversations I had with many people inside and outsite of the Eclipse ecosytem.
Also based on the perception, the most affected areas are: Website (including Eclipsecon.org), Mirrors, Gerrit, Wiki, Bugzilla, Build Servers, Signing Service. With the introduction of the Eclipse installers, the mirrors also immediately affect our end users now.

First of all, this statement is of course unfair as it is completely generic. Whenever a specific issue is reported, the EF team works hard and fast to find a solution. It might even not be true, I do not know numbers about the down times and the speed.
However, the perception is definitely there:
All people I have talked to either:
- claim it is "not optimal" (usually more harsh statements)
- say that they want to move away (e.g. to GitHub or own mirrors) or that they have done that already
- Do not like the Eclipse Installer, because it is slow and sometimes does not work

Therefore, I think the perception harms the ecosystem. Therefore, I wanted to bring is up, even though I know it is to generic to be solved with a commit or something simple.

I am interested in opinions about this. Some questions for me are:

1. Do we really have an issue here or it it just perception? If the second, what can do about that?

2. Is our infrastructure just too complicated to be maintained with the given budget?

3. Is it possible to rely on external services rather than self-hosting most services (e.g. GitHub, Cloud Services, etc)

4. Can we change the policy from must-be-entirely-self-hosted-to-own-the-data to must-be-able-to-export-data-to-pull-out-of-a-service-in-case-we-need-to?
Comment 1 Andrey Loskutov CLA 2016-08-09 03:06:03 EDT
I can confirm the perception.

We can use this bug as "umbrella" bug to collect and discuss concrete improvements to be done.

One area I've noticed recently was the permanent unstable collaboration between Bugzilla, Gerrit, Hudson and mail server (in all possible combinations). Contribution to Eclipse becomes a "Russian roulette", and most contributors aren't even aware what goes wrong and why.

If we are stick to the current ecosystem, we should *at least* think about increasing the transparency of the services state:

1) Committers should be able to see the services state.
2) There should be a way to trigger some basic infra checks (*not* by posting on a mailing list "I'm alone with XYZ issue?").
3) Web masters should be *automatically* notified if some service doesn't work as expected.

I've entered bug 499400 for that.
Comment 2 Andreas Sewe CLA 2016-08-09 03:46:05 EDT
*If* projects not setting up p2 download mirrors correctly is a problem, then having a tool similar to check a project's p2 repositories for correct mirror configuration (similar to the Projects Download Scanner) would be useful.

I've entered Bug 499403 for this.
Comment 3 Doug Schaefer CLA 2016-08-09 10:13:26 EDT
The only thing that really bugs me is the performance of our (CDT's) HIPP machine. Builds often hang at random spots, even when collecting test logs.

I've long complained about the HIPPs and how it doesn't really give us insight into what's happening on the underlying machine. Is someone else stealing all the CPU, are there too many things trying to hook up to the virtual framebuffer?

At any rate, one solution could be to make it easier for contributors to contribute external slaves to the Hudson farm and take some of the load off, at least for the Gerrit verify jobs that don't need secure access into the Eclipse download space.
Comment 4 Denis Roy CLA 2016-08-15 13:29:10 EDT
I'll take ownership of this.  If you want to follow along, please CC.

The infra has seen much unreliability in recent months. We are well aware of that. I will post some plans we have soon.
Comment 5 Denis Roy CLA 2016-09-20 15:10:50 EDT
Here's an update:

Causes for outages:
===================

www.eclipse.org, Bugzilla, Marketplace: master database overloaded.

www.eclipse.org, Marketplace, Wiki: PHP session files on Networked File System (NFS). With more logged-in users than ever, NFS was too slow to scale.

Git/Gerrit: Oomph Installer setup files, if unavailable, are fetched from cGit, which cannot handle the load.




Fixes done so far:
==================
Master database overloaded: A new two-server database cluster was put into production in August 2016. Multiple databases were migrated to the new cluster, removing load from the original cluster.

PHP session files on NFS: After identifying the problem, our content load-balancer was reconfigured and those files were move to local storage.

Cgit: Oomph team limited usage of cGit for installation files. Also, we enabled caching on cGit for better scalability.




Coming up:
==========

New 4-node download cluster to replace the current 7-year-old download servers (ETA Q4 2016)


Thanks for everyone's patience here.
Comment 6 Ed Merks CLA 2016-09-21 03:05:29 EDT
With regard to "Oomph Installer setup files, if unavailable...", unfortunately it doesn't help when the problem is self induced:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=463967#c22

In any case, once the latest changes roll out, it will no longer be possible for an Oomph-based application, other than the archiver application, to download from git.eclipse.org.

But we're still fundamentally hosed by performance problem like this one:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=500481#c24

I.e., downloading the following installable unit from my browser currently takes 18 minutes:

http://download.eclipse.org/eclipse/updates/4.7-I-builds/I20160914-0430/plugins/org.eclipse.platform.doc.isv_4.7.0.v20160830-1657.jar.pack.gz

But what users will see is that Oomph is unable to update a target platform with any reasonable kind of speed.  And of course everyone sees that download.eclipse.org doesn't function at all every Tuesday morning at 10:00AM.  Another self induced problem, though one the webmasters can do nothing about, but also unfortunately one that no one is doing anything about.

Arguably the problem with https://bugs.eclipse.org/bugs/show_bug.cgi?id=500481#c24 is also self induced because too many people are using repositories that aren't mirrored because too many projects provide repositories that aren't mirrored.
Comment 7 Mickael Istria CLA 2016-09-21 10:49:12 EDT
(In reply to Ed Merks from comment #6)
> Arguably the problem with
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=500481#c24 is also self
> induced because too many people are using repositories that aren't mirrored
> because too many projects provide repositories that aren't mirrored.

Currently, the mirror logic is in the clients (p2, eclipse download page...), isn't it? Could this logic be on download.eclipse.org side that would send some HTTP redirects to the most appropriate mirror when the artifact is mirrored? If we have this, it wouldn't require any effort from clients nor projects to mirror repositories.
Comment 8 Denis Roy CLA 2016-09-21 10:54:12 EDT
(In reply to Mickael Istria from comment #7)
> (In reply to Ed Merks from comment #6)
> > Arguably the problem with
> > https://bugs.eclipse.org/bugs/show_bug.cgi?id=500481#c24 is also self
> > induced because too many people are using repositories that aren't mirrored
> > because too many projects provide repositories that aren't mirrored.
> 
> Currently, the mirror logic is in the clients (p2, eclipse download
> page...), isn't it? Could this logic be on download.eclipse.org side that


That is exactly what this, below, is for:

(In reply to Denis Roy from comment #5)
> Coming up:
> ==========
> 
> New 4-node download cluster to replace the current 7-year-old download
> servers (ETA Q4 2016)


We've been testing a transparent mirroring system on Polarsys.org (all links to download.polarsys.org, such as  http://download.polarsys.org/capella/core/platform/releases/1.0.2/juno/capella-1.0.2.2016-04-20_12-28-48-win32-win32-x86-juno.zip

But we need lots of CPU power on download.e.o for that.
Comment 9 Ed Merks CLA 2016-09-21 10:56:39 EDT
(In reply to Mickael Istria from comment #7)

> Currently, the mirror logic is in the clients (p2, eclipse download
> page...), isn't it? Could this logic be on download.eclipse.org side that
> would send some HTTP redirects to the most appropriate mirror when the
> artifact is mirrored? If we have this, it wouldn't require any effort from
> clients nor projects to mirror repositories.

This was discussed a while back, but such an approach remains less than ideal because with client side mirror logic, the eclipse server is only contacted once for the list of mirrors, and after that, everything is offloaded.  If each artifact download request still requires the eclipse server to respond, with not exactly cheep logic to figure out where to delegate too, it will remain a heavy target.  Not only that, only on the client side can the mirror performance be measured so that the client makes the most requests directly to the fastest mirrors. The eclipse server can't know which mirror will be fast for me and which is fast at the moment.  There's really no band-aide solution for avoiding client side logic. 

And of course if repositories aren't mirrored at all, e.g., Eclipse I-Builds but lots of people want to use them for target platform resolution or bleeding-edge installation, nothing on the server nor on the client will help...
Comment 10 Mickael Istria CLA 2016-09-21 10:57:52 EDT
Ok, thanks for the explanation ;) And sorry for asking questions that were already answered (I'm not very skilled at infra stuff, so I often fail at understanding some comments ;)
Comment 11 Mickael Istria CLA 2016-09-21 11:00:40 EDT
(In reply to Ed Merks from comment #9)
> This was discussed a while back, but such an approach remains less than
> ideal because with client side mirror logic, the eclipse server is only
> contacted once for the list of mirrors, and after that, everything is
> offloaded.

Both logic could be kept in parallel. There is no reason to drop the client-side logic for this kind of use-cases.

> And of course if repositories aren't mirrored at all, e.g., Eclipse I-Builds
> but lots of people want to use them for target platform resolution or
> bleeding-edge installation, nothing on the server nor on the client will
> help...

Why aren't they mirrored? Wasn't it because it requires some Tycho/p2 tweaks to make mirroring useful?
If we have server-side mirror resolution, then it becomes really free to take advantage of mirrors and we could imagine I-Builds being mirrored then.
Comment 12 Ed Merks CLA 2016-09-21 11:01:30 EDT
The polorsys example would certainly help for large downloads, but here better web authoring can solve the problem for download links.  With p2, however, there are hundreds of requests involved, and, as I mentioned, the server can't know which mirrors are fast for at the moment, and where I happen to be located now.  The initial ordering of the mirror list is generally quite poor when compared to actual measured speed at the client side.  This is why we've spent more than a week optimizing how Oomph exploits mirrors.  E.g., probing all the mirrors with the smallest artifacts first to figure out which are actually good to use for more requests and for bigger artifacts...
Comment 13 Denis Roy CLA 2016-09-21 11:02:11 EDT
> This was discussed a while back, but such an approach remains less than
> ideal because with client side mirror logic, the eclipse server is only
> contacted once for the list of mirrors, and after that, everything is
> offloaded.  If each artifact download request still requires the eclipse
> server to respond, with not exactly cheep logic to figure out where to
> delegate too, it will remain a heavy target.  Not only that, only on the



That's a good point, and I feel I should add -- whatever transparent mirroring system we implement is intended to _complement_ the existing client-side mirroring, not replace it.
Comment 14 Ed Merks CLA 2016-09-21 11:06:47 EDT
(In reply to Mickael Istria from comment #11)

> Why aren't they mirrored? Wasn't it because it requires some Tycho/p2 tweaks
> to make mirroring useful?

I'm not sure if Tycho uses mirrors, but that's separate from the results of the build being mirrored and the update site containing the mirror request URL

> If we have server-side mirror resolution, then it becomes really free to
> take advantage of mirrors and we could imagine I-Builds being mirrored then.

The server side mirror resolution won't help with making sure that mirrors actually exist.
Comment 15 Eike Stepper CLA 2016-09-21 11:20:54 EDT
(In reply to Mickael Istria from comment #11)
> > And of course if repositories aren't mirrored at all, e.g., Eclipse I-Builds
> > but lots of people want to use them for target platform resolution or
> > bleeding-edge installation, nothing on the server nor on the client will
> > help...
> 
> Why aren't they mirrored? 

https://wiki.eclipse.org/IT_Infrastructure_Doc#Use_mirror_sites.2Fsee_which_mirrors_are_mirroring_my_files.3F shows a list of exclusion filters (not sure they're up-to-date). Filters like */I-* and *integration*/ probably catch most of the I-builds.

> Wasn't it because it requires some Tycho/p2 tweaks
> to make mirroring useful?

https://wiki.eclipse.org/IT_Infrastructure_Doc#Enable_mirrors_.2F_use_mirrorsURL_for_my_p2_repo.3F explains what's needed to let a p2 artifact repository participate in mirroring. I'm not sure if there's a Tycho thingy to automate that. In Oomph we've automated that as a promotion step with a Java class: http://git.eclipse.org/c/oomph/org.eclipse.oomph.git/tree/releng/org.eclipse.oomph.releng/src/ArtifactRepositoryAdjuster.java
Comment 16 Gunnar Wagenknecht CLA 2016-09-21 11:23:39 EDT
Tycho does not support injecting the mirror setting into a generated repository. I wonder if this should be covered by a CBI plug-in.
Comment 17 Mikaël Barbero CLA 2016-09-21 11:34:18 EDT
(In reply to Gunnar Wagenknecht from comment #16)
> Tycho does not support injecting the mirror setting into a generated
> repository. I wonder if this should be covered by a CBI plug-in.

There is bug 498360 that ask for this exact tool ;)
Comment 18 Andreas Sewe CLA 2016-09-21 11:35:05 EDT
(In reply to Gunnar Wagenknecht from comment #16)
> Tycho does not support injecting the mirror setting into a generated
> repository. I wonder if this should be covered by a CBI plug-in.

You can do this with the tycho-eclipserun-plugin, *but* injecting mirror settings should IMHO *not* be done during the build. (And David Williams agrees. ;-)

Instead, injecting this settings should be done when you move the buolt repository to its place on download.eclipse.org, archive.eclipse.org, or wherever.

See CBI Bug 498360 for such a tool -- which nobody has written yet, I'm afraid.
Comment 19 Eike Stepper CLA 2016-09-22 11:49:44 EDT
Ed and I have discussed this again. As far as p2 is involved there could be two types of things that are downloaded a lot: p2 metadata (e.g., content.jar and artifact.jar) and actual IU artifacts. We have the impression that downloading metadata is not the cause of the problem. But is there any data available about what files are downloaded? Are there statistics about the number of HEAD requests?

In Europe we've downloaded a 300MB package from both Eclipse.org and the Waterloo mirror (also in Ontario). The download from the mirror was roughly ten times faster. What is it that causes this? Is it the server load? Or it's network connection speed?
Comment 20 Denis Roy CLA 2016-09-22 12:56:28 EDT
We limit our Gigabit channel to about 250 Mbps for cost purposes.
Comment 21 Ed Merks CLA 2016-09-22 23:23:39 EDT
Denis,

Are there statistics for what is keeping the download server saturated? If we have statistics, we can figure out what might be improved or changed on the client side.  E.g., are lots of people downloading large packages even though there are mirrors?   Are lots of people downloading p2 artifacts?  Which artifacts: metadata or installable units themselves?  If IUs, which ones? Are there specific update sites that are the cause of a significant portion of the traffic?  (If so we could encourage projects to mirror them).


The link from https://bugs.eclipse.org/bugs/show_bug.cgi?id=499399#c6 took less than a minute to download this morning, so that's a huge improvement over yesterday, but it's around 5:00AM here.  I'll see how this progresses over the day.  But this improvement coincides with users finally being able to install from the Eclipse IBuild:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=500481#c35

We'd really really appreciate getting some statistics or insights that we might be able to use to improve the situation by doing things differently on the client side.
Comment 22 Denis Roy CLA 2016-09-23 09:32:47 EDT
Ed,

This should help provide clues:
https://dev.eclipse.org/committers/webstats/download.eclipse.org/usage_201609.php

I for it from the My Account Page (https://dev.eclipse.org/site_login) under Committer Tools > Website Stats > download.eclipse.org

Nothing really stands out in the KBytes % column, and most of those files are very small.

The analogy here is that we are shipping smarties one by one, each packaged in its own brown box with a label. It's not the payload that is filling the delivery truck, it's the packaging.
Comment 23 Denis Roy CLA 2016-09-23 09:36:54 EDT
I went looking for some outliers (Over 0.5%) and found some interesting nuggets.

7	2449305	0.53%	507111135	0.79%	/releases/neon/201606221000/content.xml.xz

183	359670	0.08%	518905607	0.81%	/recommenders/models/neon/org/eclipse/recommenders/index/0.0.0-SNAPSHOT/index-0.0.0-20160630.190154-1.zip

202	315594	0.07%	1777030338	2.78%	/recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20160630.185625-2-call.zip

235	252220	0.06%	247849311	0.39%	/recommenders/models/mars/org/eclipse/recommenders/index/0.0.0-SNAPSHOT/index-0.0.0-20150617.073902-1.zip

242	234427	0.05%	2771507191	4.33%	/technology/epp/logging/problems.zip

246	227211	0.05%	1120567812	1.75%	/recommenders/models/mars/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20150615.175205-1-call.zip

Seems recommenders is probably not using mirrors.
Comment 24 Andreas Sewe CLA 2016-09-23 09:58:55 EDT
(In reply to Denis Roy from comment #23)
> I went looking for some outliers (Over 0.5%) and found some interesting
> nuggets.

> 183	359670	0.08%	518905607	0.81%
> /recommenders/models/neon/org/eclipse/recommenders/index/0.0.0-SNAPSHOT/
> index-0.0.0-20160630.190154-1.zip
> 
> 202	315594	0.07%	1777030338	2.78%
> /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20160630.185625-2-
> call.zip
> 
> 235	252220	0.06%	247849311	0.39%
> /recommenders/models/mars/org/eclipse/recommenders/index/0.0.0-SNAPSHOT/
> index-0.0.0-20150617.073902-1.zip
> 
> 246	227211	0.05%	1120567812	1.75%
> /recommenders/models/mars/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20150615.175205-1-
> call.zip
> 
> Seems recommenders is probably not using mirrors.

Yes, Code Recommenders uses Aether rather than p2 to download its models (and then caches them locally below the user's home -- see Bug 464504); hence, it cannot easily make use of the mirrors list. See Bug 427772 comment 2 for a bit more information about what's going on behind the scenes.
Comment 25 Andreas Sewe CLA 2016-09-23 10:08:51 EDT
(In reply to Denis Roy from comment #23)
> 183	359670	0.08%	518905607	0.81%
> /recommenders/models/neon/org/eclipse/recommenders/index/0.0.0-SNAPSHOT/
> index-0.0.0-20160630.190154-1.zip
> 
> 202	315594	0.07%	1777030338	2.78%
> /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20160630.185625-2-
> call.zip

A bit more info on what these two files are about:

When you use an Eclipse package with Code Recommenders *installed* (currently the Java, JEE, and Committers packages) for the first time, it downloads the "index-*.zip".

If you then chose to *activate* Code Recommenders, you are very likely to download the call-recommendation model for the JRE "jre-*.zip".

Over time, you may also download the much smaller override-recommendation models for the JRE, but for the average Java user that should be it.

If you are working on eclipse.org code, however, you are likely to also download recommendation models for SWT, JFace, and a whole lot of other org.eclipse.* bundles. But those users should be the minority.

Anyway, once you have downloaded the initial index and JRE model, all Code Recommenders is doing is checking at regular intervals a very small "maven-metadata.xml" file, to see whether any updates to the index or models are available -- which normally is not the case, as we update them about once per simultaneous release.

Hope that sheds some light onto what those files are and how they are used.
Comment 26 Andreas Sewe CLA 2016-09-23 10:30:02 EDT
(In reply to Andreas Sewe from comment #25)
> Hope that sheds some light onto what those files are and how they are used.

There's also some good news: over time, this problem is going to become better.

Have a look at the download stats for the JRE model for Mars and Neon, respectively:

77273482 /recommenders/models/mars/jre/jre/1.0.0-SNAPSHOT/maven-metadata.xml	
 3204791 /recommenders/models/mars/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20150615.175205-1-call.zip

 9819619 /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/maven-metadata.xml	
  856652 /recommenders/models/neon/jre/jre/1.0.0-SNAPSHOT/jre-1.0.0-20160630.185625-2-call.zip

Overall, Mars saw 3.2 million JRE call-recommendation model downloads, with a maven-metadata.xml/.zip ratio of ~24.1. For Neon we haven't reached this level of saturation yet; the "update check/download" ratio is just ~11.5, but will grow over time as more and more users have the 5.5 MB JRE models sitting in their ~/.eclipse/org.eclipse.recommenders.models.rcp folder.
Comment 27 Denis Roy CLA 2016-09-23 14:42:43 EDT
Thank you for the explanation, Andreas, but to be clear: you don't need to defend the usage of these resources, and no one is being accusatory.  But I appreciate the background.

We'll be adding an extra 100 Mbps of bandwidth shortly, bringing our bandwidth cap to 350 Mbps.  That should help a lot.
Comment 28 Andreas Sewe CLA 2016-09-26 03:16:35 EDT
(In reply to Denis Roy from comment #27)
> Thank you for the explanation, Andreas, but to be clear: you don't need to
> defend the usage of these resources, and no one is being accusatory.  But I
> appreciate the background.

Don't worry, Denis. I didn't take this as accusations; I just wanted to provide some more background.

Speaking of which, can you please explain the columns in the following mean?

(In reply to Denis Roy from comment #23)
> I went looking for some outliers (Over 0.5%) and found some interesting
> nuggets.
> 242	234427	0.05%	2771507191	4.33%	/technology/epp/logging/problems.zip
> 
> Seems recommenders is probably not using mirrors.

Unlike Code Recommenders' recommendation models, AERI should nowadays download its problem index from a mirror, retrieving it from download.php?file=... (Bug 470479). Does the above data tell you otherwise?

> We'll be adding an extra 100 Mbps of bandwidth shortly, bringing our
> bandwidth cap to 350 Mbps.  That should help a lot.

Great. Looking forward to it.
Comment 29 Martin Oberhuber CLA 2016-09-28 10:30:54 EDT
Our Hudson Jobs fail with timeouts today:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=502433
Would it make sense linking that bug to this one since it's about "server uptime" ?
Comment 30 Denis Roy CLA 2016-10-24 08:08:51 EDT
Coming up:
==========

- New 4-node download cluster to replace the current 7-year-old download servers (ETA Q4 2016)

- Revise HIPP network access/eliminate problematic use of Proxy (ETA Q4 2016)

- New database master to replace 5 year old unit (ETA Q1 2017)

- New vserver hosts to replace >5 year units (internally known as mirage) (ETA Q1 2017)

- New networking gear to replace 8 year old units (ETA Q2/Q3 2017)
Comment 31 Jonas Helming CLA 2016-11-14 06:58:06 EST
Last week has been very bad again:
- Gerrit (https://bugs.eclipse.org/bugs/show_bug.cgi?id=507374)
- Signing service
- Bugzilla
Comment 32 Denis Roy CLA 2016-11-23 11:35:40 EST
(In reply to Denis Roy from comment #30)
> Coming up:
> ==========
> 
> - New 4-node download cluster to replace the current 7-year-old download
> servers (ETA Q4 2016)

Nodes are being racked and will be in service probably next week.

> - New database master to replace 5 year old unit (ETA Q1 2017)

Server is ordered.


> - New vserver hosts to replace >5 year units (internally known as mirage)
> (ETA Q1 2017)

Will be ordering soon.
Comment 33 Carsten Pfeiffer CLA 2017-06-19 08:38:03 EDT
It might be helpful if there was a mechanism to specify a dedicated mirror-configuration for p2 so that companies can host and use their own. The distributedness of p2 repositories makes it a little hard, though. 

Our current solution is using an http-proxy which rewrites all requests going to www.eclipse.org to our own mirror.

Then we configure p2 to *disable mirrors*, so that all requests are really made to eclipse.org, so that the proxy can rewrite them.

Providing the same functionality with an easier mechanism should help
- getting more reliable tycho builds, oomph setups, etc.
- reducing the load of eclipse.org
Comment 34 Mickael Istria CLA 2017-06-19 08:49:51 EDT
(In reply to Carsten Pfeiffer from comment #33)
> It might be helpful if there was a mechanism to specify a dedicated
> mirror-configuration for p2 so that companies can host and use their own.
> The distributedness of p2 repositories makes it a little hard, though. 
> Our current solution is using an http-proxy which rewrites all requests
> going to www.eclipse.org to our own mirror.
> Then we configure p2 to *disable mirrors*, so that all requests are really
> made to eclipse.org, so that the proxy can rewrite them.
> Providing the same functionality with an easier mechanism should help
> - getting more reliable tycho builds, oomph setups, etc.
> - reducing the load of eclipse.org

I have the impression that using HTTP proxies is actually the best way to handle local mirrors in general and I don't think p2 could do something better without somehow duplicating the purpose of a proxy.
However, if it's about build, Tycho has a good way to define mirrors via settings.xml. Some (like JBoss Tools) also prefer to reference directly their mirrors in the p2 repositories references and .target files to reduce the dependency on Eclipse.org infra.
Also, I don't think the Foundation infrastructure is aimed at dealing with enterprise-specific issues more that it already does: there are decent download servers with multiple good mirrors, anyone is free to mirror the p2 repos locally and to setup proxies (as you did) for an excellent result.
If you have more details about how p2 could be improved to handle such case, please report them to p2 rather than on this bug which is about infrastructure.
Comment 35 Denis Roy CLA 2017-11-15 09:31:20 EST
(In reply to Denis Roy from comment #32)
> > Coming up:
> > ==========
> > 
> > - New 4-node download cluster to replace the current 7-year-old download
> > servers (ETA Q4 2016)
> 
> Nodes are being racked and will be in service probably next week.

All done and tuned.

> 
> > - New database master to replace 5 year old unit (ETA Q1 2017)
> 
> Server is ordered

All done.

> > - New vserver hosts to replace >5 year units (internally known as mirage)
> > (ETA Q1 2017)
> 
> Will be ordering soon.

Those are in production.

Jonas, as the original opener of this issue, how do you feel is the state of the Eclipse Infra?


To quote you:

1. Do we really have an issue here or it it just perception? If the second, what can do about that?

How is the perception now?


2. Is our infrastructure just too complicated to be maintained with the given budget?

We've been working on improving performance, stability while reducing services and complexity. 


3. Is it possible to rely on external services rather than self-hosting most services (e.g. GitHub, Cloud Services, etc)

We now host build slaves, Mac signing services on hosted solutions, and continue to investigate hosted solutions where it makes sense to do so.



4. Can we change the policy from must-be-entirely-self-hosted-to-own-the-data to must-be-able-to-export-data-to-pull-out-of-a-service-in-case-we-need-to?

That policy already exists -- we do allow GitHub and GitHub issues, as an example.  But more work can be done there.
Comment 36 Jonas Helming CLA 2017-11-29 12:07:05 EST
Thanks for the update!

As this BR was about perception, I have talked to a quite a few people, who regularly use the Eclipse infrastructure. Nobody stated that the situation got worse. About 1/3 had the perception, it did not change and 2/3 had the impression, things got better.

It was frequently mentioned that the build servers and especially the signing service are more stable. 

It was mentioned, that downloads got better, but are still not good enough (especially update sites).

eclipsecon.org was perceived as still being very unreliable and slow and as I watched this closely myself, I can confirm this one.

So in general, I think you definitely achieved improvements and it is probably still some way to go.

Thank you a lot for all your work!
Comment 37 Dani Megert CLA 2017-12-01 10:04:57 EST
From my POV it got much better! The only thing that frequently blocks me is very slow downloads.
Comment 38 Ed Merks CLA 2017-12-01 10:35:52 EST
I agree, things generally seem much more reliable.  Just update sites (ones that aren't mirrored), are slow and can and do have timeouts, but if you're patient, it eventually works.
Comment 39 Eclipse Genie CLA 2019-11-22 02:04:33 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 40 Denis Roy CLA 2019-11-22 09:03:38 EST
(In reply to Dani Megert from comment #37)
> From my POV it got much better! The only thing that frequently blocks me is
> very slow downloads.

This is being tracked in bug 547776. I think for everything else, we can close as FIXED. Thanks for your patience.
Comment 41 Dani Megert CLA 2019-11-22 10:52:25 EST
(In reply to Denis Roy from comment #40)
> I think for everything else, we can close as FIXED. Thanks for your patience.
+1. Thanks for your efforts!
Comment 42 Jonas Helming CLA 2019-11-22 10:57:48 EST
+1 thank you for your great work!
Comment 43 Jonas Helming CLA 2019-11-22 10:57:57 EST
+1 thank you for your great work!