--- layout: post status: PUBLISHED published: true title: apache.org incident report for 8/28/2009 id: 2c32a2ce-df67-4c29-8300-8b46700cdb7f date: '2009-09-02 08:56:09 -0400' categories: infra tags: - downtime - security permalink: infra/entry/apache_org_downtime_report ---

Last week we posted about the security breach that caused us to temporarily suspend some services.  All services
have now been restored. We have analyzed the events that led to the breach, and continued to work on improving the security of our systems.

NOTE: At
no time were any Apache Software Foundation code repositories, downloads, or users put at risk by this intrusion. However, we believe that providing a detailed account
of what happened will make the internet a better place, by allowing others to learn from our mistakes.

What Happened?

Our initial running theory was correct--the server that hosted
the apachecon.com (dv35.apachecon.com) website had been compromised. The machine was running CentOS, and we
suspect they may have used the recent local root exploits patched in RHSA-2009-1222 to escalate their privileges on this machine. The attackers fully compromised
this machine, including gaining root privileges, and destroyed most of
the logs, making it difficult for us to confirm the details of
everything that happened on the machine. 

This machine is owned by the ApacheCon conference production company,
not by
the Apache Software Foundation. However, members of the ASF
infrastructure team had accounts on this machine, including one used to
create backups.

The
attackers attempted unsuccessfully to use passwords from the compromised ApacheCon
host to log on to our production webservers.  Later, using the SSH Key of the backup account, they were able to access
people.apache.org (minotaur.apache.org). This account was an unprivileged user, used
to create backups from the ApacheCon host.

minotaur.apache.org runs FreeBSD 7-STABLE, and acts as the staging machine for our mirror
network. It is
our primary shell account server, and provides many other services for Apache developers. None of our Subversion (version control) data is kept on this machine, and there was never any risk to any Apache source code.

Once
the attackers had gained shell access, they added CGI scripts to the document root folders of
several of our websites. A regular, scheduled rsync process copied these scripts to our
production web server, eos.apache.org, where they became externally
visible. The CGI scripts were used to obtain remote shells, with information sent using HTTP POST commands.

Our download pages are
dynamically generated, to enable us to present users with a local mirror of our software. This means that all of our domains have ExecCGI enabled, making it harder for us to protect against an attack of this nature.

After
discovering the CGI scripts, the infrastructure team decided to shutdown
any servers that could potentially have been affected. This included people.apache.org, and both the EU
and US website servers. All website traffic was redirected to a known-good
server, and a temporary security message was put in place to let people
know we were aware of an issue.

One by one, we brought the potentially-affected servers up, in single user mode, using our out of band access. It quickly became clear that aurora.apache.org, the EU website server, had not been affected. Although the CGI scripts had been rsync'd to that machine, they had never been run. This machine was not included in the DNS rotation at the time of the attack.

aurora.apache.org runs Solaris 10, and we were
able to restore the box to a known-good configuration by cloning
and promoting a ZFS snapshot from a day before the CGI scripts were synced
over. Doing so enabled us to bring the EU server back online, and to rapidly restore our main websites. Thereafter, we continued to analyze the cause of the breach, the method of access, and which, if any, other machines had been compromised.

Shortly after bringing up
aurora.apache.org we determined that the most likely route of the breach was
the backup routine from dv35.apachecon.com. We grabbed all the
available logs from dv35.apachecon.com, and promptly shut it down.

Analysis continued on minotaur.apache.org and eos.apache.org (our US
server), until we were confident that all remants of the attackers had been removed. As each server was declared clean, it was brought back online.

What worked?

What didn't work?

What changes we are making now?

As a result of
this intrusion we are making several changes, to help further secure our
infrastructure from such issues in the future. These changes include the following: