If you support any web site long enough you will suffer a break in. If you support lots of web sites you will suffer them more often than you’ll want to admit in public. A few weeks ago my number came up again in the attack lottery when we discovered a client’s web site was being used as a proxy and redirect to a fake shoe site.
It wasn’t the first time I’d suffered a break in, and unfortunately I don’t expect it to be the last. My last experience with a major break in was shortly after Drupalgeddon (I patched all the clients I was supporting before they were breached but had to clean up sites that weren’t patched by other vendors), and the attackers had learned a few new tricks in the meantime.
If you are responding to a break in on a Drupal site there are directions on drupal.org to help guide you through an attack response, but I thought it might be helpful to talk through a version of what response can look like in practice. I think it’s also useful for us all to admit our weaknesses from time to time to help us all make sure we’re making new mistakes.
Overview
At the outset I’m going to admit we never found the initial source of the attack, what we did find were the tools they placed after the break in. The most likely cause was poor server patching practices by the client’s host, but there were also some Drupal security patches that had been slow to be get installed as well. During the attack I worked with members of the Drupal Security team (particularly Greg Knaddison who generously provided feedback on this article as well – of course any remaining mistakes are mine), who were helpful in giving me suggestions and who were clearly interested in helping us make sure we resolved the problem.
The site was being used as part of a scam advertising network. The attacker was leveraging the reputation of the site to create records in search engine indexes that were redirects to a fake shoe sales site. There were also a number of tools placed on the server that gave them full access to the Drupal database and the ability to run arbitrary PHP scripts. And it was clear by the end they had placed additional backdoors we never found – they may have had full control over the OS as well.
How we found out
Google told us.
We got an alert from Google reporting SPAM content on the site. At first we couldn’t find the content they were talking about, which unfortunately slowed our escalating our response, because it was only directly available to search engines. The junior developer who was initially assigned to review the message from Google eventually figured out how to find the listing on Google (a Google site search for Nikes and some hash codes the attacker was using), but couldn’t figure out how out it got there, and escalated the task to me.
Once I saw what she’d figured out my stomach sank. At first I was still hoping there might be some other explanation, or some simple matter of a single user account getting exploited, but that seemed unlikely (since we couldn’t find the content on the server) and I quickly knew it was going to be a mess.
Initial Response
The first thing I did was make sure we had a copy of the exploited site: code, files, and database. I would have rolled the site back to a recent backup, but our five-day rolling database snapshots were not enough to get back to before the attack began. We spun up new virtual machines for myself and another senior developer to start reviewing copies in environments isolated from other work.
Since the URLs we had for testing were a fairly unique pattern we started to Google those – and we got lots of hits. As soon as we knew the problem was larger than our site, we opened an issue with the Drupal security team and started to feed them all the information we had gathered. While their practice is not to get involved in resolving attacks directly (their role is to ensure the security of Drupal core and contributed modules), they were supportive and helpful in suggesting places to look for problems and resolution strategies.
Attacks we found
By the time I was alerted to the problem there were already several malicious tools installed, some of which I’d seen versions of before, and some were new to me – all were designed to be hidden from sight through some simple but effective obfuscation. Over the course of the next couple of days I found several backdoors manually, wrote tools to help me find more, and played entirely too much whack-a-mole (more on that in a bit).
There were two main categories of attack I was chasing: PHP scripts scattered around the public files directory, and records added to Drupal’s database tables.
Database table exploits
If you dealt with sites in the aftermath of Drupalgeddon, or other hacked Drupal sites, you have probably seen what happens when an attacker inserts PHP into carefully targeted parts of a Drupal database. In the ones I’d seen before attackers replaced the callback functions in Drupal’s menu_router table with PHP of their own. In this case the attacker used the Block module’s ability to use PHP to place a block to provide themselves a way to execute arbitrary PHP by sending a post request to the server. They leveraged the fact that the main system block is always available and therefore is a reliable place to insert a backdoor. By posting a form with a specific form element they were able to execute arbitrary PHP and therefore use that to place additional malicious code.
The attacker also leveraged Drupal’s system table to get more complex attack code loaded. They created a record for a file to be loaded as a module and then uploaded that file to the site’s files directory where they were guaranteed Drupal had write access.
filename: sites/default/files/styles/medium/public/57h3d21.jpg
name:overly
type: module
owner:
status:1
bootstrap: 0
schema_version:0
weight: 0
info: a:11:{s:4:"name";s:6:"overly";s:11:"description";s:58:"Displays the Drupal administration interface in an overly.";s:7:"package";s:4:"Core";s:7:"version";s:4:"7.32";s:4:"core";s:3:"7.x";s:7:"project";s:6:"drupal";s:9:"datestamp";s:10:"1413387510";s:12:"dependencies";a:0:{}s:3:"php";s:5:"5.2.4";s:5:"files";a:0:{}s:9:"bootstrap";i:0;}
This was the script doing the redirects and filtering traffic so that pages only appeared to search engines. Usually these records have filenames that are .module, .php, or .inc files, but in this case it was a .jpg file named to be similarly to actual files on the site to make it hard to spot.
The content of that file was a PHP script not an image. The script did several things, and was the main tool the attacker was actively using during the time we were trying to stop them. It served as a simple proxy of content that they would present to the search engines, and redirect those same pages to the scam site for anyone else. It also provided code to make sure the content of user login forms was sent to the attacker, and a backup backdoor incase some of their others were lost.
We actually had to remove this particular attack more than once (always using the misspelled “overly” module) and each time it came back with a new file, and each time using a different but similar disguise to try to make their code blend in with legitimate files.
.htaccess files in public files
The other trick that was new to me (and a more aggressive stance by Drupal core on this approach is being discussed) was to take advantage of the .htaccess patterns in Apache to re-enable PHP execution within the public files directory. Drupal’s default .htaccess file disables PHP at the root of the public files directory and in theory all subdirectories, but that can simply be undone by a malicious .htaccess file (unless you block it in Apache’s main configuration – which in my opinion defeats the purpose of using .htaccess).
The attacker had placed a number of basic PHP-based exploits on the server using this technique to allow them to run the scripts. The tools themselves were not Drupal-specific, and likely the .htaccess file would work just as well on a number of other PHP-based CMS platforms.
Since the files directory gets deep and complicated there is no reasonable way to scan the whole thing by hand: particularly since several of the files were using inaccurate file extensions (like .jpg or no extension at all) and file names meant to blend into the background. So in addition to checking for any .htaccess files below the files directory root, I wrote a simple Python script to scan a directory for anything that includes the string <?php:
How we fixed it
We immediately made sure all code on the site was up-to-date, and I removed every exploit I could find. And for a couple days I played whack-a-mole with the attacker. Every day I would remove a series of exploits, disable their ability to redirect users to their scam, and every night they would break back in through a backdoor I’d failed to find.
The final solution was to replace the server, deploy a version of the code known to be good, and deploying copies of the database and files that have been scanned for any PHP in places it shouldn’t be – which involved a combination of the scanner above and hand checking every place the database stores PHP, not a fast process.
What we will do better next time
Part of any security event of this nature needs to be a full review of your internal processes and controls to make sure you reduce your risk and improve your response the next time something occurs (because unfortunately there will be a next time for all of us).
One of our first areas of improvement is a shared understanding that it’s more important to resolve the attack than determine the cause. This goes against the question that developers are constantly asked during and after an attack: “How did this happen?” While you need to know something about what happened, in the end it’s more important to make it stop. I still don’t know what happened that started the attack, I know I stopped it by blocking every attack vector I could think of and replacing every part of the stack with a version known to be fully up-to-date. It would have been faster and cheaper if I’d just started there: yes there is a risk I would have missed some of the code in the database if I hadn’t taken the time to review what I was finding, but frankly I doubt that risk is has high as the risk that new exploits will appear while I’m working to understand the previous one.
Beyond that basic shift in approach we developed a three part list of improvements:
- Things we needed to improve right away.
- Things we needed to improve soon.
- Things that should be part of ongoing improvement.
The highest priority items were coming up with better internal process for initial response, and making sure we are deploying all security updates in a timely but still careful manner, including monitoring our hosting partners to ensure servers stay up-to-date as well. These are basics that are easy to let slip over time – particularly monitoring that your partners are doing their job correctly.
The second category of fixes is filled with workflow and procedure improvements. We were already were in the process of improving our code handling (migrating from SVN to Git, better production monitoring, more internal code review, etc), and we accelerated our plans to complete that work. This category also includes a complete review of our existing backup procedures to make sure they provide the level of coverage our clients need.
The final category of longer term adjustments includes tasks that include ensuring all developers are given (and expected to take) professional development opportunities around security best practices, doing more internal sharing about emerging ideas and trends, and encouraging more community engagement so we are better able to leverage the community resources in a crisis.