This Week’s Drupal Fire Drill

This week many in the Drupal community lost a lot of sleep Tuesday night because the security team treated us to a warning about major security updates due out on Wednesday. Fortunately for many it wasn’t a crisis in the end, but it gave us all a chance to practice for the worst. Basically, it was like a fire drill in a elementary school: we got to prepare like there was a disaster, but since they wasn’t one we don’t really know how it would have gone if there was actually a fire. We haven’t had a stop-drop-and-roll type of emergency in a while, so it was a good refresher on how to handle a crisis.

For those who don’t know what I’m talking about here’s a quick review. At Cyberwoven, like many Drupal shops, we follow the Drupal Security twitter feed on one of our Slack channels so we saw this mid-afternoon Tuesday:

Slack posting of tweet from the security team.

I read the PSA with images of Drupalgeddon dancing in my head:

There will be multiple releases of Drupal contributed modules on Wednesday July 13th 2016 16:00 UTC that will fix highly critical remote code execution vulnerabilities (risk scores up to 22/25).These contributed modules are used on between 1,000 and 10,000 sites. The Drupal Security Team urges you to reserve time for module updates at that time because exploits are expected to be developed within hours/days. Release announcements will appear at the standard announcement locations.

Drupal core is not affected. Not all sites will be affected. You should review the published advisories on July 13th 2016 to see if any modules you use are affected.

Oh that bold line up there wasn’t part of the original announcement. On Tuesday we didn’t have a sense of scale, were we talking about modules that everyone uses on almost every site (ctools came up more than once). It’s the one thing I wish the security team had done differently: given us that sense of scale.

I read all security postings, and make sure we take prompt steps to address them for clients as needed, but the potential here was that we’d have to update all 70+ sites in a few hours or less which is very different from your run-of-the-mill security update that often aren’t related to use cases and threat profiles for the majority of sites.

Here’s what we did next:

Tuesday

  1. Took a minute to panic, complain, and joke about pending illnesses. This is actually a useful step because it allowed me to burn off some nervous energy and then to focus on the real work.
  2. Pulled out the list of all active clients with Drupal sites, and doubled checked it for accuracy.
  3. Made sure a developer had a working repo for all sites (70+). Since we had a couple people out of the office, and some projects had been reassigned recently to different developers, this was an important step to make sure no sites fell through the cracks during a rush to update them all.
  4. Made sure we knew which were 6 sites, and which were 7 in case we are able to determine that 6 is also affected. Since I knew the announcement would likely skip D6, we needed to accept that we might have to take those sites offline for a time.
  5. Made sure leadership knew that all developers may be busy start at 16:00 UTC on Wednesday. We didn’t actually cancel anything right away, but I didn’t want anyone surprised if we were all too busy posting and testing updates to worry about things like meetings.
  6. Made sure complex projects were thought about ahead of time: sites with unusual setups or ongoing dev work that make sudden updates complex. For example we have one client that has 16 sites that all have an unusual set up, so we agreed who would handle those and made sure she was prepared.

Wednesday

  1. First thing in the morning I saw the update to the notice that gave us a sense of scale and relaxed a little, but still made sure we were fully prepped.
  2. Noon: the announcement of what modules were affected was released and a couple other developers and I immediately reviewed the releases. We relaxed once we determined none of our clients were using any of the modules listed.
  3. I reviewed code from each of the three modules to see what the change was to look for ways to improve my own code to avoid similar errors.
  4. Looked for ways to improve our response for the next time it’s not a drill.

Things would have been more exciting if we’d had to update our sites. Since we were prepared it was a matter of minutes for us to check that all our sites were secure. Each developer checked all the sites in their sandbox, and since I knew all sites were in someone’s sandbox that gave us 100% coverage without having to do lots of double checks.

I think it is too easy to look past doing the code review of modules we weren’t using but I find this kind of follow up really useful. Looking back on Drupalgeddon it’s amazing how much pain was caused by such a small error (16 characters are all that were needed to fix it). And by seeking to understand what went wrong you can look for places that you make similarly invalid assumptions.

If you read my post on making new mistakes, you also know I believe that looking for improvement is the most important detail (particularly when it turned out to be a drill, not a fire). Here’s my initial list of things to improve:

  • Have a system to automatically check every site for specific modules (this was already under development but will take a little while longer to complete).
  • Make sure at least two developers have a working sandbox for all projects at all times in case something comes out during a vacation.
  • Improve internal messaging about what to expect – template message and process.  I tossed together some disjointed thoughts for account managers. But disjointed developer thinking does not make people feel like you’re on top of things.
  • Have better tracking of outliers:  I completely missed that I had a demo site on Pantheon that did have the Coder module running.  Since it was Pantheon they alerted me to this problem and had taken steps until I could do the update myself. But it would have been bad if that site had been someplace else and/or in production.
  • Make sure everyone one knows when the actual release is coming out, and what the outcome was. Several developers were hoping to get lunch before the updates, but hadn’t done so when the announcement came out (which could have been a problem if we’d ended up busy).  And I spent the rest of the day answering one-off questions from the account team who wanted to know if the announcement had been bad news.

Ideally we’d come up with a method to automate security updates (maybe all updates), but that’s not totally straightforward.  We have to worry about required patches, non-standard setups, automated testing, and other details. There has been discussion on the Pantheon power user’s mailing list, but every shop has a slightly different workflow (like the fact that we don’t use Pantheon much at all) so we’ll need to come up with a system that accommodates our system.

When Should I Update my Drupal Site to Drupal 8?

Last year Drupal 8 finally arrived, and brought the question that comes with every new release of Drupal: when should I update? New releases of Drupal mean two things: new features and cool new tools, and the retirement of an old version. We got the power and flexibility of Symfony and Drupal 6 sites are no longer getting community support. Unlike WordPress, which has well defined upgrade paths, each version of Drupal is a new adventure in upgrade pain. The more I watch people suffer with this pain, and the more I watch them try to find a way to do upgrades that preserve their site’s fundamental structure, the more I come to the conclusion that this pain is telling us something: we’re doing it wrong. Not because Drupal’s strategy is wrong, but because keeping all your content in the same structures is usually wrong. Drupal 8 should not make it easy for you to continue to use an old strategy, it should encourage us to update old assumptions.

Here is how I encourage everyone to view their choices:

If you have a Drupal 6 (or older) site you should update right now. Drupal 6 is no longer getting security updates so you are on borrowed time. But more importantly Drupal 8 is a better tool for the current state of the web than your Drupal 6 site. Most sites running on D6 reflect an online communications strategy that’s at least 4 or 5 years old. Those sites probably aren’t responsive, aren’t prepared to support apps, don’t have the right focus on social media and user engagement, and make assumptions about user behaviors that have evolved. Skip to Drupal 8: do not migrate these sites to Drupal 7. If there is a tool that is missing from Drupal 8 that your current site uses make sure you need it before complaining (or paying to have someone port it for you). Maybe that tool hasn’t been ported because it doesn’t make sense anymore. Some things are still missing, but lots of things are being rewritten differently because we have a better platform. The community is smarter than it was 5 or 10 years ago, and the platform is better, take the time to figure out why something hasn’t been ported: is it just no one has bothered, or has something better been built instead?

If you have a Drupal 7 site you should update when your web site no longer supports your work.  This is actually the same advice I just gave, but without a few assumptions like “you need a secure site.” Many Drupal 7 sites have a lot of life left in them. A site you built today will be designed to meet the needs you have now, and the ones you foresee in the near future. Three years from now (when Drupal 7 is scheduled to lose support from the community) you will be operating on assumptions that have probably been wrong for at least two years. Every six months you should ask yourself: does my site reflect my online strategy, and is my strategy still working? If the answer is yes to both of those questions you are fine, if the answer is no to either – particularly the second – you should engage someone to help you update your strategy and rebuild your site.

I’ve been part of projects that failed in part because we tried to port a stale strategy and stale content to a fresh site. We broke the new site before it even launched. Don’t try to make Drupal 8 behave like your old site: embrace the change.