Salesforce Id Iteration Attacks

Please stop using Salesforce Ids in URLs or other user accessible parts of your solution.

Why not use Salesforce Ids on public web sites? I run into this question from time to time and figured some more public commentary would be useful.

The super short version: they are not a security or access control mechanism.

During a recent Salesforce Open Source Commons sprint a question came up about why it’s important to use a random value, instead of a Salesforce Id, in public facing use cases. We were talking about the Unsubscribe Link package, and it makes a good use case for discussion.

What is Unsubscribe Link?

Here is some quick context on Unsubscribe Link and why I’ll be using it as an example in this post. Jessie Rymph created the Unsubscribe Link package for Salesforce Labs. Community volunteers are taking over the future of that package and those volunteers need to speed on the great work she already did.

The package provides links you can embed in email messages sent by Salesforce that allow recipients to unsubscribe from future messages. These links need to allow the recipient, and just the recipient, to make this update. It needs to update the data in Salesforce in a secure manner, based on a link provided in an insecure channel (an email message). Links need to be reliable and consistently formatted, but not guessable. The package, rightly, uses UUIDs in those link to match the link opened back to the correct Salesforce record.

The Attack on Salesforce Ids

If you have a public facing web page or service that allows for display or update of data that filters access based purely on the Salesforce Id, you are at risk. An attacker can manipulate those Ids and either steal, or mass update, your data. This is called a “bad thing”.

The technical terms here are: Insecure Object Reference leading to a Web Parameter Tampering attack.

In the case of Unsubscribe Link, if the link used the Salesforce Id, it would be possible to unsubscribe all, or nearly all, Contacts in the org. The attacker would start with the Id in the email they got, and then start scanning up and down the Id range (see next section for how). Unsubscribe Link is protected against this attack, but I see elsewhere all the time.

Here are a couple quick examples (which I saw, helped fix, and hear people describe recreating):

  • An org that allowed members to confirm their details by clicking a link in an email. They could then see their contact and membership information on a web page. That web page had no authentication, and the link’s only parameter was the Salesforce Id. Once you had one link, you could change Id and view other people’s data.
  • An event check in solution that used a QR code volunteers scanned at the door. The QR code contained a link that lead to a form which showed their registration information (including name and email) and updated their Campaign Member status – again the only link parameter was the Salesforce Id. Anyone with the QR code could have pulled the complete email list of attendees with a few hours of scripting.

Salesforce’s Id Structure

To understand how an attacker can do this it is useful to understand Salesforce Ids. They often look random but are not. They are defined by a predictable, and published, pattern. If you want really gritty detail have a look at this post. But I’ll give you the short version here.

They start with a 3 character prefix that encodes the object type. Salesforce provides documentation for decoding the id prefix for all standard objects. Custom objects have a prefix that’s specific to your org.

The next three characters are an instance reference. This reference indicates the Salesforce instance hosting your org when creating the record. It’s highly stable, but can change from time to time.

Then there is a zero that’s reserved for future use.

Next comes a 8-digit base-62 number. This provides a huge range, but it is just an incrementer stored in a convenient fashion.

That covers the first 15 characters. Salesforce Ids come in two flavors, 15 or 18 characters. These 15 character Ids are case-sensitive base-62 strings. In most cases you will be using the 18 character flavor. The 18 character version are case-safe and add three more characters that encode the case of the preceding 15 characters. So those last three also often look random, but are entirely calculable based on the first 15.

From an attackers perspective a Salesforce Id looks like this:
[7-digits I can copy and paste][a very big number][three digits I calculate]

How Do They Know What Salesforce Ids to Scan

For an attacker the only hard bit is figuring out what very big numbers to use.

These Ids are not meant to be random or secure. You should think of an 18-digit as a base-62 number. Worse yet, an attacker only needs to scan a small portion of the space to find data. Only 8 digits are actually incrementing (still a huge range – 62^8 is large by even astrophysics standards).

If an attacker had to check that whole range they’d never find you. But your Ids are clustered together relatively tightly. Worse yet, when they start their scan, they already know where to look – you told them.

But I Read This Isn’t Important In Practice!

There is commentary around suggesting that these numbers are too big to guess so they therefore protect you from attack. If those numbers were random that might be true, but they aren’t randomly assigned.

Salesforce generates Ids sequentially within a node which is often hosting multiple customers. The node started at zero once upon a time. Every time a record was created Salesforce incremented that number. Whenever someone runs tests, it increments the number and then dumps the data (and those Ids). When someone on another org on that instance adds data, up goes the number. During bulk data loads there are multiple processes running, the processes will each reserve a transaction’s worth of Ids at a time and may be on different parts of the file making them look out of order. If one of those transactions fails, it’ll create a gap in the sequence (typically 200 records in length).

All that means that your attacker usually cannot reliably guess the Id of a specific record, but if they scan large numbers of records they can find lots of data.

Those behaviors do spread the data out over a slightly larger area than a purely sequential Id in a traditional database. But your data all starts are some number with Ids above that number. The larger your org, and more active you are, the more likely it is that your data has dominated Id generation since you joined the node.

Yeah, They Know How to Attack You

You have also done things in the course of business that clustered your Ids together in sub-ranges that help your attacker. For example when you migrated data into your org, or any other bulk load, you created a lot of records with very tightly clustered Ids. So while searching they may find large gaps between records, and long runs of perfectly sequential Ids.

If the attacker knows one Id, then they know what area to start looking around in for more. How do they get that first Id? I already told you: you gave it to them.

In my experience an attacker is going to find your vulnerable service by getting one of the links. They will quickly see that it doesn’t require authentication, and that the Salesforce Id is the primary parameter. That link has an Id in it. That means they have an Id in their hands from whatever email or other solution you used. They don’t have to start at zero and count up until they find your range. They aren’t picking a random starting point. Instead they will start with a number they know is in your range. After that they can just search up and down the number line to find a lot more.

Sure they will get a lot of misses – but who cares? Attackers don’t do this by hand, they use software that works fast and never gets bored. They also know things about the likely size of gaps, can spot common patterns, and hop around in smart ways to find likely valid Ids. Attackers spend a lot of time thinking about these problems. Always assume they know patterns that you did not know existed.

Solutions

There are two main ways to prevent these attacks:

  1. Use authentication for all web services – even when that’s hard or annoying.
  2. Use a truly random value to do the lookup and limit the attacker to one record.

If you put some form of reliable authentication in place you can make this problem largely go away. But it limits what you can do. That can work for event registration or member directories – but it doesn’t work for Unsubscribe Link or other unauthenticated use cases.

A properly generated UUID creates a random value that cannot be guessed or iterated. Unsubscribe Link leverages this approach to make it impossible to unsubscribe another person. The package assigns each Contact or Lead a UUID generated with Apex’s UUID class. The package then uses that UUID in the link included in the email. When someone clicks that, it opens a page which searches for the UUID and updates the matching record. It doesn’t have to be a UUID, there are other solutions to get truly random data, but UUIDs are a great solution in lots of use cases.

Depending on the setup, you can also create or use services to detect these kinds of attacks. It’s not uncommon to see but not something Salesforce provides directly, nor is it common to see with form providers and other services.

In the end, secure design is your responsibility.