Tuesday, September 11, 2012

Apple's UDID Leak and the LIMA Solution


A million UDIDs (Unique Device IDs) that identified Apple devices were leaked this week, and it was hinted that this was just part of a set of 12 million UDIDs that had been stolen.

From an Identity Management perspective, the questions that arise are:
  • Is this serious?
  • Could this have been prevented?
  • And (if anyone is perspicacious enough to ask), is there an architectural approach that will render such leaks harmless?

The answer to the first question so far appears to be no, because the UDIDs themselves are hopefully meaningless. They tend to be associated with meaningful data, and it is these associations that are privacy-sensitive. As long as privacy-sensitive data indexed by UDIDs is not also leaked, this leakage of IDs need not be a disaster.

Being cynical, I believe that the answer to the second question is always no. To quote Dr Ian Malcolm of Jurassic Park, "Life will find a way". I don't believe any secret is perpetually or provably safe. Think Wikileaks.

Which brings us to the third question, which hardly anyone even thinks of asking. Note that there is a big difference between asking the obvious but naive question (which was our second one), "Is there a way to prevent such leaks from taking place?" and asking the smarter one, "Is there a way to prevent (inevitable) leaks from having an impact?"

I believe there is such a way, based on my experience of designing loosely-coupled IAM systems. Loose coupling not only makes housekeeping tasks such as splitting and merging identities easier, it can help to recover relatively painlessly from events like the leakage of Apple's UDIDs. Now, I'm not an Apple fanboy and have studiously avoided being sucked into that closed ecosystem, so I'm not quite sure how exactly it operates. They may already be using a variant of the scheme I am about to describe, and if so, good for them.

LIMA (Lightweight/Low-cost/Loosely-coupled Identity Management Architecture) has the notion of multiple identifiers for an entity. There is a meaning-free internal identifier that is private to a system and not meant to be shared with third parties or systems. And then there could be multiple external identifiers, which may be either meaningful or meaning-free, that may be shared with third parties. The privacy of the internal identifier is not so much for confidentiality as it is for flexibility, as we will see. (Confidentiality is an orthogonal concern that is ensured through other means.)

Under the LIMA scheme, Apple's UDID would be classified as an external identifier because it is shared between third parties such as app developers and customers themselves. If Apple and its partners followed LIMA, they would all use separate internal identifiers in their systems that are distinct from the UDID. They would each map these internal identifiers to the shared UDID and associate their data elements with these internal identifiers, not with the UDID itself.

UDID <===> Internal Identifier <===> All other data elements

Once a set of UDIDs was (inevitably) leaked, it could be replaced with a minimum of fuss. Anyone concerned about the leakage of their UDID could log into Apple and request another one. They would first need to be authenticated, of course. Since we are not talking about a leakage of passwords here, the authentication would still be relatively foolproof. Once the identity of the user (device) was established (i.e., the internal identifier was determined from the UDID), another UDID could be easily generated and associated with the internal identifier. The new UDID could then be shared with all authorised partners that used the old one. The old UDID would be marked "retired" and de-linked immediately from the internal identifier, and thereby from all privacy-sensitive data. Thus the leakage of UDIDs could be rendered harmless.

What if the internal identifiers used by Apple or one of its partners were themselves leaked? Well, though this appears at first glance to be much more serious, we did say that the private nature of the internal identifier was not for confidentiality but for flexibility. It's in fact even simpler to replace internal identifiers because they're not shared with any external party. Simply generate new meaning-free identifiers to replace the old ones wherever they are used within the organisation. There should be systems to do this as a matter of routine, because any dependence upon the specific value of an identifier is a source of rigidity and brittleness.

Loose coupling through two levels of identifiers provides flexibility. If the UDID is the only identifier (internal as well as external) used in the Apple ecosystem, then they're ninnies.