Thursday, April 26, 2012

An Optimistic Approach to Identity


I've been working in the Identity Management area for a few years now, and I've seen three different industries up close (banking, insurance and telecom). What I'm struck by in all these industries is that none of them has historically been customer-centric in their business approach. For decades, banks have always looked at their customers through the prism of accounts, insurance companies through policies, and telecom companies through billing accounts and sometimes carriage services (broadband or mobile services). And everywhere, the holy grail is the same - "single view of customer". Identity and Access Management (IAM) is the way these organisations aim to achieve single view of customer as well as other benefits.

However, IAM initiatives at organisations in all these industries have generally floundered. Why?

I believe that IAM is simple but subtle. That's why although it's not hard to design and deliver an IAM system, it's also treacherously easy to get it wrong.

Some of the major reasons why organisations struggle with IAM are these:

1. Rather than bite the bullet and create a top-level data entity called "customer" with its own unique identifier, organisations choose what they consider a cheaper compromise because of a misplaced belief that using a surrogate for customer (i.e., account, policy, billing account) would somehow do the job. Reality check: it doesn't, and it's more expensive in the long run.

2. Even where identifiers are created for customers, these are not carefully designed. The result is that many identifiers that are chosen have business meaning. It's quite funny at one level to see a system designed with a person's email address as their identifier, and where the major business pain point is that it's very hard to handle the situation where a customer changes their email address. (Why are we not surprised?) Quite often in such cases, there is no other way around the problem but to delete and re-create the customer record.

3. Even where organisations avoid the first two mistakes and embark on an IAM initiative to tie customer data across multiple systems to a new, unique and meaning-free customer ID, they run into logistical problems relating to the existing user base. They struggle to "marry" records across systems to the appropriate customer entity because of the sheer volume of data involved, the cost of changing existing systems, the unreliability of matching algorithms and the need to replace engines while the plane is flying, so to speak. The two problems with matching algorithms are false positives (two or more customers being assigned the same identifier) and false negatives (a customer being assigned two or more identifiers).

I have some suggestions that can make life easier.

1. Create a database external to all existing systems that will maintain mappings. [Resist the temptation to migrate customer attributes from other systems to this one. This is just a mapping database, not a customer master. Use Master Data Management (MDM) principles instead to keep data in source systems in sync.]

2. Use a universally unique and meaning-free identifier for customers. Version 4 (random) UUIDs are a great scheme to use.

3. Adopt an optimistic model of "eventual consistency". I.e., generate a new customer UUID corresponding to every system record, in effect assuming (in the case of a bank) that each account belongs to a different customer, then pare them down to reflect known relationships. 

a) You can generate UUIDs for a system in an optimistic way because the probability of two UUIDs conflicting is infinitesimally low, even if you have hundreds of millions of customers. You can check for duplicates out of band if you're paranoid.

b) Similarly, you can optimistically generate UUIDs in a federated way (i.e., each system generates its own UUIDs corresponding to its surrogate records). The probability of conflict is so low it's worth doing this and checking for duplicates out of band.

c) You can afford to start with a system with a large number of false negatives (but no false positives) because this corresponds to a siloed organisation with no "single view of customer". False positives are a greater danger, and we avoid that with this scheme.

d) You can use the existing intelligence in your systems (i.e., the knowledge of which records belong to the same customer) to merge customer UUIDs relating to the same physical customer by eliminating all but one of them at random. Since UUIDs are meaningless, it doesn't matter which one you keep and which ones you remove.

Now you're no worse off than you were before in terms of data quality (i.e., your data is just as clean in terms of known relationships). But structurally, you're far better off because you now have a customer data entity for the first time. As your data quality improves with more reliable mappings, the siloes effectively disappear and you get to a "single view of customer" with no more changes to data structures or processes.

In the case of a telecom company, your mapping database will now consist of three parts. The first part will map customer UUIDs to billing accounts. The second part will map customer UUIDs to product holdings (mobile, broadband and other carriage services, media products, etc.) The third part will map customer UUIDs to other customer UUIDs to reflect corporate organisational structures and household relationships. With this model, the many problems that telecom companies currently face will simply melt away.

- We can see all the product holdings of a customer to determine what else to sell them. We can see this at an individual customer level as well as at the level of a household or organisational unit.
- We can sell media products even to customers who haven't purchased an underlying carriage service
- We can group billing accounts independently of product holdings. In a household, the kids use various products but mum or dad alone may pay the bill.

As you can see, this kind of design isn't hard. But it requires conceptual clarity around the nature of Identity. As I said before, IAM is simple but subtle. It isn't hard to design and deliver an IAM system, but it's treacherously easy to get it wrong.

Monday, April 23, 2012

Dimensions of Decoupling


I was in a meeting at work discussing deployment strategies for various SOA components. Let me take a subset of the problem as today's topic.

The issue was how to deploy a bunch of ESB instances and App Server instances onto server boxes. One of the infrastructure guys said he preferred to have one ESB instance and one App Server instance on each box for ease of administration. Now, since we were going to run the instances on virtual servers, I suggested that we not worry about it at a SOA topology level. We would only talk about virtual servers running ESB instances and virtual servers running App Server instances. If the infrastructure guy wanted, he could always run a virtual server of each type on each physical box and would then get what he wanted. He had the ability to tune the configuration, allocating 0.5 CPUs to each virtual server, etc.

He objected to that idea, saying the performance of the virtual servers was bad if he deployed them like that. He wanted to deploy one instance of the ESB and one instance of the App Server on each virtual server.

I said, "OK, then deploy the two instances on the same virtual server, but don't assume that they're on the same virtual server."

This statement was a bit too cryptic for the others in the room, and they asked me to explain. This is how I explained it:

Assume that you have to write a deployment script that installs one instance of the ESB and one instance of the App Server on a virtual server. There are at least two ways you could do it.

Script option 1:
DEPLOY_ADDR=192.168.1.2
# Deploy ESB to $DEPLOY_ADDR
# Deploy App Server to $DEPLOY_ADDR

Script option 2:
ESB_DEPLOY_ADDR=192.168.1.2
AS_DEPLOY_ADDR=192.168.1.2
# Deploy ESB to $ESB_DEPLOY_ADDR
# Deploy App Server to $AS_DEPLOY_ADDR

In both cases, the scripts will deploy one instance of the ESB and one instance of the App Server onto a single server.

However, in the first script, the two servers are assumed to be the same. In the second, they happen to be the same. That is the difference between tight coupling and loose coupling.

In the second script, a simple change in the value of (say) AS_DEPLOY_ADDR to 192.168.1.3 will see the two instances running on different servers. This is not possible with the first script. Changing DEPLOY_ADDR to 192.168.1.3 will move *both* instances to the new server address but will not separate the two instances.

I was therefore recommending the approach exemplified by the second script - deploy the two instances on the same server if you must, but don't hardcode the assumption that they are on the same server into your scripts.

It took a while for this concept to sink in, but the idea was finally accepted.

I guess a decade of SOA experience has sensitised me to looking out for needless dependencies, but most people in IT still don't think this way. I wonder how much rigidity and subsequent operational inefficiency is in IT systems all over the world because people are not sensitised towards the elimination of needless dependencies.