Tuesday, December 22, 2009

The Coming Overthrow of XML - Orderly Makes Further Strides


My feeling that XML is due to be dethroned grows stronger by the month.

A quick recap of recent history:

First, JSON offered a simpler data structure than the angle bracketed format of Unicode XML. But that still lacked rigour around data definition, so even though XML suffered the confusion of having at least three competing schema definition languages (XML Schema, RelaxNG and Schematron), the world did have a way to specify data types, formats and constraints with XML that JSON could not match.

Then, thanks to Kris Zyp, JSON Schema appeared on the scene and plugged the rigour gap. JSON Schema parsers now exist for a number of languages including Java. One of the design decisions of JSON Schema was for a schema document itself to be valid JSON, much as XML Schema is itself valid XML. Unfortunately, this meant that brevity wasn't JSON Schema's strong point because the JSON way of expressing properties is necessarily long-winded.

The third shoe has dropped now (never mind the grotesque image that conjures up of the wearer). Orderly is a new schema language developed by Lloyd Hilaiel that is far more compact than JSON Schema and yet round-trips to JSON Schema quite effectively. [There's a really cool Ajax-y screen that converts back and forth between Orderly and JSON Schema before your eyes, so you can tweak either code to see how it looks in the other representation.]

Bottomline: SOA architects can recommend the use of the simpler JSON data format instead of XML without having to worry about the lack of rigour in data definition. Data architects, designers and developers can use Orderly to design schemas without bothering about JSON Schema's cumbersome syntax. JSON parsers can work with the equivalent JSON Schema to validate a piece of JSON data without the need to understand two different syntaxes.

A great solution, and it's all come together quite nicely in time for Christmas. Thanks to Lloyd (and Kris before him) for a wonderful Christmas present to all SOA practitioners, and ultimately, everyone wrestling with XML in any capacity.

The Meaning of Open - By Google


A friend sent me this link to a piece written by Jonathan Rosenberg of Google on the meaning of the term "open". This is old hat to those of us who have already seen the light, of course ;-). [Rosenberg's calisthenics when he then tries to justify the closed bits of the Google ecosystem are quite amusing.] But to people who have not given much thought to openness and tend to follow the herd on technology (the bigger the brand name, the better), this open letter may hold many eye-opening insights (all puns intended).

I would perhaps have said what Rosenberg did in half the length, but brevity isn't a necessary quality of openness, so I'll forgive him :-). The main danger of the unnecessary length is it may just cause some of the audience to stop reading before the end, when Rosenberg delivers his most inspired paragraph:

Open will win. It will win on the Internet and will then cascade across many walks of life: The future of government is transparency. The future of commerce is information symmetry. The future of culture is freedom. The future of science and medicine is collaboration. The future of entertainment is participation. Each of these futures depends on an open Internet.
Amen to that. In fact, that's worth repeating in a more structured form:

The future of government is transparency.
The future of commerce is information symmetry.
The future of culture is freedom.
The future of science and medicine is collaboration.
The future of entertainment is participation.

I would like to analyse these in greater detail and add to/modify the list, because I'm sure this is incomplete.

For now, this is a document that is worth circulating to our brand name-dazzled colleagues. After all, Google is one of the biggest brands out there, so if Google is endorsing openness, there must be something in it ;-).

Now if only IBM would come out with an OpenTM line of products, we would be willing to write a cheque...

Friday, December 11, 2009

Is Canonical Trying to Purge Ubuntu of the L-word?


I don't think much of revisionist history, and biting the hand that feeds isn't an endearing trait.

Visiting the Ubuntu site today after a while, I was unpleasantly surprised that I couldn't see the word "Linux" anywhere. After trawling the site exhaustively, I did find two or three references, and I leave it as an exercise for the reader to find more. Warning: You'll have to search really hard.

Under "What is Ubuntu?", the site says, "Ubuntu is a community developed operating system that is perfect for laptops, desktops and servers".

What's up, guys? Does it hurt a lot to use the phrase "based on Linux" somewhere in that sentence?

Canonical and Ubuntu, great as their contributions have been, would be nowhere without Linux, especially the Debian distribution. So why not acknowledge that debt? Why try to pass Ubuntu off to newbies as a completely original operating system with no ties to Linux?

On the same page, right at the bottom, there's a section titled "What does Ubuntu mean?" and it goes on to explain, "Ubuntu is an African word meaning 'Humanity to others', or 'I am what I am because of who we all are'."

How apt. Dear Canonical, why not show some Ubuntu (humanity to others) and acknowledge that you are who you are because of what Linux is?

Thursday, December 03, 2009

Advice To Organisations Embarking On SOA Today


I have been involved with SOA in various roles over the last two to three years, and my thinking has evolved a fair bit over this period. If I was asked to advise an organisation embarking on a major SOA initiative today, I would probably say this to them:

1. The End Goal: Remember that SOA is not about integration but about inherent interoperability. Think health, not medication. SOA is about raising the capability of your systems such that they can easily and inexpensively integrate with others, not about introducing a new, slick technology that will connect your systems together more easily. Simplifying the components of your enterprise and making them easy to understand and connect to will give you SOA. So keep simplicity at the back of your mind all the time, and don't confuse it with expediency, which is the path of least resistance. Simplification could take some effort.

2. Domain Models: Don't waste too much time searching for "the" canonical data model. Most off-the-shelf ones are too high-level and abstract to be useful. And building your own comprehensive dictionary is wasteful and time-consuming. Instead, identify logical owners of different elements of data and let them own the data dictionary for those elements. All services exposed out of these logical domains should use these definitions and it is the responsibility of service consumers from other domains to understand them. Processes that combine services crossing such domains should perform their own mapping between similar data elements. It's necessarily messy and plenty of out-of-band communication will be required at design time. After all, even similarly-named elements may suffer from subtle interpretation errors, so manual discussion and clarification will always be part of a service or process design.

This is not as bad as it sounds because only a subset of data elements managed by a domain is exposed through its service interface, and it's only these that may need translation to the context of their consumers. Don't look to do away with this manual effort through a single canonical data model. That's a wild goose chase, so don't even start.

3. Infrastructure and Connectivity: Try and avoid using message queues unless you're looking at low latency situations like a trading floor, or if there's simply no other way to interface to a legacy system. The queuing paradigm introduces various coordination issues into application design, and implementing message queues requires establishing more complex patterns to solve these gratuitous problems. [I have a larger, philosophical argument about the need to innovate an application protocol on top of an asynchronous, peer-to-peer transport, but let's not confuse the current set of recommendations with that idea.]

In today's world, HTTP-based communication patterns backed up by databases will often do the trick more simply than expensive message queues. Look beyond the apparent need for reliable message delivery. Often, an idempotent operation will suffice to meet the real requirement, and this is quite a standard pattern to implement. Often, queues are used in synchronous (blocking) patterns anyway (to avoid the coordination problems I talked about), so nothing is being gained in an architectural sense by the use of queues. And even asynchronous communications, where required, can be implemented in standard ways over HTTP, so HTTP is quite a universal protocol to use as the logical infrastructural element for your SOA.

ESBs, Service Directories and other "governance" components are often only required to manage the complexity that they themselves introduce. It's amazing what you can achieve with a farm of simple web servers and a database, and still keep things simple and understandable.

4. Service Enablement: Try and avoid the entire SOAP/WS-* stack of technologies if you can. There is a significant complexity overhead associated with this set of technologies, and you will need an expensive toolset to (partially) simplify its use. Look seriously at REST instead. Even though REST advocates don't make the case strongly enough (and sometimes see SOA as an antithetical philosophy), REST is in fact a valid way to do SOA and can usually help to deliver solutions at much lower cost and complexity. The hard part about doing REST is finding good people who can think that way. REST is subtly different from the SOAP/WS-* approach, even though they may just look like different kinds of plumbing to move XML documents around (and I confess that's the way I initially sell REST to corporate skeptics brought up on a diet of vendor-provided Web Services propaganda).

5. Data Contract: Consider alternatives to XML for the data contract. Though this sounds like heresy, XML is heavyweight and cumbersome, and XML manipulation tools in high-level languages (with the possible exception of E4X in JavaScript) are clumsy to use and suffer major impedance mismatches. You will spend more time wrestling with XML than on the service itself. Although many in the web world will immediately recommend JSON, raw JSON is not sufficient to ensure data integrity, because it has hitherto lacked a strong schema definition capability. Maintain a watching brief on the JSON Schema proposal, submitted for approval as an IETF standard. Already, there are JSON Schema libraries in many high-level languages such as Java. It should be possible to define data contracts with as much rigour as with XML, but at a much lower level of complexity. A newer and more compact JSON Schema representation called Orderly is also maturing, which makes this approach simple as well as easy.

So instead of going down the XML rabbit-hole, start with JSON anyway, and incorporate JSON Schema/Orderly as it matures. You will find this works especially well in combination with REST. A quick Proof-of-Concept may convince the skeptics in your organisation (although the opposite result may also occur, with many going away convinced by the speed of this approach that it's either simplistic or too good to be true!)

6. Web Service Implementation: If you're trapped by circumstances into an XML-and-SOAP/WS-* approach, look at the WSO2 suite of commercially-supported Open Source products. Especially look at the WSO2 Mashup Server. Don't be fooled by the name. It's more than just a mashup server. It's a service orchestration engine that (curiously) uses server-side JavaScript as its programming language. The major advantage of JavaScript is the ability to use the E4X library to perform extremely straightforward XML manipulation. Once you use E4X, you will never go back to JAXB or any other XML-processing library. WSO2 Mashup Server allows SOAP or REST services to be consumed, combined and orchestrated, and in turn exposes SOAP or REST services. It's a good way to hedge your bets if you're only half-convinced about REST. The WSO2 suite is also much less expensive than its proprietary rivals, although the real expense is in the heavyweight approach that it unfortunately shares with them.

7. The Paradox: SOA is really all about simplicity, but it's hard to find SOA architects who seek to simplify systems. Conventional SOA practice seems to be about making integration complex through heavyweight approaches, then introducing tools to manage that complexity, tools that require specialist skills to use properly. If done the conventional way as most SOA consultants seem to agree, your SOA initiative will only leave you with additional complexity to manage.

Of course, if you're politically inclined, you will bask in the prestige of a hefty budget and a large team, and can declare victory anyway on the basis of the number of services and processes you have delivered. But if you want to be really successful at delivering SOA (i.e., making your business more agile and able to operate on a sustainably lower cost basis) while keeping your burn rate low along that journey, you would do well to look at boring, unimpressive and even anticlimactic approaches and technologies such as the ones I've listed above. Give the big vendors a wide berth. You don't need to buy technology (beyond the web servers and databases you already have). You certainly don't need to buy complex technology, which is what the vendors all want to sell you.

And don't let the lack of grandeur in this approach worry you. Complexity impresses the novice, but results are what ultimately impress all.

Postscript: Vendor REST is coming. Beware.

Monday, November 16, 2009

Spring Roo Makes a Quiet Debut

This has always been the Holy Grail (pun definitely intended!): To design an application using just object-orientation as the paradigm, and to have persistence, a service interface and a sensible user interface all generated for you automatically. In other words, just define the core behaviour of the application's objects, and you have a working web application with non-visual service interfaces to boot. This is the Domain-Driven Design dream (DDDD?)

Spring Roo has been talked about for a while, but Grails beat it to market. However, it's better late than never for the product from the SpringSource stables, and Release Candidate 3 is now available.

For a teaser on what Roo can do, see this video. It's a third-party video based on RC2 (and you'll see some syntax differences when you start to play with RC3), but the spirit is the same.

Exciting stuff!

Update 17/11/2009: I sent the video link to a friend and colleague for his feedback. He 's a senior architect and based on his experience, he believes Roo needs to provide an Eclipse-based graphical interface rather than (in addition to?) a command-line interface. With that, he believes it's a winner.

Wednesday, November 11, 2009

Cisco Buys Jabber - What Does It Mean?

Cisco has bought Jabber (another of those purchases of Open Source companies that make no sense to me - what are they really buying when the IP is public?)

What it does indicate to me is that the Presence and Instant Messaging protocol pioneered by Jabber (XMPP) is now becoming a more widely adopted standard. GoogleTalk and Avaya have been using XMPP for a while now, and Cisco has just joined the bandwagon.

My idea of a Unified Communications architecture is something that leverages SIP/SIMPLE, XMPP, HTTP and the open email protocols, rather than a vendor-proprietary stack. Microsoft and IBM have interesting stories in the UC space, but I would argue that this would be precisely the wrong time for an organisation to go with a proprietary suite of products. There's no such thing as a free pair of handcuffs.

I had drawn up this architecture diagram on UC some time ago but didn't publicise it because it seemed a bit theoretical at the time. But now it seems to be more relevant with the strengthening of XMPP and SIP/SIMPLE.

Thursday, November 05, 2009

FST Media Annual Technology and Innovation Seminar Day One (05/11/2009)

I attended FST Media's 4th annual seminar on technology and innovation themed "The Future of Banking and Financial Services" at the Hilton in Sydney.

Some quick impressions:

Don Koch, CEO of ING Direct, delivered the opening keynote and spoke about what customers will want next (answer: who knows?). (Incidentally, every single talk was billed as a "keynote", which leads one to question if FST Media knows what a keynote really is.) He made a number of points but none stuck in my head as anything particularly insightful.

Rocky Scopelliti, General Manager, Financial Services, Industry Development, Telstra Enterprise & Government, spoke next about Gen Y and made some interesting points. Essentially, he was reporting on research by noted Australian sociologist Hugh Mackay. Although Gen Y-ers have a reputation for being difficult to please and notoriously non-loyal, the research suggests that Gen Y-ers are in fact influenced by their parents, the Baby Boomers, who in turn grew up influenced by two opposing forces - on the one hand, abundant and growing prosperity that promised a better tomorrow, and on the other, by the threat of the Cold War and the possibility that there may be no tomorrow. The unique solution to this was the demand for instant gratification - enjoy life like there's no tomorrow. So Gen Y is not really demanding and disloyal; they're just keeping their options open for as long as possible.

I forget whether Koch or Scopelliti made this point, but it was interesting that nowadays, all you need to establish a person's identity is their date of birth and mobile phone number. People tend not to change their mobile numbers, even if they switch providers. More on this later.

David Boyle, CIO, International Financial Services, Commonwealth Bank next spoke about cloud computing at CBA. The interesting aspect of his talk was his call for fellow customers to get together with him and his team to define some basic standards for cloud computing, because he's afraid of the same thing that I am, i.e., that some vendor-proprietary technology will become the de facto standard for cloud computing which will make it unnecessarily pricey.

There were a few eminently forgettable panel discussions (with the only exception being the one on which Dhiren Kulkarni, joint CIO of St George Bank, was being needlessly competitive and obnoxious towards his fellow panelists. He earned a rejoinder at one stage from Pravir Vohra, CTO of ICICI Bank, which was roundly applauded.)

The next talk was billed as an "International Keynote" and was delivered by the aforementioned Pravir Vohra, who provided a few interesting insights into the growth of ICICI Bank. It's been almost 15 years since I left India, and I didn't know ICICI Bank had become the second largest Indian Bank. I guess it would still be a distant second to State Bank of India, though. SBI has more branches than some banks have customers :-).

Vohra would be the envy of CTOs everywhere, because he says he has 25% of the technology budget to play with. I.e., he's free to invest in various technologies and initiatives. If they work out, he "transfer prices" the benefits to the business units that he can serve with it. I have always felt that a similar system is required for Enterprise Utilities. Neither the "first project pays" model nor a Big Bang enterprise rollout in a single financial year is the right one. We need an accounting treatment for Utilities akin to R&D, with carryovers possible across financial years.

I was very pleased to hear that ICICI Bank only uses OpenOffice, not Microsoft Office. Vohra estimates that this choice saves his bank more than $15 million a year, even more than the projected savings from cloud computing! Who said the Third World lags behind the First? Sometimes they lead the way.

One thing Vohra said that didn't seem quite right was his contrast of consumer behaviour in India with that in Australia - he said Indians change their mobile numbers every six months! Apart from the fact that the Indians I know haven't changed their mobile numbers in quite a while, it seems very counter-intuitive behaviour. Why would anyone make life more difficult for themselves in this manner?

The three sessions after lunch and before tea were uniformly missable.

David Cartwright, Group Managing Director, Operations, Technology and Shared Services, ANZ was probably the best of the lot. He spoke about ANZ Bank's "Journey to a Super-Regional Bank". An interesting aspect was the emphasis ANZ Bank places on its new corporate headquarters in Melbourne that has a 6 star green rating. It resembles a skyscraper on its side and is deliberately designed to have a huge floor area on a limited number of floors so as to maximise interaction between people. Another interesting technology they have is "telepresence", which is very high quality videoconferencing, which gives the impression that remote participants are actually in the room on the opposite side of the table.

The aforementioned Dhiren Kulkarni spoke next, and repeatedly parrotted his boss Gail Kelly's favourite phrase "delighting the customer" ad nauseam. You don't get more patronising, defensive, competitive and humourless than this bloke. Not very good PR for Westpac-St George to trot him out at industry seminars.

Rounding out the boring lot was Darren Covington, Vice President, Asia Pacific Japan, Enterprise Software Applications, Hewlett Packard. He spoke about churn and the desirability of retaining customers because it's far cheaper than acquiring new ones (a well-worn concept), but added no new insights. My cynical reaction was that with the Australian banking sector being as oligopolistic as it is, a customer who walks out the door is readily replaced by another bank's customer who walks in. There are only four banks in the game, so where will customers go anyway? It's not so much churn as a nice merry-go-round.

After tea, we had Randy Fennel, General Manager, Engineering and Sustainability, Technology, Westpac, talking about sustainability, which again was a bit ho-hum.

The last session of the day was by far the best. Phil Hopper, founder of iGrin spoke about peer-to-peer lending. This is an extremely interesting concept that I hadn't heard of before. There have been players like Zopa, LendingClub, etc., who have all fallen on hard times now, both because of the Global Financial Crisis and because of some increased regulatory requirements.

Peer-to-peer lending exploits the big spread enjoyed by the large institutional lenders. Banks borrow from individuals at 5% (the deposit interest rate) and offer personal loans to individuals at anywhere over 12%, sometimes going up to 18% in the case of some credit cards. This means that there is a spread of at least 7% here to be exploited if individuals lend direct to each other. A P2P lending marketplace can take 2%, leaving 5% to be split between lender and borrower, with both consequently better off than by going through banks. Hopper described P2P lending as "eBay for money".

He also spoke about Australia currently operating on a model of negative credit data only, which makes it hard to assess the creditworthiness of most people. But there is apparently some privacy-related legislation due shortly that will remove impediments to acquiring (positive) credit data.

Other points were the subtle differences between microfinance, lending to the underbanked, and prime lending. Hopper pointed out that by ceding a degree of control to customers, P2P lending agencies allowed innovation to happen, such as "blenders" (arbitrage players who use their superior credit rating to borrow at lower rates and lend to others at a slightly higher rate), and people who run Self-managed Super Funds who invest (lend) from the fund into the P2P market and then borrow the money right back. Some of this sounds dodgy to me (and to the Tax Office, I'm sure), but it certainly is innovative.

In spite of the temporary problems faced by P2P players including iGrin, Hopper sees a good future for the concept. There are niches (after the Online Auction itself) that can be independently filled by various players, such as Identity Verification, Credit Assessment, Fulfillment/Settlement (both contracts and statements) and Collections. A bill likely to be passed by parliament soon (the Private Properties Securities Bill) will also allow people to securitise their own assets like their property, enabling lending to become more "secured". Once a secondary market develops, it will be possible for the P2P segment to expand from its personal lending base (with a typical term between 12 months and 5 years) into the mortgage market (with terms of upto 30 years). This is because although the typical small lender would not be willing to enter into 30-year loans, the emergence of a secondary market will make this possible, since debt obligations like these can themselves be traded. Wholesale funds are another step along the development of P2P lending.

All in all, a most fascinating concept, and Phil Hopper did an excellent job of presenting it. It seemed a little too slick, though. As moderator and host Michael Pascoe asked him, I wonder if this is a case of amateurs attempting to take on the professionals. To a more pointed question, Hopper defended the eventual viability of the "amateur" P2P market even against the eventual entry of the "professionals", and Pascoe insightfully remarked that he would rather have had Hopper express a desire to be bought out by a bank.

That was the end of the first day.


Friday, October 09, 2009

Rare Home Truths about Windows

I never expected to actually read such news. A senior police officer spoke to members of parliament and candidly told them that they should use Linux if they expect secure Internet banking.

I guess the truth always comes out in the end. Microsoft has lies, hush money and non-tech savvy users on its side, but as Lincoln said, you can't fool all the people all the time.

Of course, I could also tell that we still have a long way to go.

The collection of MPs listening to van der Graaf were very enthusiastic about his suggestion but didn't understand what he meant and asked for clarification.

"You may need to explain further for us," said one MP, while another responded, "yes, we need to understand that".
On the brighter side, knowledge comes from asking questions and finding out more. When lay users find out more about Ubuntu Linux, I doubt if many will stay with Windows. The success of Firefox proves that people are not wedded to Microsoft's products.

Tuesday, October 06, 2009

JSON Schema becomes more Orderly


I have been convinced for a while that just like REST will gradually displace its more heavyweight SOAP/WS-* equivalent, JSON will slowly displace the mighty XML in its various strongholds (today Web Services, tomorrow the world :-). But to do that, JSON first needs to incorporate some rigour into its definition, using an equivalent to XML Schema, Relax NG or Schematron.

The JSON Schema proposal seemed to fit the bill quite nicely, but I was always vaguely uneasy that it was so verbose. There was probably no escape from that, since one of the requirements was that JSON Schema should itself be valid JSON (otherwise two parsers would be needed to consume a snippet of schema-compliant JSON).

Now along comes another schema syntax for JSON called Orderly, which has the twin advantages of being succinct and being able to round-trip to JSON Schema. The syntax has already been revised with inputs from commenters, and is looking much better in its second version.

Orderly's main advantage is its human-readability and -composability. Its simplicity (with no loss of rigour) will give JSON (and JSON Schema) the impetus they need to challenge XML. If Orderly catches fire, I believe it will accelerate the adoption of JSON for serious service-oriented work.

It's overdue.

Tuesday, September 22, 2009

REST is Polymorphic CRUD

I've been trying to evangelise REST a bit lately and have had mixed success. There is cautious interest. But there are also huge conceptual hurdles to be overcome. Pete Lacey said it best about enterprise IT folk when it comes to REST: They Can't Hear You.

One architect looked at the definition of a service interface I proposed and thought it a bit "bland". Perhaps it just needed a big WSDL file, lots of XML and SOAP faults!

Another common reaction to REST when it's presented is that "it's just CRUD", with the implication that it's just too fine-grained to be used to create good business services. I've been struggling to explain that just because REST uses four HTTP verbs that correspond roughly to CRUD operations, it doesn't necessarily mean that REST is a CRUD approach to manage data at a very low and detailed level. The resources on which the verbs operate can be arbitrarily coarse-grained. But what has eluded me so far is a succinct term that can drive the point home.

I think I've finally found it - "Polymorphic CRUD".

IT folk in the enterprise understand both polymorphism and CRUD, so the combined term should make sense. I want to drive home the point that a verb itself is neither coarse- nor fine-grained, it's how each resource interprets it. Fine-grained resources will interpret the REST verbs as CRUD operations. But more coarse-grained resources can interpret the verbs as any arbitrary business operation.

Accompanied by the appropriate payload, POSTing to the resource "/applications" is nothing but submitting an application. There's no need for a specific "submitApplication" method.

I've also realised that one can clear a process inbox by DELETEing a "/pending" resource, with a standard WebDAV status code in response (207 Multi-Status), indicating that different items encountered different status codes during the batch process.

It's the way the verb is interpreted by each resource that gives it its meaning in that context. Therefore the REST approach is to manipulate business objects of arbitrary size and complexity through polymorphic CRUD operations.

I hope that gets people to go "Aha!"

Monday, September 14, 2009

The Answer to Complexity

This sounds trite, but long arguments with colleagues in IT have convinced me that this is anything but obvious and therefore needs to be explicitly stated.

Guys, the answer to complexity is not better tooling.

It's simplicity.

Tuesday, August 11, 2009

An Enterprise Technology Stack for Business Agility and Sustainably Low Cost

I have been musing about this for a few years now, but some recent discussions with colleagues caused my thinking to finally crystallise. I realise that the two enterprise IT objectives of delivering business agility and sustainably low cost are both realisable through a fairly simple, though disciplined, architectural approach. There is no real trade-off between flexibility and cost-effectiveness, although hard choices need to be made and problematic technologies need to be fearlessly called out.

I've attempted to build this logic out into an argument in this presentation. I'm sure there will be many who will object vigorously to various aspects of this argument, but I'm confident that this logic will appear obvious in hindsight within a decade.

For those who don't have the patience to go through the entire presentation, the final technology stack is illustrated below.


Wednesday, July 22, 2009

Modelling Resources from First Principles

I've been providing architectural advice to a group of colleagues who are building a set of services. Without going into too much detail, they need to uniquely identify some entities. Clients of the services use these identifiers as references when they return to make related queries on these entities. They've proposed using UUIDs as the unique identifiers, and while I liked the idea, I thought it was too simplistic. There was more to the requirement than just unique identifiers.

They're actually dealing with two types of entities - widgets (say) and requests for widgets. These are different because a request can pertain to a set of widgets, so it may be necessary to model them distinctly. The service interfaces dealing with requests and widgets may need to be distinct as well.

Mind you, these services are not going to be REST services. But having been exposed to the RESTian way of thinking, I immediately thought of resource representations for the two types of entities. Rather than plain old UUIDs, I thought there should be a degree of structure around them (but not so much detail as to make the scheme brittle and inflexible).

Something like these, in other words:

http://www.mycompany.com/widgets/4f138ff2-362f-4e35-8f9e-173290fe86d7

http://www.mycompany.com/widget-requests/eabf5bdb-9800-4af5-9ad7-32c3b95fc48a


However, this suggestion proved to be a surprisingly hard sell. The cross-examination was withering.

Why all the extra information?
Why http://?
Why www.mycompany.com?

Why not a simpler scheme like these examples show:

widget:4f138ff2-362f-4e35-8f9e-173290fe86d7


widget-request:eabf5bdb-9800-4af5-9ad7-32c3b95fc48a


I found I had to retrace my steps and work through my reasoning from first principles. In the process, I learnt a great deal about naming.

My initial response was to point my colleagues to points 7 and 8 of "Common REST Mistakes", where we are admonished not to try and invent our own proprietary object identifiers, and not to try for "protocol independence" (i.e., avoid HTTP URIs). But this wasn't too convincing.

I made a bit of progress by getting agreement on the following:

1. It probably made sense to distinguish between widget identifiers and widget-request identifiers, so some sort of prefix to distinguish between them was necessary. UUIDs alone were probably not enough.
2. It also probably made sense to specify the "domain" within which these resources were being identified, so the "mycompany" string probably belonged somewhere as well.

But then, why not just these:

mycompany:widget:4f138ff2-362f-4e35-8f9e-173290fe86d7


mycompany:widget-request:eabf5bdb-9800-4af5-9ad7-32c3b95fc48a


Frankly, I hated this. My point was that such a format, even though "simple", would have to be explained to anyone looking at it. The structure wasn't immediately obvious. Worse, it was ambiguous and could be extended by later designers in ways that violated the original designers' intent. To this, the counterargument was that the knowledge of the format was only required on the server side. To the client, the whole name was just going to be an opaque string, - a reference ID.

I wrestled with this objection for a while. Then I proposed a guiding principle that given a choice between two naming conventions, a universally understood one was preferable to one that we made up ourselves, provided it wasn't unnecessarily complex.

My research led me to the definition of a URN (Universal Resource Name). What I learnt from this was that in order to name something, we first need to specify a "scheme" that then defines what the rest of the name denotes according to the predefined format for that scheme. The name of the scheme is followed by a colon, then the rest of the name is something that can only be interpreted according to the rules specified by that scheme.

In other words, a standard name (URN) looks like this:

<scheme name>:<some scheme-specific format>

A common example is

"http://www.mycompany.com/widgets/4f138ff2-362f-4e35-8f9e-173290fe86d7"
(Heh!)

It's important to point out that "http" in the string above does not refer to the HTTP protocol! It's the name of a "scheme". What does this mean?

Well, in the URN "file:///home/ganesh", the string "file" is not a protocol, because more than one protocol may be used to get to the file.

Similarly, in the URN "mailto:ganesh@mycompany.com", the string "mailto" is not a protocol. SMTP is the actual mail protocol.

[For those familiar with XML namespaces, when we say "xmlns='http://www.example.com/schema'", the URN being referred to here is not necessarily a web page that one can point a browser at. It just needs to be a unique string.]

So we're not necessarily modelling our resources as web resources. All that the "http" scheme defines is that after the colon (":"), there is a scheme-specific structure that specifies a few things.

There are two slashes, then there's a dot-separated domain name, then a slash, then a "resource path" which is itself slash-separated. So that's what a URN conforming to the "http" scheme looks like:

"http" (the name of the scheme)
":" (the colon separating the name of the scheme from the scheme specific structure. This is from the basic definition of a URN)
"//" (the "http" scheme just specifies this, OK?)
"www.mycompany.com" (this is the dot-separated domain name)
"/" (this is the first slash that signifies that the domain name is terminated)
"widgets/4f138ff2-362f-4e35-8f9e-173290fe86d7" (all of this is the "resource path" , and internal slashes are possible, as we can see)

So now going back to our guiding principle (using a well-understood format is preferable to rolling our own) as well as the two points on which there was agreement (i.e., that we may need to qualify the resource's UUID with the type of resource as well as the organisational domain), it looks like the "http" scheme of the URN naming standard fits the bill. This is a well-understood way to include both a domain and a resource path to provide some structure around an already unique ID.

I concede that the "www" prefix of the domain could confuse. All we really need to identify the domain is "mycompany.com".

And so, a unique, standards-based and minimal way to name resources in this business domain would be

http://mycompany.com/widgets/4f138ff2-362f-4e35-8f9e-173290fe86d7

http://mycompany.com/widget-requests/eabf5bdb-9800-4af5-9ad7-32c3b95fc48a


Tuesday, June 23, 2009

What's Wrong With Vendor Lock-In?

A colleague asked me this question today, and I'm really glad it came out in the open, because it's so much easier to deal with ideas when they're plainly stated.

I believe there are at least two strong reasons to actively avoid vendor lock-in.

The first reason, paradoxically, lies in the very justification for agreeing to be locked in - "ease of integration". If an organisation already has a product from a certain vendor and is looking for another product that needs to integrate with this one, then conventional thinking is to go back to the same vendor and buy their offering rather than a competitor's. After all, they're far more likely to be "well integrated". There are typically so many problems integrating products from different vendors that it doesn't seem worth the effort. The best-of-breed approach leads to integration problems, so customer organisations often throw in the towel and go for an "integrated stack" of products from one company.

This approach is antithetical to SOA thinking, however. What they're really saying here is that they don't mind implicit contracts between systems as long as they work. But implicit contracts are a form of tight coupling, and as we know, tight coupling is brittle. Upgrade one product and you'll very likely need to upgrade the other as well. In production systems, we have seen product upgrades delayed for years because the implicit dependencies between "well-integrated" components could cause some of them to break on an upgrade, which is unacceptable in mission-critical systems. As a result, many useful features of later versions are forfeited. That's one of the unseen, downstream costs of the tight coupling that comes from vendor lock-in.

SOA shows us the solution. For robust integration between systems, we need loose coupling between them, which seems a bit paradoxical. Shouldn't tight coupling be more robust? Hard experience has taught us otherwise. But what is loose coupling? It's a hard concept to visualise, so let's define it in terms of tight coupling, which is easier to understand. In practical terms, loose coupling between two systems can be thought of as tight coupling to a mutually agreed, rigid contract rather than to each other. Such contracts are very often nothing more than open standards.

Even though people generally nod their heads when asked about their preference for open standards, the contradiction between that stated preference and their practical choices in favour of "integrated stacks" is hard to understand. If pressed, they might say that creating contracts external to both systems and insisting on adherence to them seems like a waste of time. Why not something that just works "out of the box"? The answer is that this is not an either-or choice. A browser and a web server work together "out of the box", but they do so not because they come from the same company but because they both adhere to the same rigid contract, which is the set of IETF and W3C standards (HTTP, HTML, CSS and JavaScript).

The key is to hold vendors' feet to the fire on adherence to open standards. This isn't that hard. With Open Source products available in the market to keep vendors honest, customers have only themselves to blame if they let themselves get locked in.

The second reason why vendor lock-in is a bad idea is because it often implies vendor lock-out. Very often, customers are held hostage to the politics of vendor competition. Once you start buying into a particular vendor's stack, it will become increasingly hard to look elsewhere. Any third-party component you procure will be less likely to play well with the ones you already have, causing you more integration headaches. My favourite example from just a few years ago is Oracle Portal Server, which claimed to support authentication against LDAP. It turned out later that it wasn't just any LDAP server but just Oracle's own OID. This meant that the corporate directory server which happened to be IBM Tivoli LDAP couldn't be used. The data in it had to be painfully replicated to OID to allow Oracle Portal Server to work.

My solution to incipient vendor lock-in would be to aggressively seek standard interfaces even between products from the same company. I remember asking an IBM rep why they weren't enabling SMTP, IMAP and vCalendar as the mail protocols between Notes client and server. The rep sneered at me as if I was mad. Why would you use these protocols, he wanted to know, when Notes client and server are so "tightly integrated"? Others on my side of the fence agreed with him. Well, the answer came back to bite the company many years later, when they wanted to migrate to Outlook on the client side. Their "tight integration" had resulted in vendor lock-out, preventing them from connecting to Notes server using Outlook (which standard protocols would have allowed) and they were stuck with Notes client indefinitely. By that time, there were too many other dependencies that had been allowed to work their way in, so enabling open protocols at that stage was no longer an option. That was a great outcome for IBM, of course, but to this day, there are people in the customer organisation who don't see that what happened to them was a direct consequence of their neglect of open interfaces in favour of closed, "tight" integration.

Ultimately, vendor lock-in has implications of cost, time and effort, which basically boils down to cost, cost and cost.

As users of technology, we simply cannot afford it.

Monday, June 22, 2009

Microsoft May Lose Some Fair-Weather Friends

I read the news of the release of Microsoft Security Essentials with some amusement.

One reason for my amusement is the notion that an operating system should require a separate product to ensure its security instead of having security built into its design.

The other reason is the anticipation of the impact this will have on that group of parasites in the Windows ecosystem. I'm talking about the makers of anti-virus software.

For a long time now, these companies have been less-than-honest players in the industry, revelling in the fact that the inherent vulnerabilities in Windows have given them a steady income stream and acting like they have the best interests of the customer at heart, when in fact they have always fought true advances in computer security that would have put them out of business.

The FUD from these players against Linux has been astounding. A common refrain is that "Linux is only secure because no one uses it. When its profile rises, hackers and malware writers will turn their attention to it." Really? How come IIS used to attract a disproportionate share of web server attacks in spite of Open Source Apache having twice its market share at the time? Surely it's badly-designed systems that invite attack.

These folk even try and sell anti-virus software for Linux! This would certainly fool people who don't realise that Linux itself is the best anti-virus software you can install on your computer. I haven't been hit by malware since 1997, when I first installed Slackware Linux on my PC.

So what will Microsoft's announcement of free anti-virus protection do to the likes of McAfee and Symantec? While users will probably be going, "It's about time," I can imagine a very different reaction at these companies.

I'm not shedding any tears, though.

Monday, June 08, 2009

JavaOne 2009 Day Five (05/06/2009)

The final day of JavaOne 2009 began with a General Session titled "James Gosling's Toy Show". This featured the creator of Java playing host to a long line of people representing organisations that were using Java technology in highly innovative and useful ways. Many of them got Duke's Choice awards.

First up was Ari Zilka (CEO, Terracotta) who was given the award for a tool that makes distributed JVMs appear like one, thereby providing a new and different model of clustering and scalability. Terracotta also allows new servers and JVMs to be added on the fly to scale running Java applications.

Brendan Humphreys ("Chief Code Poet", Atlassian) received the award for Atlassian's tool Clover. [Brendan is from Atlassian's Sydney office, and I've met him there on occasion when Atlassian hosts the Sydney Java User Group meetings.] Clover is about making testing easier by identifying which tests apply to which part of an application. When changing part of an application, Clover helps to run only the tests that apply to that part of the code.

Ian Utting, Poul Henriksen and Darin McCall of BlueJ were recognised (though not with a Duke's Choice award) for their work on Greenfoot, a teaching tool for children. Most of their young users are in India and China, but it's not clear how many there are, because they only interact with the Greenfoot team through their teachers.

Mark Gerhard (CEO of Jagex) was called up to talk about RuneScape. [Mark Gerhard received the Duke's Choice Award on Day 2, as diligent readers of this blog would remember.] This time, the focus was not on the game itself, but on the infrastructure it took to run it. According to Gerhard, Jagex runs the world's second-biggest online forum. There is no firewall in front of the RuneScape servers (!), so they get the full brunt of their user load. The servers haven't been rebooted in years. Jagex had to buid their own toolchain, because a toolchain needs to understand how to package an application for streaming, which off-the-shelf equivalents don't know to do. Jagex runs commodity servers (about 20?) and their support team has just 3 people. Considering that their user base numbers 175 million (10 million of whom are active at any time), this is a stupendous ratio. Of course, Jagex has about 400 other staff, mainly the game developers. Jagex builds their libraries and frameworks in-house, and they maintain feature parity with their commercial competitor, Maya. I found it curious that Gerhard was cagey when asked which version of Java they used. Why would that need to be a secret? All he would say was that we could guess the version from the fact that their servers hadn't been rebooted in 5 years.

By way of vision, Gerhard said that OpenGL would be standard on cell phones in a year, and Jagex's philosophy is that "There's no place a game shouldn't be".

The next people on stage were two researchers from Sun itelf, - Simon Ritter and Angela Caicedo. [Caicedo had been at the Sydney Developer Day only a couple of weeks earlier.] Ritter demonstrated a cool modification of the Wii remote control, especially its infra-red camera. The remote talks Bluetooth to a controller. I didn't grasp the details of how the system was built, although I heard the terms Wii RemoteJ and JavaFX being mentioned. Ritter held a screen in front of him, and a playing card was projected onto it. Nothing hi-tech there. When he rotated the screen by 90 degrees, the projected image rotated as well, which was interesting. But what brought applause was when he flipped the screen around, and the projected image switched to the back of the card! He also showed how a new card could be projected onto the screen by just shaking the screen a bit (shades of the iPod Shuffle there).

Caicedo demonstrated a cool technology that she thought might be useful to people like herself who had young children with a fondness for writing on walls. With a special glove that had a embedded infra-red chip, she could "draw" on a screen with her finger, because a projector would trace out whatever she was drawing based on a detection of the position of her finger at any given time. The application was a regular paint application, allowing the user to select colours from a toolbar and even mix them on a palette.

Tor Norbye (Principal Researcher, Sun) then gave the audience a sneak preview of the JavaFX authoring tool that has not yet been released. Very neat animations can be designed. It's possible to drag an image to various positions and map them to points in time. Then the tool interpolates all positions between them and shows an animation that creates an effect of smooth motion, bouncing images, etc. There are several controls available, like buttons and sliders, and it's possible to visually map between the actions of controls and behaviours of objects. It reminded me of the BeanBox that came with Java 1.1, which showed how JavaBeans could be designed to map events and controls. The lists of events and actions appear in dropdowns through introspection.

There's no edit-compile cycle, which speeds up development. Norbye showed how the same animation could be repurposed to different devices and form factors. There's a master-slave relationship between the main screen and the screens for various devices, such that any change made to the main screen is reflected in the device-specific screens, but any specific overrides made on a device-specific screen remain restricted to that screen alone.

Fritjof Boger-Engelhardtsen of Telenor gave us a demo on a technology I don't pretend to understand. In the mobile world, the SIM card platform is very interesting to operators. The next generation of SIM cards will be TCP/IP connected nodes, with motion sensors, WiFi, etc., embedded within the card. It will be a JavaCard 3 platform. It's possible to use one's own SIM card to authenticate to the network. FB-E gave us a demo of a SunSpot sensor device connected to a mobile phone and being able to control the phone's menu by moving the SunSpot. The phone network itself is oblivious to this manipulation. More details are available at http://playsim.dev.java.net.

Brad Miller (Associate Professor, Worcester Polytechnic Institute Robotics Research Center) and Derek White (Sun Labs) showed some videos of the work done by High School students. Given a kit of parts, the students have to put together robots to participate in the "US First", an annual robotics competition. A large part of the code has been ported across from C/C++ to Java, and the project is always on the lookout for volunteer programmers. Interested people can go to www.usfirst.org. WPI got a Duke's Choice award for this.

Sven Reimers (System Engineer and Software Architect, ND Satcom) received a Duke's Choice award for the use of Java in analysing the input of satellites.

Christopher Boone (President and CEO, Visuvi, Inc) showed off the Visuvi visual search enine. Upload an image and the software analyses it on a variety of aspects and can provide deep insights. Simple examples are uploading images of paintings and finding who the painter was. More useful and complex uses are in the area of cancer diagnosis. Visuvi improves the quality and reduces the cost of diagnosis of prostate cancer. The concordance rate (probablity of agreement) between pathologists is only about 60%, and the software is expected to achieve much better results. The Visuvi software performs colour analysis, feature detection and derives spatial relationships. There's some relationship to Moscow State University that I didn't quite get. At any rate, Visuvi is busy scanning in 400 images a second (at 3000 megapixels and 10 MB each)!



Sam Birney and Van Mikkel-Henkel spoke about Mifos, a web application for institutions that deal in providing microfinance to poor communities. Microfinance is inspired by the work done by Mohammed Yunus of Grameen Bank. This is an Open Source application meant to reduce the barriers to operation of cash-strapped NGOs. The challenge is to scale. Once again, volunteers are wanted: http://www.mifos.org/developer/ , and not just for development but also for translation into many relatively unknown languages. Mifos won a Duke's Choice Award.



Manuel Tijerino (CEO, Check1TWO) told of how many of his musician friends were struggling to find work at diners. So he created a JavaFX based application that allows artistes to upload their work to the Check1TWO site, and it's automatically available on any Check1TWO "jukebox" at any bar or disco. Regular jukeboxes are normally tied up in studio contracts, so the Check1TWO jukeboxes provide a means for struggling artistes to reach over the heads of the studios and connect directly with their potential audiences.

Zoltan Szabo and Balazs Lajer (students at the University of Pannonia, Hungary) showed off their project that won the first prize at RICOH's student competition. Theirs is a Java application that runs on a printer/scanner and is capable of scoring answer sheets.

Marciel Hernandez (Senior Engineer, Volkswagen Electronics Research Lab and Stanford University) and Greg Follella (Distinguished Engineer, Sun) talked about Project Bixby. This is about automating the testing of high-speed vehicles through "drive-by-wire". The core of the system is Java RTS (Run-time System). The primary focus is improving safety. Stanford University is building the control algorithms. It should be possible to control the car when unexpected things happen, which especially happens on dirt racetracks. There's no need to put a test driver at risk. Project Bixby leads to next-generation systems that are faster, such as more advanced ABS (Anti-lock Braking System) and newer stability control systems).

Finally, there was a video clip of the LincVolt car, which turns a classic Lincoln Continental into a green car like the Prius, but with some differences. The Prius has parallel electrical and petrol engines. The LincVolt has batteries driving the wheels all the time, with the petrol engine only serving to top up the battery pack when it starts to run down. What's the connection with Java? The control systems and visual dashboard are all Java.

This concluded the General Session.

I then attended a session titled "Real-World Processes with WS-BPEL" by Murali Pottlapelli and Ron Ten-Hove. The thrust of the whole session was that WS-BPEL as a standard was incomplete and that real-world applications need more capabilities than WS-BPEL 2.0 delivers. A secondary theme of the session was that an extension called BPEL-SE developed for OpenESB is able to address the weaknesses of WS-BPEL.

[My cynical take on this is that every vendor uses the excuse of an anaemic spec to push their own proprietary extensions. If there is consensus that WS-BPEL 2.0 doesn't cut the mustard, why don't the vendors sit down together and produce (say) a WS-BPEL 3.0 spec that addresses the areas missing in 2.0? I'm not holding my breath.]

The structure was fairly innovative. Ten-Hove would talk about the failings of the standard and Pottlapelli would then describe the solution implemented in BPEL-SE. [The session would have been far better if Pottlapelli had been able to communicate more effectively. I'm forced to the painful realisation that many Indians, while being technically competent, fail to impress because their communication skills are lacking.]

These are the shortcomings of WS-BPEL 2.0 that are addressed by BPEL-SE:

- Correlation is hard to use, with multiple artifacts to define and edit. BPEL-SE provides a correlation wizard that hides much of this complexity.
- Reliability is not defined in the context of long-running business processes susceptible to crashes, maintenance downtime, etc. BPEL-SE defines a state machine where the state of an instance's execution is persisted and can be recovered on restart without duplication.
- High Availability is not defined. BPEL-SE on GlassFish ESB can be configured to take advantage of the GlassFish cluster, where instances migrate to available nodes.
- Scalability is an inherent limitation when dealing with long-running processes that hold state even when idle. BPEL-SE defines "dehydrate" and "hydrate" operations to persist and restore state. As the dehydrate/hydrate operation is expensive, two thresholds are defined (the first to dehydrate variables alone, and the next to dehydrate entire instances).
- Retry is clumsy in WS-BPEL, because the <invoke> operation doesn't support retries. Building retry logic into the code obscures the real business logic. BPEL-SE features a configurable number of retries and a configurable delay between retries. It also supports numerous actions on error.
- Throttling is difficult to achieve in WS-BPEL, which makes the system vulnerable to denial of service attacks and bugs that result in runaway processes. BPEL-SE can limit the number of accesses to a "partnerlink" service node, even across processes. This helps to achieve required SLAs.
- Protocol headers are ignored in WS-BPEL by design, as it is meant to be protocol-agnostic. However, many applications place business information in protocol headers such as HTP and JMS. BPEL-SE provides access to protocol-specific headers as an extension to <from> and <to> elements. Protocols such as HTTP, SOAP, JMS (strictly speaking, this isn't a protocol but an API) and file headers.
- Attachments are non-standard in WS-BPEL, with some partners handling document routing inline and some as an attachment. BPEL-SE adds the extensions "inlined" and "attachment" to the <copy> element.
- BPEL extensions fail because it relies on XPath, and XPath 1.0 has a limitation in that false is interpreted as true (boolVar is a non-empty node). BPEL-SE uses JXPath, and is enhanced to be schema-aware. False is not interpreted as true, and integers do not end in ".0".
- XPath limitations hamper WS-BPEL, because XPath has a limited type system, it is impossible to set a unique id on the payload, it's a challenge to extract expensive items in a purchase order document if it spans multiple activities, etc. Any kind of iteration across an XML structure is difficult, and this is partly due to the limitations of XPath and partly due to those of BPEL. With BPEL-SE, Java calls can be embedded in any XPath expression, with syntax adapted from Xalan. It also supports embedded JavaScript (Rhino with E4X), which makes XML parsing much easier.
- Fault handling is problematic in WS-BPEL because standard faults have no error messages, which makes them hard to debug. Standard and non-standard faults can be hard to distinguish. This complicates fault handlers, requiring multiple catches for standard faults. The WSDL 1.1 fault model has been carried too far. BPEL-SE fault handler extensions propagate errors and faults not defined in WSDL as system faults. Associated errors are wrapped in the fault message. Standard faults are associated with a fault message and failure details like activity name are carried along. BPEL-SE catches all standard and system faults.

I asked the presenters if they were making the case that BPEL-SE had addressed all the limitations of WS-BPEL or if they felt there were some shortcomings that still remained. Ten-Hove's answer was that subprocesses were a lacuna that hadn't yet been addressed.

My next session was "Cleaning with Ajax: Building Great Apps that Users Will Love" by Clint Oram of SugarCRM. I regret to say that this session was short on subject matter and ended in just half an hour. The talk seemed to be more a sales pitch for SugarCRM than a discussion of Ajax principles. Even the few design principles that were discussed were fairly commonsensical and did not add much insight.

For what it is worth, let me relate the points that sort of made sense.

Quite often, designers are asked to "make the UI Ajax". This doesn't say what is really expected. Fewer page refreshes? Cool drag-and-drop? Animation?

There are also some downsides of Ajax:
- Users get lost (browser back and forward buttons and bookmarks are broken, the user doesn't always see what has changed on the page, and screen elements sometimes keep resizing constantly, irritating the user)
- There are connection issues (it can be slow, the data can't be viewed offline, and applications often break in IE6)
- There are developer headaches (mysterious server errors (500), inconsistent error handling, PHP errors break Ajax responses)

Some design questions need to be asked before embarking on an Ajax application:
- Does the user really want to see the info?
- Will loading this info be heavy on the server?
- Does the info change rapidly?
- Is there too much info?

The "lessons learned", according to Oram, are:

- Use a library like YUI, Dojo or Prototype instead of working with raw JavaScript.
- Handle your errors, use progress bars to indicate what is happening, don't let the application seem like it's working when there has been an error.
- Ajax is a hammer, and not every problem is a nail.
- Load what you need, when you need it.

As one can see, there was nothing particularly insightful in this talk, and it was a waste of half-an-hour.

One interesting piece of information from this session was that IBM WebSphere sMash (previously Project Zero) is a PHP runtime that runs on the JVM and provides full access to Java libraries.

SugarCRM 5.5 Beta is out now, and adds a REST API to the existing SOAP service API.

I'm not a fan of SugarCRM. The company's "Open Source" strategy is basically to provide crippleware as an Open Source download, and to sell three commercial versions that do the really useful stuff. I don't know how many people are still fooled by that strategy.

The next session I attended was quite possibly the best one I have attended over these five days. It was called "Resource-Oriented Architecture and REST" by Scott Davis (DavisWorld Consulting, Inc).

Unfortunately, the talk was so engrossing that I forgot to take notes in many places, so I cannot do full justice to it here :-(. Plus, Davis whizzed through examples so fast I didn't have time to transcribe them fully, and lost crucial pieces of code.

Davis is the author of the columns "Mastering Grails" and "Practically Groovy" on IBM Developerworks. He's also a fan of David Weinberger's book "Small Pieces Loosely Joined", which describes the architecture of the web. Although ten years old, the book is still relevant today. [David Weinberger is also the author of The Cluetrain Manifesto.]

I knew Groovy was quite powerful and cool, but in Davis's hands it was pure magic. In a couple of unassuming lines, he was addressing URLs on the web and extracting information that would have taken tens of lines with SOAP-based Web Services. I must have transcribed them incorrectly, because I can't get them to work on my computer. I'll post those examples when I manage to get them working.

A really neat example was when he found a website that provided rhyming words to any word entered on a form. He defined a "metaclass" method on the String class to enable it to provide rhymes by invoking the website. Then a simple statement like "print 'java'.rhyme()" resulted in "lava".

As Davis said, REST is the SQL of the Web.

Davis then talked about syndication with the examples of RSS and Atom. Three things draw one in (chronology (newest first), syndication (the data comes to you) and permalinks (which allow you to "pass it on")). He also mentioned the Rome API and called it the Hibernate of syndication.

What's the difference between RSS and Atom? I hadn't heard it put this way before. Davis called RSS "GETful" and Atom "RESTful".

He then did a live grab of a Twitter feed and identified the person in the room who had sent it.

In sum, this session was more a magic show than a talk. It made me determined to learn Groovy and how to use it with RESTful services.

The last session I attended at JavaOne 2009 was "Building Enterprise Java Technology-based Web Apps with Google Open Source Technology" by Dhanji Prasanna of Google.

Prasanna covered Google Guice, GWT (pronounced "gwit") and Web Driver, and provided a sneak preview of SiteBricks.

I'm somewhat familiar with GWT but none of the others, so my descriptions below may make no sense. It's a verbatim transcription of what Prasanna talked about.

Guice helps with testing and testability. Its advantages are:
- Simple, idiomatic AOP
- Modularity
- Separation of Concerns
- Reduction of state-aware code
- Reduction of boilerplate code

It enables applications to horizontally scale. Important precepts are:

- Type safety, leading to the ability to reason about programs
- Good citizenship (modules behave well)
- Focus on core competency
- Modularity

GWT is a Java-to-JavaScript compiler. It supports a hosted mode and a compiled mode. Core Java libraries are emulated. It's also typesafe.

The iGoogle page is an example of what look like "portlet windows", but which are independent modules that are "good citizens".

Unfortunately, social sites like OpenSocial and Google Wave have different contracts, so modules may not be portable across them.

Google Gin is Guice for GWT. It runs as Guice in hosted mode and compiles directly to JavaScript. There is no additional overhead of reflection.

Types are Java's natural currency. Guice and GWT catch errors early, facilitate broad refactorings, prevent unsafe API usage and reason better about programs. They're essential to projects with upwards of ten developers, because these features are impossible with raw JavaScript.

Web Driver is an alternative to the Selenium acceptance testing tool, and apparently the codebases are now merging. Web Driver has a simpler blocking API that is pure Java. It uses a browser plugin instead of JavaScript, features fast DOM interaction and a flexible API and supports native keyboard and mouse emulation. Web Driver supports clustering.

Then Prasanna provided a preview of SiteBricks, which is a RESTful web framework. The focus is on HTTP and lessons have been learned from JAX-RS. There are statically typed templates with rigorous error checking. It's concise and uses type infererence algorithms. It's also fast and raises compilation errors if anything is wrong.

SiteBricks modules are Guice servlet modules. One can ship any module as a widget library. Any page is injectable. It supports the standard web scopes (request, session) and also a "conversation" scope.

SiteBricks has planned Comet ("reverse Ajax") support. The preview release is available on Google Code Blog.

That concludes my notes on JavaOne 2009. Admittedly, a lot of this has been mindless regurgitation in the interests of reporting. If these make sense to readers (even if they make no sense to me), well and good.

In the days to come, I'll ruminate on what I've learned and post my thoughts.

Thursday, June 04, 2009

JavaOne 2009 Day Four (04/06/2009)

The third day of JavaOne proper (the fourth day if including CommunityOne on Monday) started with a General Session on interoperability hosted by (of all companies) Microsoft. It shouldn't be too surprising, actually, because Sun and Microsoft buried the hatchet about 5 years ago and started to work on interoperability. Time will tell if that was an unholy alliance or not.

Dan'l Lewin (Corporate VP, Strategy and Emerging Business Development, Microsoft) took the stage for some opening remarks. What he said resonated quite well, i.e., that users expect things to just work. Data belongs to users and should move freely across systems. The key themes are interoperability, customer choice and collaboration.

Lewin pointed to TCP/IP as the quintessential standard. Antennae and wall plugs may change from country to country, but TCP/IP is the universal standard for connectivity, which is why the Internet just works. [I could add many other standards to this list, which would also have "just worked" but for Microsoft!]

Lewin added that the significant partners that Microsoft is working with in the Java world are Sun, the Eclipse Foundation and the Apache Software Foundation. The key areas where Sun and Microsoft work together are:

- Identity Management, Web SSO and Security Services
- Centralised Systems Management
- Sun/Microsoft Interoperability Center
- Desktop and System Virtualisation
- Java, Windows, .NET

Identity Management interoperability has progressed a great deal with the near-universal adoption of SAML. On virtualisation, where host and guest systems are involved, Lewin put it very well when he said Sun and Microsoft control each other's environment "in a respectful way."

A website on his slide pack was www.interoperabilitybridges.com .

Steven Martin (Senior Director, Developer Platform Productivity Management, Microsoft) took over from Lewin and started off with "we come in peace and want to talk about interoperability".

He introduced Project Stonehenge, a project under Apache, with code available under the Apache Licence. This uses IBM's stock trading application to demonstrate component-level interoperability between the Microsoft and Sun stacks.

Greg Leake of Microsoft and Harold Carr of Sun then provided a live demo of this interoperability.

The stock trading application has four tiers, - the UI, a business logic tier, a further business logic tier housing the order processing service, and the database tier. The reason for splitting the business logic tier into two was to demonstrate not just UI tier to business logic tier connectivity but also service-to-service interop. The Microsoft stack was built on .NET 3.5, with ASP.NET as the UI technology and WCF the services platform. The Sun stack was based on JSP for the UI and the Metro stack for services, running on Glassfish. Both stacks pointed back to a SQLServer database.



The first phase of the demo showed the .NET stack running alone, with Visual Studio breakpoints to show the progress of a transaction through the various tiers. Then the ASP.NET tier was reconfigured to talk to the Metro business logic layer, and the control remained with the Java stack thereafter. In the third phase of the demo, the Metro business service layer called the order processing service in the .NET stack. The application worked identically in all three cases, effectively demonstrating bidirectional service interoperability between .NET and Metro Web Services.

Martin also mentioned a useful principle for interoperability, "Assume that all applications are blended, and that all client requests are non-native". This is analogous, I guess, to that other principle, "Be conservative in what you send, and liberal in what you accept".

He also referred to "the power of choice with the freedom to change your mind", which I thought was a neat summarisation of user benefits.

Aisling MacRunnels (Senior VP, Software Marketing, Sun) joined Steven Martin on stage to talk about the Sun-Microsoft collaboration, which isn't just limited to getting the JRE to run on Windows. Microsoft also cooperates with Sun to get other "Sun products" like MySQL, VirtualBox and OpenOffice to work on the Microsoft platform. The last item must be particularly galling to the monopoly. Microsoft is also working to get Sharepoint authentication happening against OpenSSO using SAML2. Likewise, WebDAV is being supported in Sun's Storage Cloud. In other words, when both parties support open standards, their interoperability improves.

I think it speaks more of the quality and tightness of a standard than of vendor cooperation when systems work together. Sun and Microsoft shouldn't need to talk to each other or have a cozy relationship. Their systems need to just work together in the first place.

The next session I attended was "Metro Web Services Security Usage Scenarios" by Harold Carr and Jiandong Guo of Sun. Carr is Metro Architect and Guo is Metro Security Architect, so we pretty much had the very brains of the project talking to us.

There wasn't very much in the lecture that was specific to Metro. Most of the security usage patterns were general PKI knowledge, but I must say the diagrams that illustrated the logic flow in each pattern were top class. I have seen many documents on PKI, but these are some of the best. My only quibble with them is they tend to use the same terms "encrypt/decrypt" in two separate contexts - "encrypt/decrypt" and "sign/verify".













Some of the interesting points they made were:

- The list of security profiles bundled with Metro will be refactored soon. Some will be dropped and new ones will be added.
- SSL performance is still better than WS-Security, even with optimisations such as derived keys and WS-SecureConversation. [WS-Security uses a fresh ephemeral key for every message, while WS-SecureConversation caches and reuses the same ephemeral key for the whole session.]



- Metro 2.0 is due in September 2009.

I then attended a session called "Pragmatic Identity 2.0: Simple, Open Identity Services using REST" by Pat Patterson and Ron Ten-Hove of Sun. Ten-Hove is also known for his work on Integration and JBI. He was the spec lead for JSR 208.

[As part of their demo, I realised that NetBeans has at least a couple of useful features I didn't know about earlier. There's an option to "create entity classes from tables" (using JPA, I presume), and another one to "create RESTful web services from entity classes".]





It's a bit difficult to describe the demo that Patterson gave. On the surface, it was a standard application that implemented user-based access control. One user saw more functions than another. The trick was in making it RESTful. Nothing had to be explicitly coded for, which was the cool part. The accesses were intercepted and authenticated/authorised without the business application being aware of it. As I said, it's hard to describe without the accompanying code.

The next session was on "The Web on OSGi: Here's How" by Don Brown of Atlassian. Brown is a very polished and articulate speaker and he kept his audience chuckling.

OSGi is something that has fascinated me for a while but I haven't got my head around it completely yet. At a high level, OSGi is a framework to allow applications to be composed dynamically from components and services that may appear and disappear at run-time. Dependencies are reduced or eliminated by having each component "bundle" use its own classloader, so version incompatibilities can be avoided. Different bundles within the same application can use different versions of libraries without conflicts, because they don't share classloaders.



OSGi is cool but complex. As Brown repeatedly pointed out, while it can solve many integration and dependency problems, it is not trivial to learn. Those who want to use OSGi must be prepared to learn many fundamental concepts, especially around the classloader. Also, components may appear and disappear at will in a dynamic manner. How must the application behave when a dependency is suddenly exposed?

There are 3 basic architectures that one can follow with OSGi:



1. OSGi Server
Examples: Spring DM Server, Apache Felix, Equinox
Advantages: Fewer deployment hassles, consistent OSGi environment
Disadvantages: Can't use any existing JEE server

2.Embedded OSGi server via bridge (OSGi container runs within a JEE container using a servlet bridge)
Examples: Equinox servlet bridge
Advantages: Can use JEE server, application is still OSGi
Disadvantages: (I didn't have time to note this down)

3. Embedded OSGi via plugin
Example: Atlassian plugin
Advantages: Can use JEE server, easier migration, fewer library hassles
Disadvantages: More complicated, susceptible to deployment

I learnt a number of new terms and names of software products in this session.
- Spring DM (Dynamic Modules)
- Peaberry using Guice
- Declarative Services (part of the OSGi specification)
- iPOJO
- BND and Spring Bundlor are tools used to create bundles
- Felix OSGi is embedded as part of Atlassian's existing plugin framework.
- Spring DM is used to manage services.
- Automatic bundle transformation is a cool feature that Brown mentioned but did not describe.

There are three types of plugins:
Simple - no OSGi
Moderate - OSGi via a plugin descriptor
Complex - OSGi via Spring XML directly



Brown gave us a demo using JForum and showed that even if a legacy application isn't sophisticated enough to know about new features, modules with such features can be incorporated into it.

I had been under the impression that OSGi was only used by rich client applications on the desktop. This session showed me that it's perhaps even more useful for web applications on the server side.

My last session of the day was a hands-on lab (Bring Your Own Laptop) called "Java Technology Strikes Back on the Client: Easier Development and Deployment", conducted by Joey Shen and a number of others who cruised around and lent a helping hand whenever one got stuck. It looks like Linux support for JavaFX has just landed (Nigel Eke, if you're reading this, you were right after all), and a very quiet landing it has been, too. But it's still only for NetBeans 6.5.1, not NetBeans 6.7 beta. At any rate, I was more interested in just checking to see if it worked. It turned out that there were a couple of syntax errors in the sample app which had to be corrected before the application could run. I was very keen to try the drag-and-drop feature with which one could pull an applet out of the browser window and install it as a desktop icon (Java Web Start application). Unfortunately, this feature requires a workaround in all X11 systems (Unix systems using the X Window System), because the window manager intercepts drag commands and applies it to the window as a whole. There was a workaround described for applets but none for a JavaFX application. As time was up, I had to leave without being able to see drag-and-drop in action. Never mind, I'm sure samples and documentation will only become more widely available as time goes on, and Sun will undoubtedly make JavaFX even easier to use in future.

Wednesday, June 03, 2009

JavaOne 2009 Day Three (03/06/2009)

The second day of JavaOne proper (the third day if including CommunityOne) started with a General Session on mobility conducted by Sony Ericsson.

The main host was Christopher David (Head of Developer and Partner Engagement). At about the same time he started his talk, Erik Hellman (Senior Java Developer) got started on a challenge - to develop a mobile application by the end of the session that would display all Tweets originating within a certain radius of the Moscone Center that contained the word 'Java'.

Rikko Sakaguchi (Corporate VP, Head of Creation and Development) and Patrik Olsson (VP, Head of Software Creation and Development) joined David on stage, and between the three of them, kept the Sony Ericsson story going.

One of the demos they attempted failed (controlling a Playstation 3 with a mobile phone), but then it isn't a demo if it doesn't fail.

One of the points made was about the difference between a mobile application and a traditional web application. A traditional web application has its UI on the client device, with business logic and data on a server across the network. A mobile application has the UI, parts of business logic and parts of data and platform access on the device, and the remaining data and business logic across the network. I don't quite buy this distinction. I don't necessarily see a difference between traditional distributed applications and mobile applications. So the device form factor is a bit different and the network is wireless, but that's hardly a paradigm shift. Application architectures like SOFEA are meant to unify all such environments.

The history of Sony Ericsson's technology journey is somewhat familiar. In 2005, they switched from C/C++ to Java. Java became an integral part of the Sony Ericsson platform rather than an add-on. In 2007, they created a unique API on top of the base Java platform. In 2009, the focus is on reducing fragmentation of platforms. The bulk of the APIs are standard, while a few (especially at the top of the stack) are proprietary to SE.

As expected from a company that boasts 200 million customers worldwide and 200 million downloads a year, SE has a marketplace called PlayNow Arena. SE has been selling ringtones, games, music, wallpapers, themes, movies and lately, applications. I'm frankly surprised that it's taken them so long to get to selling applications.

Since time-to-market is important, SE promises software developers a turnaround time of 30 days from submission to appearance in the virtual marketplace, with assistance provided throughout the process.

And yes, Erik Hellman had completed his application with 10 minutes to spare by the time the session ended.

The next session I attended was something completely new to me. This was called "Taking a SIP of Java Technology: Building voice mashups with SIP servlets" by RJ Auburn of Voxeo Corp. The Session Initiation Protocol (SIP) is mainly used in telephony, but can apparently be used to bootstrap any other interaction between systems. SIP has more handshaking than HTTP, with many more exception cases, so it's a more chatty protocol than HTTP. It's also implemented on top of UDP rather than TCP, so SIP itself needs to do much more exception handling than HTTP.

RFC 3261 that describes SIP is reportedly a rather dry document to read. Auburn recommended The Hitchhiker's Guide to SIP, and also some Open Source info at www.voip-info.org and some industry sites (www.sipforum.org and www.sipfoundry.org).

There seem to be two main ways to develop applications with SIP. One uses XML, the other uses programming APIs. The XML approach is a 90% solution, while the API approach provides more options but is more complex.

There are two sister specifications in the XML approach - VoiceXML and CCXML (Call Control XML). VoiceXML supports speech and touchtone, a form-filling model called the FIA, but very limited call control. CCXML, in contrast, manages things like call switching, teleconferencing, etc. The two work in a complementary fashion, with CCXML defining the overall "flow logic", and VoiceXML defining the parameters of a particular "message" (for want of better terms).

The Java API is based on the SIP Servlet API (www.sipservlet.com). JSR 116 was SIP Servlet 1.0, and JSR 289 is SIP Servlet 1.1 (just released). JSR 309 (the Java Media Server API) is based on the CCXML model, but is still in draft.

SIP is complex, so Voxeo has a simpler API called Tropo, available in a number of scripting languages. This is not Open Source, but is free for developers, and the hosting costs about 3 cents a minute. There are also traditional software licensing models available.

There are some phone-enabled games available, and Vooices is a good example.

More information is available on tropo.com and www.voxeo.com/free.

The next session I attended was "What's New in Groovy 1.6?" by Guillaume Laforge himself. The talk was based on an InfoQ article written by him earlier.

In brief, the improvements in Groovy 1.6 are:

- Performance - the compiler is 3x to 5x faster than 1.5, and the runtime is between 150% to 460% faster.
- Syntax changes - multiple assignment (the ability to set more than one variable in a statement), optional return statements
- Ability to use Java 5 annotations
- Ability to specify dependencies
- Swing support (griffon.codehaus.org)
- JSR 223 (scripting engine) support is built-in and can invoke scripting in any language
- OSGi readiness

My next session was on providing REST security. The topic was called "Designing and Building Security into REST Applications" by Paul Bryan, Sean Brydon and Aravindan Ranganathan of Sun. The bulk of the talk focused on OpenSSO and its REST-like interface. But as the presenters confessed, the OpenSSO "REST" API is just a facade over a SOAP API. In OpenSSO 2.0, they visualise a truer RESTian API.



The other term I had never heard of before was the OAuth protocol. Apparently, OAuth is a style of authentication just like HTTP's Basic Authentication and Digest Authentication.



The last session that I attended today was on "Coding REST and SOAP together" by Martin Grebac and Jakub Podlesak of Sun. Although the topic was entirely serious, it felt like cheating at a certain level.

The premise is that we implement SOAP and REST on top of POJOs using annotations defined by JAX-WS and JAX-RS, respectively. So can't we just add both sets of annotations to the same POJO and enable SOAP and REST simultaneously?

I can see one very obvious problem with this approach. REST forces one to think Service-Oriented because the verbs refer to what the service consumer does to resources. It's what I've earlier called the Viewpoint Flip, and I believe it's an essential part of Service-Oriented thinking. But SOAP doesn't enforce any such viewpoint. It's possible to have an RMI flavour to the JAX-WS SOAP methods. So there's no substitute for proper design.