Tuesday, October 23, 2007

SOA Technology Best Practice

Let me formalise what's on my mind regarding SOA best practice.

  1. Build your core application using the principles of Domain-Driven Design. This is your service implementation. Hint: If using Java, Spring/JPA is the way to go. EJBsaurus is dead.
  2. Take the help of an off-the-shelf domain model like IFW for Banking and IAA for Insurance. This will let you build incrementally without experiencing refactoring pain as your model grows.
  3. If the way your clients refer to your services emphasises verbs over nouns ("Jelly this eel"), you may want to look at SOAP-based Web Services, which are operation-oriented services. If your clients emphasise nouns over verbs ("Buy a ticket to the Spice Girls concert"), consider REST-based Web Services, which are resource-oriented services.
  4. Model your interface contract using XML schema in either case. Layer SOAP or REST Data Interchange patterns over this basic XML document foundation.
  5. Follow Contract-First Design. Do not generate your service interface from your implementation classes, even if your vendor gives you cool tools to do so painlessly (Hint: the pain comes later, when every change to your implementation breaks your service contract).
  6. Use a meet-in-the-middle mapping approach to connect the XML documents in the contract to the domain model. Do not generate implementation classes from the service interface. The self-important JAXB library is a bad bug that's going around. Use JiBX or TopLink instead.
  7. When building the Presentation Tier of an application that consumes services, use the SOFEA model.
  8. If you want to do content aggregation using 2007 thinking, consider a mashup. If still stuck in 2003, use a portal server and jump through the associated hoops.
  9. You don't need an ESB product, not now, not ever. Learn to see the "bus" in the way services are built and invoked. Remember that the web with its smart endpoints has comprehensively beaten the telco network with its smart network approach, even in voice communications. If you must use an ESB, deploy it in a decentralised, "smart endpoints" configuration. Obviously, a decentralised ESB is economically viable only with an Open Source product . A commercial one will force you to deploy it in a centralised architecture to save on licence costs (blech!).
  10. You don't need an ESB product to do process orchestration either. Use BPEL engines at endpoints. If you need human workflow, use BPEL4People.
  11. Open Source is the way of the future. Wake up and smell the coffee. For SOAP-based Web Services, consider Sun's Metro (formerly Tango) SOAP stack. For REST, consider Sun's Jersey (Java-based) or SnapLogic (Python-based). For an ESB (decentralised, of course), consider ServiceMix/CeltiXFire/FUSE. For process orchestration, including human workflow, check out Intalio.
  12. Management of your services can be centralised even with decentralised services. That's not an argument for a centralised ESB either.
  13. Take vendor education with a pinch of salt. Talk to fellow practitioners, especially the bold and the irreverent.

Thursday, October 18, 2007

A Java-filled Day in Sydney

Wednesday the 17th October, 2007.

Sydney had a surfeit of Java today. First, there was the all-day Sun Developer Day at the Wesley Theatre on Pitt Street. And then after that, the Sydney Java Users Group had a meeting at the same place where Mike Keith (of Oracle and the EJB3 spec committee) spoke about JPA.

Where do I begin? I guess my takeaways were that we shouldn't write off NetBeans or Glassfish yet. There was a time, I admit, when I was one of those who thought Sun should just give up on building its own IDE and app server. I may have been wrong. The upcoming NetBeans 6.0 seems quite cool, not least because they seem to have shamelessly stolen a lot of IntelliJ IDEA's cool features.

Glassfish doesn't seem bad either. It's no longer a toy. It's clusterable, for one. And Sun's Project Tango (interoperability with Microsoft's implementation of Web Services in .NET 3) is now part of Glassfish. That means the Java world gets an advanced Web Services platform absolutely free. That should count for something. And Glassfish v3 is supposed to start up in under a second, so it should really wow people when it debuts.

There was a great talk on building RESTful Web Services by Sun's Lee Chuk Munn, which has inspired me to try and download Jersey and give it a spin. Should I look at Phobos as well - a JavaScript engine for a RESTful server? Decisions, decisions.

Angela Caicedo was another Sun evangelist who spoke very well about new features in Java 6 and 7, also about building mobile applications using Java, AJAX and SVG. I'm just a bit disappointed that a mechanism I thought of but didn't widely publicise is now being used for exposing Web Services. The Java 5 concepts of Responses and Futures as a way to turn nouns into verbs and get systems to turn what should be standalone "methods" into classes (which are naturally first-class objects) is at the heart of this.

Glassfish is built on top of the java.nio.* package, which makes it very fast, I believe.

In the evening, I attended Mike Keith's talk on JPA. I'm pleasantly surprised to see that I'm not the only one advocating a mapped-data binding approach. Mike told us about Eclipse MOXy, which provides a "meet-in-the-middle" approach. I wonder if they're aware of JiBX.

Wednesday, October 10, 2007

Services, Persistence and a Rich Domain Model

I see so many wrong-headed ideas about SOA and Web Services that I want to scream.

Here are just some:

1. Using the RPC style of SOAP Web Services (Wake up, Rip Van Winkle, it's 2007!)
2. Thinking that buying an Integration Broker or ESB will itself solve your integration problems (Sure, your shiny telephone lets you dial any international phone number, but does the person at the other end speak your language?)
3. Generating WSDL files and Web Service endpoints automatically from Java classes (So what happens to your service contract when you next change some aspect of your implementation?)

It's the last of these points that I want to talk about in this post.

On the face of it, the problem can be stated in a deceptively simple manner, but there are significant subtleties to it.

The application you are building uses (say) Java (it could be using C# and .NET for all we care). The way we have decided to interact with other systems is through Web Services. That means SOAP messages will be exchanged between our systems. Since the document/literal style of SOAP-based messaging requires an XML document to be embedded within the SOAP body, we can see that the service contract is basically a set of XML documents.

So our problem can be simplistically stated as follows:
How do we convert between Java objects used within our system and XML documents exchanged with other systems?

There are two "obvious" answers to this question - either generate the XML document(s) from your Java classes, or generate Java classes from the XML documents. Java IDEs tend to take the former approach. In fact, many of them go a step further by not only generating XML documents but SOAP endpoints as well, and WSDL definitions of the SOAP endpoints that you can conveniently distribute to your client systems. As I suggested earlier, this approach is wrong because it breaks the service contract (by generating a new WSDL file) whenever the Java classes or method signatures change.

There are also a bunch of tools that take the latter approach. Apache's ADB, XMLBeans and the ubiquitous JAXB library that comes bundled with your JDK all try and generate their versions of a set of Java classes that they believe represent your XML document. Looking at the resultant code can make you lose your appetite very quickly. I'm surprised how so many people gamely continue even after looking at the results of what is very patently a wrong approach.

(This is a blog. Bloggers are meant to be opinionated.)

I believe I know how the problem should really be tackled. Both answers above are wrong. Neither the XML document nor the Java classes must be generated from the other. The Service Contract (represented by the XML document) and the Domain Model (represented by the Java classes) are both First Class Entities. By this, I mean that both are concepts that stand independently. The Service Contract is something that is negotiated between service provider and consumer. It must have no implied dependencies upon how the service is implemented. The Domain Model, on the other hand, represents the designer's deep understanding of the business application and how it really works. This should be independent of how other systems may want to interact with it.

And so my answer to the problem is that we need a third entity, a mapping entity, that sits outside both the Service Contract and the Domain Model and maps them to each other in as flexible and bi-directional a manner as possible. We have such tools today. Their names are Castor and JiBX.

The tragedy of the Java-XML translation space today is not that most practitioners seem to think it is a solved problem. It is. The tragedy is that they think the solution is JAXB. The correct answer is JiBX, and they get no points for getting the letters a bit jumbled. (Think mapping, not code generation.)

Mapped data binding that respects the independence of the two entities it maps is the correct solution approach to Java-XML translation.

There is a second reason why mapped data binding works so well, and it has to do with what is called an impedance mismatch. Impedance mismatch is a term borrowed from Electrical Engineering, and refers to the difficulty of translating concepts from one paradigm to another. Classes in Object-Oriented systems and tables in Relational Databases have an impedance mismatch. Classes and XML schemas have an impedance mismatch.

The Object-Relational impedance mismatch has been satisfactorily overcome for a few years now by ORM (Object-Relational Mapping) tools. Hibernate, Toplink JPA and OpenJPA (formerly Kodo) come to mind. You will notice a similar approach with these tools to what I suggested was the solution to Java-XML translation. They respect both paradigms and can map existing instances of each using mapping files that sit outside both.

So overcoming the Java-XML impedence mismatch is a second good reason to use a mapped data binding approach like JiBX or Castor.

[I'm told that JiBX makes the use of XML as a data interchange mechanism as fast as native RMI. I just love it when the right way to do something is also made attractive.]

My third and final way of looking at the problem is from the traditional Java view of Domain Objects and Data Transfer Objects. Data Transfer Objects have never been considered true objects because they only encapsulate state, not behaviour. However, they serve a useful purpose as a data interchange mechanism. It's considered bad practice (except by the Hibernate crowd and their queer notion of detached entities) to directly expose your Domain Objects to the outside world. You should decouple your implementation from your service interface. This decoupling is done by using Data Transfer Objects (DTOs) as very specialised data structures visible to other systems through the service interface. DTOs are marshalled from Domain Objects and unmarshalled back to Domain Objects during service interactions. This is a paradigm well known to most J2EE developers. Looked at in this way, XML documents used in Web Services are just DTOs.

The best technology to marshal and unmarshal Java DTOs is Dozer, just as the best technology to marshal and unmarshal XML DTOs is JiBX.

I see a beautiful symmetry in the way all these tools and technologies inter-relate. At the centre is the rich Domain Model that implements your application. To persist the application's state, you use Hibernate or another ORM tool to map your Domain Objects to relational tables. To talk to remote Java clients, you map your Domain Objects to DTOs using Dozer. To expose services to clients that require a Web Services interface, you use JiBX or Castor to map your Domain Objects to XML documents.

This picture should be worth the thousand words above.

Sunday, October 07, 2007

Who is the best Vendor of Collaboration Technology?

Collaboration is a hot topic in the industry today, and every organisation wants to set up an advanced collaboration capability for their employees and customers. My personal view is that collaboration has more to do with culture than technology. So if an organisation's culture is not inherently collaborative, merely buying collaboration technology will leave a hole in their budgets but not actually achieve very much.

But catch vendors telling customers that.

Listening to them tell it, the key to success in collaboration capability seems to be in implementing the most integrated "stack" of products you can find. The theory goes that if you find a vendor with a rich and integrated stack of "collaboration" products, buying and deploying that stack will lead to collaboration nirvana (canned laughter here).

It's funny enough seeing stodgy, hierarchical organisations trying to acquire collaboration capability without becoming any less stodgy or hierarchical, but just look at who the vendors are!

You shouldn't be surprised to find that the "leaders" in the collaboration space (self-styled or anointed as such by analyst firms with backward-facing crystal balls) are large, stodgy and hierarchical software vendors. Our usual suspects IBM and Microsoft lead the pack. Blind leading the blind is the expression that leaps to mind.

Face it, buying collaboration technology from large software vendors is like importing voting machines from dictatorships. What do these guys know about this stuff anyway? They wouldn't recognise it if it bit them in a sensitive area. I bet they wouldn't be prepared to face its implications in their own organisations/countries, but they don't seem to mind pushing it onto others. (OK, that's a bit unfair, because many technical people in companies like IBM and Microsoft do collaborate quite effectively, but I'm talking about a larger organisation-wide culture.)

I believe that organisations that want to make a success of collaboration must (1) ensure first that their cultures are collaborative and (2) source their collaboration technology from collaborative communities, e.g., the Open Source community.

Collaboration is what Open Source communities do all the time. Newsgroups and mailing lists, IRC, blogs and wikis, RSS and Atom, mashups, -- all invented by the necessity of collaboration across far-flung communities of peers. These products and technologies are truly collaborative because the people that built them use them for true collaboration, not the sanitised, managed and controlled collaboration that corporate bosses have in mind.

And the "stack" theory is my pet hate. In my evolved view (and I say this without vanity), I believe that the best integrated products are those that were not built as a "family of products" by a single entity but those that were built by independent groups that operated in a decoupled way but had no hidden agendas about locking out their competitors. That's why you can run a Linux desktop with a Firefox browser, a MySQL database and your own local Apache web server running PHP, and build applications using this suite of products without having to think about where they came from. The Linux community, the Mozilla Foundation, the Apache Software Foundation, the PHP community, the MySQL community and MySQL AB have all collaborated (yes, that's the word) to deliver to you, the developer, a seamlessly integrated set of products. Is it a monolithic "stack"? Nonsense. The myth that software products need a common brand to be interoperable is just that - a myth. Open Source software groups don't have ulterior objectives of locking each other out. That's why interoperability happens, - because users want it.

Will decision makers get this? I believe it's an ongoing process of learning and maturity. Hopefully the presence of Generation Y in the workforce will teach the rest of us how to really collaborate. And put away those chequebooks, because the best things in life really are free.

Life above the Service Tier

Or "How to Build Application Front-ends in a Service-Oriented World"

(This is a paper written by myself with two co-authors, Rajat Taneja and Vikrant Todankar, the same team that critiqued the EJB 3.0 spec in 2004.

You can download and read the paper and an accompanying presentation.

Update Sep 2008: How SOFEA fits into a SOA Ecosystem)

But let me talk a bit about it here.

When we look at the Business Logic Tier, although the industry hasn't reached nirvana yet, the story seems to be coming together rather well. The Service-Oriented Architecture (SOA) message is being heard, with its concepts of loose coupling between systems, removal of implicit dependencies and specification of all legitimate dependencies as explicit contracts. The SOA model yields flexibility and uniformity through the specification of a definite Service Interface, which hides domain models and implementation technologies.

Unfortunately, no such clear architectural blueprint exists for the Presentation Tier that lies just above the Service Interface. There are a dozen different technologies available here. There are also the thin-client and rich-client models to choose from. Which of these is the "right" one to use? What does industry best practice say about developing the front-end of an application? What is the best way to connect the front-end neatly into the Service Interface? Developers are confused and left without guidance.

And while on the subject of connecting the front-end to the Service tier, an additional source of confusion is the fact that the Service Tier may itself follow one of two different models - SOAP-based Web Services or REST. Which one should we target? We believe that both these models will continue to co-exist, and organisations will have legacy business services in both these models. Therefore a good Presentation Tier technology will be capable of interfacing with both SOAP and REST, not just one of them. It is important to note that both SOAP and REST use (or are capable of using) XML documents for data interchange.

When we put on our architects' hats and look at the Presentation Tier, we see that regardless of whether it is a thin client or a rich client, three essential processes always take place.

The first is what we call Application Download. In order to be able to run an application on a client device, it needs to be delivered to the device in some fashion.

When the application is in use, a sequence of screens is typically seen. We call this Presentation Flow.

Finally, any non-trivial application ultimately exchanges data with some back-end resource, a process we call Data Interchange. This is usually the crux of the application's functionality.

To reiterate, the three processes occurring in the Presentation Tier of any application are Application Download (AD), Presentation Flow (PF) and Data Interchange (DI).

In thin-client applications, the web server plays a part in all three processes. Application Download occurs piecemeal, in the form of individual web pages served up by the web server. The web server also drives the application's Presentation Flow, usually with the help of a web framework like Struts or Spring MVC. Finally, the web server acts as an intermediary for Data Interchange between browser and application server.

We believe that the thin-client model as described above suffers from at least three major architectural flaws. Unless these flaws are addressed, we cannot arrive at a clean architectural blueprint for the Presentation Tier.

The first flaw is that the thin-client model does not respect data. Look at the data that goes from the browser to the web server. There is no data structure. It's just a set of name-value pairs. There are no data types. Everything is a string. And there are no data constraints. We can put any values into those name-value pairs.

Look at the data that comes back from the web server to the browser. It's highly presentation-oriented, marked-up data. The thin-client model does not respect data as data. We believe we can do much better than this in this day and age.

The second flaw is that the thin-client model tightly couples the mutually orthogonal processes of Presentation Flow and Data Interchange. It is not possible to move through the steps of the Presentation Flow without initiating what amounts to Data Interchange operations. Web pages are displayed in response to GET and POST requests sent by the browser. Even worse, every Data Interchange operation initiated by the browser willy-nilly forces a Presentation Flow. An infamous result of this tight coupling is the "browser back-button problem". When a user tries to step backwards page-by-page through the Presentation Flow, the browser re-does the Data Interchange operations that resulted in each of those pages. If any of those operations is a POST, it spells trouble, because POST operations are not safe or "idempotent". They have side-effects if re-done.

True, there are patterns such as POST-Redirect-GET that are used to work around this problem, but the inescapable fact is that the fundamental architecture is broken because of this tight coupling.

The third flaw is that the web model is request/response. It does not support peer-to-peer interaction styles that are required for server event notification.

At this juncture, one would be tempted to conclude that AJAX is the answer to these problems. Unfortunately, AJAX itself is just a raw capability and not a prescriptive model. It is possible to use AJAX and still come up with a horrible hybrid model where the web server continues to drive Presentation Flow in response to Data Interchange operations, and the AJAX interaction just hangs off to one side, so to speak.

Clearly, what we need is a prescriptive architectural model that explicitly delineates the dos and don'ts of application design.

We call this model SOFEA (Service-Oriented Front-End Architecture).

In the SOFEA model, there is an entity called a Download Server. This role may be performed by a standard web server, but its role is very strictly restricted to enabling Application Download. It stays out of the loop thereafter and does not play a role in Presentation Flow or Data Interchange. This is the most significant difference between the SOFEA model and the traditional thin-client model.

The application is downloaded onto an Application Container, which could be a browser, a Java Virtual Machine, a Flash runtime or even the native operating system itself. The nature of the Application Container does not matter. What matters is that there is an environment within which the application can run.

The application is built using a proper MVC architecture, not the Front Controller pattern used by traditional thin-client applications. The Controller drives Presentation Flow.

The Controller also asynchronously initiates Data Interchange with the Service Interface as required.

We stipulate that a SOFEA-conforming application will support both SOAP and REST-based Data Interchange through XML documents. This is the seamless interface that we envisage between the Presentation Tier and the Service Tier. These XML documents, being specified through a schema definition language, will enforce data structures, data types and data constraints at the point of capture and ensure data integrity end-to-end. The XML documents of the service interface correspond to a representation of the domain, or (to a Java programmer) Data Transfer Objects (DTOs). They're ideal for transferring data marshalled from domain objects or unmarshalled into domain objects. Thus, while XML documents are a connection point between the Presentation and Service Tiers, they also decouple the client from the domain model, which is as it should be.

For maximum flexibility, Data Interchange should support Peer-to-Peer interaction. A strict client/server delineation as in the case of traditional thin-clients is likely to be limiting.

The SOFEA model unifies the "thin" and "rich" client models. Indeed, these labels are meaningless when systems conform to the SOFEA model. The differences between technologies form a spectrum based on the application's footprint and its startup time. There is no sharp demarcation between thin and rich clients anymore.

The SOFEA model also highlights why there is a plethora of web frameworks. SOFEA explicitly repudiates Front Controller as an anti-pattern. Driving Presentation Flow from the web server is tied to a fundamental architectural flaw (the coupling to Data Interchange). Therefore no web framework can ever be satisfactory. Continued innovation in search of a "better" web framework is, we believe, completely misguided.

The SOFEA model can be implemented through any of the following modern technologies:

I DHTML/AJAX frameworks for Current Browsers
1. Handcoded with third party JavaScript libraries
2. Google Web Toolkit (GWT, GWT-Ext)
3. TIBCO General Interface Builder
II XML Dialects for Advanced Browsers
4. XForms and XHTML 2.0
5. Mozilla XUL
6. Microsoft SilverLight/XAML
III Java frameworks
7. Java WebStart (with/without Spring Rich Client)
8. JavaFX
IV Adobe Flash-based frameworks
9. Adobe Flex
10. OpenLaszlo
Obviously, different technologies will support, out of the box, different subsets of the SOFEA model. It's up to designers to consciously apply SOFEA principles to their application design, making the appropriate decisions with regard to their technology of choice.

Our contribution through the SOFEA model is the following:
  1. A renewed emphasis on data integrity, which traditional thin-client technology does not and cannot enforce.
  2. A cleaner architectural model that decouples the orthogonal concerns of Presentation Flow and Data Interchange.
  3. Affirmation of MVC rather than Front Controller as the natural pattern to control Presentation Flow.
  4. Unification of the thin-client and rich-client models, now seen as an artificial distinction.
  5. Support for SOAP- and REST-based business services, and a natural integration point between the Presentation and Service Tiers.
  6. Positioning the web server as a Download Server alone. The evils of web server involvement in Presentation Flow and Data Interchange are avoided.
We believe we have provided an architectural blueprint for the Presentation Tier, with clear dos and don'ts, while still supporting considerable diversity in technologies. Life above the Service Tier now lives by similar rules as life within it.

Tell us what you think.