[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]

Session E-DIST

Building Distributed Web
Applications Using XML

Rick Strahl
http://www.west-wind.com

Building Distributed Web Applications Using XML

Resources:

wwxml Converter:
http://www.west-wind.com/wwxml.asp

The next wave of application development is upon us in the form of distributed applications that communicate data rather than presentation between client and server sides. XML is becoming the messaging standard of choice and the primary mechanism for providing the protocol of communication. In this article Rick describes why XML is a great fit for this architecture and introduces some free tools you can use to easily take advantage of XML to build distributed applications that work over the Web.

XML is shaping up as a key technology in distributed applications. You couldn't have missed it in the hype happening in the trade papers. But unlike other over-hyped technologies XML has been rapidly accepted because it solves very specific technical problems by using a standard protocol/data representation. The simple act of agreeing to a common data format for data is making data exchange drastically easier than it ever was before.

In this article I want to discuss XML from the perspective of a messaging and data representation mechanism in distributed applications. For these applications XML is the ideal transport mechanism as it can open up applications to all sorts of clients including Fat Client, Thin Client, Browser based and even Palm Top applications. I'll introduce some concepts of persisting data and objects to XML generically that make it possible to build flexible solutions that can work well in standalone applications as well as distributed applications sharing data over the Web by building smart components and objects and using XML to represent that data as it travels over the Web.

In the process I'll also introduce some free Visual FoxPro and COM tools that help you easily persist your objects and data into XML quickly and efficiently with a few lines of code as well as introducing you to the client side tools you can use to use the XML data in a browser and in a Visual FoxPro Fat Client application transparently. But before we dig into code, let's review the premise of XML as it relates to a message mechanism used to represent and transfer data.

XML as a Data Representation and Messaging Standard

The full XML spec is rather complex because there are so many related technologies such as XSL, schemas and data definitions that are making XML complex for the first time user. Those offshoots provide powerful functionality to XML, but they don't exactly make learning it easy. However, at the core XML is a straightforward standard that's easy to understand and work with thanks to a standard XML parser object model which is consistent on different development platforms (client/server, Thin Client/Fat Client, Windows/Unix/Mac).

Even in its simplest form XML as a standard has lots of potential as a data representation and messaging mechanism. Data representation typically involves translating the data from a native format into XML, and then back into the same format or even a completely different one on the other end of a connection. A good example may be a Visual FoxPro/VB application publishing some of its data via XML and another application, possibly running a Java applet on Unix, picking up that XML data and using it internally. For a second consider that this data is dynamically generated on a Web server with the Java applet asking for specific data and receiving the result back as XML.

XML as a standard data exchange format

As you might expect, this is particularily useful in Web applications where XML can provide a standard, agreed upon format to transfer data over the Internet. In this respect XML is similar to a data format such as Comma Delimited or SDF file in the past. However, as a data representation format XML is also much more flexible than these old formats, because it can carry just about any kind of data. Most other text formats have been hampered by their limited ability to transport complex data like memo fields or binary data. XML is very flexible in what kind data it can carry because of its tag based language definition where every XML data element is marked with a start and end tag (<tag>value</tag>).

Support for multiple sub-documents

Another huge benefit of XML as a data representation mechanism is that XML can combine multiple pieces of data into a single document. The markup language has support for stacked and hierarchical data representation. XML documents can combine several separate entities (be it tables, objects, messages or metadata) into a single XML document. For example, you can send the actual data of say a table, as well as a message header that describes the data or maybe contains any error conditions that might have occurred in obtaining that data. You could also combine multiple tables (as an example) into a single document. Or a table and an object both parsed into XML.

A stacked document may look like this:

There are multiple data representations in this single document such as customer, invoices and even the jobinfo and errors XML fragments. Note that you can only have a single root element (<docroot> in this case), but you can nest multiple items on the second level and down. The XML fragments may be totally unrelated to each other or they may all be related - it's entirely up to your implementation.

XML can also represent hierarchical data. Hierarchical data is extremely useful for packaging related data in a logical fashion that is easy to read and group without relational concepts. Instead of representing say an invoice as a set of related tables, you can actually have an invoice XML fragment, which nests inside of it the invoice header, the customer information, and a set of lineitems:

XML provides flexibility and the ability to do things in a single pass that may otherwise require multiple passes. With XML I'm able to send multiple files at once where with an encoded file I'd have to make multiple requests to the server. Also, what happens if there's a problem? With file encoding there's no standard form to report errors. With XML error handling will be immediately obvious with an error header part of the document. You wouldn't have to explain the error format to anybody - one look at the result XML would tell the story.

All of this provides for a lot of flexibility in how the data is packaged for using XML in messaging between applications either locally or remotely over the Internet. The cool thing is that you as the developer can determine the level of complexity she wants to implement. You can go the simple route and simply dump data into XML or you can build a whole framework that deals with error handling and processing instructions implemented through special XML fragments that are part of the complete XML document as I hinted at in the first XML example. You can use the parser and build complex hierarchical objects or you can use plain application development code to generate the XML as a string yourself or the wwXML generating class I'll describe shortly.

Version Independence

If you use XML you're also not relying on a specific binary mechanism like COM, which requires a binary contract between the client and server. You're not bound to a property interface with XML - the data structures themselves can change by way of the XML structure without affecting the binary data representation. The XML may map to a binary object eventually, but there's an intermediary layer that pulls the data that knows what to do with it.

This makes for easier maintenance as you're not relying on binary binding and recompiling your code to make changes to data structures. This makes it a snap to add new functionality without breaking compatibility with existing clients. Old and new functionality can coexist with the same data structures without breaking either version of the app. Older clients simply don't include the new data and your application can handle this accordingly while newer client can use the additional data as needed.

Object Persistance

XML can also provide a good way to present persistent data from an abstract element such as an object that exists in memory. XML's hierarchical structure actually makes a very good fit for mapping complex nested objects into XML structures. Typical real world objects tend to be hierarchical and a good programmatic implementation of these objects can closely map those relationships. The Invoice example above is a simple example. You'd have an oInvoice object with sub objects for oCustomer, oLineItems and so on.

These types of objects are great for passing data around locally, but when the times comes to persist this data to send it over a Web connection for example things get more tricky. XML can provide a great way to take a snapshot of the object and store the contents into a persistable form - a file or a database typically. Or you can send the result and send it over the Web essentially marshalling an object from a client to the server.

This has a large number of applications: Transferring data over the Web is an obvious choice. For example, a client application can build a purchase order object and send that object persisted as XML to the server. The Server validates the PO and then either saves it into the database (unwinding the hierarchical relationship into a relational DBMS structure) or sends back a note that says the PO could not be accepted.

Another extremely useful application of persisted objects in XML is Server Session state. Servers, especially Web servers should be stateless, but most Web applications need to track state for users using a site. Persisting real binary (COM) objects across multiple requests is problematic in terms of resources and scalability and multiple machines. But persisting state as strings in a database table via XML is fairly lightweight and flexible. As long as the data can be written somewhere persistent it can be easily retrieved using XML.

For example you could have a COM object that can accept an XML input parameter, and generate XML output. By doing so, you can use that same object on a Web backend to process incoming Web requests directly over the Web using exactly the same code base. This is one of the biggest points for XML in my experience - the ability to apply logic directly without change in multiple layers of applications, just by using a common XML interface to pass messages around. Once such a mechanism is in place it doesn't matter much whether you use COM, HTTP or any other client to access your code, either directly or indirectly over the Internet.

Distributed Applications and XML

XML is a near perfect fit for many distributed applications precisely because it is so flexible in representing existing types of data. Database tables, objects, business objects - you name it and it can be represented fairly easily in XML format.

Distributed applications are the next major evolution in application development. The current form of the Web has mainly focused on HTML based applications that mix content with presentation. XML promises to separate these layers once and for all, with XML providing the data to standalone GUI applications, browser hosted or even PDA, phone applications making up the front end. In this scenario the Internet or the Web in particular becomes really just a very wide area network with data being served from a Web server rather than a standard 'file server'. The difference is that a Web server can hook a full backend application so you can cause the server to execute code and return data as a result of that code. Note that this is very different from the client server model where some service on a remote machine such as a SQL Server just spits back data. In the distributed model the front and back end both are smart and communicate with each other with logic running on both ends. The server in particular is smart and executes complex business logic rather than just serving and accepting data. The front end can run local copies of data in offline scenarios, and the backend can apply business rules and other related logic when servicing data and data update requests. A distributed application consists of several key components:

Smart Client

In a distributed application client machines tend to run sophisticated applications that communicate with the Web server. Unlike typical browser only applications these apps are smart and can present a rich user experience using a full Windows (or other) GUI. Clients can also persist and work with data locally, so for example a business object could be downloaded from a server, manipulated locally and the sent back up to the server later. All of this requires that the client be able to run at least some amount of code, which gets away from the total thin client/dumb terminal model that most HTML applications exhibit today.

Smart client code can be browser based scripting (DHTML, JavaScript, VBScript) or it can be in a full standalone application written in Visual FoxPro, VB or other GUI development tool. The point is that you don't have to be limited by an HTML interface, but if you want to run in that environment and use scripting you can. The client should be able to generate and consume XML and be able to communicate over HTTP.

Fat Server

In distributed applications servers are the providers of the permanent data store as well as the final stop for application business rules. On the server side tools like Active Server Pages, Web Connection, Cold Fusion etc. are used to build the backend interfaces that communicate either with the database directly or go through intermediary business objects that perform the business logic in typical n-Tier fashion.

Servers need to serve lots of users simultaneously in a typical Web server format that uses the HTTP model, hence the term Fat Server.

Communication over HTTP

HTTP is the key to a distributed application's communication. All communication occurs over HTTP in these applications for a number of reasons. HTTP can take advantage of the existing infrastructure used for Web applications that are in place already. The same tools to build HTML based applications with ASP, WWC and CF can also be used for distributed applications. Applications can even be retrofitted relatively easily to provide XML based interfaces instead of HTML (one example is described later). The other issue is that the HTTP infrastructure provides for many security needs both at the protocol as well as at the application level. HTTP can travel through most firewalls without special configuration (if you've ever build any kind of raw TCP/IP application you can really appreciate this).

How it works

If we take a look at how a distributed application works we come up with an image like this:

Figure 1 - Distributed XML applications use XML to persist data from the client to server and back. The standard HTTP architecture is used so building this architecture requires no new technologies.

[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]

Session E-DIST

Building Distributed Web Applications Using XML

Rick Strahl http://www.west-wind.com