COM: A Model Problem Solver

Binary standard lays foundation for component software

Paul Stafford
Joel Powell

March 10, 1995

Are you curious about what the Component Object Model (COM) is and what it can do for your applications? Knowing what types of problems you can address and solve using COM is a key part of deciding if it is for you. Before discussing that, a brief look at the basics is in order.

COM is a component software architecture that allows binary components supplied by different software vendors to interoperate in a reliable, controlled manner. In the past, independent software vendors (ISVs) developed applications that operated in their own world; those applications were their own world. The products that emerged resulted in little or no interaction among applications. Even when ISVs made a sincere effort to get their programs to interact, there was no clear standard and, as a result, getting different applications to interact was difficult at best.

Now, the programming mind-set is changing. Terms such as "document-centric" and "component software" are taking hold. The shift has opened a new world—and challenge—to today's programmers. COM helps jump-start this process by defining a binary standard that extends component capabilities at the system level.

But why should developers care about COM?

Put simply, COM lays the groundwork on which we can build component software.

COM represents a binary interface standard that allows developers to build specialized software components that interface in a common way with other software components. OLE is built on the foundation that COM provides.

Software vendors don't need to communicate with each other to exchange specifications or in any way coordinate the design and assembly of their specialized software components. They just need to build components that adhere to the OLE interface standards.

The idea is to enable the software industry to achieve greater rates of innovation, more specialized customer solutions, higher quality applications, and a faster, less expensive development process. (See "The benefits of component software" by Gregory Leake, Developer Network News, November 1994.)

Four problems that COM solves

Here are four basic problems that arise in a component-based system. We'll discuss how COM solves these problems:

Interoperability: How can components supplied by different software vendors interoperate safely?

Versioning: How can one system component be updated without replacing the other components in the system? How does COM let you provide new versions of components while maintaining reliable backward compatibility?

Language independence: How can function calls be made between components written in different programming languages?

Transparent remoting: How can clients communicate with component objects without being concerned whether those components are running in the same process, in a different process, or even on another computer?

Basic COM terminology

Before we look at how COM solves these problems, let's define some essential terms.

V-table

When you have a pointer to a COM component, you actually have a pointer to a pointer to a virtual function table, or v-table. This v-table contains pointers to the functions within the component (Figure 1):

-

Figure 1. Representation of a v-table in a component object

Interface

An interface is a group of semantically related functions, or "methods." Taken together, the methods in an interface define a logical group of services that a component object can provide to the system.

The definition of an interface forms a contract between COM components and component users. This contract is a definition of expected behavior and responsibilities. Interfaces are immutable—methods cannot be added to or removed from an interface. Once defined, the syntax and semantics of interface methods can never change.

An interface itself has no implementation; therefore, it cannot be instantiated. It is the programmer's responsibility to provide an implementation for the interface. Different COM components may implement the same interface differently; however, the contract states that the behavior must be the same.

Microsoft already defines many standard interfaces. For example, there are general interfaces (such as IDataObject and IDropTarget), OLE interfaces (such as IOleObject and IOleContainer), and many others supporting OLE Automation and OLE Controls. If you cannot find an interface that defines the services your component wants to provide, you can create your own custom interface.

Interfaces are strongly typed; each interface has its own unique interface ID, or IID. An IID is actually a globally unique identifier (GUID). A GUID is a unique 128-bit value that allows COM to properly identify a specific interface. You can generate your own GUIDs with GUIDGEN, a tool provided with Microsoft Visual C++ 2.0.

COM components and the component object library

Basically, a COM component consists of compiled code that provides services to the system. It differs from a C++ object in that its data is always private; the only way to access a COM component's data is through one of its interfaces.

In C++ you instantiate an object by using the new operator, which returns a pointer to the C++ object's v-table. With COM, you must communicate with the Component Object Library to obtain an initial interface pointer on a component. This library is provided as part of the operating system—to date on Windows, Windows NT, Windows 95, and the Macintosh—and contains the basic "plumbing" that enables components to find, connect to, and communicate with other components in the system.

All COM component interfaces must derive from a base interface named IUnknown. IUnknown has three methods: QueryInterface, AddRef, and Release. QueryInterface lets a client obtain a pointer to another interface supported by the component (if it exists). AddRef and Release are counting methods that allow a component to control its own lifetime.

COM components are housed in either a dynamic-link library (DLL) or an EXE; this housing is the component server. DLL component servers are labeled in-process servers or in-proc servers. EXE component servers are labeled local servers if the component is running in a different process but on the same computer. EXE component servers are labeled remote servers if they are running on a different computer.

COM components are often referred to as "Windows objects," "component objects," or "components." All these terms mean the same thing. Using such terms separates them from C++ objects, which, hopefully, reduces confusion between the two. We'll use the term "client" to refer to any piece of code that calls a component's interface methods.

Armed with these terms and a brief definition of a COM component, let's look at the solutions COM provides for the four problems we outlined earlier.

Solving the problem of interoperability: The binary standard

A client obtains the services of a COM component by calling interface methods provided by the component. When one component makes a call to another, an important question arises: How can such a call be made safely when the other component may have been written at a different time, by developers in a different software company, and in a different programming language?

COM enables calls to be made between components by specifying a binary standard for such calls. The standard specifies how a component constructs the v-tables for its interfaces, and how the client calls interface functions indirectly, through the v-tables. COM's binary standard allows clients to safely call a component's interface methods, without concern for the component's implementation language.

Interface "contracts"

Once the client is successfully making calls to the component server, it needs to know exactly what services it will receive. COM interfaces specify the semantics of interactions between components.

The specification of a COM interface serves as a contract between the client calling the interface's methods and the component providing an implementation of the interface. This contract specifies the services that the client can expect to receive from that component. The contract is good for all time, because COM interfaces are immutable. Once an interface is defined, the semantics and syntax of its methods can never change.

Encapsulation

For a component object system to be robust, components must always be able to protect themselves from direct access by other components. The use of interfaces allows COM to ensure complete encapsulation of a component's internal data and processing.

Clients never have direct access to a component's internals. Instead, clients have interface pointers on the component, and obtain all services by calling interface methods indirectly through that pointer. The interface pointer is opaque; all implementation details and internal data are hidden from the caller.

The Component Object Library

The Component Object Model is a specification, but it also has an implementation—the Component Object Library.

A client wanting to use the services provided by some particular component object class starts by calling the Component Object Library, asking for an interface pointer on a component of that class. The Component Object Library then does the "legwork" necessary to hook up the client and component. First it finds and runs the server executable (DLL or EXE) associated with the component class and asks the server to create an instance of the component. It then obtains an interface pointer on the component, and finally returns that pointer to the client.

The client can now make calls to the component through the interface pointer. It can also obtain additional services from the component by obtaining additional interface pointers (via IUnknown::QueryInterface).

Versioning and backward compatibility

COM provides a versioning mechanism that allows seamless evolution of components. When one system component is updated, it is not necessary to replace the other components in the system. Existing clients of the component do not need to be recompiled, rebuilt, or even notified. The new version of a component can easily maintain backward compatibility with old clients, while at the same time adding new features to be used by new clients.

The first key to versioning in COM is that interfaces are immutable. Because it is not possible to add new functionality to an old interface, the only way that a component can add new features is to expose those features via a new interface. This interface is not a new version of the old interface—interfaces are never versioned. Instead, it is a completely new interface, with its own unique IID.

The second key is IUnknown::QueryInterface. This call allows clients to determine the capabilities of a component at run time. Clients that want to use the new features will simply use QueryInterface for the new interface.

Maintaining backward compatibility with old clients is simple—just continue to support the old interface. Because interfaces are immutable, existing client code is not broken.

Note that as long as the syntax and semantics of the interface methods are preserved, the methods can be reimplemented internally—for example, to improve performance. Existing clients will call the methods just as they always have, and benefit from the increased performance "for free."

language independence

COM is language independent. By specifying a binary object standard (as opposed to a source code standard, such as C++), COM enables calls to be made between components written in different programming languages (C, C++, Small Talk, and so on).

Components can be implemented in any language that supports the creation of v-tables. They can be called from any language that supports indirect calls through function pointers. Typically, the various components in the system have been written in many different languages.

Transparent cross-process interoperability

As described earlier, COM component servers can be implemented in three flavors: in-process, local, and remote. Fortunately, clients do not need to know (or care) which kind of server they are calling. COM provides "transparent cross-process interoperability." This means that clients use the same simple programming model when calling any component. The Component Object Library insulates the client from the specifics of where the component is actually running.

The three possible scenarios are shown in Figure 2. Calls to in-proc components are simply direct calls to the component. Overhead is minimal, allowing in-proc components to provide the best performance.

Figure 2. In-process, local, and remote server interoperability

Local components cannot be called directly, because they are running in a different process space than the client. Calls to local components are intercepted by a "proxy" object provided by the Component Object Library. The proxy generates a call through the LRPC (lightweight remote procedure call) channel to a "stub" object (also provided by the Component Object Library) that runs in the process space of the local component. The stub object then completes the procedure by making an in-proc call to the local component.

The key point is that any differences between the in-process and cross-process cases are transparent to both clients and components. Clients always make calls through an in-process pointer, and components are always called via some in-process pointer. Any extra work needed when making calls across process boundaries is handled "magically" by the Component Object Library.

In 1996, Microsoft plans to release a version of the Component Object Library that will extend transparent remoting to include calls between two machines on a network. Existing components will not need to be rebuilt then; components that can make cross-process calls today will gain the ability to make cross-network calls "for free."

Getting started

A host of information is at your disposal. To name just a few sources: Inside OLE 2 by Kraig Brockschmidt (available in the Developer Network Development Library) and the OLE Programmer's Reference Volume 1 and Volume 2 provide a good head start; they are all available from Microsoft Press. Also, check out the full COM specification provided in both the Development Library and with the Visual C++ 2.0 CD. If you have any problems, concerns, or ideas, feel free to contact Microsoft on CompuServe (GO WINOBJ).

Joel Powell is a support engineer specializing in COM and OLE. He is also a published author of books ranging from Win32 programming to computer gaming.

Paul Stafford is an engineer in the Windows Developer Support Group, specializing in all things OLE. Joel's books occupy the place of honor on Paul's bookshelf, right next to the OLE Programmer's Reference.