In this article I demonstrate one way of working with sets of business objects
The Needs Of The Many
In this article we return to the world of a general purpose object framework and complete relationship handling by considering how to work with sets or collections of business objects. So far we have seen how to express simple 1-1 or many-1 relationships by simply exposing the related object as a property, such as Order.Customer. This is a simple and natural way to expose related objects, with the appropriate handling occurring within the accessor functions of the property, but how can we expose the other variants of relationships, namely 1-many or many-many? A simple approach, such as Customer.Order, does not serve our purpose as it implies that there is a single Order for a given Customer (which is generally not the case), and prevents access to any other Orders that might exist. What we need is syntax, and an efficient implementation, for handling a set of business objects.
If we think about this in a different way then we need a general purpose means for handling such a set of objects. In practice, as well as a small collection of objects (such as the list of Orders for a given Customer that would generally consist of at most a few hundred entries) we would also want to support large collections of objects, such as the list of all items of Stock. As we are operating in an object oriented world, the logical conclusion is that we will need a new class that exposes a set of business objects. Let us call this class TPDList (a list of Problem Domain objects) to go with our TPDObject class. The name is arbitrary and some may think that this one implies a particular implementation. Unfortunately alternatives such as "set" and "collection" already have similar implications within the Delphi world.
A syntactically pleasurable and natural way of dealing with a set of objects would be to expose them using some kind of array, as shown in Listing 1. Such a class definition would allow us to access the objects using constructs in our code such as Customer.Orders, which is very familiar. However, there are two major disadvantages to this approach: it implies that we know how many objects are in the list, and that they are all available in a random-access type fashion. The only way of ensuring these prerequisites are true is to fully populate the list, or at least fully populate an internal list of object ID's and load each object individually as it is accessed. The downside to the first approach is excessive memory consumption and a possibly noticeable delay while the many objects are instantiated, and the downside to the second approach is the inefficiency of issuing a record load for each object accessed (stepping through a set of 100 objects would require 100 database queries to be issued).
So, despite the syntactical benefits of the array approach it is not optimal for the general case, although of course a framework could be extended to support two (or more) different ways of handling sets of objects, one for small and one for large. Personally I am a big fan of consistency and would prefer there to be a single syntax for working with sets of objects. As we shall see in a later column, this does not prevent us from having a number of implementations for this syntax, tuned for different circumstances.
At the most fundamental level our objects are going to be populated from persistent storage mechanism, usually a database. Most data these days is accessed through a query (SQL) and if we consider the properties of such elements they are usually able to provide us with the "next" conceptual record, and to determine when the list is exhausted. Typically information such as the number of records in the list, or random access to them, is not available. Our framework should be designed with performance and client application resource requirements in mind, so a suggested interface for our TPDList is shown in Listing 2. This provides us with a First method (re-initialising the list back to the beginning), Next (access the next object in the list) and an IsLast property (indicating when the list is exhausted). The "current" object in the list is available through an appropriately named property. This interface may seem unnecessarily simple; where are the methods to support backwards navigation? Experience shows that reverse navigation through a list is very rarely required, and where it is, can be easily accomplished through other constructs. For the time being, we will keep our list management class simple.
How is this class to be implemented? As before, our problem domain (business object) layer should have no concept of how data is stored or managed. Our TPDList manages a collection of business objects and so cannot be a thin wrapper around something as database-specific as a database query. Instead, we will design another class to handle the database support for lists of objects, and provide an object-based database-independent interface between our TPDList and this class. In this way we have kept the strict separation between the business layer and data management layer, allowing us full flexibility in how the latter is managed.
Sweetening the pill
In the same way as our TPDObject had a TDMObject data management corollary, it may seem that our TPDList should have a TDMList equivalent. In practice, there is a large amount of commonality between the two data management classes and they can be subsumed into a single class. It may seem logical to have a class hierarchy with a shared ancestor for the common functionality and descendants to handle single instances and lists, but this imposes a code-writing overhead when it comes to actually implementing our application, with little benefit. We will therefore extend our TDMObject to support list operations, as well as the existing single instance Load and Save.
Each TPDList will therefore own a private TDMObject that is responsible only for managing data access for the list of objects. The "current" TPDObject exposed by the list will mimic the "current" record in the database cursor. We will need a unique (and private) data management object for each TPDList because it needs to maintain state (query information, current cursor record and so on). By contrast, problem domain objects of the same class can share a single data management object because the Load and Save methods are stateless operations.
Our TDMObject can now be extended to support the operations required. It will have FirstRecord and NextRecord methods (to indicate it's record-based nature), and an IsLastRecord property to indicate when the cursor is exhausted (this cursor will be private to the data management object and based around some database-dependent query mechanism). The methods in our TPDList will simply delegate work off to the data manage object, calling these similar named methods. The TPDList asks the private data management object to provide it with an instantiated problem domain object as the client application steps through it. The code for populating this object from the query cursor should be shared with that for populating a single object in the Load routine.
We now have a class that allows us to navigate through a set of problem domain objects, and we have extended our data management class to support these operations. What we have not yet done is define how different sets of objects should be defined. After all, there are many different types of sets of objects our application might require; we might need the set of customers who have ordered a particular stock item, we might need the set of customers called "SMITH" or we might need the set of all customers (for reporting purposes). An obvious way to define these different sets that we might need is to define custom constructors. Listing 3 shows the public interface for an example TCustomerList that manages a set of customers. You will also see that it exposes a Customer property; this simply returns the CurrentObject statically typecast to a TCustomer (we know and expect all objects in the list to be of this type so it is a safe operation and avoids placing the typecast within the main application logic). Some ultra-purist OO advocates might claim that rather than use custom constructors a class hierarchy should be created, with a new class for each type of list required. I personally can see little benefit in this approach unless the particular set of objects requires some particularly involved handling that needs to remain private for a specific list implementation, and this approach has the downside of requiring a considerable amount of code to be written and consequent class proliferation.
We can now deal with lists of customers simply by constructing a TCustomerList in the most appropriate way. Our application code might look something like this:
CustomerList := TCustomerList.CreateByName ('SMITH');
while not CustomerList.IsLast do begin
// Do something with CustomerList.Customer object
Having this kind of logic within the application is much clearer than the equivalent procedural means of constructing database-specific queries and using fields directly. In particular, we have encapsulated all of these details within our data management class and this code centralisation permits us the luxury of knowing that the opportunities for referencing an invalid table or field name are greatly reduced.
So far we have not specified the actual implementation of the custom constructors in our list management objects. As already noted, these classes sit resolutely within our application business logic and therefore must be entirely database-independent. Specifically, these constructors cannot be involved with building up SQL queries or the like. Such details are within the remit of the role for our data management class, and so the corresponding TDMObject for each class will acquire custom constructors, each exactly matching in name and parameters those for the TPDList object for which it is responsible. Next month we'll round off this handling by looking at some of the implementation details, and showing how refinements in our class design, and the power of polymorphism, leads to very reduced code writing for our custom application classes.
This article's question
Being able to handle sets of objects is the building block for handling object relationships. How can we use them to support 1-many relationships such as Customer.Orders, and can we support these constructs generically in the same way that we did for 1-1 and many-1 relationships?
((( Listing 1 - Array-based TPDList interface)))
TPDList = class
property ObjectAtIndex[Index: Integer]: TPDObject; default;
property Count: Integer;
((( End Listing 1 )))
((( Listing 2 - Navigational TPDList interface)))
TPDList = class
property CurrentObject: TPDObject;
property IsLast: Boolean;
((( End Listing 2 )))
((( Listing 3 - Example TCustomerList public interface)))
TCustomerList = class (TPDList)
// Result := TCustomer (CurrentObject);
function GetCustomer: TCustomer;
constructor CreateByName (const Name: String);
constructor CreateByStockOrder (Item: TStockItem);
property Customer: TCustomer read GetCustomer;
((( End Listing 3 )))
Next in series