The problems of using and reusing many varying size objects in an application can cause fragmentation of the heap that can slow down processing speed. This article uses a factory object, to make and recycle objects with minimal fragmentation effects.
One of the patterns in the classic book "Design Patterns; Elements of Reusable Object-Oriented Software" is the factory method, where objects are created and I recently had to use it. The reason? A memory fragmentation problem in a simulation system using a large number of objects to hold numeric arrays. There were several numeric array types, in fixed sizes varying from 1500 x 1 to 5000 x 10 numbers either integers or doubles. These were used in a simulation of a financial calculation over a two year period, creating data on each of 472 simulated days from historical data. On each day, data was read in from a database, processed, then saved out.
Despite extensive leak checking, the software would only run for a number of simulated days before eating up the Windows swap file. Memory leaks were carefully tracked down and eliminated but my hair was still in jeopardy. The app would not run for longer than about 30 simulated days before it had committed over half of the NT Paging file- this on a 512 Mb ram system! Close investigation showed that on each day it ran, the allocated memory grew from 5M to about 80Mb then shrink back to about 5M again. On the face of it, not a problem with 512Mb to play with, but watching the Win NT Task Manager showed an increasing amount of memory committed.
The problem was simple, it was heap fragmentation to blame. This occurs when many different objects are created and then destroyed repeatedly. As each object is created, it consumes memory from the heap. If objects were created and then destroyed in reverse order it probably wouldn't happen as all the freed memory could be merged into one large block. But in any system with a large number of objects the order of creation and destruction will never be symmetrical- my app can easily have up to 50,000 objects in memory at the same time. So when an object is destroyed, a pointer to the freed memory block is added to a free-block list.
When other requests for memory are made, Windows tries to allocate those requests out of the freed list first. Fragmentation happens where a large block like 1Mb has been requested and subsequently freed up, followed by a request for a smaller block. This is taken from the first block on the free-list that may well be the 1Mb, which leaves only 900Kb free. Then another request for a large 1Mb block comes along, it cannot be satisfied from the free list and so it is taken from the heap. If the creation/destruction cycle happens enough times the large blocks on the heap are chopped into smaller bits; the physical ram is exhausted and replaced with virtual ram. Windows heap memory management (and Delphi) is quite clever- it takes a lot to get it to fragment. But under the relentless pressure of a large number of objects being created and destroyed, the memory manager will gradually cave in.
When Windows runs out of free ram, it starts swapping pages of ram to disk and performance takes a nose dive. Your app could be galloping happily along at 100% CPU until swapping starts. It then becomes a funeral march crawling along at maybe 7-10% of CPU, taking forever to run. Disaster!
Microsoft have put a lot of effort making the memory manager as efficient as possible. For example under NT, there is a two stage process of reserving and committing memory. If your app requires 100Mb, it is reserved when the app is loaded. But only when the memory is accessed are the reserve pages committed. If you want to learn more than you will ever possibly need to know about this and other topics, I recommend the book Inside NT, published by Microsoft Press but get the David Solomon version which is the later edition, not the Helen Custer first edition.
So I needed a non fragmenting way of creating many objects, using them, throwing them away and then doing it all again without windows running out of virtual memory. Having recently read the Pattern book I thought why not use a factory, i.e. a factory object that creates objects of a specified class. I then went one better and made it environmentally friendly, so it gets to recycle all its manufactured objects instead of destroying them and without the associated fragmenting problems. The icing on the cake was to make the factory able to expand its capacity without loss of access speed.
Rather than have one factory class for all class types, I took the simpler approach of passing the object class into the factory as a factory creation parameter. When the factory is created you specify both the class of objects it can make and the initial storage capacity of the factory. This size can be changed upwards by calling the GrowFactory method. I suggest you only call this in exceptional (!) circumstances.
The link back to the factory from each object is needed, so all "factory made objects" (FMOs) must descend from a TFactoryObject class instead of Tobject. This adds a factory reference which is "stamped" on all FMOs so that the object knows which factory to use to recycle itself.
Instead of creating an object your app requests one from the suitable factory by calling its RequestObj method which returns a tclass object back and you convert it to the right class using 'as'. Finally when you have finished using the object you just call its RecycleSelf method. No creation or destruction except of the factories themselves.
HOW IT WORKS
When the factory is created, all of the objects are physically created in one contiguous block of ram. A tlist (fblocklist) object holds the address of each of these blocks. Each time you grow the factory, a new block is created and added to this list. The method AddObjects creates the specified number of objects using the ram from the block. If you write code like this be aware that just doing a tobject(address) is not enough to create the object. You must always call ObjectClass.InitInstance(address) to transform it into a 'proper' object. InitInstance clears everything to zero, nil etc but more importantly it sets up the VMT.
The factory also contains another tlist (ffreelist) which holds the address of every unused object.
All of the donkey work of filling the factory is done in the private method AddObjects. For each object created in the block, this converts the pointer ptr into an object, using FactoryObject as the class of object created. This must always be a descendant of TfactoryObject. Obj holds the object reference and links the factory to the manufactured object. ptr is then incremented to point to the next object in the block by adding fsize.
To get an object you r code calls Request_Obj which pops the reference off the end of the ffreelist and returns it as the first requested object. Recycling is the reverse; it pushes the recycled object reference onto the end of ffreelist. One thing to bear in mind. When a factory made object is in use, the factory does not have any reference to it, although the memory occupied by the object is contained within the factory!
REPLACING CREATE AND DESTROY
Using the factory requires you to use manufactured objects slightly differently from normal. You no longer create or free them explicitly, instead you just request the relevant factory for the object. Unless the factory is empty this will always work. You have to modify the initialisation code in the constructor and termination code in the destructor routines. There are two approaches.
1) If the object has a simple create constructor with no parameters, you can rename it to procedure init;override and remove any inherited create calls. The factory always calls an Init method when any object is requested. By default this does nothing but you can override this so your Init will be called automatically on every object requested from the factory.
2) If your original create has parameters, rename it to something like Initialise and remove inherited etc calls. After the object is requested call the initialise routine, eg MyRoutine.Initialise(...)
If the object has destructor code, rename it to procedure Done;override so that it's called automatically when the object is recycled. Init and Done are similar to create/destroy but without the baggage of the construction or destruction mechanism.
As a slight digression, I understand that there are arguments in the Delphi world in favour on one part creation or two part One part is where the constructor has parameters and completely sets up the object. In the two part approach, the constructor just creates a blank object which is then initialised by a later method. The factory approach I think is firmly in the two-part camp..
When the factory is destroyed, all the allocated memory blocks are freed up. Before this, the factory checks that the number of objects in the freelist matches the capacity. If you have forgotten to recycle all your remaining objects, it will raise an exception.
This shows the benefits of factories.. At work my app had many different sizes of objects in differing quantities, but a 15,000 line program would not exactly be publishable. After a bit of trial and error I came up with a short program that can show fragmentation. However this depends on the size of the different objects, how many there are, available ram and for how long it is run. Also, fragmentation seems to occur more quickly on NT4.0 than 98 which suggests that maybe 98 has a better memory manager. When run it creates and destroys a large number of objects repeatedly for the specified number of days. It also does exactly the same thing using a factory. Each day is timed and both sets of times plotted using Tchart.
I used three types of objects, all descended from tTestobj which descends from Tfactoryobject. Small objects are 2K in size, medium are 40K and large are 800K, but these sizes are defined by constants and can easily changed. The demo program allocates 100Mb for the normal objects and another 100Mb for the factory. Both contain the same number of objects- divided equally by size between the three object types so there are 17406 small objects, 873 medium and 43 large. In the object creation procedure TimeOneDay these are randomly created and added to the list. At the end of the day they are freed up and the process repeated next day.
The same is done using three factories with all manufactured objects added to a factorydata list. On a P2 400 with 256Mb running for 100 days, there was a modest increase in the time for both the factory and the normal object creation. Commenting out the normal creation confirmed this suspicion- the increase was much less over a 1000 day run using factories against 100 days with both normal and factory testing. I guessed that this was caused by increased page swapping due to fragmentation affecting both processes. Some other combinations of sizes and number of objects under Windows 98 showed no fragmentation
I implemented this originally using a tlist for both testdata and factorydata and changed it later when I realised that repeatedly adding and freeing up 18000 pointers was also adding to heap fragmentation. I'm not advocating ditching tstringlists completely in your code- they are very useful (as are tlists), but if you have to manipulate a large number of items, it might be better to use your own list structures. If you want to use Tlist or Tstringlist for this purpose, its probably better to completely fill the structure with nil pointers, so that count = capacity and use an integer to hold the index of the last pointer.
This app also confirmed that the factory code is very quick at allocation and deletion- typically 30 ms for 18000 objects instead of the 870 ms that normal creation/deletion took.
Tgis article originally appeared in Delphi Developer magazine. Sources.zip