Advanced Run Time Type Identification in C++

Part II

Property Library

An Implementation of RTTI in C++

Peter Barczikay (bpeter@rcs.hu)

Andras Tantos (tantos@rcs.hu)

© 2003 by Robot Control Software Ltd. (http://www.rcs.hu). All rights reserved.

December 5. 2003.

Abstract

Run-Time Type Identification has many form and implementation in different programming languages. Standard C++ also contains RTTI support, but RTTI should provide much more than a unique identifier of types available at run-type. This article presents a C++ library (Oops) providing advanced type information. It is scalable and easy to use, but the main advantage is, that the library introduces properties for providing detailed, hierarchical description of types, classes and containers. The objects of applications can be investigated using property iterators. Property iterators traverse through the tree hierarchy of classes, structures and containers, give information about the pointed objects, describe their type, get and set their value. The Property Interface gives access to objects with the help of properties without knowing the actual type of the objects at compile time.

Oops Library opens new dimensions in component based software development. Designing classes with Property Interface helps to make better and more logical software components, while the development, testing and fine-tuning of applications is quicker and easier due to the help of property streams (persistency) and property editors (application generator). The library contains a lot of interesting programming techniques, it has a clear design, and its source code can be downloaded with a plenty of test and example programs for proving its quality.

Introduction

The first part of this article (http://www.rcs.hu/Articles/RTTI_Part1.htm) describes the requirements of an advanced RTTI system providing detailed type information at run time and fulfilling the requirements of Persistency and Application Generators. The Oops Library of Robot Control Software Ltd.; http://www.rcs.hu provides all required features and services. This library will be discussed and presented in this article. The authors currently do not know about any other C++ library having competitive services.

The Oops (Object Oriented Property Stream) Library has two parts, the Property Library and the Stream Library.

The Property Library provides the advanced RTTI system and the so-called Property Interface for investigating and accessing the type description and the object hierarchy of applications. Consequently using Oops Library in an application has two steps. First the RTTI description of every types, structures and classes has to be prepared. Oops provides tools, macros and example programs for making this step as easy as possible. The second step is using the objects with the help of the Property Interface. The object hierarchy can be accessed and investigated with the iterators provided by the Property Interface.

The Stream Library makes possible to save and load objects. It uses the Property Interface for iterating through the object hierarchy and getting or setting the values of properties. The third part of the article will discuss the Stream Library in details, but after reading this article you probably will have some idea, how it works.

A simple Application Generator program called Property Editor is available as well. It also uses the Property Interface for accessing the application’s data structure and displaying it for the User. The graphical user interface provides a tree representation of the object hierarchy, where the User can view and modify the objects, add new objects and delete existing objects.

The Basic Idea

In C++ every structure, class, template and array or container is a new and different type. We need an automatic method for creating the RTTI description, because writing it by hand is almost impossible. The most critical part of the RTTI description is, how to convert a variable to another format used in object streams. The basic idea of Oops Library is introducing the well-known properties in C++. If we can see the significant members of objects like properties, and we are able to investigate the name, type, and value of the properties, we can easily write programs for printing, saving or loading the objects. These programs will be independent of the source of the accessed objects, and we do not need to write conversion functions by hand.

First we need some base types what we can use as elementary property types. The type description and the conversion functions of Base Types are implemented by hand, but fortunately the number of Base Types is limited, and Oops Library provides default type description for all built in types, and many other types like strings. (The type description is called Type Info Record in Oops.)

Classes and structures should be handled with a much simpler way. The programmer developing the class has to tell somehow, that a given class has properties, and the members being property must be signed somehow. It should look something like this:

class MyClass : public PropertyBase
{
public:
     property int A;
     property float B;
     property int Size();
};

This solution requires the minimum effort for creating the type description and it has a great flexibility, because the programmer has full control of which variables are available as properties. The solution provided by Oops Library is slightly different, but the main point is represented very well, with this example.

The life becomes more difficult, if member functions, arrays, containers, enumeration and compound types are also used as properties. A correct implementation of this idea makes no difference at the user level, as in Oops Library any member even standard containers can be used as property.

Defining the properties is the first step only, for using them we need a standard interface. The best choice is the well-known iterators. All information of objects can be accessed and investigated with Property Iterators. This means, that a library knowing nothing about the application is able to access any property if the appropriate Property Iterator is provided. The 3rd part of this article will present such a library for saving and loading objects using only the Property Iterators.

Parts of Property Library

The implementation of Property Library contains the following parts:

The detailed description of the implementation cannot be described here, but the following chapters demonstrate how the most important features are implemented, describe the critical parts and the programming techniques used to solve problems.

The Implementation of Property Library

Type Info Record

The Type Info Record is the key element of the Property Library. Every type having RTTI description must have its own Type Info Class and all Type Info Class has one and only one instance called Type Info Record. The Type Info Record stores all type information of the given type and the member functions of Type Info Class provide some basic operations.

All Type Info Class is the descendant of a common base class and linked to each other for making possible to iterate through all Type Info Records. The Type Info Class is the only part of the Property Library knowing anything about the actual types. All other parts must use the services of Type Info Records.

The base class of all Type Info Class has the following services.

Constructor

It has one constructor only getting the name of the type, as it’s first parameter. The virtual destructor is defined, but it does nothing.

Identification

All types need a unique identifier. Oops Type Info Records provide string and binary identifiers. The type name and type identifier is returned by TypeName() and TypeId() member functions. Type name is given in the constructor, but type identifier may be created somehow. For example it can be computed from the relative address of the Type Info Record, which is an application dependent way, therefore other identifiers may be required for ensuring that TypeId-s are global identifiers.

Creating and Destroying Objects

There is a set of functions called CreateObj() for creating objects and others called DestroyObj() for deleting objects. Every Type Info Class has to override these member functions and calls the appropriate new and delete operators. The CreateObj() functions may call the default constructors or the copy constructors. The second case makes possible to create objects similar to an etalon, which can be useful, when a large set of similar objects are handled.

Information of Type

C++ types belong to different groups like base types (integer, float, string), container types (array, vector, list), compound types (classes and structures) or abstract types. The Type Info Class has some functions providing information about classification of the described type:

IsCntnr() returns true, if the type is compound type or container, that is it contains a list of properties. The function name means, that the type is a container of properties.

IsPropClass() returns true, if the type is a class or structure having Property Interface. A class has Property Interface, if it is descendant of a common base class, rProp_BaseClass_i, and implements its abstract functions. See details later.

IsAbstract() returns true, if the type is an abstract class, that is it has pure virtual functions. Abstract classes cannot be instantiated, therefore their CreateObj() and DestroyObj() functions must not called and cannot create the instance of the class.

IsSTLCntnr() returns the address of a special descriptor for containers or NULL, if the type is not container type. The type descriptors of standard containers provide some additional services for adding new elements, deleting elements, and iterating through the elements. IsSTLCntnr() actually returns a pointer to a descendant interface class of Type Info base class.

Getting and Setting Value

The GetValXxx() and SetValXxx() functions convert the value of the described type to a common format. A text and binary format are supported (Xxx stands for Str or Bin). These functions get the address of the object to convert and append the text or binary representation to the given buffer. The caller of the functions must ensure, that the pointers points to the correct place. These functions are called from Property Iterators therefore the user of Property Library does not need to use these functions directly.

The text format ensures a platform independent and human readable representation, while the binary format is quick and efficient. Binary conversion generally has nothing to do, but in some cases (e.g. integer types) it makes some simple conversion for ensuring platform independent format.

There is a special case, when a member variable is used to describe the number of elements in an array. The Property Descriptor of the array must know the actual size of the array; therefore such a Property Descriptor contains a reference to the property storing the size information. In this case the GetValSize() function is used to get the value of the property in this case.

Void2PropBase()

This function is implemented only for compound types and it is used to get an rProp_BaseClass_i type pointer of the object.

The C++ language supports multiple inheritance and type casting. The up- or downcast operation on a pointer may change the address. This is necessary, because different sub-classes placed on different addresses in the object’s data block. Casting to void pointer always returns the address of the beginning of the data block. Therefore Property Library uses void pointers for passing the address of the objects, but these pointers cannot be directly turned back to one of its sub-class. The void pointer has to be casted to its real type first, and then it can be casted to the requested sub-class.

The Void2PropBase() virtual function is implemented for all Type Info class of compound types and it contains these type casting. First it casts the void pointer to the type of the pointed object, and then it casts the result up to rProp_BaseClass_i type pointer.

Getting the Type Info Record

Type Info records are created as static variables and their constructors links them to a list. The First() member function returns the address of the first Type Info record. The Next() member function can be used to iterate through the list of all types.

The GetTypeInfo() functions are used to find the Type Info record from type name or type identifier. It is recommended to use these functions instead of searching with Next(), because they returns the address of the record directly without iterating through the list of Type Info records.

Compare Type Info Records

Type name and type identifier must be unique; therefore the equal operators compare the content of the Type Info records. It is also possible to compare address of the Type Info records, because every type has only Type Info record. (In some cases it may happen, that a DLL has its own data segment, and therefore both the program and the DLL build their own Type Info records. In this case, the addresses cannot be compared.)

Containers

Handling containers require some additional services. Containers are special objects storing a list of elements like arrays, lists, maps, vectors etc. Arrays do not require any support from Type Info classes (because simple pointers can be used to iterate their elements), but containers represent a higher abstraction level, and the better is that Property Library does not know anything about their internal structure. Property Library uses the standard interface of containers, handles them similarly to Base Types instead of handling them as compound types.

Standard iterators can be used for iterating through STL containers, but they cannot add new elements to the container. An iterator actually points to an element, and it does not know the container itself. The Property Iterator has to know somehow the type of the container and it has to store somehow an iterator pointing to an element of the container.

As it will be discussed later, Property Iterators do not know the type of properties. In Oops the Type Info records are the only objects having any knowledge of data types. Therefore some special programming tricks are used and the services of Type Info Class are extended for accessing STL containers and using their iterators.

The Type Info Class provides an extended interface for STL container types. (The STL container type means the instantiated template type here, like vector<int>.) Every STL container type must have its own Type Info Class and static Type Info record. The IsSTLCntnr() functions returns the address of the extended Type Info Class (rTD_STLCntnr_i). This class is descendant of class rProp_TypeInfo_c and adds some additional functions operating on STL containers and their iterators.

STL containers work on different ways, therefore several Type Info Classes exists for handling them on the best way. Please consider, that random access should be used for vectors (vector, deque), but iterators for lists and associative containers (map). Some containers (e.g. vector) invalidate iterators when new items are added, therefore the operator[] is used to access elements. Others (list, map) do not invalidate the iterator, but do not support random access of elements.

The extended interface of the Type Info Class provides services for accessing STL containers, but most operation works with iterators, and these iterators must be stored in the Property Iterator. An advanced programming technique, the ‘Type Destroyer is Type Restorer’ design pattern, is used to make this possible.

All functions operating on containers uses void pointers for getting the address of the iterator. The iterator itself is created by the Type Info Class, but stored by the Property Iterator. The appropriate Type Info Class does all operations on the iterator, while every Property Iterator can store its own copy of the container iterator as an array of bytes, without actually knowing what it is stored. This seams to be complicated, but it ensures the required independence of the Property Iterators and adds only a little overhead to the usage of STL iterators.

Property Descriptors

Property Descriptors make possible to handle compound types without manually writing the type description and the conversion functions. Instead a description of the members is given and the type description is generated automatically.

Types are used at different places in C++ programs for declaring variables, members of compound types (structure or class), function arguments, return values, and element of arrays or containers. The Type Info Class gives information about the type itself. Is this information is enough to access a variable, an instance of the given type? Unfortunately it is not. If we want to access a variable, we need to know its address as well. The address belongs to the instance therefore it cannot be stored in the type descriptor. If we want to access a simple variable, we have its address, but the address of a data member or an element of a container is not so trivial. We generally have the address of the class or container, and we should somehow compute or get the address of the member or element. This is the task of the Property Descriptors.

The Property Descriptor makes possible to access the members of classes and elements of containers. They store a reference to the Type Info Record and some other information necessary to access the data. Different types require different method. Member variables can be accessed by their offset, member functions have to be called, and elements of containers have to be accessed through the functions of the container class or by using iterators. Therefore Property Descriptor has many forms depending on the member it is designed for.

Every compound type has its own Property Descriptor Table for describing its properties. This Property Descriptor Table is a static array of rProp_Descriptor_c classes, and it mirrors the class declaration. The array has an item for every property and the type of the item depends on the type of the member (data or member function, pointer, array, or container, etc.). Different Property Descriptors belongs to base types, compound types and containers leading us to define a polymorph hierarchy of different Property Descriptors.

However elements of a static array must have the same type. How we can store polymorph classes in static array, and how we can initialize it before the program starts running? The first possibility would be using array of pointers, but it cannot be initialized with a list of values. The better solution is, that rProp_Descriptor_c class is a wrapper class, and Property Descriptor Table is an array of rProp_Descriptor_c. Its constructor creates the actual Property Descriptor and the member functions only mirrors the polymorph functions of the internal representation. Generally only the Property Iterators use the Property Descriptors and applications rarely need to access them directly.

The rProp_Descriptor_c class is very simple. It has only one data member (a pointer for storing the internal, polymorph implementation), and many constructors for creating property descriptors for all possible property. Member functions are simple gate functions to access the information stored by the Property Descriptor.

The real functionality of the Property descriptors is implemented in the internal descriptor classes. They have an abstract base class (rProp_DescBase_i), but for understanding the functionality of the Property Descriptors, we have to understand how the descendant classes work.

The Abstract Base Class (rProp_DescBase_I)

There are 3 data members in rProp_DescBase_I, storing common information for all Property Descriptors: the name of the property, the address of the Type Info Record, and the property flags.

Property Name

The name of the property has the same purpose as the name of a variable. When the properties are searched, or a single item of a complex data structure is referenced, the Property Name can be used on the same way, as member names are used in C++ expressions. For example you can make the following function call for setting the A.B.strName member variable:

class B_c {
     std::string strName;
};
class A_c {
     B_c B;
};
A.SetValStr( “B.strName”, “Mr. Smith” );

The example assumed, that the property names are the same as the variable names. The SetValStr() function search for a property called “B”, then it search for a property called “strName” in B. If the proper string variable is found the SetValStr() function of the Type Info Record is called with the address of the string data member and the string representation of the new value (“Mr. Smith”).

Property Flags

Property Flags give some additional information for the Property Iterator about the property. Some of the flags describe language dependent information, like pointer, access right (public, protected, private), while others describe additional information not provided by the C++ language, like streamable.

Reference to the Type Descriptor

The 3rd common data member is the address of the Type Info Record. It is the link to the type descriptor.

Member Descriptor

The simplest case of properties is the member variables described by the rPropDescMember_c class. Above of the information stored in rProp_DescBase_i, it has only one data member for storing the offset of the variable. The real address of the member variable can be computed by adding this offset to the address of the object.

Member Function Descriptors

Properties sometimes cannot be set or get by directly accessing member variables. It is a common solution in C++ to use gate functions for reading and writing private or protected data members. These functions may have other functionality, like checking the values or they may introduce ‘virtual’ properties, which do not represented by a single member variable. For example a rectangle may store the coordinates of their corners, but it may have properties computed from these coordinates like ‘Height’ and ‘Width’. There are special Property Descriptors for making possible to use member functions as properties.

There are several problems using member functions as properties.

  1. Memory has to be allocated for storing the value of arguments when the member functions are called. The data value is given in ASCII format, when SetValStr() is called. It is converted to binary value before passing it to the appropriate gate function. The temporary variable cannot be in the Property Descriptor, because it is possible, that the same Property Descriptor is used parallel by several threads. It is rather stored in the Property Iterator, as we will see later.
  2. Properties are generally read-write values. Therefore two gate functions are required for setting and getting the value of a given property. This seems to be simple, but sometimes these gate function pairs do not exist. For example a rectangle class (generally used in graphical user interface classes) has a function for setting the coordinates by providing 4 parameters: the x and y coordinates, the height and the width (like rect(x,y,w,h)), while it has 4 functions for getting the values, like x(), y(), w(), h(). These cases should be avoided by adding the some further member functions.
  3. When the gate functions have several parameters the situation became even more difficult. This can be handled, if the property is treated as compound property, and the parameters are collected to a structure having its own property description.

Gate functions are handled as callback functions. The technique used to implement gate functions is based on the idea published by [Jakubik], but it is simpler. The Property Descriptor class of gate functions has two members for storing the member function pointers of the gate functions. These members provide polymorph descriptors of gate functions for calling the set and get member functions. The base classes have a common interface and several template classes are derived for describing different function argument lists. (Even the simplest gate functions may have 5 different kind of parameter. The argument can be passed by value or by address, when a pointer or reference can be used, which may have constant or not.)

There are 2 solutions when the gate function has several arguments. The simplest case, when the type of the arguments is the same, when they can be described and stored as an array. When the type of the arguments is different, a structure has to be created for storing the temporary values of the parameters.

Arrays

The simplest containers are the traditional arrays. The Property Descriptors have to know somehow the size of the array and the number of elements actually stored in the array. There are 3 possible way of using C style arrays:

Both fixed size and dynamic arrays may use null terminator, and dynamic and null terminated arrays may be allocated dynamically (on the heap). The Property Descriptors of arrays support all these cases, with 2 Property Descriptors for fixed size and dynamic arrays both supporting null terminator.

Fixed Size Arrays

This Property Descriptor has 3 additional data members:

The Property Descriptors works closely together with the appropriate Property Iterators. The Property Iterator uses the information stored in the Property Descriptor when the address of the next property is computed. Then it uses the services of the Type Info class for accessing the value of the property.

Dynamic Arrays

The place of the actual size of the dynamic arrays has to be described by the Property Descriptor. It is assumed, that the actual number of elements is stored in another property of the class. Therefore the Property Descriptor stores the address of another Property Descriptor and uses it to get the actual size of the array by calling the GetSize() function of the Type Info record.

The additional data members are slightly different from fixed size array:

Standard Containers

Containers could be handled as classes by defining some properties. However they provide an important abstraction, what should be reflected in the property description. The internal structure of containers is complicated and hidden making difficult to add and use the property description. For example the standard set template class generally store the elements in a binary tree, but it does not make sense for the user, who simple wants to see the list of the elements.

Here we are talking about the containers of Standard Template Library. The Type Info Class makes possible to use only a few functions of the containers for accessing its content, and the Property Descriptors use the services of the Type Info Class. They do not directly access the container. It is possible to use any container having the appropriate interface and it is also possible to extend the list of Property Descriptors for using containers with different interface. However the implementation discussed here fits for most of the applications.

The Property Descriptors of STL containers do not need to store any additional data about the variable they belong to. The address of the container object is already stored in the base class, and all other information is stored in the container itself. Even the Property Descriptor of elements stored in the container is got from the Type Info Record. However different Property Descriptor classes works on different ways, and they vary depending on the type of the container. There are 3 different possibilities:

Services of Property Descriptors

Property Descriptors generally work with Property Iterators, but sometimes it is required to use them directly. For example when a global container stores objects, the first Property Iterator has to be got by calling the CreatePropIterator() function of the container’s Property Descriptor.

Property Iterators use the internal polymorph version of Property Descriptors; therefore the rProp_Descriptor_c wrapper class gives access to the internal representation and provides only a few functions calling the appropriate function of the internal object directly.

Information of Property

Some member functions of Property Descriptors give information about the property. This information is used to decide what to do with the property, and how to handle it.

These functions are used by Property Iterators for getting the necessary information about the pointed properties.

Address of Object

The address of the property is necessary for setting or getting the value of the property. The main task of the Property Descriptor is to calculate this address. The simplest example is getting the address of a class’ data member. The Property Descriptor stores the offset of the variable in the object. The Property Iterator can get this offset from the Property Descriptor and it can calculate the address of the variable by adding the offset to the base address of the object. The situation became more complicated in case of polymorph objects, arrays or containers, but the Property Descriptor hides this complexity.

The following functions are used to calculate the address of the property:

Offs() returns the offset of the property, if it makes sense.

PropAddr() returns the address of the property or the pointer to the property. This function is used, when the pointer has to be accessed.

RealAddr() returns the address of the property. This function takes into account pointers and polymorphism, and returns the real address of the property.

ApplyPtr() returns the address of the pointed object, if the property is pointer.

PropBaseClass() returns the address of the Property Interface when the property is the ancestor of class rProp_BaseClass_i.

Access the Value of Property

When the property already exists the GetValXXX() and SetValXXX() functions can be used to access the value (XXX stands for Str or Bin). These functions call the appropriate functions of the Type Info Class. The Property Descriptor’s version only transforms the given arguments using its knowledge.

When the property has not exist, for example it is a pointer, the object has to be created first. The AddNewStr() and AddNewBin() functions creates the object and sets its value, while the AddNewPtr() only creates the object and stores the address, but the value or the content can be set later. This is the case, when polymorph objects are loaded from stream. The objects are created and their addresses are stored in a container, then a new Property Iterator is used to iterate through the properties of the new object and set the value of their properties.

Iterating through Compound or Container Objects

When the property’s type is not Base Type the Property Descriptor has to give access to the properties of the property. This is the task of the CreatePropIterator() function, which creates a new Property Iterator pointing to the first or the last sub-property. Property Iterators will be discussed later in detail, but the CreatePropIterator() function is the key to understand how Oops handles the tree structure of data. Property Iterators are used to step through one level of the tree. When the IsCntnr() function of a property returns true the CreatePropIterator() function can open the branch by creating a new Property Iterator. This new iterator iterates through the properties of the branch.

Inheritance

Inheritance is one of the most important features of every object-oriented language. C++ supports multiple and virtual inheritance, which should be supported as well by all RTTI description or library providing persistency. Unfortunately only a few library supports multiple inheritance (e.g. Microsoft’s MFC does not allow of using multiple inheritance), because it is quite difficult to handle. Lets see how Oops handles inheritance.

Every class describes only its own properties in the Property Descriptor Table. The properties of ancestor classes are described in the Property Descriptor Table of the ancestor classes; therefore when the Property Iterator iterates through the properties of a class, it has to find the Property Descriptor Table of ancestor classes for getting the inherited properties.

The property description of the class has to contain the list of ancestor classes as well. For example the Property Descriptor Table may have special items for the ancestor classes. Unfortunately the real world is not simple; therefore Oops provides two different solutions for describing inheritance, both having some advantages and disadvantages. The programmer provides this information, when the property description of the class is created.

The original description of inheritance is too difficult to use and slow for accessing properties at run-time; therefore it is used only to initialize a faster internal representation. This internal representation is quite simple and it is built when the RTTI description of the class is used first. The internal representation handles inheritance with a list of records describing all ancestor classes not only the direct ancestors of the class.

The programmer only defines the direct ancestor in the property description; therefore the list has to collect information by iterating through all classes. The list contains the address of the ancestor class’ Property Descriptor Tables and the offset of the ancestor class’ data inside the object. It is built when it is used first time in by calling the CreateIteratorTable() function of the class, which builds the list, if it has not built yet, and returns its address.

The internal representation of ancestor classes has another major task. When the list of all sub-classes is created CreateIteratorTable() function checks if the next sub-class already added to the list. This way Oops can properly handling virtual inheritance, because the offset of the virtual sub-class does not depend on the root it was reached. It is another problem, how we can get the offset of the virtual sub-class. The compiler independent solution used in Oops is discussed later.

Structures and Classes

There are only one little differences between structures and classes in C++. The default access right is public in case of structures and private in case of classes. This small difference is not important in most of the time, but the philosophy is extended in Oops.

There are two ways of defining the property description of compound types, and both works for structures and classes as well. There are two important differences:

Both solutions have advantages, therefore Oops implements both of them, and the programmer making the property description can decide which one he/she prefer in the given situation.

The structure like description is not intrusive. It can be used even if the source code of the class is not available (only the object code) or it cannot be modified for some reason (for example adding property interface to a library). The Property Descriptor Table could have access to the private and protected members, if it had made a friend of the class, but the friend statement has to be added to the class definition.

Another advantage of the structure like description is, that the ancestor classes can be simply added to the Property Descriptor Table. This makes the property description simpler and easier to understand. The disadvantage of this solution is, that even if it can properly handle multiple inheritance, it fails on virtual inheritance.

The class like description handles the inheritance on a different way working properly even with virtual inheritance, but the price is that this method is intrusive. However the base class also gives a standard interface, called Property Interface, which makes possible to pass the object for any program knowing this interface only.

The following examples illustrate the difference. B_s is a structure and B_c is a class both having 2 ancestors and a member variable. Please, note the difference, how the property description handles the inheritance.

struct B_s : public A1_s, public A2_s
{
  double Data;
};
rProp_DeclareGetTypeInfo_d(B_s);

rProp_BgnDescTbl_d(B_s,Struct,idB1)
  rPropDescInherit_d( B_s, A1_s )
  rPropDescInherit_d( B_s, A2_s )
  rPropDescSame_d( B_s, double, Data,
rcProp_Default )
rProp_EndDescTbl_d

The ancestor classes are described at the beginning of the Property Descriptor Table and the definition of the class is not modified. Actually there is only one thing the programmer has to declare somewhere above of the Property Descriptor Table, the declaration of the rProp_GetTypeInfo() function for structure B_s.

class B_c: virtual public rProp_BaseClass_i, protected A2_c
{
  double _Data;
public:
  B_c()
    : rProp_BaseClass_i(), A2_c(),
_Data(4)
    {}
  rProp_DeclareInterface_d(B_c);
};
rProp_DeclareGetTypeInfo_d(B_c);


rProp_BgnDescTbl_d(B_c,Class,idB2)
  rPropDesc_d( B_c, double,
"Data", _Data, rcProp_Default )
rProp_EndDescTbl_d

rProp_ImplementInterface_d(
  B_c,
  rProp_Inherit_d(rProp_BaseClass_i);
  rProp_Inherit_d(A2_c);
);

The property description is more complicated in this case. Class B_c has to be derived from rProp_BaseClass_i, its definition has to contain an additional macro (rProp_DeclareInterface_d) and above of the Property Descriptor Table, it must implement the property description with the rProp_ImplementInterface_d macro. The ancestor classes are passed to this macro and they are not listed in the Property Descriptor Table.

These macros seem to be mysterious at the first glance, but they are quite simple. There are 2 reasons of using them:

Users of Oops became familiar with the macros quickly. The documentation and the tutorials explain how they work.

Property Base Class

All compound types having class style type description are derived from rProp_BaseClass_i interface class. (Classes derived from rProp_BaseClass_i are called Property Classes in this article.)

The Property Interface provide a good starting point, when the application’s data structure should be passed to a program dealing with properties, like the Save() of Stream Library. These functions generally need a Property Iterator for working with the data structure, and Property Classes can create Property Iterators for themselves. It is a simple and convenient solution, if all classes have a common interface, where we can get access to their properties.

The rProp_BaseClass_i class has the following member functions:

Above of these public services Property Classes have a function (CreateIteratorTable()) for filling the list of ancestor classes.

The CreateIteratorTable() function

This function creates the list of ancestor classes from the information given by the programmer about inheritance. It is a recursive function. Every compound type knows its immediate ancestors, therefore CreateIteratorTable() calls the CreateIteratorTable() function of all immediate ancestors. The list of all ancestors is passed as well, and records about the new ancestor classes are added to the list.

The description of the sub-class contains the reference to the Property Descriptor Table and Type Info Record, and the offset of the sub-class from the address of the whole object. This offset depends on the compiler, and multiple and virtual inheritance makes difficult to determine it. This is why we need CreateIteratorTable(). It is used to determine this offset and fill the list of ancestor classes properly.

CreateIteratorTable() has different implementations for structure and class type property description.

In case of structure like property description CreateIteratorTable() is the member of the Type Info Class and the offset of the sub-class is stored in a Property Descriptor Table entry describing the ancestor class. This solution is simple and handles multiple-inheritance properly, however it cannot access private and protected ancestors and cannot calculate the offset of virtual sub-classes properly.

The class style property description handles all cases of inheritance properly. This is why it is preferred even if the class getting property description has to be modified. CreateIteratorTable() is virtual member function of the class and every class has its own version of CreateIteratorTable(). As a member of the class, CreateIteratorTable() knows the ‘this’ pointer and it can cast it to the type of the ancestor classes. This way the compiler properly calculates the offset even in case of virtual multiple-inheritance. CreateIteratorTable() also checks if the sub-class is already added to the list. This way virtual sub-classes are added to the list only once.

It is not trivial how to implement this solution, because the CreateIteratorTable() function has to be implemented for every class, and the implementation has to directly call the ancestor class’ CreateIteratorTable() functions. The only way to automate this task is providing a macro which generates the body of the CreateIteratorTable() function. The list of ancestor’s function calls is passed as argument to the macro. Macro arguments are separated by comma while C instructions are separated by semicolon, therefore a macro argument may contain a list of C instructions. (This is a good example for using macros. We have not found any replacement of this macro.)

Property Iterators

Using Type Info Records and Property Descriptor Tables is quite complicated. The property based RTTI system needs a well-known and easy to use interface for making the user’s life easier and the programs accessing properties simpler. Property Iterators were developed for this purpose based on the well-known iterator pattern and STL iterators.

Property Iterators can be treated as pointers to properties, not exactly like C pointers, but in the term of the original meaning of the word. A Property Iterator belongs to a Property Container (an instance of compound types or containers) and points to one of its property. It is a class having some member functions for accessing the value of the property and getting information about it. As any kind of iterators, Property Iterators can be incremented and decremented for stepping through the list of properties and they can be compared with other Property Iterators as well.

However Property Iterators are designed to traverse a hierarchical data structure. When a Property Iterator points to a property having sub-properties (Property Container) the Begin(), End() functions of the iterator can be used to create a new Property Iterator for accessing the internal data structure.

What does the Property Iterator?

The main task of Property Iterators is to maintain some variables required to access the selected property of a given object. This information contains the address of the object, the Property Descriptor Table, the Type Info Record and some kind of index information, which vary for different Property Containers. It may be an index into the Property Descriptor Table, or an array, or it may be an STL iterator.

This information is used to calculate the address of the property, which is used in most of the member functions for calling the appropriate function of the Property Descriptor. The calculated address is generally cached. Therefore the Property Iterators may become invalid, if the object or the container is changed.

When the iterator is incremented or decremented the business logic of the Property Iterator updates the internal index information. This job is quite difficult. In case of compound types the inheritance and the internal structure of Property Descriptor Tables is handled, while the appropriate STL iterator has to be used for accessing elements of containers.

Encapsulation

Unfortunately a lot of different Property Iterators exists for different kind of properties, similarly to Property Descriptors. In fact, Property Iterators and Descriptors work together, and Property Descriptors have their Property Iterator pair.

For making the life simpler Property Iterators are encapsulated in a wrapper class. There is only one public Property Iterator, which hides the actual type of the Property Iterator and makes possible to use the same type everywhere in application programs.

Property Iterator Groups

Every Property Iterator works on different way, but there are 3 basic groups.

Iterating Compound Types

The Property Iterator has to access a member of an object, and for doing that it needs the address of the data member. The iterator knows the base address of the compound object, the offset of the data member (from the Property Descriptor Table) and the offset of the sub-class (from the list of ancestor classes), and it can calculate the required address by adding these offsets to the base address.

Actually the index of the Property Descriptor Table is incremented (or decremented) when the iterator is incremented (or decremented). The iterator checks if the next item in the Property Descriptor Table matches to the given filtering options and increment the index as long as an appropriate item or the end of the table is found. When the end of the Property Descriptor Table is reached the iterator gets the Property Descriptor Table of the following ancestor class, and continue searching for the next appropriate item.

Iterating Arrays

The array iterators are the simplest Property Iterators. They maintain an index of the array and calculate the address of items by adding the index multiplied by the size to the base address of the array. The number of elements can be determined with the help of the Property Descriptor of the array, but several array iterators are exists for different cases (fixed size, variable size or null terminated arrays).

Iterating Containers

STL container iterators are the trickiest part of the Property Library. The basic task is simple: the Property Iterator has to store an iterator of the container and it has to make possible to insert new elements into the container. The problem is, that Property Iterators knows absolutely nothing about the STL container! (Remember, that only Type Descriptors have any knowledge of the handled data types.)

There are 2 tricks for solving the problem:

  1. The Type Info Class of STL containers provide an extended interface. The member functions of this interface makes possible to get the number of elements, inserting elements, creating and handling iterators.
  2. The Property Iterator stores an appropriate iterator pointing to the iterated container. The iterator is stored as an array of bytes without knowing anything about their meaning. When the Property Iterator wants to use the STL iterator it calls a function of the Type Info Record and passes the address of the memory storing the iterator. This technique is known as ‘Type Destroyer is Type Restorer’ pattern.

Property Iterators has nothing to do when new elements are inserted, just calling the appropriate function of the Property Descriptor. New elements can be inserted to containers through the Property Iterator or the Property Descriptor of the container as well.

Services of Property Iterators

Property Iterator has many member functions. Most of them are gate functions of the Type Info Record and Property Descriptor of the pointed property, and some of them are used to create a new iterator for iterating through the properties of the pointed property. Property Iterators make possible to select certain properties and step over all properties not matching the given filtering condition.

Creating Property Iterators

The well known Begin() and End() functions creates a new Property Iterator for the pointed property. This way the branches of the hierarchical data structure can be opened. The Begin() and RBegin() functions creates an iterator pointing to the first and last property respectively. The End() member function however creates an invalid iterator which can be used in both cases. It simple creates an empty Property Iterator wrapper class (storing a NULL pointer).

All these functions have 3 different forms. The first function set has no arguments. They create a new iterator and return it by value. This is the most convenient way, but unfortunately it is slow, because the functions have to make a copy of the created iterator when it returns, and the copy constructor clones the internal representation of the iterator.

  rProp_Iterator_c SubIter = Iter.Begin();

The second function set initializes the iterator from its argument. The argument is the Property Iterator of the parent object. The iterator which the Begin() function is called for will point to the first property of the object pointed by the iterator passed. (Please consider this Begin() function as an initialization function, doing something similar, than a constructor.)

  rProp_Iterator_c SubIter;
  SubIter.Begin(Iter);

This solution is not so common, but works faster then the first one. It also may cause some confusion.

The third function set is not so interesting. It makes possible to initialize the Property Iterator from a Property Descriptor item, which is useful, when the root object is a container or an array.

Filtering

Filtering is an advanced feature of Property Iterators. Sometimes it is required to make difference between properties. For example properties may be redundant, therefore it may not necessary or possible to save all properties to stream. In that case Property Iterators can filter the properties regarding the given mask, which is compared with the flags stored in Property Descriptors. For example the flag rcProp_Strm is used to sign, that this property should be saved to the stream.

All functions and constructors creating Property Iterator have an additional argument (with a default value of 0) for defining the filtering condition. The default value means, that the filter flags of the parent iterator will be used.

Property Library does not define all fields of the filtering flags. Applications may introduce their own filtering flags for customizing the filtering feature.

The filtering flags are taken into account, when the Property Iterator is incremented or decremented. If the next property does not match to the filtering flags the iterator is moved further as long as an appropriate property or the end of the properties is found.

Accessing the Pointed Property

Member functions of Property Iterators provide access to all information stored by the RTTI system, like the name, type and size of the pointed property, or its classifications (container, pointer, Property Class). It is also possible to set and get the value of the property. If the property is extendable new elements can be inserted through the Property Iterator as well.

Some Special Cases

Template Classes

Template classes are handled on the same way as normal classes. The only difference comes from the fact, that every template instance is a new type. Therefore every template instance has to have its own type description.

Standard containers are handled on the same way. Every instance of STL containers gets its own Type Info Record and Property Descriptor.

Type Definition

In C++ you can introduce new types with the ‘typedef’ keyword. These are not really new types just alias names for other types. (For example the compiler uses the original type in the error message instead of the defined type name.) However type definition is very useful.

Fortunately the Property Description is also transparent for type definition, because of the global rProp_GetTypeInfo() function returns the Type Info Record. The argument of this function is a typed pointer. As long as the compiler is able to convert a variable to one of the types having type description, this function will return the address of an appropriate Type Descriptor. (This may lead to problems. The automatic type conversions makes possible to find a type descriptor, even if the given type does not have type description at all.)

Enumeration

Enumerations can be handled as integer number. This is a simple solution, and does not require any explanation. It works similarly to type definition.

However it would be much better, if we could use names of the enumeration instead of their values. It also would be helpful to avoid out of range values. There are reasons of using enumeration, and Oops should make possible to use them correctly.

For example the date record should look like “day=22; month=May; year=2003”, instead of simple numbers (“day=22; month=5; year=2003”). Please note, that it is unambiguous, how the months are represented: 5 or 4 means May.

The solution is very simple from the Property Library point of view. A new Type Info Class has to be written for enumerations, which can convert the enumeration values between binary and string representation. The SetValStr() and GetValStr() functions has to be able to do the conversion.

Oops provides a template Type Info Class for handling enumerations. The class stores a map of the value and string pairs, and the SetValStr() and GetValStr() functions use this map for doing the conversion. The constructor initializes the map.

Only one problem left. C++ compilers do not know anything about enumeration names at run-time; therefore the program has to provide this information, when the Type Info Record is created. Generally programmers write a function returning the string representation of the enumeration value, and they use this function, when an enumeration value is printed or displayed. Similarly Oops Library uses a function for initializing the map. It also would be possible to provide a string containing all enumeration values, but the conversion function has some advantages, when the enumeration values does not form a simple series of numbers, i.e. there are holes, invalid values between them.

How to Use Oops

The implementation of Oops Library is complicated and not easy to understand in details. However it is quite easy to use it. There are two steps of using Oops:

It is easy to define the properties and type descriptors with the provided macros, while these macros make possible to remove the Property Interface in one application and use it in another one.

From the other hand, Property Iterators makes easy to use the classes and objects having property interface. For example the following simple recursive function lists properties to the standard output:

void PrintProps(rProp_Iterator_c &arPropIter, int aIndent)
{
  rPropStr_t Buf(1024,256,"");
  rPropStr_t NameBuf(1024,256,"");
  // Iterate through all properties.
  printf( "{\n" );
  ++aIndent;
  while( !arPropIter.IsEnd() ) {
    Buf = ""; NameBuf = "";
    arPropIter.GetValStr( Buf );
    arPropIter.PropName( NameBuf );
    Indent( aIndent );
    printf( "%s(%d) %s = %s",
      arPropIter.TypeName(), arPropIter.TypeId(),
      NameBuf.SafeCStr(), Buf.SafeCStr()
    );
    if( arPropIter.IsCntnr() ) {
      printf( " " );
      rProp_Iterator_c iP;
      iP.Begin( arPropIter ); // Open branch.
      PrintProps( iP, aIndent );
    } else {
      printf( "\n" );
    } //if ... else
    arPropIter.Next();
  } //while
  --aIndent;
  Indent( aIndent );
  printf( "}\n" );
} //PrintProps()

Creating Type Info of Base Types

The users of Oops Library rarely need to write Type Info Classes. The library implements them for all built in types and strings. It also provides templates for STL containers and enumerations.

However it is not difficult to write a new Type Info Class, if your application has types better described as Base Types. The Type Info Classes provided by Oops are a good starting point. Just pick one of them and modify the member functions as needed for handling your data type. There are a few functions for converting data to string and binary formats. Writing these functions is the 90% of the job.

Adding Property Interface to Classes

Adding property interface to your classes is very simple. You basically have to tell, which classes do you want to have Property Interface and which members do you want to be properties.

Using the appropriate macros the Type Info Record and the Property Descriptor Table of the class have to be declared and implemented. Oops Library has a lot of example programs and tutorials showing how to do it in different situations.

Using macros is not nice and may be confusing, because people do not know, what they are doing. There are two reasons of using macros for defining the property description:

However you do not have to use the macros. All code can be written by hand, if you wish. It is also a good way for debugging the property description, if you macros expand the macros by hand.

Conclusion

The problem of persistency exists since the objects or data records had invented. Consequently many solutions exist for saving and loading objects. However Oops is one of the best and the only one providing property-based persistency. Even Oops is not a new solution as well. Its development had started in 1996 and the first application (Speech Corrector, http://www.rcs.hu/boxoftricks) using property-based persistency is on the market for several years.

The competitor libraries generally use different solutions. They can be divided in 2 groups:

In the first case objects are saved as a block of binary data without knowing what it is. When the raw binary data is loaded back some pointers (virtual method table) have to be updated. There are several tricky solutions for doing that, but even if they work they are not elegant solution. Moreover the data stream will not be robust neither human readable.

The second case is somehow similar to the solution described here. It also requires some kind of run time type identification and functions for converting the objects to the format of the stream. The libraries provide little help to the programmer, who has to write one or two functions for reading and writing the object to the stream. This way the programmer could do anything in these functions, but in most cases the variables are simply written to the stream. It is the responsibility of the programmer to read and write the same variables in the same order. The stream might be human readable, but it is never robust.

Property-based persistency implemented by Oops is somehow similar to the second solution. The conversion functions of the Type Info Records have similar functionality than the stream of object of passed to the read/write functions. Defining the Property Description requires about the same amount of work from the programmer than implementing the read/write functions. You can consider Oops as an automated, robust, modular and scaleable implementation of that idea.

The Property Library handles the following features of C++ properly:

This list contains almost all possibility of the language, however there are some restrictions:

The Oops Library has some important features not provided by any of the competitors:

Glossary

Application Generator – Application Generators are program development tools making the program development process quick and easy. They provide a set of components and a nice graphical user interface, where someone can build an application by adding and configuring components. The users of these programs do not have to be programmers for building an application and these tools provide a higher abstraction level than the traditional programming languages.

Base Type – RTTI system provides an immediate description for base types. They are leaves of the property tree, and they internal structure cannot be investigated through the Property Interface.

Class – this word is used for a type introduced by a C++ class. The word ‘class’ is used for the type, while the word ‘object’ is used for its instances.

Compound Types – C++ class or structure. The type is represented as a collection of members called properties.

Container Type – There are special types and classes in C++ designed for storing objects of different types. They are called containers. Static or dynamic arrays, the standard vector, list, map, and set are all containers.

Object – this word is used for an instance of C++ class or other types. The word ‘class’ is used for the type, while the word ‘object’ is used for its instances.

Property – is common word. Here ‘property’ is a member of class or an item of container made visible by Oops. All or just a few members (both member functions and data members) of classes may be properties. Any application can access properties without knowing the class itself.

Property Descriptor – is an object describing a property. It stores information like the offset of the data member, the address of the Type Info Record describing the type of the property, and the name of the property.

Property Descriptor Table – is an array of Property Descriptors describing the properties of a class. Every compound type has its own ‘Property Descriptor Table’. It is related to Type Information stored by Oops.

Property Iterator – acts like pointers pointing to a property. It can be incremented or decremented for getting the next property. Property Iterators give access to the property and makes possible to go into the details of the pointed property by creating a new Property Iterator.

Run Time Type Identification (RTTI) – is common term for type information available at run time. Generally programming languages provide access to type information at compile time only, but sometimes this information is required run time. Persistency is the best example of RTTI applications.

Type Information – is a collection of the properties of types. Generally it contains a string and a binary identifier of the type, some classification (base type, container, compound type), the size of the object, and a way to create and destroy objects of the given type.

Type Info Record – is the instance of the type descriptor of a given type. Every type must have its own Type Info Record. The ‘Type Info Record’ is given by the Oops library, or written by the user, or created automatically when the properties of a class are defined.

References

  1. Bjarne Stroustrup: The C++ Programming Language Special Edition, AT&T, 2000.
  2. Paul Jakubik: Callback Implementations in C++, http://www.primenet.com/~jakubik/callback.html
  3. Vladimir Batov: Persistency Made Easy, http://www.adtmag.com/joop/crarticle.asp?ID=849, Aug 12, 2002.
  4. Peter Barczikay, Andras Tantos: Advanced Run Time Type Identification in C++, Part I: Requirements, http://www.rcs.hu/Articles/RTTI_Part1.htm, May 3, 2003