2015 m. gegužės 28 d., ketvirtadienis

C++ inheritance explain (part III)

This is the third part in the series. If you haven't, have a look at part I and part II.

This part is limited to a single, but rather complicated inheritance feature in C++.

Virtual inheritance

Now we've come to one of most complicated features - virtual inheritance.

Let's start as usual, with most trivial example:

class Base
{
public:
  int m_base_member;
  void set_base_member(int x)
    {
      m_base_member = x;
    }
};

class Derived : public virtual Base
{
public:
  int m_derived_member;
  void set_derived_member(int x)
    {
      m_derived_member = x;
    }
};
This time I've added implementations to methods, as they do matter now. Base class translates to this (nothing particular here):

struct Base
{
  int m_base_member;
};

void Base_set_base_member(Base *_this, int x)
{
  _this->m_base_member = x;
}
Now let's see the layout for Derived:

struct Derived_SubObject
{
  int m_derived_member;
};
struct Derived : public virtual Base
{
  void *_vtable;
  Base _parent;
  Derived_SubObject _derived_part;
};
One thing is, that Derived class does have a VTable. That could have been guessed. As you can see, I've split Derived class specific members into a separate struct, that I placed directly inside Derived, rather than inlining members. This was done in order to explain the code in set_derived_member() method, which looks like this:

void Derived_set_derived_member(Derived *_this, int x)
{
  Derived_SubObject *_derived_part = _get_sub_object(_this, Derived_part);
  _derived_part->m_derived_member = x;
}
The important part here is that implementation of set_derived_member() make no assumptions about layout of Derived being passed in, with exception for VTable, which is expected to be at the start. The sub-part of Derived can be located anywhere in the object, method always looks it up via VTable. This has two implications:
  1. negative performance impact, as instead of accessing variable via offset in object, a VTable lookup is performed
  2. in case of multiple inheritance, duplication can be avoided for diamond-problem (as shown below)
class Base
{
public:
  int m_base_int;
};

class Derived1 : public virtual Base
{
public:
  int m_derived1_int;
  void foo(int x) { m_base_int = x; }
};

class Derived2 : public virtual Base
{
public:
  bool m_derived2_bool;
  void bar(int y) { m_base_int = y; }
};

class DerivedMultiple : public Derived1, public Derived2
{
public:
  bool m_multiple_bool;
};


/* DerivedMultiple object; */
Derived1 *d1 = &object;
d1->m_base_int = 3;
d1->foo(6);
Derived2 *d2 = &object;
d2->m_base_int = 4;
d2->bar(2);
Base *b = &d2;
b->m_base_int = 0;
The resulting structs and functions are:

struct Base
{
  int m_base_int;
};

struct Derived1_SubObject
{
  int m_derived1_int;
};
struct Derived1
{
  void *_vtable;
  Base _parent;
  Derived1_SubObject _derived1_part;
};

void Derived1_foo(Derived1 *_this, int x)
{
  Base *base = _get_sub_object(_this, Base_part);
  base->m_base_int = x;
}

struct Derived2_SubObject
{
  bool m_derived2_bool;
};
struct Derived2
{
  void *_vtable;
  Base _parent;
  Derived2_SubObject _derived2_part;
};

void Derived2_bar(int y)
{
  Base *base = _get_sub_object(_this, Base_part);
  base->m_base_int = y;
}

struct DerivedMultiple_Subobject
{
  bool m_multiple_bool;
};
struct DerivedMultiple
{
  void *_vtable;
  Base _parent;
  Derived1_SubObject _derived1_part;
  Derived2_SubObject _derived2_part;
  DerivedMultiple_Subobject _derivedmultiple_part;
};
As you see, in the final layout of DerivedMultiple there is only one Base part. The methods Derived1::foo() and Derived2::bar() have identical code, all they require is a VTable at the start of _this. Because DerivedMultiple satisfies this requirement, it can be passed in directly to either of two.
Let's analyze the code part step by step:

/* DerivedMultiple object; */
Derived1 *d1 = &object;
_get_sub_object(d1, Base_part)->m_base_int = 3;   // d1->m_base_int = 3;
Derived1_foo(d1, 6);           // d1->foo(6);
When doing pointer assignment compiler is not required to do anything about pointer, as all that is needed is a VTable at the start. When accessing anything from Base a VTable-lookup is done to obtain a pointer to Base part inside object. foo() call is trivial.
The code where we use Derived2 is pretty much the same:

Derived2 *d2 = &object;
_get_sub_object(d2, Base_part)->m_base_int = 4;    //d2->m_base_int = 4;
Derived2_bar(d2, 2);   //d2->bar(2);
Last, let's cast to Base and use that:
Base *b = _get_sub_object(d2, Base_part);
b->m_base_int = 0;
The base is obtained the same way as inside implementations of foo() and bar(). The important thing to note here is that making Base into a virtual class does not change that, except that pointer to VTable could be reused then.

This draws few conclusions:

  • Virtual inheritance has negative performance impact, because of regular sub-object lookups
  • In case of multiple inheritance and diamond hierarchy, two copies of common base class can be avoided
  • Internal layout of class is unpredictable, compiler can rearrange sub-objects
  • Virtual inheritance may not solve problem with duplicate base class; this can happen, if there is a complex mix of classes, where some classes do not use virtual inheritance, so their layout has to be preserved