Vtable Generation

Vtables are data structures generated by the compiler that hold pointers to a set of callable methods of an object. They are used by dynamic dispatch and templates.

Typing Systems

Structural Typing allows any class to be cast to an interface, as long as the API of the class matches the requirements of the interface. The alternative would be a ‘nominative’ system, where the class would have to list the interfaces it implements when it is defined.

In languages with a ‘nominative’ system, such as C++, the compiler simply generates an array of function pointers for the class, with each interface being an offset within the array (some function pointers could appear multiple times). This is possible because the compiler knows ahead-of-time that only a very limited set of casts can be performed, and so generates a vtable to cater only to these casts.

In contrast, a structural typing system allows a very large number of casts, which the compiler cannot necessarily forsee when it is generating the vtable (i.e. casts can occur in other modules).

Casting from a class to an interface

Every interface reference must originate from a reference to a class. The interface reference may then be cast any number of times to other interface references, i.e.:

Class -> Interface -> Interface -> Interface -> ...
             |
             \
              -> Interface -> Interface -> Interface -> ...
                                  |
                                  \
                                   -> Interface -> ...

Loci provides structural typing for all types, including primitives, so it would be costly to generate a vtable for every class.

Instead we generate a vtable whenever a class is cast to an interface. The vtable only needs to contain the methods specified by the interface, which is important for reducing vtable collisions (see below).

Note

C++ generates vtables for classes with virtual methods, which restricts only those classes to having polymorphic behaviour.

Casting from interfaces to other interfaces

While the compiler can generate a vtable for the initial cast from a class to an interface, it is more difficult to handle casts from interfaces to other (less restrictive) interfaces. To see the difficulty, consider:

interface RunnableType {
        void run();
}

interface WalkableType {
        void walk();
}

interface RunnableAndWalkableType {
        void run();
        void walk();
}

RunnableType& castRunnable(RunnableAndWalkableType& value) {
        return value;
}

WalkableType& castWalkable(RunnableAndWalkableType& value) {
        return value;
}

The reference passed to castRunnable() or castWalkable() will originally have been a class, but at some point, perhaps in another module, was cast from the class to the RunnableAndWalkableType interface and a vtable was generated.

The two functions allow casts to two different interface types. If we were going to implement an array of function pointers, we’d have to adjust the reference in some way to point to different offsets in the vtable.

However in general this isn’t feasible, because the vtable would need to be very large, to handle each interface to which it might be cast.

  • Vtable of 1 method - Needs 1 slot: [{1st}]
  • Vtable of 2 methods - Needs 2 slots: [{1st, 2nd}]
  • Vtable of 3 methods - Needs 5 slots: [{1st, 2nd, 3rd}, {1st, 3rd}]
  • Vtable of 4 methods - Needs 14 slots: [{1st, 2nd, 3rd, 4th}, {1st, 2nd, 4th, {1st, 4th}}, {1st, 3rd, 4th, {1st, 4th}}]
  • Vtable of 5 methods - Needs 47 slots: [{1st, 2nd, 3rd, 4th, 5th}, {1st, 3rd, 4th, 5th, ...}, ...]
  • Vtable of 6 methods - Needs 194 slots: [...]
  • Vtable of 7 methods - Needs 977 slots: [...]
  • Vtable of 8 methods - Needs 5870 slots: [...]

For any vtable of n methods, the number of slots required is determined by:

\[\begin{split}gaps(n) &= n - 2 \\ slots(1) &= 1 \\ slots(n) &= n + gaps(n) * slots(n-1) \\\end{split}\]

\(gaps(n)\) means the number of ways a single gap can be inserted into the complete array of function pointers. So a vtable of 3 methods could be cast to an interface with only the 1st and 3rd methods, so a copy must be generated for that combination. This is necessary because calls via the 2 method interface will expect offsets 0 and 1 for the two methods.

Hash Tables

Clearly a simple array of function pointers is not scalable to even small numbers of methods. Arrays of function pointers only work for nominative systems because the number of allowed casts is much smaller than the number of allowed casts in a structural typing system.

Note

While nominative systems can use arrays of function pointers, they can still generate large vtables due to inserting many copies of a function pointer, particularly where an interface is specified to extend many other interfaces.

The problem with the array of function pointers approach is that each dynamic dispatch call would expect to see its methods as a contiguous block of function pointers in the vtable. Hence a vtable with many methods must have lots of copies of the same function pointers to handle gaps.

To resolve this issue Loci uses hash tables for vtables, with each dynamic dispatch call using a hash value (based on the method name) to index into the vtable. For example, the vtable could look like:

[_, _, 3rd, _, 1st, _, _, 2nd, ... ]

A cast from one interface to another is now a NO-OP, since they all refer to the same vtable. The vtable does have some empty slots but it has a fixed size so there are no scalability issues.

Hash Table Collisions

An obvious problem with using hash tables is that two methods may hash to the same slot. This is handled by passing a ‘hidden parameter’, a much larger (64-bit) hash of the method name, to the method being called.

If there is no hash table collision, which should be the common case, then the hidden parameter is simply ignored and hence the only overhead is setting a register (usually negligible when compared to the cost of the call).

If there is a collision then the compiler will emit a ‘conflict resolution stub’, which is a short sequence of code that checks the hidden parameter and jumps to the appropriate method.

See the dynamic dispatch doc for more detail.

Vtable Pointer

A related issue for vtables is how to access them for a given interface reference. Since primitive types can be used polymorphically, we can’t store the vtable pointer inside the object (as C++ does).

Instead the vtable pointer is stored inside the interface reference, so while a class reference is just a simply pointer, an interface reference becomes:

struct interface_ref {
        void* object_ptr;
        void* vtable_ptr;
        struct template_generator {
                void* root_fn;
                path_t path;
        };
};

notemplate

Note

Feature not yet implemented.

The template generator is needed here in case the original class type is templated. This overhead can be avoided by specifying the interface reference as notemplate:

void f(notemplate Interface& value);

Vtables in Templates

Vtables are used in templates to provide a way to call methods of the templated type. These vtables may contain both static and non-static methods; from an implementation perspective this just concerns whether or not the this pointer must be passed to the method.

Vtable Generation

Just as there is an initial class to interface cast that causes a vtable to be generated, the instantiation of a template variable with an object type causes a vtable to be generated. The equivalent graph to the above is therefore:

Class -> Template Variable -> Template Variable -> ...
             |
             \
              -> Template Variable -> Template Variable -> ...
                        |
                        \
                         -> Template Variable -> ...

Template variables are very similar to interface references, since dynamic dispatch is required on all calls to methods via template variables (unless the templated code is inlined), because templates are instantiated at run-time via template generators.

Cast from Template Variable to Interface

Any templated-type object could be cast to an interface reference:

template <typename T: RunnableType>
RunnableType& f(T& object) {
        return object;
}

Even though the full type of T is unknown, the object reference received by f() is just a pointer; the vtable for T is available via the template generator passed to f(). However, the return type of f() is a polymorphic reference, which itself contains a pointer to the vtable.

Hence the effect of f() is to take the vtable from its template generator and return a polymorphic reference containing the vtable pointer. So we have a new kind of cast:

Template Variable -> Interface

However in practice there is nothing significant about this cast since the vtable is already generated prior to this cast. In fact, this cast is almost equivalent to an Interface -> Interface cast.

Note

Since template vtables are obtained via the template generator, they don’t incur the same space overhead as vtables in interface references. It may therefore be preferable to use template type references in suitable situations.

Effect of Const

const modifies the method set of a type, so needs to be handled by vtables. We must be assured that vtables generated in an initial cast remain suitable when const is applied.

Cast from Non-const to Const

A cast from non-const to const is benign because the const interface cannot have more methods than the non-const interface. So if we’ve generated a vtable for a non-const type, the vtable must also work for a const type.

const RunnableType& f(RunnableType& object) {
        return object;
}

Remove const from template type

template <movable T>
interface ReplicatableType {
        T replicate() const;
}

class Replicator {
        Replicator replicate() const;
}

template <typename T>
require(T : ReplicatableType<const<false> T> and movable<const<false> T>)
const<false> T f(T& object) {
        return object.replicate();
}

Replicator g(const Replicator& replicator) {
        return f<const Replicator>(replicator);
}

This code attempts to call the replicate() method of the Replicator class, for which it is using const<false>(T) to get a non-const type equivalent of T.

If it worked, this code would break vtables since the compiler is only required to generate a vtable for const Replicator. In particular, this means it doesn’t have to add the __move method, which may be called by f().

Fortunately the code is broken because const is cumulative, so const<false> T is actually equivalent to T. In fact const is also relative, so f() sees T as being non-const and hence it doesn’t make sense to ‘remove’ const from T.

The code can be fixed by passing the non-const type into f():

template <typename T>
require(T : ReplicatableType<T> and movable<T>)
T f(const T& object) {
        return object.replicate();
}

Replicator g(const Replicator& replicator) {
        return f<Replicator>(replicator);
}

In this case we generate a vtable for Replicator, which will contain the __move method.