6 Inheritance

Éric Tanter

In the presence of classes, we may want a mechanism similar to Delegation in order to be able to reuse and selectively refine existing classes. We therefore extend our object system with support for class inheritance. As we will see, many issues have to be dealt with. As usual, we will proceed step-by-step.

6.1 Class Hierarchy

We introduce the capability for a class to extend another class (called its superclass). We focus on single inheritance, where a class only extends a single class, as in Java or C#. As you surely know, some languages support multiple inheritance, such as C++ and Python.

What are the respective benefits and drawbacks of these different approaches to inheritance? What about other reuse and composition mechanisms such as Scala’s traits?

As a result, classes are organized in a hierarchy. The superclasses of a class (transitively) are called its ancestors; dually, the set of transitive subclasses of a class are called its descendants.

6.2 Method Lookup

When we send a message to an object, we look in its class for a method that implements the message, and then we apply it. The lookup part is defined in our CLASS macro definition as:

['-lookup
(let ([found (assoc (first args) methods)])
   (if found
       (cdr found)
       (error "message not understood:" (first args))))]

With inheritance, if an object is sent a message for which a method is not found in its class, we can look for a method in the superclass, instead of failing. Recursively, if the method is not found in the superclass, maybe it can be found in the super-superclass, etc. Until we reach "the end": we introduce a "Root" class at the top of the tree, in order to put an end to the method lookup process:

(define Root
  (λ (msg . args)
    (match msg
      ['-lookup (error "message not understood:" (first args))])))

Root is implemented directly as a dispatcher function, without using the CLASS form, so that we don’t need to specify its superclass (it has none). In case it is sent a -lookup message, it raises a message-not-understood error. Note that in this system, it is an error to send any message that is not -lookup to the root class.

The CLASS macro is extended to support the declaration of a superclass via an extends clause, and to delegate to the superclass the lookup of an unfound method.

Why is let-binding the superclass necessary?

(defmac (CLASS extends scls-expr
               ([field fname fval] ...)
               ([method mname (mparam ...) mbody ...] ...))
  #:keywords field method extends
  #:captures self ? !
  (let ([scls scls-expr]
        [methods
         (local [(defmac (? fd) #:captures self
                   (dict-ref (obj-values self) 'fd))
                 (defmac (! fd v) #:captures self
                   (dict-set! (obj-values self) 'fd v))]
           (list (cons 'mname (λ (self mparam ...) mbody ...)) ...))])
    (letrec ([class
       (λ (msg . args)
         (match msg
           ['-create
            (obj class (make-hash (list (cons 'fname fval) ...)))]
           ['-lookup
            (let ([found (assoc (first args) methods)])
              (if found
                  (cdr found)
                  (scls '-lookup (first args))))]))])
      class)))

Let us see an example of class inheritance at work, first with a very simple example:

(define A
  (CLASS extends Root ()
         ([method foo () "foo"]
          [method bar () "bar"])))
(define B
  (CLASS extends A ()
         ([method bar () "B bar"])))

> (define b (new B))
> (→ b foo)
"foo"
> (→ b bar)
"B bar"

This looks just fine: sending a message that is unknown in B works as expected, and sending bar also results in B’s refinement to be executed instead of A’s method. We say that method bar in B overrides the method of the same name defined in A.

Let us look at a slightly more complex example, building on our Counter class. In this chapter, we will progressively develop a ReactiveCounter, which extends Counter with the ability to trigger actions whenever the counter reaches certain values. But, let’s first try with an empty subclass:

(define Counter
(CLASS extends Root
  ([field count 0]
   [field step  1])
  ([method inc () (! count (+ (? count) (? step))) (? count)]
   [method dec () (! count (- (? count) (? step))) (? count)]
   [method reset () (! count 0)]
   [method step! (v) (! step v)]
   [method inc-by! (v) (→ self step! v) (→ self inc)])))
(define ReactiveCounter
  (CLASS extends Counter () ()))

> (define rc (new ReactiveCounter))
> (→ rc inc)
hash-ref: no value found for key
  key: 'count

What happened? It seems that we are not able to use field count of a reactive counter. Fair enough, we haven’t dealt at all about how fields have to be handled in the presence of inheritance.

6.3 Fields and Inheritance

Let us look again at how we handle object creation at the moment:

['-create
(obj class (make-hash (list (cons 'fname fval) ...)))]

That is it: we are only initializing the values dictionary for the fields that are declared in the current class! We ought to be initializing the dictionary with values for the fields of the ancestor classes as well.

6.3.1 Inheriting Fields

An object should have values for all the fields declared in any of its ancestors. Therefore, when we create a new class, we should determine all the fields that its instances will have. To do that, we have to extend classes such that they keep a list of all their fields, and are able to pass that information to any subclass.

(defmac (CLASS extends scls-expr
               ([field fname fval] ...)
               ([method mname (mparam ...) mbody ...] ...))
  #:keywords field method extends
  #:captures self ? !
  (let* ([scls scls-expr]
         [methods ....]
         [fields (append (scls '-all-fields)
                         (list (cons 'fname fval) ...))])
    (letrec
        ([class (λ (msg . args)
                  (match msg
                    ['-all-fields fields]
                    ['-create (obj class (make-hash fields))]
                    ....))]))))

Why are we suddenly using let*?

We introduce a new fields identifier in the lexical environment of the class. This identifier is bound to the complete list of fields that the class’ instances should have. All fields of the superclass are obtained by sending it the -all-fields message (whose implementation simply returns the list bound to fields). When creating an object, we just make a fresh dictionary with all the fields.

Because we have added a new message to the vocabulary of classes, we need to wonder what happens if this message is sent to the Root class: what are all the fields of that class? Well, it has to be the empty list, since we are using append blindly:

(define Root
  (λ (msg . args)
    (match msg
      ['-lookup     (error "message not understood:" (first args))]
      ['-all-fields '()])))

Let us see if this works:

> (define rc (new ReactiveCounter))
> (→ rc inc)
1
> (→ rc inc-by! 10)
11

Good! We can now define some more of the reactive counter:

(define ReactiveCounter
  (CLASS extends Counter
         ([field predicate (λ (n) #f)]
          [field action    (λ (n) #f)])
         ([method register (p a) (! predicate p) (! action a)]
          [method react () (when ((? predicate) (? count))
                             ((? action) (? count)))])))

> (define rc (new ReactiveCounter))
> (→ rc register even? (λ (v) (printf "reacting to ~a~n" v)))
> (→ rc react)
reacting to 0
> (→ rc inc)
1
> (→ rc react)
> (→ rc inc)
2
> (→ rc react)
reacting to 2

A reactive counter holds a predicate and an action. Both are functions of the current value of the counter, count. The react method implements the reactive logic: when the predicate matches the counter value, the action is triggered. The interactions below the class definition show the expected behavior: when react is called, if the counter value is even, then a message is printed, showing the value. We will see later how to automatically invoke react when the counter value is modified.

6.3.2 Binding of Fields

Actually, there is still one more issue that we haven’t considered: what happens if a subclass defines a field with a name that already exists in one of its ancestors?

(define A
(CLASS extends Root
        ([field x "A"])
        ([method ax () (? x)])))
(define B
  (CLASS extends A
         ([field x "B"])
         ([method bx () (? x)])))

> (define b (new B))
> (→ b ax)
"B"
> (→ b bx)
"B"

In both cases, we get the value bound to the x field of B. In other words, we have late binding of fields, exactly as we do for methods.

Is that what you expect? is it reasonable? What happens in Java? In JavaScript? In Python?

Let us see: an object is meant to encapsulate some (possibly mutable) state behind a proper procedural interface (methods). It is clear that late binding of methods is a desirable property, because methods are what makes an object’s external interface. What about fields? Fields are supposed to be hidden, internal state of the object—in other words, implementation details, not public interface. Actually, notice that in our language so far, it is not even possible to access a field of another object other than self! So at the very least, late binding of fields is doubtful.

Let us look at what happened with Delegation. How are fields handled there? Well, fields are just free variables of a function, so they are lexically scoped. This is a much more reasonable semantics for fields. When methods are defined in a class, they are defined in terms of the fields that are directly defined in that class, or one of its superclass. This makes sense, because all that is information known at the time one writes a class definition. Having late binding of fields means reintroducing dynamic scoping for all free variables of a method: an interesting source of errors and headaches! (Think of examples of subclasses mixing up their superclasses by accidentally introducing fields with existing names.)

Likewise, in languages that have visibility modifiers, are private methods late bound? Why (not)?

6.3.3 Field Shadowing

We now see how to define the semantics known as field shadowing, in which a field in a class shadows a field of the same name in a superclass, but a method always accesses a field as declared in the class or one of its ancestors.

Concretely, this means that an object can hold different values for fields of the same name; which one to use depends on the class in which the executing method is defined (this is know as the host class of the method). Because of this multiplicity, it is not possible to use a hash table anymore. Instead, we will keep in a class the list of the field names, and in the object, a vector of values, accessed by position. A field access will be done in two steps: first determining the position of the field according to the list of names, and then accessing the value in the vector held by the object.

For instance, for class A above, the list of names is '(x y) and the vector of values in a new instance of A is #(1 0). For class B, the list of names is '(x y x) and the vector of values in a new instance is #(1 0 1). The advantage of keeping the fields ordered this way is that, if not shadowed, a field is always at the same position within an object.

To respect the semantics of shadowing, we have (at least) two options. We can rename shadowed fields to a mangled name, for instance '(x0 y x) in B so that methods hosted in B and its descendants only see the latest definition of x, that is, the field introduced in B. An other alternative is to keep the field names unchanged, but to perform lookup starting from the end: in other words, we will want to find the last position of a field name in the list of names. We will go for the latter.

Defining find-last is left as an exercise to the reader

We update our definition of CLASS so as to introduce fields as a vector instead of a hashtable:

['-create
(let ([values (list->vector (map cdr fields))])
(obj class values))]

We obtain a vector (with the initial field values) and construct the object with it. Then, for field accesses, we must access the vector at the appropriate position, as returned by find-last:

(let* ([scls scls-expr]
       [fields (append (scls '-all-fields)
                       (list (cons 'fname fval) ...))]
       [methods
        (local [(defmac (? fd) #:captures self
                  (vector-ref (obj-values self)
                              (find-last 'fd fields)))
                (defmac (! fd v) #:captures self
                  (vector-set! (obj-values self)
                               (find-last 'fd fields)
                               v))]
          ....)]))

For this to work, the list of fields fields must be in scope of the method bodies, hence we moved its definition above that of methods. Recall that hygiene ensures that the fields identifier does not accidentally interfere with the (user-provided) method bodies.

This implementation is not so satisfactory because we call find-last (expensive/linear) for each and every field access. Can this be avoided? How?

Let us see if all this works as expected:

(define A
(CLASS extends Root
        ([field x "A"])
        ([method ax () (? x)])))
(define B
  (CLASS extends A
         ([field x "B"])
         ([method bx () (? x)])))

> (define b (new B))
> (→ b ax)
"A"
> (→ b bx)
"B"

Great!

6.4 Super Sends

When a method overrides a method in a superclass, it is sometimes useful to be able to invoke that definition. This allows many typical refinement patterns, such as adding something to do before or after a method is executed, like additional processing on its arguments and return values, among many others. This is achieved by doing what is called a super send.

Let us go back to our reactive counter example. We would like to actually make our counter reactive, in that it automatically reacts to each change in the counter value. We can do that by overriding inc in the ReactiveCounter subclass, and use a super send to reuse the original implementation.

Let us look at a first example (we use ↑ as the syntax for a super send).

The syntax of a super send is rather different from what you might be used to. For instance, in Java one writes super.m(), and here, (↑ m). The reason for this syntax will soon become clear.

(define ReactiveCounter
  (CLASS extends Counter
         ([field predicate (λ (n) #f)]
          [field action    (λ (n) #f)])
         ([method register (p a) (! predicate p) (! action a)]
          [method react () (when ((? predicate) (? count))
                             ((? action) (? count)))]
          [method inc ()
                  (let ([val (↑ inc)])
                    (→ self react)
                    val)])))

> (define rc (new ReactiveCounter))
> (→ rc register even? (λ (v) (printf "reacting to ~a~n" v)))
> (→ rc inc)
1
> (→ rc inc)
reacting to 2
2

Note how a super send allows us to reuse and extend the definition of inc in Counter in order to define the method in ReactiveCounter. What is the semantics of a super send?

A first thing to clarify is: what is the receiver of a super send? In the example above, to which object is inc sent using ↑? self! That’s right. In Java, the syntax super.m() suggests that super is an object, but it is not!

In Java, try returning super or passing super as an argument. What happens? Why?

The key lesson here is that a super send only really affects method lookup. So, in Java syntax, what is super in super.m() is rather the ., not the receiver. So this "super dot" is what we write ↑ here. The receiver is left implicit, because, as for field access in our language, the only valid receiver is self.

Beyond this syntactic rant, a common misunderstanding is that when doing a super send, method lookup starts from the superclass of the receiver, instead of its class. Let us see why this is incorrect in a small, artificial example:

(define A (CLASS extends Root ()
([method m () "A"])))
(define B (CLASS extends A ()
([method m () (string-append "B" (↑ m) "B")])))
(define C (CLASS extends B () ()))

(define c (new C))
(→ c m)

What does this program return? Let’s see. → expands into a send of -lookup to c’s class, which is C. There is no methods defined for m in C, so it sends -lookup to its superclass, B. B finds a method for m, and returns it. It is then applied to the current self (c), with no further arguments. The evaluation of the method implies the evaluation of the three arguments of string-append, the second of which is a super send. With the above definition of a super send, this means that m is looked up not in C (the actual class of the receiver) but in B (its superclass). Is there a method for m in B? Yes, and we are actually executing it.... In other words, with this understanding of super sends, the above program does NOT terminate.

What is the mistake? To consider that a super send implies looking up the method in the superclass of the receiver. In the example above, we should actually lookup m in A, not in B. To this end, we need to know the superclass of the host class of the method in which the super send occurs. Is this a value that should be bound statically in method bodies, or dynamically? Well, we’ve just said it: it is the superclass of the host class of the method, and that is not likely to change dynamically, at least in our language.

Some dynamic languages, like Ruby, allow the class inheritance relation to be changed at runtime. This is also common in prototype-based languages, like Self and JavaScript.

Luckily, we already have a binding in the lexical context of methods that refers to the superclass, scls. So we just need to introduce a new local macro for ↑, whose expansion asks the superclass scls to lookup the message. ↑ can be used by user code, so it is added to the list of #:captures identifiers:

(defmac (CLASS extends superclass
               ([field f init] ...)
               ([method m params body] ...))
  #:keywords field method extends
  #:captures self ? ! ↑
  (let* ([scls superclass]
         [fields (append (scls '-all-fields)
                         (list (cons 'f init) ...))]
         [methods
          (local [(defmac (? fd) ....)
                  (defmac (! fd v) ....)
                  (defmac (↑ md . args) #:captures self
                    ((scls '-lookup 'md) self . args))]
            ....)])))

Note how -lookup is now sent to the superclass of the host class of the currently-executing method, scls, instead of to the actual class of the current object.

(define A (CLASS extends Root ()
([method m () "A"])))
(define B (CLASS extends A ()
([method m () (string-append "B" (↑ m) "B")])))
(define C (CLASS extends B () ()))

> (define c (new C))
> (→ c m)
"BAB"

6.5 Inheritance and Initialization

We have previously seen how to address object initialization (Initialization), by introducing special methods called initializers. Once an object is created, and before it is returned to the creator, its initializer is called.

In the presence of inheritance, the process is a bit more subtle, because if initializers override each other, some necessary initialization work can be missed. The work of an initializer can be quite specific, and we want to avoid subclasses to have to deal with all the details. One can simply assume the normal semantics of method dispatch, whereby initialize in a subclass can call the super initializer if needed. This is the approach adopted by Python: if one does not call super () .__init__ (...) the initializer of the superclass(es) will not execute. The problem with this freedom is that the initializer in the subclass may start doing things with the object when inherited fields are not yet consistently initialized, or simply skip the initialization defined in the superclasses.

To avoid this issue, in Java, the first thing a constructor must do is to call a super constructor (it may have to compute arguments for that call, but that’s all it is allowed to do). Even if the call is not in the source code, the compiler adds it. Actually, this restriction is also enforced at the VM level by the bytecode verifier: low-level bytecode surgery can therefore not be used to avoid the super constructor call.

However, this protocol does not ensure by itself that uninitialized fields are never accessed. Can you think of how such a problem can occur? Think about the possibility to call any method from a constructor.

contents ← prev up next →

1	From Functions to Simple Objects
2	Looking for the Self
3	Benefits and Limits of Objects
4	Forwarding and Delegation
5	Classes
6	Inheritance
7	A World of Possibilities

6.1	Class Hierarchy
6.2	Method Lookup
6.3	Fields and Inheritance
6.4	Super Sends
6.5	Inheritance and Initialization