5 Classes

Éric Tanter

Let’s go back to the factory function (see Constructing Objects):

(define (make-counter [init-count 0] [init-step 1])
  (OBJECT
   ([field count init-count]
    [field step  init-step])
   ([method inc () (set! count (+ count step)) count]
    [method dec () (set! count (- count step)) count]
    [method reset () (set! count 0)]
    [method step! (v) (set! step v)]
    [method inc-by! (v) (→ self step! v) (→ self inc)])))

(define c1 (make-counter))
(define c2 (make-counter 10))

Each counter object get its own version of the methods, even though they are the same. Well, at least their signature and body are the same.

Are they the completely the same, though?

Well, methods are functions, so as values, they are closures: function definitions together with their lexical environment. In particular, each method closes over the self of each object: i.e., self in a method in c1 refers to c1, while it refers to c2 in the method of c2. Likewise for fields, which are also let-bound variables in scope for the method closures. In other words, methods differ by the lexical environment they capture.

5.1 Sharing Method Definitions

Instead of duplicating all method definitions just to be able to support different selves and field values, it makes much more sense to factor out the common part (the method bodies), and parameterize them by the variable part (the object bound to self).

Let us try first without macros. Recall that our definition of a plain counter object without macros is as follows (slightly simplified):

(define make-counter
  (λ ([init-count 0] [init-step 1])
    (letrec
        ([self
          (let ([count init-count]
                [step  init-step])
            (let ([methods
                   (list
                    (cons 'inc (λ () (set! count (+ count step))
                                     count))
                    (cons 'step! (λ (v) (set! step v)))
                    (cons 'inc-by! (λ (n) (self 'step! n)
                                          (self 'inc))))])
            (λ (msg . args)
              (let ([found (assoc msg methods)])
                (if found
                    (apply (cdr found) args)
                    (error "message not understood:" msg))))))])
    self)))

If we hoist the (let ([methods...])) out of the top-level λ (the factory), and parametrize them by self (as an additional first argument), we effectively achieve the sharing of method definitions at the factory level we are looking for:

(define make-counter
  (let ([methods
         (list
          (cons 'inc (λ (self) (set! count (+ count step)) count))
          (cons 'step! (λ (self v) (set! step v)))
          (cons 'inc-by! (λ (self n) (self 'step! n)
                                     (self 'inc))))])
    (λ ([init-count 0] [init-step 1])
      (letrec
          ([self
            (let ([count init-count]
                  [step  init-step])
              (λ (msg . args)
                (let ([found (assoc msg methods)])
                  (if found
                      (apply (cdr found) (cons self args))
                      (error "message not understood:" msg)))))])
        self))))

Note how each method takes an extra first parameter. In inc-by!, this allows us to invoke methods on self. Of course, when we apply the method in the dispatcher, we need to pass it the current self as extra argument.

The problem is that field variables are now out of scope of method bodies. More concretely, here, it means that count and step are free in the method bodies. So we need to parameterize methods by state (field values) as well, in addition to self. But, fair enough, self can "hold" the state (it can capture field bindings in its lexical environment). We just need a way to extract (and potentially assign) field values through self.

5.2 Accessing Fields

If we only pass self as extra argument to methods, we need a way to access an object’s fields by sending it messages (since self is an object, and an object is just a dispatcher function). To this end, we introduce a field access protocol consisting of two messages -read and -write. The leading dash indicates that these messages are "meta" messages, not standard messages that need to be interpreted by a corresponding method.

These meta-messages take as first argument the field name to be accessed. The issue is then how to go from a field name (as a symbol) to actually reading/assigning the variable with the same name in the lexical environment of the object. A simple solution is to use a structure to hold field values. This is similar to the way we handle method definitions already: an association between method names and method definitions. However, unlike in a method table, field bindings are (at least potentially) mutable. Racket does not allow mutation in association lists, so we will use a dictionary (more precisely, a hashtable), which is accessed with dict-ref and dict-set!.

(define make-counter
  (let ([methods
         (list
          (cons 'inc (λ (self)
                       (self '-write 'count
                             (+ (self '-read 'count) (self '-read 'step)))
                       (self '-read 'count)))
          (cons 'step! (λ (self v) (self '-write 'step v)))
          (cons 'inc-by! (λ (self n) (self 'step! n)
                                     (self 'inc))))])
    (λ ([init-count 0] [init-step 1])
      (letrec
          ([self
            (let ([fields (make-hash (list (cons 'count init-count)
                                           (cons 'step init-step)))])
              (λ (msg . args)
                (match msg
                  ['-read  (dict-ref  fields (first args))]
                  ['-write (dict-set! fields (first args) (second args))]
                  [_ (let ([found (assoc msg methods)])
                       (if found
                           (apply (cdr found) (cons self args))
                           (error "message not understood:" msg)))])))])
        self))))

> (let ([c (make-counter)])
    (c 'inc-by! 10))
10

Note how make-counter now holds the list of parametrized methods, and the created object captures a dictionary of fields, which is initialized prior to returning the object. In the method bodies, field accesses are now implemented by sending -read and -write meta-messages, interpreted accordingly in the dispatcher.

This interpretation of -write means that setting a non-existent field will add it to the object (because that’s how dict-set! works). This is the semantics adopted by Python, for instance. In contrast, in Java and Scala, for instance, setting a non-existent field is an error (detected statically thanks to typing). How would you modify the code above to raise an error when setting a non-existent field?

5.3 Classes

While we did achieve the sharing of method definitions we were after, our solution is still not very satisfactory. Why? Well, observe the definition of an object (the body of the (λ (msg . args) ....) above). The logic that is implemented there is, again, repeated in all objects we create with make-counter: each object has its own copy of what to do when it is sent a -read message (lookup in the fields dictionary), a -write message (assign in the fields dictionary), or any other message (looking in the methods table and then applying the method).

So, all this logic could very well be shared amongst objects. The only free variables in the object body are fields and self. In other words, we could define an object as being just its self as well as its fields, and leave all the other logic to the make-counter function. In that case make-counter starts to have more than one responsability: it is no longer only in charge of creating new objects, it is also in charge of handling accesses to fields and message handling. That is, make-counter is now evolving into what is called a class.

How are we going to represent a class? well, for now it is just a function that we can apply (and it creates an object—an instance); if we need that function to have different behaviors, we can apply the same Object Pattern we saw at the beginning of this course. That is:

(define Point
  ....
  (λ (msg . args)
    (match msg
      ['-create create instance]
      ['-read read field]
      ['-write write field]
      ['-invoke invoke method])))

This pattern makes clear what the role of a class is: it produces objects, and invokes methods, reads and writes fields on its instances.

So, what is the role of an object now? Well, it is just to exist as a first-class value, know its class, and hold the values of its fields. It does not hold any behavior on its own, anymore. In other words, we can define an object as a plain data structure:

(struct obj (class values))

As simple as this! From now on, an object will just be such a structure.

It is the first time we use struct in these notes: it is a convenient Racket macro to define datatypes, which automatically generates a constructor (here, obj) and accessors for each of the field of the structure (here, obj-class, obj-values).

Let us see exactly how we can define the class Counter now:

(define Counter
  (let ([methods ....])
    (letrec
        ([class
             (λ (msg . args)
               (match msg
                 ['-create (let ([values (make-hash '((count . 0) (step . 1)))])
                             (obj class values))]
                 ['-read (dict-ref (obj-values (first args))
                                   (second args))]
                 ['-write (dict-set! (obj-values (first args))
                                     (second args)
                                     (third args))]
                 ['-invoke
                   (let ([found (assoc (first args) methods)])
                     (if found
                         (apply (cdr found) (rest args))
                         (error "message not understood:" (first args))))]))])
      class)))

As we did for self before, the class identifier is also defined using letrec: can you see why?

We can instantiate the class Counter by sending it the create message.

> (Counter '-create)
#<obj>

Now that an object is a structure, we need a different approach to sending messages to it, as well as to access its fields. To send a message to an object p, we must first retrieve its class, and then send the invoke message to the class:

((obj-class c) '-invoke 'inc c)

And similarly for reading and accessing fields.

Note that representing classes as dispatcher functions following the Object Pattern is clearly not the only design alternative here. We could as well push the notion of classes-as-objects further, in a uniform manner (ie. what is the class of a class?), as done in Smalltalk. We could also trim-down classes as inert structures, using plain procedures to implement the mechanisms of object creation, field accesses, and method invocation. In fact, even within our choice of classes-as-dispatchers, there are different landing points, as we will see next.

5.4 Embedding Classes in Scheme

Let us now embed classes in Scheme using macros.

5.4.1 Macro for Classes, Take 1

We can define a CLASS syntactic abstraction for creating classes:

(defmac (CLASS ([field fname fval] ...)
               ([method mname (mparam ...) mbody ...] ...))
     #:keywords field method
     #:captures self
     (let ([methods
            (list (cons 'mname (λ (self mparam ...) mbody ...)) ...)])
       (letrec
           ([class
                (λ (msg . args)
                  (match msg
                    ['-create
                     (obj class
                          (make-hash (list (cons 'fname fval) ...)))]
                    ['-read
                     (dict-ref (obj-values (first args)) (second args))]
                    ['-write
                     (dict-set! (obj-values (first args)) (second args) (third args))]
                    ['-invoke
                     (let ([found (assoc (second args) methods)])
                       (if found
                           (apply (cdr found) (rest args))
                           (error "message not understood:" (second args))))]))])
         class)))

This definition follows the schema presented above, where a class is responsible for the interpretation of all the OO mechanisms (instance creation, field accesses, method invocation).

The syntactic macro for method invocation would then be:

(defmac (→ o m arg ...)
(let ([obj o])
((obj-class obj) '-invoke 'm obj arg ...)))

Why is the let-binding necessary?

5.4.2 Macro for Classes, Take 2

In the definition of CLASS above, the interpretation of both -read and -write have nothing to do with the class itself. Their interpretation just consists in accessing the vector of values of an object. Therefore, we could move that behavior out of classes themselves, and do it in the syntactic macros for field accesses.

Likewise, a method invocation consists of two steps: looking up the method in the methods dictionary of the class, and applying it. Of these steps, only the first one is specific to a given class; application per se is common to any class, and could therefore be handled by the syntactic macro for method invocation as well.

We can therefore use a more lightweight CLASS macro, which only handles the behavior specific to a class, and leaves the rest to the auxiliary macros.

(defmac (CLASS ([field fname fval] ...)
               ([method mname (mparam ...) mbody ...] ...))
     #:keywords field method
     #:captures self
     (let ([methods
            (list (cons 'mname (λ (self mparam ...) mbody ...)) ...)])
       (letrec
           ([class
                (λ (msg . args)
                  (match msg
                    ['-create
                     (obj class (make-hash (list (cons 'fname fval) ...)))]
                    ['-lookup
                     (let ([found (assoc (first args) methods)])
                       (if found
                           (cdr found)
                           (error "message not understood:" (first args))))]))])
         class)))

We now define the convenient syntax to invoke methods (→), and introduce similar syntax for accessing the fields of the current object (? and !).

(defmac (→ o m arg ...)
  (let ([obj o])
    (((obj-class obj) '-lookup 'm) obj arg ...)))

(defmac (? f) #:captures self
  (dict-ref (obj-values self) 'f))

(defmac (! f v) #:captures self
  (dict-set! (obj-values self) 'f v))

We also define an auxiliary function to create new instances:

(define (new c)
(c '-create))

This simple function is conceptually very important: it helps to hide the fact that classes are internally implemented as functions, as well as the actual symbol used to ask a class to create an instance.

Why don’t we need to define new as a macro?

5.4.3 Example

Let us see classes at work:

(define Counter
  (CLASS
   ([field count 0]
    [field step  1])
   ([method inc () (! count (+ (? count) (? step))) (? count)]
    [method dec () (! count (- (? count) (? step))) (? count)]
    [method reset () (! count 0)]
    [method step! (v) (! step v)]
    [method inc-by! (v) (→ self step! v) (→ self inc)])))

> (define c1 (new Counter))
> (define c2 (new Counter))
> (→ c1 inc-by! 10)
10
> (→ c2 inc-by! 20)
20

What happens in this language if we read an undeclared field? If we assign to an undeclared field? Why? Explore variations out there (ie. Java vs. JavaScript) and in your implementation.

5.4.4 Strong Encapsulation

We have made an important design decision with respect to field accesses: field accessors ? and ! only apply to self! i.e., it is not possible in our language to access fields of another object. This is called a language with strongly-encapsulated objects. Smalltalk follows this discipline (accessing a field of another object is actually a message send, which can therefore be controlled by the receiver object). Java does not: it is possible to access the field of any object (provided visibility allows it); JavaScript even less! Here, our syntax simply does not allow foreign field accesses.

Another consequence of our design choice is that field accesses should only occur within method bodies: because the receiver object is always self, self must be defined. For instance, look at what happen if we use the field read form ? outside of an object:

> (? count)
self: undefined;
cannot reference an identifier before its definition
in module: 'program

It would be much better if the above could yield an error saying that ? is undefined. In order to do this, we can simply introduce ? and ! as local syntactic forms, only defined within the confines of method bodies, instead of global ones. We do that by moving the definition of these field access forms from the top-level to a local scope, surrounding method definitions:

(defmac (CLASS ([field fname fval] ...)
               ([method mname (mparam ...) mbody ...] ...))
     #:keywords field method
     #:captures self ? !
     (let ([methods
            (local [(defmac (? f) #:captures self
                      (dict-ref (obj-values self) 'f))
                    (defmac (! f v) #:captures self
                      (dict-set! (obj-values self) 'f v))]
              (list (cons 'mname (λ (self mparam ...) mbody ...)) ...))])
                (letrec
                    ([class (λ (msg . vals) ....)]))))

Defining the syntactic forms ? and ! locally, for the scope of the definition of the list of methods only, ensures that they are available to use within method bodies, but nowhere else.

Now, field accessors are no longer defined outside of methods:

> (? count)
?: undefined;
cannot reference an identifier before its definition
in module: 'program

From now on, we will use this local approach.

5.5 Initialization

As we have seen, the way to obtain an object from a class, i.e., to instantiate it, is to send the class the create message. It is generally useful to be able to pass arguments to create in order to specify the initial values of the fields of the object. For now, our class system only supports the specification of default field values at class-declaration time. It is not possible to pass initial field values at instantiation time.

There are many ways to do this. Here we implement a simple mechanism. First, the new function takes optional arguments:

(define (new class . init-vals)
(apply class (cons '-create init-vals)))

First, the new object is created using the default values declared for each field. Then, if extra arguments are passed to new, the initialize method is invoked on the new object. This method can then set other values to fields, or do whatever initialization work is desired.

....
(λ (msg . args)
  (match msg
    ['-create
     (let ([o (obj class (make-hash (list (cons 'fname fval) ...)))])
       (when (not (empty? args))
         (let ([found (assoc 'initialize methods)])
           (if found
               (apply (cdr found) (cons o args))
               (error "initialize not implemented in:" class))))
       o)]
    ....)) ....

Let us see if it works as expected:

(define Counter
  (CLASS
   ([field count 0]
    [field step  1])
   ([method initialize ([cnt 0] [stp 1]) (! count cnt) (! step stp)]
    [method inc () (! count (+ (? count) (? step))) (? count)]
    [method dec () (! count (- (? count) (? step))) (? count)]
    [method reset () (! count 0)]
    [method step! (v) (! step v)]
    [method inc-by! (v) (→ self step! v) (→ self inc)])))

> (define c1 (new Counter))
> (define c2 (new Counter 5))
> (define c3 (new Counter 5 2))
> (→ c1 inc)
1
> (→ c2 inc)
6
> (→ c3 inc)
7

(Note that we are able to use Racket’s optional argument mechanism without any problem, for initialize, cute!)

This object initialization mechanism is somewhat limited. You can study the variety of mechanisms in languages out there, and implement your own take on the matter. In particular, with inheritance, initialization can become quite subtle to get right.

5.6 Anonymous, Local and Nested Classes

We have introduced classes in our extension of Scheme, in such a way that classes are, like objects in our earlier systems, represented as first-class functions. This means therefore that classes in our language are first-class entities, which can, for instance, be passed as parameter (see the definition of the create function above) and constructed dynamically. Other consequences are that our system also supports both anonymous and nested classes. Of course, all this is achieved while respecting the rules of lexical scoping.

For instance, we can introduce classes in a local scope. That is, as opposed to languages like Java where classes are first-order entities that are globally visible, we are able to define classes locally.

(define doubleton
  (let ([cls (CLASS ([field x 0])
                    ([method initialize (v) (! x v)]
                     [method x? () (? x)]))])
      (cons (new cls 4) (new cls 8))))

> (+ (→ (car doubleton) x?) (→ (cdr doubleton) x?))
12

In the above we introduce cls only for the sake of creating two instances and returning them in a pair. After that point, the class is not accessible anymore. In other words, it is impossible to create more instances of that class. However, of course, the two instances we created still refer to their class, so the objects can be used. Interestingly, once these objects are garbage collected, their class as well can be reclaimed.

Now try to come up with interesting examples of both anonymous classes and nested classes, and compare all these facilities with other languages out there.

contents ← prev up next →

1	From Functions to Simple Objects
2	Looking for the Self
3	Benefits and Limits of Objects
4	Forwarding and Delegation
5	Classes
6	Inheritance
7	A World of Possibilities

5.1	Sharing Method Definitions
5.2	Accessing Fields
5.3	Classes
5.4	Embedding Classes in Scheme
5.5	Initialization
5.6	Anonymous, Local and Nested Classes