On this page:
5.1 Sharing Method Definitions
5.2 Accessing Fields
5.3 Classes
5.4 Embedding Classes in Scheme
5.4.1 Macro for Classes
5.4.2 Auxiliary Syntax
5.4.3 Example
5.4.4 Strong Encapsulation
5.5 Initialization
5.6 Anonymous, Local and Nested Classes

5 Classes

Éric Tanter

Let’s go back to the factory function (see Constructing Objects):

(define (make-point init-x)
  (OBJECT
   ([field x init-x])
   ([method x? () x]
    [method x! (new-x) (begin (set! x new-x) self)])))
 
(define p1 (make-point 0))
(define p2 (make-point 1))

All point objects get their own version of the methods, even though they are the same. Well, at least their signature and body are the same. Are they completely the same though? They are not, in fact. The only difference, in this version of the object system, is that each method closes over the self of each object: i.e., self in a method in p1 refers to p1, while it refers to p2 in the method of p2. In other words, the methods, which are functions, differ by the lexical environment they capture.

5.1 Sharing Method Definitions

Instead of duplicating all method definitions just to be able to support different selves, it makes much more sense to factor out the common part (the method bodies), and parameterize them by the variable part (the object bound to self).

Let us try first without macros. Recall that our definition of a plain point object without macros is as follows:
(define make-point
  (λ (init-x)
    (letrec ([self
              (let ([x init-x])
                (let ([methods (list (cons 'x? (λ () x))
                                     (cons 'x! (λ (nx)
                                                 (set! x nx)
                                                 self)))])
                  (λ (msg . args)
                    (apply (cdr (assoc msg methods)) args))))])
      self)))

If we hoist the (let ([methods...])) out of the (λ (init-x) ...), we effectively achieve the sharing of method definitions we are looking for. But, field variables are now out of scope of method bodies. More concretely, here, it means that x will be unbound in both methods. This means that we need to parameterize methods by state (field values) as well, in addition to self. But, fair enough, self can "hold" the state (it can capture field bindings in its lexical environment). We just need a way to extract (and potentially assign) field values through self. For that, we are going to have objects support two specific messages -read and -write:

(define make-point
  (let ([methods (list (cons 'x? (λ (self)
                                    (λ () (self '-read))))
                       (cons 'x! (λ (self)
                                    (λ (nx)
                                      (self '-write nx)
                                      self))))])
    (λ (init-x)
      (letrec ([self
                (let ([x init-x])
                  (λ (msg . args)
                    (case msg
                      [(-read) x]
                      [(-write) (set! x (first args))]
                      [else
                       (apply ((cdr (assoc msg methods)) self) args)])))])
      self))))
See how the two methods are now parameterized by self, and that in order to read/assign to a field, they send a special message to self. Now let us examine the definition of the object itself: when sent a message, it first checks if the message is either -read or -write, in which case it either returns x or assigns it. Let us see if this works:
(define p1 (make-point 1))
(define p2 (make-point 2))

 

> ((p1 'x! 10) 'x?)

10

> (p2 'x?)

2

5.2 Accessing Fields

Of course, our definition is not very general, because it only works for the one field x. We need to generalize: field names must be passed as arguments to the -read and -write messages. The issue is then how to go from a field name (as a symbol) to actually reading/assigning the variable with the same name in the lexical environment of the object. A simple solution is to use a structure to hold field values. This is similar to the way we handle method definitions already: an association between method names and method definitions. However, unlike in a method table, field bindings are (at least potentially) mutable. Racket does not allow mutation in association lists, so we will use a dictionary (more precisely, hashtable), which is accessed with dict-ref and dict-set!.

(define make-point
  (let ([methods (list (cons 'x? (λ (self)
                                    (λ () (self '-read 'x))))
                       (cons 'x! (λ (self)
                                  (λ (nx)
                                    (self '-write 'x nx)
                                    self))))])
    (λ (init-x)
      (letrec ([self
                (let ([fields (make-hash (list (cons 'x init-x)))])
                  (λ (msg . args)
                    (case msg
                      [(-read)  (dict-ref  fields (first args))]
                      [(-write) (dict-set! fields (first args)
                                                  (second args))]
                      [else
                       (apply ((cdr (assoc msg methods)) self) args)])))])
      self))))

 

> (let ((p1 (make-point 1))
        (p2 (make-point 2)))
    (+ ((p1 'x! 10) 'x?)
       (p2 'x?)))

12

Note how make-point now holds the list of methods definitions, and the created object captures a dictionary of fields (which is initialized prior to returning the object).

5.3 Classes

While we did achieve the sharing of method definitions we were after, our solution is still not very satisfactory. Why? Well, observe the definition of an object (the body of the (λ (msg . args) ....) above). The logic that is implemented there is, again, repeated in all objects we create with make-point: each object has its own copy of what to do when it is sent a -read message (lookup in the fields dictionary), a -write message (assign in the fields dictionary), or any other message (looking in the methods table and then applying the method).

So, all this logic could very well be shared amongst objects. The only free variables in the object body are fields and self. In other words, we could define an object as being just its self as well as its fields, and leave all the other logic to the make-point function. In that case make-point starts to have more than one responsability: it is no longer only in charge of creating new objects, it is also in charge of handling accesses to fields and message handling. That is, make-point is now evolving into what is called a class.

How are we going to represent a class? well, for now it is just a function that we can apply (and it creates an object—an instance); if we need that function to have different behaviors, we can apply the same Object Pattern we saw at the beginning of this course.

In some languages, classes are objects in their own right. The paradigmatic example in this regard is Smalltalk. Definitely worth a detour!

That is:
(define Point
  ....
  (λ (msg . args)
    (case msg
      [(create) create instance]
      [(read) read field]
      [(write) write field]
      [(invoke) invoke method])))
This pattern makes clear what the role of a class is: it produces objects, and invokes methods, reads and writes fields on its instances.

What is the role of an object now? well, it is just to have an identity, know its class, and hold the values of its fields. It does not hold any behavior on its own, anymore. In other words, we can define an object as a plain data structure:

(define-struct obj (class values))

Let us see exactly how we can define the class Point now:
(define Point
  (let ([methods ....])
    (letrec
        ([class
             (λ (msg . vals)
               (case msg
                 [(create) (let ((values (make-hash '((x . 0)))))
                             (make-obj class values))]
                 [(read) (dict-ref (obj-values (first vals))
                                   (second vals))]
                 [(write) (dict-set! (obj-values (first vals))
                                     (second vals)
                                     (third vals))]
                 [(invoke)
                   (let ((found (assoc (second vals) methods)))
                     (if found
                         (apply ((cdr found) (first vals)) (cddr vals))
                         (error "message not understood")))]))])
      class)))

> (Point 'create)

#<obj>

We can instantiate the class Point by sending it the create message. Now that an object is a structure, we need a different approach to sending messages to it, as well as to access its fields. To send a message to an object p, we must first retrieve its class, and then send the invoke message to the class:

((obj-class p) 'invoke p 'x?)

And similarly for reading and accessing fields.

5.4 Embedding Classes in Scheme

Let us now embed classes in Scheme using macros.

5.4.1 Macro for Classes

We define a CLASS syntactic abstraction for creating classes:

(defmac (CLASS ([field f init] ...)
               ([method m params body] ...))
     #:keywords field method
     #:captures self
     (let ([methods (list (cons 'm (λ (self)
                                     (λ params body))) ...)])
       (letrec
           ([class
                (λ (msg . vals)
                  (case msg
                    [(create)
                     (make-obj class
                               (make-hash (list (cons 'f init) ...)))]
                    [(read)
                     (dict-ref (obj-values (first vals)) (second vals))]
                    [(write)
                     (dict-set! (obj-values (first vals)) (second vals) (third vals))]
                    [(invoke)
                     (if (assoc (second vals) methods)
                         (apply ((cdr (assoc (second vals) methods)) (first vals)) (cddr vals))
                         (error "message not understood"))]))])
         class)))
5.4.2 Auxiliary Syntax

We need to introduce a new definition for the convenient syntax to invoke methods (->), and introduce similar syntax for accessing the fields of the current object (? and !).

(defmac (-> o m arg ...)
  (let ((obj o))
    ((obj-class obj) 'invoke obj 'm arg ...)))
(defmac (? fd) #:captures self
  ((obj-class self) 'read self 'fd))
 
(defmac (! fd v) #:captures self
  ((obj-class self) 'write self 'fd v))

We can also define an auxiliary function to create new instances:
(define (new c)
  (c 'create))
This simple function is conceptually very important: it helps to hide the fact that classes are internally implemented as functions, as well as the actual symbol used to ask a class to create an instance.

5.4.3 Example

Let us see classes at work:
(define Point
 (CLASS ([field x 0])
        ([method x? () (? x)]
         [method x! (new-x) (! x new-x)]
         [method move (n) (-> self x! (+ (-> self x?) n))])))
 
(define p1 (new Point))
(define p2 (new Point))

 

> (-> p1 move 10)
> (-> p1 x?)

10

> (-> p2 x?)

0

5.4.4 Strong Encapsulation

We have made an important design decision with respect to field accesses: field accessors ? and ! only apply to self! i.e., it is not possible in our language to access fields of another object. This is called a language with strongly-encapsulated objects. Smalltalk follows this discipline (accessing a field of another object is actually a message send, which can therefore be controled by the receiver object). Java does not: it is possible to access the field of any object (provided visibility allows it). Here, our syntax simply does not allow foreign field accesses.

Another consequence of our design choice is that field accesses should only occur within method bodies: because the receiver object is always self, self must be defined. For instance, look at what happen if we use the field read form ? outside of an object:

> (? f)

self: undefined;

 cannot reference undefined identifier

It would be much better if the above could yield an error saying that ? is undefined. In order to do this, we can simply introduce ? and ! as local syntactic forms, only defined within the confines of method bodies, instead of global ones. We do that by moving the definition of these field access forms from the top-level to a local scope, surrounding method definitions:
(defmac (CLASS ([field f init] ...)
               ([method m params body] ...))
     #:keywords field method
     #:captures self ? !
     (let ([methods
            (local [(defmac (? fd) #:captures self
                      ((obj-class self) 'read self 'fd))
                    (defmac (! fd v) #:captures self
                      ((obj-class self) 'write self 'fd v))]
              (list (cons 'm (λ (self)
                               (λ params body))) ...))])
                (letrec
                   ([class (λ (msg . vals) ....)]))))

Defining the syntactic forms ? and ! locally, for the scope of the definition of the list of methods only, ensures that they are available to use within method bodies, but nowhere else.

Now, field accessors are no longer defined outside of methods:
> (? f)

?: undefined;

 cannot reference undefined identifier

From now on, we will use this local approach.

5.5 Initialization

As we have seen, the way to obtain an object from a class, i.e., to instantiate it, is to send the class the create message. It is generally useful to be able to pass arguments to create in order to specify the initial values of the fields of the object. For now, our class system only supports the specification of default field values at class-declaration time. It is not possible to pass initial field values at instantiation time.

Initializer methods are a typical programming idiom in Smalltalk. In Java, these are known as constructors (That’s arguably a bad name, because as we can see, they are not in charge of the construction of the object–only of its initialization after the object is actually created).

There are several ways to do this. A simple way is to require objects to implement an initializer method, and have the class invoke this initializer method on each newly-created object. We will adopt the following convention: if no argument is passed with the create message, then we do not call the initializer (and therefore use the default values). If arguments are passed, we invoke the initializer (called initialize) with the arguments:

....
(λ (msg . vals)
  (case msg
    [(create)
     (if (null? vals)
         (make-obj class
                   (make-hash (list (cons 'f init) ...)))
         (let ((object (make-obj class (make-hash))))
           (apply ((cdr (assoc 'initialize methods)) object) vals)
           object))]
    ....)) ....

We can refine the auxiliary function to instantiate classes such that it accepts a variable number of arguments:
(define (new class . init-vals)
    (apply class 'create init-vals))
Let us see if it works as expected:
(define Point
 (CLASS ([field x 0])
        ([method initialize (nx) (-> self x! nx)]
         [method x? () (? x)]
         [method x! (nx) (! x nx)]
         [method move (n) (-> self x! (+ (-> self x?) n))])))
 
(define p (new Point 5))

 

> (-> p move 10)
> (-> p x?)

15

5.6 Anonymous, Local and Nested Classes

We have introduced classes in our extension of Scheme, in such a way that classes are, like objects in our earlier systems, represented as first-class functions. This means therefore that classes in our language are first-class entities, which can, for instance, be passed as parameter (see the definition of the create function above). Other consequences are that our system also supports both anonymous and nested classes. Of course, all this is achieved while respecting the rules of lexical scoping.

(define (cst-class-factory cst)
 (CLASS () ([method add (n) (+ n cst)]
            [method sub (n) (- n cst)]
            [method mul (n) (* n cst)])))
 
(define Ops10  (cst-class-factory 10))
(define Ops100 (cst-class-factory 100))

 

> (-> (new Ops10) add 10)

20

> (-> (new Ops100) mul 2)

200

We can also introduce classes in a local scope. That is, as opposed to languages where classes are first-order entities that are globally visible, we are able to define classes locally.

(define doubleton
  (let ([the-class (CLASS ([field x 0])
                          ([method initialize (x) (-> self x! x)]
                           [method x? () (? x)]
                           [method x! (new-x) (! x new-x)]))])
    (let ([obj1 (new the-class 1)]
          [obj2 (new the-class 2)])
      (cons obj1 obj2))))

 

> (-> (cdr doubleton) x?)

2

In the above we introduce the-class only for the sake of creating two instances and returning them in a pair. After that point, the class is not accessible anymore. In other words, it is impossible to create more instances of that class. However, of course, the two instances we created still refer to their class, so the objects can be used. Interestingly, once these objects are garbage collected, their class as well can be reclaimed.