Unexpected compilation

I went down an interesting rabbit hole a few days ago. I've been writing mostly Swift at work for over a year now, so my Objective-C practice became somewhat rusty. Anyways, that day was a different day, because I was working on something in the Objective-C part of our code. I had a few "lol" moments as I witnessed myself struggling with the syntax, but then I ran into something interesting as the result of my new Swift habits.

I wanted to add an extra argument to the existing initializer of a class:

- (instancetype)initWithPhotoItem: (PhotoItem *)photoItem
                            state: (AnnotationViewState *)state

I'm so used to Swift by now, I made a mistake:

- (instancetype)initWithPhotoItem: (PhotoItem *)photoItem
                            state: (AnnotationViewState *)state
          annotationFilterService: AnnotationFilterService

Instead of defining an annotationFilterService named argument as (AnnotationFilterService *) annotationFilterService, I just mistakenly defined the AnnotationFilterService class. Exactly like how I would define function arguments in Swift code. The code compiled, the app was running happily, and things only started falling apart once I actually tried to use the annotationFilterService argument, that didn't exist.

It came as a surprise to me that the code compiled and the initializer was running fine this way, so I wanted to understand why.

After giving it some thoughts I concluded that the AnnotationFilterService class must be an object. I assumed that somewhere not-so-deep down there, my method definition is being translated to something correct, and that's why the compiler and even the runtime is fine with it.


UPDATE: As it turns out at the end, the AnnotationFilterService definition could've been anything, like asd. As Jeff Johnson pointed out on Twitter, it's just an untyped parameter name in this context:

Nevertheless, following my intuition before knowing the answer, I figured I would take a closer look at classes, objects, and methods to understand more.

What is a Class?

From the (open source) runtime code:

/// An opaque type that represents an Objective-C class.
typedef struct objc_class *Class;

When looking at the objc_class struct, we can see that it inherits from the objc_object struct. (It's C++ code)

struct objc_class : objc_object {
    Class superclass;
    cache_t cache;
    class_data_bits_t bits;

...

Quickly jumping over to this objc_object struct, we will see what an Objective-C object is, and what's the one thing objc_class will inherit from it. The isa (value, or pointer), that's also a Class.

/// Represents an instance of a class.
struct objc_object {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;
};

/// A pointer to an instance of a class.
typedef struct objc_object *id;

Alright, so an Objective-C class is technically an object with more fields. Still, many questions arise. First of all: what does the compiler translate method definitions to?

Message sending

In order to find it out, let's first look at how message sending works, what happens when we send a message to an object in Objective-C. When running the line of code [object message]; the implementation of the message method on object gets resolved to a function during runtime. First, the compiler translates that expression to something that corresponds to objc_msgSend(object, @selector(message)). In order to resolve the instance method to the underlying function, the system needs to know what class the receiver object is an instance of, because that's where it can look up the implementation of the selector passed in.

This is different from how other languages implement this, using a vtable for example, that is bound to the object, not the class.

isa field

The Class of an object is determined by looking at the first of all variables of it, the isa field. Since every object has an isa, and a super_class field, all objects have access to their Class, and all the Classes up in their hierarchy. For instance member lookup, the runtime first follows the object's class's method list for the selector via its isa field, and if it can't find it there, the superclasses, and so on, until it reaches the root.

Method lists

The method_lists field of the object is a compiler generated list of the methods on the given class. This is a list of Methods, where Method is just an alias for objc_method:

struct objc_method {
    SEL _Nonnull method_name                                 OBJC2_UNAVAILABLE;
    char * _Nullable method_types                            OBJC2_UNAVAILABLE;
    IMP _Nonnull method_imp                                  OBJC2_UNAVAILABLE;
}                                                            OBJC2_UNAVAILABLE;

When looking through these methods, the system finds the method with the matching method_name. If there was no match in the object's class' method list, or any of the object's class' superclasses' method lists, the app will crash with the probably familiar "unrecognized selector sent to instance" message.

There's a method caching mechanism involved in this process to make this more efficient. Look up the docs on objc_cache if you're interested in more details.

When a matching method was found though, the attention turns to the method_imp field of the identified method. This IMP is a pointer to the start of the function that implements the method. This function can be called C-style, passing in the pointer to the object itself first, then the selector, and its arguments following. This is what gets called at the end.

Method signature

When us developers are looking at Objective-C methods on the other hand, all we see is a return type, a name, and arguments. What it looks like actually, is a bit more.

Every method has an implicit self, and a _cmd argument that's hidden from our eyes, but we have access to them. After these two, follow the arguments defined by the developer in the Objective-C code.

Classes are objects

Circling back to the original finding, that Class is an object, we can now try to imagine how sending messages to classes work following the same principles as above. When we send the string message to the NSString class ([NSString string]), something very similar happens to what we were just looking at. The compiler translates the expression to an objc_msgSend function call first, and the runtime will try to resolve the class of the object through the isa field.

But wait, if the class itself is a Class (i.e. NSString), what is the class of the class will be?

Meta classes

A meta-class. A meta-class is what holds the method list of the class. The isa field of a Class is what points to the class' meta-class. When sending the string message to NSString, the runtime will find it in NSString's meta-class' method list.

Let's write a small function to find out what the class, and meta-class of a certain class is. This can help us understand what's happening.

void findClass(Class class) {
    const char* className = class_getName(class);
    NSLog(@"%s's class: %p", className, class);
    NSLog(@"%s's meta-class: %p", className, objc_getMetaClass(className));
}

When passing in NSString.class, the results were the following:

NSString's class: 0x7fff87b450e8
NSString's meta-class: 0x7fff87b45110

I paused the execution right there, and asked lldb to give me what kind of objects these memory addresses represent.

(lldb) po 0x7fff87b450e8
NSString

(lldb) po 0x7fff87b45110
NSString

They are both NSStrings, but the second one is a meta-class. The only thing that distinguishes it from the plain class is that we know they're pointing to different things in memory.

There's more to learn about meta-classes, so if you're interested, I recommend reading Matt Gallagher's What is a meta-class in Objective-C?, and Greg Parker's Classes and metaclasses writing.

Type Encodings reveal more

Now that we know all this, we can go back to finding out how method definitions look like under the hood. The final step of this investigation was figuring out a way to see what the runtime sees when looking at my method.

Let's recreate the original confusion in a more simple example:

- (void)someMethod:justAString {

}

In order to get the type encoding information, we'll use the method_getTypeEncoding runtime function. This function requires one argument to pass in, a Method. We already know what Method is, and how to get to it! Remember that objc_class had an objc_method_list field on it, that points to the list of Methods defined for the class. What we want to do here is to get a list of the methods for my class, and look at the type encodings for the method in question.

In short, type encodings are compiler generated character strings associated with the method selector.

Let's see how they look like and how to get them!

Type encoding lookup and decryption

The following function takes a class and a selector, and returns the selector's type encoding. Let's use this to find out what type encodings my someMethod: method has right now.

const char * typeEncodings(Class class, SEL selector) {
    unsigned int outCount = 0;
    Method *methodList = class_copyMethodList(class, &outCount);

    int i = 0;
    for (i = 0; i < outCount; i++) {
        Method m = methodList[i];
        SEL sel = method_getName(m);

        if (sel == selector) {
            return method_getTypeEncoding(m);
        }
    }

    return nil;
}

In my view controller's viewDidLoad, I print out the results.

NSLog(@"%s", typeEncodings(ViewController.class, @selector(someMethod:)));

The following gets printed on the debug console:

[email protected]:[email protected]

Decrypting this information we get that the return type of the method is void, because it starts with a v, 24 is the size of the argument frame. @0 is an Objective-C object at byte offset 0, which is the implicit self argument. As far as I understand these byte offsets aren't used by modern systems anymore.:8 is the selector (implicit _cmd) at byte offset 8. @16 is the interesting part, because it indicates another Objective-C object at byte offset 16, and that has to be our untyped argument.

When running again but first removing the argument from the method, we get:

[email protected]:8

The last object argument is not there in this case.

This was enough evidence for me to see that my odd method definition indeed translates to something that makes sense to the compiler, and the runtime too. As it turns out, I wasn't completely right, because I believed that this object at the end of the method would be an AnnotationFilterService class, but as Jeff Johnson explained to me on Twitter, it's just a result of a mechanism where an untyped parameter is assumed to be id.


Either way, it was a good journey down to the lower levels, that helped me to some new and interesting learnings along the way 😄 Hope you enjoyed, too! If you have any feedback, especially if you spot a mistake, I'd appreciate if you could reach out to me here, or on Twitter @vasarhelyia.

Read more

If this seemed interesting to you, I recommend checking out these resources. I myself love browsing these blogs, code, and books for the endless great content and learning.

Show Comments