kateinoigakukunのブログ

思考垂れ流しObserver

Swift Type Metadata (en)

www.youtube.com

try! Swift 2019

I'm kateinoigakukun, working at Mercari as an intern. Today, I'll talk about Metadata which is one of the most important things to understand how Swift works.

Swift is well-known for being a statically-typed language, but it actually has many dynamic things on runtime.

let typeName = String(describing: Int.self)

I'm sure all of you have looked at StackOverflow and written code like this to get a type name.

extension UITableView {
    func register<Cell>(nibWithCellClass: Cell.Type) where Cell: UITableViewCell {
        let typeName = String(describing: Cell.self)
        let nib = UINib(nibName: typeName, bundle: Bundle.main)
        register(nib, forCellReuseIdentifier: typeName)
    }
}

tableView.register(nibWithCellClass: TweetCell.self)

For example, when you call register method for UITableViewCell, you use this to match the xib name and type name. It's useful extension. But have you ever thought about how this code works on runtime? This is your first step towards thinking about memory representation in Swift. Let's dig into the world of metadata!

Agenda

  1. What is type metadata?
  2. Explore String(describing: Int.self)
  3. How to use metadata in Swift
  4. Use cases in OSS

First, I'll explain "What is type metadata". Swift type metadata is not something we are familiar with. But we usually get the benefits of it, because Core features of Swift use it for dynamic behavior. I'll explain the example of the String initializer and talk about how type metadata is used inside of Swift. Then I'll introduce how to use metadata in Swift and some examples of hacking Swift.

What is type metadata?

  • Type information in Swift runtime
  • Used in Swift internal dynamic behavior
  • Metatype is pointer to metadata
let metatype: Int.Type = Int.self

Let's start. Type metadata is Swift's internal information about types like instance size, number of cases of enum, and so on. This information is stored statically in binary or generated dynamically in runtime. Metatype is represented as type name plus self keyword and the actual value is a pointer to the metadata.

extension String {

  public init<Subject: CustomStringConvertible>(describing instance: Subject) { ... }

  public init<Subject>(describing instance: Subject) { ... }

}

let typeName = String(describing: Int.self) // "Int"

Int.self is a metatype object and it is passed to the String initializer. This initializer accepts any type of value, and returns the description property if the type of the value conforms to CustomStringConvertible and returns the type name if a metatype is passed in.

extension Int.Type: CustomStringConvertible { // 🚫 Cannot extend a metatype 'Int.Type'
    var description: String {
        return "Int"
    }
}

It can not be true that metatype implements the description property of CustomStringConvertible because there is no way to extend metatype. And it seems impossible to implement with the pure Swift API because Swift doesn’t have a runtime API like Objective-C. So there should be some magic.

SwiftCore

  • Swift standard library
  • Fundamental types and interfaces

SwiftRuntime

  • Swift runtime library
  • Dynamic behavior

The initializer is implemented in SwiftCore and the magic is implemented in SwiftRuntime. SwiftCore is Swift standard library written in Swift and contains fundamental data types like String and Int and protocols. SwiftRuntime is the Swift runtime library written in C++ and contains runtime behavior. Many dynamic features like dynamic casting and allocating instances are implemented in this library.

stdlib/public/core/Mirror.swift

struct String {
  public init<Subject>(describing instance: Subject) {
    _print_unlocked(instance, &self)
  }
}

Swift is open source, so we can see this on Github. In this initializer, the print_unlocked which is internal function is called.

stdlib/public/core/Misc.swift

public func _typeName(_ type: Any.Type, qualified: Bool = true) -> String {
  let (stringPtr, count) = _getTypeName(type, qualified: qualified)
  return String._fromUTF8Repairing(
    UnsafeBufferPointer(start: stringPtr, count: count)).0
}

@_silgen_name("swift_getTypeName")
public func _getTypeName(_ type: Any.Type, qualified: Bool) -> (UnsafePointer<UInt8>, Int)

Digging into the call stack for the case when type of argument is metatype, the _typeName function is called, and calls another function _getTypeName. Look at the _getTypeName definition. First, you can see that the function has @_silgen_name attribute and doesn't have body code. What’s this? This @_silgen_name attribute specifies the function name that the declaration will have at link time. In this use case, this definition is used to link to a function in SwiftRuntime. I'll skip the details of the linked function, but it simply extracts the type name from the metadata. Then, how is metadata represented in the memory world?

As you can see in the figure, metadata is broken down and represented as - Value witness table which is a group of functions for manipulating instance

  • a kind value which represents the kind of type such as class, struct, protocol, etc,
  • and NominalTypeDescriptor which records detailed information of the type.
  • In the case of a class, VTable is also included,
  • and in the case of a generic type, type parameters are embedded dynamically.

So, in the NominalTypeDescriptor there is the type name we are looking for. We can get the type name from the nominal type descriptor by just advancing the metadata pointer. It seems not difficult to implement, so let's reproduce the String initializer in SwiftRuntime. (docs/ABI/TypeMetadata.rst)

struct StructMetadata {
    let kind: Int
    let typeDescriptor: UnsafePointer<StructTypeDescriptor>
}

struct StructTypeDescriptor {
    let flags: Int32
    let parent: Int32
    let name: RelativePointer<CChar>
}

In the first step, reproduce the memory layout as a struct. Most information about the memory layout is documented but a part of it is already outdated, so we need to read the source code of swift compiler sometimes. To simplify this example, I will only implement for structs. Then, we need to understand RelativePointer to reproduce the memory layout.

(include/swift/Basic/RelativePointer.h)

RelativePointer is not just a pointer. A basic absolute pointer has the address to the referent, but relative pointer has the offset from its own address to the referent's address. It works by just reading the offset and advancing from its own address. Using RelativePointer instead of an absolute pointer reduces relocation.

func getTypeName<Subject>(of type: Subject.Type) -> String {
    let metadataPointer = unsafeBitCast(
        type, to: UnsafePointer<StructTypeMetadata>.self
    )
    let namePointer: UnsafePointer<CChar> = metadataPointer.pointee
                        .typeDescriptor.pointee
                        .name.advancedPointer()
    return String(cString: namePointer)
}

Then, we finished preparing, so let's extract the type name. First, cast the metatype object into the pointer of metadata. But in Swift's type system, metatype doesn't have subtyping relation with metadata pointer, so use unsafeBitCast to cast the metatype. Access the name pointer through the type descriptor and advance the offset to be an absolute CChar pointer. Next convert this to a Swift String. Then, the implementation has been completed!

let typeName = getTypeName(of: Int.self) // "Int"

Execute this, You can get the type name. This is the first step of meta programming with metadata!

Use cases inside of Swift

  • Allocate instance
  • Dynamic method dispatch
    • VTable
  • Reflection

Metadata is used for dynamic behavior in Swift. Many people use Swift without realizing this, but there are many use cases. Where? The most common use case is allocating an instance. And if you call a method through protocol or class, the method table stored in metadata is referenced to get the method reference. In other cases, Mirror API uses metadata to reflect properties.

In this way, metadata is very useful inside of Swift, but we can abuse it.

Method swizzling

Next, I'll talk about the Black magic that you used in Objective-C. The black magic is Method swizzling. If you understand the metadata, you can get great power.

Method swizzling

class Animal {
    func bar() { print("bar") }
    func foo() { print("foo") }
}

struct ClassMetadata {
    ...
    // VTable
    var barRef: FunctionRef
    var fooRef: FunctionRef
}

First, reproduce the memory layout as getting the type name. Class methods are called via VTable that is a table of pointers to functions. So it should work if we replace the pointers.

let metadata = unsafeBitCast(
    Animal.self, to: UnsafeMutablePointer<ClassMetadata>.self
)

let bar = withUnsafeMutablePointer(to: &metadata.pointee.barRef) { $0 }
let foo = withUnsafeMutablePointer(to: &metadata.pointee.fooRef) { $0 }

bar.pointee = foo.pointee

let animal = Animal()
animal.bar() // foo

Get the metadata from metatype using unsafeBitCast and get the both pointers of the functions to swizzle. Then, it's easy to replace them. This is very simple but this works well. Like this, using metadata we can achieve what seems impossible.

Use cases

  • Zewo/Reflection
  • wickwirew/Runtime
  • alibaba/HandyJSON
  • kateinoigakukun/StubKit

Now I will introduce some use cases of metadata in OSS I found. The top two provide a Swifty interface to access metadata information. They are very useful when using metadta. The third one is a JSON serialization library which enables encoding and decoding JSON without mapping configuration.

alibaba/HandyJSON

struct Item: HandyJSON {
    var name: String = ""
    var price: Double?
    var description: String?
}

if let item = Item.deserialize(from: jsonString) {
    // ...
}

This feature has been achieved by Codable with compiler code generation since Swift4, but HandyJSON was created before Codable and uses metadata to make a relationship between value and property name without Objective-C API.

Use cases

  • Zewo/Reflection
  • wickwirew/Runtime
  • alibaba/HandyJSON
  • kateinoigakukun/StubKit

The last use case is my library StubKit.

kateinoigakukun/StubKit

import StubKit

struct User: Codable {
    let name: String
    let age: UInt
}

let user = try Stub.make(User.self)
// User(name: "This is stub string", age: 12345)

This library enables instantiating stubs without any arguments and makes it easy to instantiate struct with many many fields. Most of this feature is implemented with Codable but some features are implemented using type metadata.

kateinoigakukun/StubKit

Before I introduce the use case of metadata, I'll share with you how this stub function works. First, a basic struct forms a tree structure and you can traverse it using Decoder protocol.

So if we prepare the stub of a leaf and inject it while traversing, we can instantiate any type of stub without arguments.

func leafStub<T>(of type: T.Type) -> T {
    guard let stubbable = type as? Stubbable else { return nil }
    return type.stub
}

extension Int: Stubbable {
    var stub: Int { return 12345 }
}

extension enum: Stubbable { // 🚫 Can't extend
    var stub: Self {
        return enumStub()
    }
}

For example, String, Int and enum can be leaf types. It's easy to prepare a stub of basic data types, but enum can be defined by users so we need to prepare all stubs of custom enums manually. I know it's very hard so I implemented generating enum instances using metadata.

func enumStub<T>(of type: T.Type) -> T? {
    if isEnum(type: type) {
        let rawValue = 0
        let rawPointer = withUnsafePointer(to: rawValue) { UnsafeRawPointer($0) }
        return rawPointer.assumingMemoryBound(to: T.self).pointee
    }
    return nil
}

func isEnum<T>(type: T.Type) -> Bool {
    let metadata = unsafeBitCast(type, to: UnsafePointer<EnumMetadata>.self).pointee
    return metadata.kind == 1 // kind value of enum is 1
}

We know that we can cast Int value to enum because they have same memory layout. However the method is only available for enum and it's necessary to check whether the type is enum. Metadata is effective here. The head address of the metadata is which kind the type is class, struct or enum. The kind value of enum is statically 1. So we can check if the type is enum by comparing this value.

Caution

  • ABI stability
  • Responsibility

Swift4.2 has no ABI stability so metadata layout will be broken in Swift5. But we have good news that ABI stability is accomplished in Swift5! So you can make a library easily. But if you want to support both Swift4 and 5, you will need a lot of effort to maintain it. We got great power but please remember that using metadata is not an official way of doing things in Swift. If you use the metadata and particularly if you rewrite it, you must have a deep understanding of metadata. For example, method swizzling may be reverted to the original implementation by optimization. It's necessary to dispatch method through method table to swizzle implementation, but dynamic dispatch can be optimized to static dispatch by "Devirtualize" optimization. You must handle cases like this to use this magic. "With great power comes great responsibility"

Summary

  • Swift uses metadata for dynamic behavior
  • We can use metadata in Swift
  • Let's write meta programming libraries!

Let me wrap up the key points. - First, Swift uses metadata for dynamic method dispatch, reflection API and so on. - Second, we can use it in Swift by reproducing memory layout. It brings big benefits and it's just fun. So, I'm looking forward to your great libraries using metadata. That’s all, thank you all very much.