In my last post, I wrote a parser for a Smalltalk-like language. In this post, I'll show you how to write a C++ compiler for that language. I think that you will find it to be surprisingly easy.

First, let me say that I modifed the language slightly since the last point because I did not like how the writing of the compiler was going. For chaining messages, I have no introduced the good old "|" character that may be familiar from shell scripting. This means that verbs no longer need to be prefixed with a period. Here is an example of the new message chaining syntax in action:

old style:
event1 : subject1 .verb1 argument1 argument2
event2 : subject2 .verb1 .verb2 argument1
event3 : subject3 .verb1 argument1 .verb2 argument2

new style:
event1 : subject1 verb1 argument1 argument2
event2 : subject2 verb1 | verb2 argument1
event3 : subject3 verb1 argument1 | verb2 argument2

I also tweaked the example that we're going to compile a bit. Here is the new example:

codec : Audio:Control:SGTL5000
tonesweep : Audio:Synth:ToneSweep
i2s : Audio:Output:I2S

tonesweep -> i2s

setup : codec enable . codec volume 0.5 0.5 . tonesweep play 0.8 256 512 10

I added "codec enable" because this is required to turn on the codec.

So here's my compiler:

import Foundation

import Datable
import Text

public class BellCompiler
{
    let className: Text
    let output: URL
    let program: BellProgram

    public init(source: URL, output: URL) throws
    {
        self.className = Text(fromUTF8String: source.deletingPathExtension().lastPathComponent)
        self.output = output
        let parser = BellParser()
        self.program = try parser.generateBellProgram(url: source)
    }

    public init(className: Text, source: Text, output: URL) throws
    {
        self.className = className
        self.output = output.appendingPathComponent("\(className.toUTF8String()).ino")
        let parser = BellParser()
        self.program = try parser.generateBellProgram(source: source)
    }

    public func compile() throws
    {
        let inosource = try generateIno(self.className, self.program)
        print(inosource)
        print("Writing \(className.toUTF8String()).ino...")
        try inosource.write(to: self.output, atomically: true, encoding: .utf8)

        print("Done.")
    }

    public func generateIno(_ className: Text, _ program: BellProgram) throws -> String
    {
        let maybeSetupHandler = program.handlers.first
        {
            handler in

            handler.eventName == "setup"
        }

        let setupHandlerText: String
        if let setupHandler = maybeSetupHandler
        {
            setupHandlerText = self.generateHandlerText(setupHandler)
        }
        else
        {
            setupHandlerText = ""
        }

        let maybeLoopHandler = program.handlers.first
        {
            handler in

            handler.eventName == "loop"
        }

        let loopHandlerText: String
        if let loopHandler = maybeLoopHandler
        {
            loopHandlerText = self.generateHandlerText(loopHandler)
        }
        else
        {
            loopHandlerText = ""
        }

        return """
        #include <Arduino.h>
        #include "Audio.h"

        \(program.instances.map { self.generateInstance($0) }.joined(separator: "\n"))
        
        \(program.flow2s.enumerated().map { self.generateFlow2s($0) }.joined(separator: "\n"))

        void setup()
        {
        \(setupHandlerText)
        }

        void loop()
        {
        \(loopHandlerText)
        }
        """
    }

    public func generateInstance(_ instance: ModuleInstance) -> String
    {
        let moduleName = instance.module.toUTF8String().replacingOccurrences(of: ":", with: "")
        let instanceName = instance.instanceName.toUTF8String()
        return """
        \(moduleName) \(instanceName);
        \(moduleName)Module \(instanceName)Module(&\(instanceName));
        \(moduleName)Universe \(instanceName)Universe(&\(instanceName)Module);
        """
    }

    public func generateFlow2s(_ tuple: (Int, Flow2)) -> String
    {
        let (index, flow) = tuple

        return """
        AudioConnection connection\(index)a(\(flow.moduleA.toUTF8String()), 0, \(flow.moduleB.toUTF8String()), 0);
        AudioConnection connection\(index)b(\(flow.moduleA.toUTF8String()), 1, \(flow.moduleB.toUTF8String()), 1);
        """
    }

    public func generateHandlerText(_ handler: EventHandler) -> String
    {
        return handler.block.sentences.map { self.generateSentenceText($0) }.joined(separator: "\n")
    }

    public func generateSentenceText(_ sentence: Sentence) -> String
    {
        return "    \(sentence.subject.instanceName.toUTF8String())Universe.\(self.generatePhrases(sentence.phrases))"
    }

    public func generatePhrases(_ phrases: [Phrase]) -> String
    {
        if phrases.count == 0
        {
            return ""
        }
        else if phrases.count == 1
        {
            return self.generatePhrase(phrases[0])
        }
        else
        {
            return phrases.map { self.generatePhrase($0) }.joined(separator: ".")
        }
    }

    public func generatePhrase(_ phrase: Phrase) -> String
    {
        return "\(phrase.verb.name.toUTF8String())(\(self.generateArguments(phrase.arguments)));"
    }

    public func generateArguments(_ arguments: [Argument]) -> String
    {
        if arguments.count == 0
        {
            return ""
        }
        else if arguments.count == 1
        {
            return self.generateArgument(arguments[0])
        }
        else
        {
            return arguments.map { self.generateArgument($0) }.joined(separator: ", ")
        }
    }

    public func generateArgument(_ argument: Argument) -> String
    {
        switch argument
        {
            case .name(let value):
                return value.toUTF8String()

            case .literal(let literal):
                switch literal
                {
                    case .float(let value):
                        return "\(value)"

                    case .int(let value):
                        return "\(value)"
                }
        }
    }
}

There's not really a lot to say about this code. If you've been following the other posts, then you will be familiar with the style at this point. It's just string interpolation combined with functional operations like filter, map, and compactMap. We walk the data structure that represents a Bell program and generate text first for the leaves and then combine that into text for the branches until we get to the main branch, the top-level program, where we have the largest string interpolation part to generate the top-level Arduino program. Altogether, we're under 200 lines.

Here is the Arduino code generated by our compiler for the example program above:

#include <Arduino.h>
#include "Audio.h"

AudioControlSGTL5000 codec;
AudioControlSGTL5000Module codecModule(&codec);
AudioControlSGTL5000Universe codecUniverse(&codecModule);
AudioSynthToneSweep tonesweep;
AudioSynthToneSweepModule tonesweepModule(&tonesweep);
AudioSynthToneSweepUniverse tonesweepUniverse(&tonesweepModule);
AudioOutputI2S i2s;
AudioOutputI2SModule i2sModule(&i2s);
AudioOutputI2SUniverse i2sUniverse(&i2sModule);

AudioConnection connection0a(tonesweep, 0, i2s, 0);
AudioConnection connection0b(tonesweep, 1, i2s, 1);

void setup()
{
    codecUniverse.enable();
    codecUniverse.volume(0.5, 0.5);
    tonesweepUniverse.play(0.8, 256, 512, 10);
}

void loop()
{

}

If I were you, dear reader, I might be skeptical about our progress and the claims of what we've accomplished. Surely this is just a toy? Well, there is a lot of missing functionality, which we will be extending over time. Right now we have just enough capabilites to do an Arduino Audio "hello world", which is the tone sweep demo. I believe that we we have constitute a syntax, parser, compiler, and virtual machine, even if their construction is unfamiliar to experts. I think that we saved a lot of code by ignoring certain historical aspects of how those things are usually defined. The main limitation of our system right now is that our virtual machine can ONLY call effects. There are no computations. Also, you can currently only chain together existing effects, you can't write your own effects. These a significant limitation, to be sure! However, to expand our language we need motivating examples. We're not going to just add languages features because they seem cool and hope that they end up being useful. So in the next post we'll expand with a  slightly more complex example providing a tremolo effect.

Code Generation for Arduino Audio - Writing a Smalltalk to C++ compiler