A Simple Syntax for Data Structures

A Simple Syntax for Data Structures
An image of a tree made of nodes

In my last post, I discussed how I've been working on a new type system, mostly for the purpose of serialization of data structures to send over the wire between applications written in different programming languages. In this post I'll go further into how we can use a similar syntax to actually make some user-defined data structures and write programs using those data structures.

First, we need a way to distinguish type definitions from function definitions. We'll use the "type" keyword, although it's not really a reserved word, it's just a function, although it is evaluated at compile-time instead of run-time. The type function needs a name for the type and the a definition. The definition is just like a function definition, except that the available functions are type constructors. There are just a few basic type constructors, and of course you can have user-defined ones as well.

type Number Varint.
type True Singleton.
type False Singleton.
type Boolean Enum True False.
type Flags Record verbose:Boolean turbo:Boolean.
type Coords Record Number Number.
type Configuration Record Flags Coords.

That's it! This is actually exactly the syntax we defined earlier for functions, it's just that now we start the the "type" value and it has a namespace of type constructor functions. We've added one new feature to the syntax here, which is the use of ":" as a higher precedence separator. This allows us to group together tuples of names and types when making records with named fields without the use of parentheses.

So now that we've defined these types, how do we use them? Type definitions just define run-time constructor functions.

y: Number.
z: Number 1. 
flag: Boolean True.
position: Coords 0 2.
config: Configuration True False 0 2.

Note that constructors are just functions. If you don't provide a value, then you end up with a function that types a value. If you have a complex, nested type, you don't need to provide nested constructors, just the values. Nested types are just functions that take more values.

Before using a type value in a function, we need to fill in the missing bits and evaluate it to a value.

return z -- 1

y: 1
return y -- 1

return position -- Coords 0 2

Note that we don't need nested expressions to make nested types and values that are instances of those types. We get nesting in the type definitions by naming each subtype. There are no anonymous types. We get nested with values by flattening the constructor function to take a straight list of values instead of needing to build an expression tree.

That's it, easy data structure definitions that support sum and product types. In my next post I'll be going into more detail about how you can write functions with user-defined data structures.