Effective Smalltalk, with Functional Combinators
In my last post, I talked about a lot of ideas I've been mulling over for the last year in programming language design. Now, I'm going to try to synthesize all of those ideas into something coherent, comprehensive, elegant, and simple. That's the idea, anyway!
To start with, we're going to need effects, because effects are how we do things. I'm going to skip effect sequencers for now, and just talk about code that has a single effect. The most straightforward effect is "return", which returns a value. Yes, return is an effect! It's not a side effect, but it's still an effect. Effects live outside of our program, so what we are actually creating in our program is an effect descriptor, which the runtime will translate into a real effect. The way we make an effect descriptor is with an effect constructor, which is just a data structure constructor function, like this:
return 1.
"return" is our constructor, and it takes some value, the type of which depends on the constructor. return is pretty permissive as to what sort of values it will take. Here we are giving it a literal value, but we could give it any expression that evaluates to a value of an appropriate type.
Every value is an object, which just means that its type contains a namespace of functions which take the value as an implicit first parameter. The flavor of the language is going to be highly influenced by what sort of values are supported and the functions included in their namespaces. This post, however, is just about the overall syntax and general semantics, so we'll just kind of make up some reasonable functions as we go along.
return 2 squared. -- 4
Unlike in APL, in Smalltalk style we are going mostly left to right. The value is on the left and we select a function from its namespace on the right. Whether or not we keep going depends on the type of the result. The squared function doesn't need any additional arguments, so we're done, and return 2 squared, which is 4.
return 2 plus 3. -- 5
Here the plus function requires another parameter, so it consumes another value from the right.
return 4 divide 2. -- 2
return 4 divide swap 2. -- 0.5
Here we have two different paths we can go down after "4 divide", which gives us the divide function on the number type, with an implicit first argument of 4. Divide needs another parameter, so we can provide it, such as 2. Alternatively, functions are objects, so we can look in divide's namespace and pull out the swap function, which takes divide as an implicit first argument and yields a new function, which is divide with the order of its arguments swapped. We then supply the missing argument 2 to finish off the expression.
Let's now introduce a couple of extra bits of syntax. The "." at the end of an expression means that we will be providing no more arguments. Additionally, if we want to disambiguate the precedence in an expression, we can use a separator with a higher precedence than space, such as dash.
return 4 divide. -- {x in 4 divide x}
return 4 divide-swap 2. -- 0.5
Separator precedence is a trick to avoid parentheses, because I don't like them. It also can act like a bind operator:
return 4-divide. -- {x in 4 divide x}
return 2 divide-swap. -- {x in x divide 2}
return 4-divide-2. -- 2
return 4 divide 2-plus-2. -- 1
What about arrays? Of course, arrays are objects and follow the same syntax, with some helpful additions.
return 1, 2, 3. -- [1, 2, 3]
return 1, 2, 3 map plus-1. -- [2, 3, 4]
return 1, 2, 3 reduce plus. -- 6
Here , is the array constructor (like ravel in APL), which is a function in the namespace of the number type, and yields an array of the same number type. We also provide , on arrays to append additional numbers. This lets you chain , to build a literal array. Array has additional functions on it, such as map and reduce, which take as arguments functions drawn from the namespace of the number type for the specific array.
An issue with this syntax for list literals is that it doesn't provide an obvious way to, say, concatenate two list literals together. However, this can be achieved with the subexpression token ":".
return 1, 2, 3 concatentate: 1, 2, 3.
The : means to evaluate the expression on the right first and then use it as a parameter to the function on the left. We can also use ":" in a couple of other ways. It can be used to define named values (including functions) and for quoting.
x: 1.
return x. -- 1
return 1, 2, 3 map: add 1 -- [2, 3, 4]
We have have a freestanding expression using ":" then it is a definition. When we have a subexpression that is not terminated by "." then it is quoted. Defined terms can be any value, including functions. When a defined function name is used, that name overrides doing a lookup on the value's type's namespace, but it is generally bad form to shadow existing function names.
Let's get a bit more into arrays. The "," operator on numbers creates an array of that number type. However, what if we want to create some other sort of traversable data structure? Let's just say for the sake of the example that tuples and arrays are different types. You can reuse "," by starting with an empty tuple and then appending to it.
return tuple, 1, 2, 3. -- (1, 2, 3)
return set, 1, 2, 3. -- {1, 2, 3}
The last bit of housekeeping is that in my last post I talked about removing the primacy of function application from the syntax, but what happened here? Application seems to still be the primary operation implied by juxtaposition. Actually, this is not so. When a value is followed by a juxtaposed function name, it looks up the function in the value's type's namespace. When a function is followed by a value, it curries the function, removing one parameter and yielding a new function. So the meaning of an expression is either a value, if you started with a value and didn't chain any functions, or else it's a function. It is the "." operator that actually applies the function, if all of its parameters have been filled, yielding a value, and otherwise yields a function. Therefore, without "." the result of most expressions is going to actually be a function and not a value. This is what allows chaining of functions rightward. We are just building up function calling chains, without executing them, until we reach ".".
return 1 + 2 -- { 1 + 2 }
return 1 + 2. -- 3
So that's my syntax for universal computation in a nutshell. In order to fill in the necessary parameters for a given effect, we start with a value and then alternate looking up functions for that type of value and filling in function parameters with more values. We finally end up with a function or a value. Most things are handled by the function namespaces on value types. There are a few bits of syntax. "," is actually just a function, so the syntactic bits are just "." for application and ":" for the three types of subexpressions.
This is just a first draft at putting together the ideas from my last post into a cohesive proposal. I'm sure that you will find issues, both obvious and subtle, with this approach. Some of these will need to be revealed through writing practical working examples of real code. Also, a lot of the language is missing because I have not provided a rich set of types and functions on those types.
In my next post, I'll talk about one of the big things missing in this one: effect sequencers, which will let us combine multiple effects in one function.