Golfrun tutorial: a basic yet long appetizer

Golfrun is a stack oriented esoteric language. If you know already GolfScript you are some steps forward but beware: Golfrun is not GolfScript.

In stack oriented languages, operands are pushed on the stack and popped by the operators. Since operators pop elements from the stack, operands must be pushed before, and so we obtain the postfix notation and not the more common and known infix notation.

Other worth-knowing stack oriented languages are Forth, PostScript, Factor.

The stack

The stack is the Last In First Out list where you put operands in. Of course operands on stack can be swapped, rotated, discarded, copied...

Stack operators
Stack beforeOperatorStack afterDescription
A ; drop the top most object on stack
A . A A duplicate top most object
A B C @ B C A rotate three elements
Nth ... N $ Nth ... Nth duplicate the Nth element deep in the stack; note that N must be an integer operand that is popped by $.
A B \ B A swap first two objects on top of the stack

Objects on stack

The stable language specification 1.1 has a new type (complex integer); the rest of this tutorial does not consider this type, since it was written before complex numbers were added

Golfrun has several kinds of objects that can be pushed on stack:

Operators behave differently according to the kind of type they operate on. We've already seen the $ operator; it acts like the duplicate the n-th element only when it pops an integer from the stack!

We can say that operators are overloaded.

What about a little bit of math?

Let's see what kind of math can be done with Golfrun. As said before, Golfrun handles only integers, but with arbitrary precision. In order to obtain, from the operators here listed, the described behaviour, the operands must be integers. We underline this using the letters I, J, K...

Arithmetics operators
Stack beforeOperatorStack afterDescription
I J + K sum two integers; K is I+J.
I J - K subtract two integers; K is I-J.
I J * K multiply two integers; K is I*J.
I J / K divide two integers; K is the integer part of I/J.
I J % K compute I mod J, i.e. K is the rest of the integer division I/J.
I J ? K K is IJ
I ) K increment I: K is I+1
I ( K decrement I: K is I-1

Since Golfrun uses arbitrary precision computation, we can do 11 256 ? (11256) and obtain the correct value!

The increment and decrement operators are shorter than 1+ and 1- which do the same thing: Golfrun inherits those shortcuts from GolfScript that, as said, was created to win code golf.

Integer literals can be decimal 42, hexadecimal 0xff, octal 0o144, binary 0b100100. The operator base converts an integer on the stack in any base you want, pushing as result an array having as items the digits in that base (expressed in base 10). E.g.

      987 16 base

gives the array [3 13 11]: if we use D as 13, and B as 11, we see that we have 0x3DB, and in fact 987 in hexadecimal is 3DB.

I can't live with and without strings

Strings are the usual way to keep text, though Golfrun strings can hold arbitrary binary data. When a GUI is missing, strings are likely what allows you to communicate something to the user.

Golfrun has two literal forms for a string.

The operator print does what you expect: print the string to the standard output. Indeed, it can print integers too, arrays and blocks.

Thus "Hello world!\n" print is the HelloWorld program in Golfrun.

It is worth remembering, sometimes, that strings are nothing but sequences of characters and characters are nothing but interpreted bytes. (Golfrun is encoding agnostic, more or less...)

Strings operators
Stack beforeOperatorStack afterDescription
S1 S2 + S the resulting string S is the concatenation of S1 and S2, in that order
S1 ) S I the operator tail extracts the last byte from the string, which is then shortened by 1 (S), and push its integer value on the stack (I)
S1 ( S I the operator head extracts the first byte from the string, which is then shortened by 1 (S), and push its integer value on the stack (I)
S print print the string
S puts print the string followed by a newline
S1 N < S if N is positive, S is the substring made of the first N characters; if N is negative, S is the substring made of all the characters but the last N.
S1 N > S if N is positive, S is the substring made of all the characters but the first N; if N is negative, S is the substring made of last N characters.
S1 N < S if N is positive, S is the substring made of the first N characters; if N is negative, S is the substring made of all the characters but the last N.
S1 N * S S is a string made of N repetition of S1
S , N N is the length of the string S

Let's see few examples (note everything written after # up to the newline, is a comment and it is ignored)

      "hello\n" print               # print "hello" on a line
      "hello" " World" + puts       # print "hello World" on a line
      "hello\n" ) puts puts         # print 10 and "hello" on a line
      'hello\n' ) puts puts         # print 110 and "hello\" on a line
      'hello'2<puts                 # print "he"
      'hello'-2<puts                # print "hel"
      'hello'2>puts                 # print "llo"
      'hello'-2>puts                # print "lo"
      '0'100*puts                   # print a string of 100 '0'
      '0'100*,puts                  # print the integer 100

In the last examples I've started to shrink the code a bit: Golfrun is not fooled by the absence of spaces when they are not necessary!

Let's interpret the last line of code:

And, we could wonder, what 100'0'* does? It's the same of '0'100* i.e. the order is not important: the operator * pops two objects, and when these objects are an integer and a string, irrespective of the order, the meaning of the operator becomes repeat the string a number of times specified by the integer.

Other operators behaves like this (i.e. their meaning is determined by the objects they must handle, no matter of the order), but there are also operators that checks the first popped object to know if they must pop another one or that behaves in a way rather than another only if the first popped object is of a certain type.

Many Golfrun's binary operators have a meaning depending on the types of the objects they pop, irrespective of their actual order

We've already seen stack operators; these operators, because of their own nature, can't consider the type of the objects they pop.

Arrays

An array is a collection of objects; array elements need no to be homogeneous; so you can put into an array strings, other arrays, blocks, integers.

Array begins with [ that is not just a syntactical character as e.g. quotes "" for strings: the [ pushs a non-object (a marker) on the stack. The marker is invisible to operators that pop objects. The ] makes Golfrun search for the nearest marker, collecting objects found so far, and popping them all and replacing them with the array object.

The marker is a real object (non-object) that can't be popped but by a matching ], and is invisible as already said for other operators; thus, consider as examples

Operations on arrays
Stack beforeOperatorStack afterDescription
[A] [B] + [A B] concatenate two arrays
[H1 T] ( [T] H1 take away the head (first item) of the array; T is the tail, that is zero, one or more items; if you are familiar with other languages, you could recognize this syntax: [H|T]
[H L] ) [H] L take away the tail (Last item) of the array; H is a list of zero, one or more items
A $ R sort the items of the array A
A N < A1 if N is positive, A1 is the subarray made of the first N items of the array A; if N is negative, A1 is the subarray made of all the items but the last N
A N > A1 if N is positive, A1 is the subarray made of all the items but the first N; if N is negative, A1 is the subarray made of last N items
A1 A2 * Ar put A2 between every items of A1 (note: this means here order matters!)
A N * Ar repeat the items of A N times (preserving the original order)
A N = E E is the Nth element (from the head when N≥0, from the bottom when N<0) in the array
A1 A2 = N compare two arrays; N=1 (true) means they are the same; otherwise, N is 0
A1 A2 - R remove from A1 all the items appearing in A2
A1 A2 & R setwise and, i.e. intersection: R contains all items that are in common between A1 and A2 (no repeated)
A1 A2 | R setwise or, i.e. union: R contains all the items from A1 and A2, without repetitions
A1 A2 ^ R symmetric set difference: R contains all the items from A1 and A2 which are not in common
A1 N % A select items for which the index mod N is zero; if N is negative, count index from the end

If you experiment (too much), you could discover that

To finish this section, let's see another useful operators, the comma applied to an integer:

Way(s) to generate an array
Stack beforeOperatorStack afterDescription
P , [0 .. P-1] create an array of P integer items ranging from 0 to P-1

I need a block to build

Blocks are intended to contain code. Blocks can be nested. Blocks are not evaluated when parsed, so if they fail, they will fail at runtime, when executed.

E.g. {getAnObject +} contains the symbol getAnObject that needs not to be defined when the block is built. If the block is executed before that definition occurs, then there's an error in your code...

Blocks are important in order to be able to loop or to execute code conditionally.

Looping and conditionals
Stack beforeOperatorStack afterDescription
C {T} {F} if ? if C is true, the block T will be executed; otherwise (else) the block F is executed; the final stack depends on what the blocks actually do.
... {B} do ? B is executed; an object is popped; if it evaluates as true B is executed again, otherwise the loop ends
... {C} {B} while ? C is executed; an object is popped, if it evaluates as true B is executed, otherwise the loop ends
... {C} {B} until ? C is executed; an object is popped, if it evaluates as false B is executed, otherwise the loop ends

Looping could seem not so clear, but it is easy to become accustomed to. Moreover, those are not the only way we can loop; often we need to iterate over the elements of an array; or to loop from 0 to N (the classical for loop); if N is not too high, we can use the for each operator. Now, let's see what a block can do with an array and the right operators...

Block over an array
Stack beforeOperatorStack afterDescription
[A] {B} / ? for each: the array is popped; then each item of the array is pushed on stack, and the block is executed
[A] {B} % [R] apply a block (map): the array is popped; then each item of the array is pushed on stack, the block is executed and the objects produced by the block are put into an array, which is finally pushed on stack
[A] {B} , [R] filter: the array is popped; each item of the array is pushed on stack, the block is executed and if it evaluates to true, the item is collected, otherwise discarded; the resulting array is pushed on stack

Before we go on we have to explain what evaluates as true and what to false.

The logical negation operator ! can be used to show this, as in the following examples, where we see what we know so far in action.

Coercion

Currently Golfrun coercion is a litte bit messy. Anyway, the basic idea is to be able to coerce an object type into another, to allow some kind of operation, or obtain a transformation. An example is if we try to add a number to a string: the number is coerced into a string first, and then the + (concatenation) is performed.

So, "hello"1+ is the same as "hello" "1"+.

GolfScript has precise known rules for coercion. Golfrun currently, even preserving the basic facts, could result sometimes a little bit less respectful of any rule. The matter will be rationalized in future, keeping more or less the features shown in this tutorial.

Variables, binding, system services, type inspection

An object on top of the stack can be assigned to a symbol which, when used, will push that object on stack or, if the object is a block, will execute it.

Variables are global and share the same namespace of the operators, so that it is possible to redefine builtin operators.

Assignment begins with a : followed by a symbol token or textual token, i.e. [A-Za-z][A-Za-z]*. As you can see, numbers can't be used. The : is not a stack operator, it's more like the PostScript /Symbol, but it does not push on stack an object of type symbol: it assigns that symbol to the object on top of the stack instead.

The stack is not consumed, that is the object is not popped. If you need to discard it, you have to add ;.

Usually an assignment looks like :Symbol;.

Numbers can't be part of a symbol and this allows, in Golfrun, to shrink spaces a little bit more; in fact Symbol0 is made of two tokens, the Symbol and the number, and not by one single symbol token Symbol0. A single digit is not treated like a symbol, thus it can't be a symbol token and you can't bind to a number anything (in GolfScript, you can).

Special care is needed for _: if it appears first, it's a single character symbol, like *, + and so on. E.g. "hey":_whatever will assign hey to _, and whatever, if not defined, will be ignored (with an unknown symbol warning). But "hey":hey_; will assign the string to hey_.

As said, : is not an operator; but :: is, and precisely it is the assignment operator.

Assignment operator
Stack beforeOperatorStack afterDescription
O S :: assign (bind) the object O to the symbol contained in the string S

So, we lied a bit: we can redefine even 0, e.g. {.&}'0'::. But in the parser mouth, digits are not symbols like * or others, so the symbol 0 will be always the digit 0. So, the definition/binding is lost. Is this true? Not exactly: a symbol named 0 but it is not accessible writing simply 0 (this will push the integer 0 on stack).

We can access symbols the syntax does not allow by using the symbol lookup system service. System services are available in two different ways (mainly for historical reason). The difference is that the operator :: can't be redefined; the system service number 0 restores the default, so that if we have messed up the system, 0:: resets everything to its default.

Few system services
Args on stackHow to invokenameDescription and result
0:: reset reset the system;
S 5:: lookup lookup the symbol contained in the string S; if it exists, it is pushed on stack, then a 1 (true/success) is pushed; if it does not exist, a 0 is pushed on stack
S -5:: builtin lookup lookup symbols but in the built-in tables
-1:: halt halt execution

Don't forget it: :: is an operator, it pops objects from stack and decides how to behave. Since symbols made of numbers or :: can't be accessed directly, you'll be always able to push integers on stack and invoke ::, and so you'll be always able to reset the system (save when you have the GolfScript compatibility mode on! In this case, you could be unable to push integers on stack!!)

      {type"S"=}:?       # now ? check whether the obj on stack
                         # is a String
      2 8?               # so this won't compute 2^8 anymore;
                         # instead, it will push on stack 0 (false), since
                         # 8 is not a string
      ; 0 ::             # let's drop the extra object and reset
      ?                  # now, this will compute 2^8 correctly
      puts               # just show it!

      "zero"'0'::        # symbol 0 bound to the string "zero"
      0 1 * puts         # writes 0 (without GolfScript compat mode)
      '0'5::             # look up symbol '0'
        {puts}
        {'?'puts}
        if               # if it exists, writes what's on stack, otherwise
                         # write ? to output stream
      0::                # reset
      '0'5::;puts        # will push "zero" still, but
      0 puts             # will push 0, as usual

We have used the operator type to inspect the type of the object on top of the stack.

Type inspection
Stack beforeOperatorStack afterDescription
O type O T push a string containing a single letter string describing the type of the object found on stack: S (String), B (Block), I (Integer), A (Array), b (built-in), U (Unknown)

Numbers and operator :: can't be messed up in Golfrun: they will be always accessible

Miscellanea examples and scattered details

This tutorial has shown you a lot of things (and many more are missing), and it is already too long for a single page tutorial. Though, it gives you, in my opinion, a lot of fun to play with. The usage examples that follow are educational, showing what you've learned until now in practice, but adding also some other details, like assignments, block execution, object to string...

You can try these example in the command line interpreter, if you succedeed in compiling Golfrun interpreter!!

   {}!!                    # res: 0 i.e. false
   ""!!			   # res: 0 i.e. false
   []!!			   # res: 0 i.e. false
   0!!			   # res: 0
   {}! ""! []! 0!+++	   # res: 4 (here we can remove all spaces!)
   5,{puts}/		   # out: 0 1 2 3 4 (on separate lines)
   5,{)}%		   # increment each item by 1: [1 2 3 4 5]
   5`			   # object to string: res is the string "5"
   "hello"`		   # res: "\"hello\""
   5.			   # res: 5 5
   5{.}~		   # res: 5 5: the op ~ executes the block
   {`puts}:x;		   # the block gets the string representation of the
			   # object and print it in a line; the colon operator
			   # followed by a symbol assign to that symbol (x) the 
			   # object on the top of the stack, here the block
			   # the assignment does not consume the object on stack,
			   # so we drop it explicitly
   5x			   # out: "5": when a symbol is used and that symbol
			   #	  is a block, the block is automatically executed
   5p			   # p is already defined in Golfrun with {`puts}:p;
			   # so we obtain the same output
   31415:pi;		   # assign 31415 to symbol pi
   pi			   # push on stack 31415
   [4 3 2 1]$		   # res: [1 2 3 4]
   {.-1%=}:isPalindrome;   # define a "function" which determine if a
			   # string or array is palindrome
   "12321"isPalindrome	   # res: 1
   [1 2 1]isPalindrome	   # res: 1
   [1 2 2]isPalindrome	   # res: 0
   [1 1 1].&puts	       # out: [1]: remove duplicates performing a setwise
			   # and with itself; doing this Golfrun transforms
			   # the array into a set
   [97]""+p		   # out: "a"; this is an example of "coercion": the array
			   # is coerced into a string in order to perform the
			   # string concatenation; since the string is void,
			   # we obtain just the result of the coercion.
   [256]""+p		   # out: "\0" (that is, a string containing the byte zero)
			   #	  since the conversion is done modulo 256.

Let's some maybe useful code:

# when we bind in the "usual way" and use the symbol,
# if the bound object is a block, it is executed;
# otherwise, the object is pushed on stack.
# The following code create the "function" lku which
# behaves similarly
{ 
   5::            # look up symbol
    {             # if it exists,
      type"B"=    # check if it is a block
      {
        ~         # if it's a block, execute it
      }
      {           # else, do nothing (the object is on the stack)
      }
      if
    }
    {             # else if the symbol is unknown,
    }             # do nothing special
    if
}:lku;            # bind to lku and drop the block

# let's write it shorter:
{5::{type"B"={~}{}if}{}if}:lku;

# USAGE
{"example"puts}:x;
3141592:y;
'y'lku            # pushs 3141592
'x'lku            # executes the block i.e. write "example" to output

# what's up with the following?
'*'lku            # when * is not rebound by user, it is bound to
                  # a builtin; a "non object" is pushed, which has type
                  # "b" (built-in); it can be executed only, or dropped
# BUT, it can't be used to get the builtin meaning once the op it's redefined;
# we can use sys service -5 instead

Small examples doing useful things:

  "http://www.address.net"    # we have an address
  .                           # preserve the string
  .'://'?                     # find the index of the substring '://'
  .3+                         # dup the index and add to it the length of '://'
  @@                          # stack: "addr" idx+3 "addr" idx
  <                           # substr of "addr" made of first idx chars
			      # i.e. everything before the ://
  [                           # -> "addr" idx+3 "proto"
    @@       # -> "proto" "addr" idx+3
    >        # remove proto: the string after ://
    '\.'/    # split around dots obtaining ["www" "address" "net"]
    ~        # "unroll" array: all the items in the array are pushed as if "bare"
  ]          # -> ["http" "www" "address" "net"]

# just for fun, let's crunch the code after the string
..'://'?.3+@@<[@@>'\.'/~]

# but there's even a shorter way...
['://'/~'\.'/~]
# reformat the string as "2012/04/20 16:29:00"
  "20120420162900"
  4/            # split in group of 4
  [             # ["2012" "0420" "1629" "00"]
    ~           # unroll array
    +           # concat "1629" and "00"
    @[.]\;      # enclose "2012" into []
                # -> "0420" "162900" ["2012"]
    @2/+        # -> "162900" ["2012" "04" "20"]
    '/'*        # -> "162900" "2012/04/20"
    \2/':'*     # -> "2012/04/20" "16:29:00"
  ]' '*         # -> "2012/04/20 16:29:00"

# a longer way, can be shorterned of course, but it
# shows an interesting usage of % operator
  [4[2]4*~]{.@.@<@@>}%3/~':'*\'/'*' '+\+

The following last example solves this code golf contest, but I suggest to avoid running it with big values... memoization or iterative solution are possible too, of course. The shortest code (not in Golfrun) is 36 char long! (Very likely this can be shortened a lot)

','%n*~{.{\.{.(2$_@(@(\_+}{;;0}if}{;;1}if}:_~
    

Conclusions

Ok, you've now a sight broad enough to be hungry or angry instead, and decide if you'll try some coding (maybe golfing) with Golfrun or not.

You can study other examples in the examples directory on the SVN repository (or the tar.bz2 package, which is not up to date).