One of my first and most memorable graphical programs was implementing John Conway’s Game of Life. At the time, that implementation was as a Java applet. I’ve revisited it periodically in different programming languages including several years ago when I started to implement the Game of Life in Factor – something I’ve always wanted to write about.
The Game of Life is a two-dimensional grid of square cells with fairly simple logic. Each cell can be either live or dead. Each cell interacts with its eight neighboring cells with the following rules determining the next state of the game board:
You can run this in any release since Factor 0.98:
IN: scratchpad "game-of-life" run
And it will look something like this:
Let’s go ahead and build it!
We will model our two-dimensional game board as an array of arrays. And in particular, since each cell has only two states, we will use bit-arrays to reduce the memory requirements by efficiently storing the state, one bit for each cell.
: <grid> ( rows cols -- grid )
'[ _ <bit-array> ] replicate ;
: grid-dim ( grid -- rows cols )
[ length ] [ first length ] bi ;
Making a random grid, which is useful in testing:
: random-grid ( rows cols -- grid )
'[ _ { t f } ?{ } randoms-as ] replicate ;
And a word we can use for debugging, to print a grid out:
: grid. ( grid -- )
[ [ CHAR: # CHAR: . ? ] "" map-as print ] each ;
Some implementations choose to make the game boards infinite, but we are instead going to build a wraparound game board. This allows, for example, a glider shape to fly off the bottom right and then re-appear on the top left of the board, which is a lot more fun to watch.
A useful word calculates adjacent indices for a cell – that wrap at a
max
value of rows or columns:
:: adjacent-indices ( n max -- n-1 n n+1 )
n [ max ] when-zero 1 -
n
n 1 + dup max = [ drop 0 ] when ;
Test it out, showing how it might work in a hypothetical 10 x 10
grid:
! in the middle
IN: scratchpad 3 10 adjacent-indices 3array .
{ 2 3 4 }
! at the start, wrapped around
IN: scratchpad 0 10 adjacent-indices 3array .
{ 9 0 1 }
! at the end, wrapped around
IN: scratchpad 9 10 adjacent-indices 3array .
{ 8 9 0 }
The main game logic requires counting neighbors for each cell. Since each
cell can have 8 neighbors, we can store this count in a half-byte – a
nibble – which can hold the values
[0..15]
. In the batteries-included standard
library, we
have a nibble-arrays
vocabulary
that makes this easy.
The simplest implementation would just iterate across the game board, and for each cell that is live, increment the count for the neighboring indices around it:
:: count-neighbors ( grid -- counts )
grid grid-dim :> ( rows cols )
rows [ cols <nibble-array> ] replicate :> neighbors
grid [| row j |
j rows adjacent-indices
[ neighbors nth ] tri@ :> ( above same below )
row [| cell i |
cell [
i cols adjacent-indices
[ [ above [ 1 + ] change-nth ] tri@ ]
[ nip [ same [ 1 + ] change-nth ] bi@ ]
[ [ below [ 1 + ] change-nth ] tri@ ]
3tri
] when
] each-index
] each-index neighbors ;
Then the last piece of game logic we need is to adjust the grid cells according to the rules – making some transition from live to dead, and others from dead to live based on their state and the neighboring counts.
:: next-step ( grid -- )
grid count-neighbors :> neighbors
grid [| row j |
j neighbors nth :> neighbor-row
row [| cell i |
i neighbor-row nth
cell [
2 3 between? i row set-nth
] [
3 = [ t i row set-nth ] when
] if
] each-index
] each-index ;
Before we move on to creating a graphical user interface for the game, let’s try it out in the Factor listener:
! Create a random 10x10 grid
IN: scratchpad 10 10 random-grid
! Print it out
IN: scratchpad dup grid.
#..#..#.##
##....####
..###.####
.##...#..#
.##....###
..###..#.#
...###.#..
.###....##
#...###.##
.##..#.#..
! Compute the neighbors for each cell
IN: scratchpad dup count-neighbors .
{
N{ 5 5 4 1 2 3 4 6 5 5 }
N{ 5 3 4 4 3 4 4 7 7 7 }
N{ 6 5 4 3 1 4 4 6 6 5 }
N{ 5 4 5 5 2 3 3 6 7 4 }
N{ 5 4 5 5 2 2 3 3 5 3 }
N{ 3 3 4 5 4 3 4 3 6 2 }
N{ 3 3 6 6 5 2 3 2 5 3 }
N{ 4 2 3 4 6 5 4 4 4 4 }
N{ 4 5 5 4 3 3 3 4 4 4 }
N{ 5 3 2 3 4 4 5 4 5 6 }
}
! Compute the next generation
IN: scratchpad dup next-step
! Print it out
IN: scratchpad dup grid.
.....#....
.#..#.....
...#......
.....##...
......##.#
##...#.#.#
##...###.#
.##.......
....###...
.###......
It works!
In Factor, one of the ways we can build user
interfaces is
using gadgets and OpenGL rendering
instructions.
We start by modeling our game as a
gadget with
a grid
object, a size
that specifies the rendered pixels-per-cell, and
a timer
to control the speed of repainting new generations.
TUPLE: grid-gadget < gadget grid size timer ;
Our default gadget will have cells that are 20 pixels square, and repaint 10 times per second:
: <grid-gadget> ( grid -- gadget )
grid-gadget new
swap >>grid
20 >>size
dup '[ _ dup grid>> next-step relayout-1 ]
f 1/10 seconds <timer> >>timer ;
Gadgets are grafted onto the render hierarchy, and then later ungrafted when they are removed. We handle that state change by stopping the timer before delegating to the parent to cleanup further:
M: grid-gadget ungraft*
[ timer>> stop-timer ] [ call-next-method ] bi ;
The default dimension for our gadget is the grid dimension times the pixel size:
M: grid-gadget pref-dim*
[ grid>> grid-dim swap ] [ size>> '[ _ * ] bi@ 2array ] bi ;
If the grid size
changes – for example, by using the mouse scroll wheel to
zoom in or out – we can create and store a new grid, keeping the cells that
are visible in the same state they were in:
:: update-grid ( gadget -- )
gadget dim>> first2 :> ( w h )
gadget size>> :> size
h w [ size /i ] bi@ :> ( new-rows new-cols )
gadget grid>> :> grid
grid grid-dim :> ( rows cols )
rows new-rows = not cols new-cols = not or [
new-rows new-cols <grid> :> new-grid
rows new-rows min [| j |
cols new-cols min [| i |
i j grid nth nth
i j new-grid nth set-nth
] each-integer
] each-integer
new-grid gadget grid<<
] when ;
We can draw the cells that are live as black
squares:
:: draw-cells ( gadget -- )
COLOR: black gl-color
gadget size>> :> size
gadget grid>> [| row j |
row [| cell i |
cell [
i j [ size * ] bi@ 2array { size size } gl-fill-rect
] when
] each-index
] each-index ;
And then draw the gray
lines that define the grid of cells:
:: draw-lines ( gadget -- )
gadget size>> :> size
gadget grid>> grid-dim :> ( rows cols )
COLOR: gray gl-color
cols rows [ size * ] bi@ :> ( w h )
rows 1 + [| j |
j size * :> y
{ 0 y } { w y } gl-line
] each-integer
cols 1 + [| i |
i size * :> x
{ x 0 } { x h } gl-line
] each-integer ;
Putting this together, we draw our gadget by updating the grid, drawing the cells, and drawing the lines:
M: grid-gadget draw-gadget*
[ update-grid ] [ draw-cells ] [ draw-lines ] tri ;
And, with the “visual REPL”, you can directly render the grid gadget, to see it work:
We now need to build the interactive parts. Let’s first start by handling a
click, to toggle the state of a cell, and storing which state it was toggled
to in the last-click
variable:
SYMBOL: last-click
:: on-click ( gadget -- )
gadget grid>> :> grid
gadget size>> :> size
grid grid-dim :> ( rows cols )
gadget hand-rel first2 [ size /i ] bi@ :> ( i j )
i 0 cols 1 - between?
j 0 rows 1 - between? and [
i j grid nth
[ not dup last-click set ] change-nth
] when gadget relayout-1 ;
That allows us to build a drag feature, where as we drag, we continue to either set cells to live or dead according to what the first click was doing:
:: on-drag ( gadget -- )
gadget grid>> :> grid
gadget size>> :> size
grid grid-dim :> ( rows cols )
gadget hand-rel first2 [ size /i ] bi@ :> ( i j )
i 0 cols 1 - between?
j 0 rows 1 - between? and [
last-click get i j
grid nth set-nth
gadget relayout-1
] when ;
We implement a scrolling feature to adjust the size
of the rendered cells:
: on-scroll ( gadget -- )
[
scroll-direction get second {
{ [ dup 0 > ] [ -2 ] }
{ [ dup 0 < ] [ 2 ] }
[ 0 ]
} cond nip + 4 30 clamp
] change-size relayout-1 ;
And we store these as "gestures"
that are supported by the gadget:
grid-gadget "gestures" [
{
{ T{ button-down { # 1 } } [ on-click ] }
{ T{ drag { # 1 } } [ on-drag ] }
{ mouse-scroll [ on-scroll ] }
} assoc-union
] change-word-prop
The last bit we need is to make the toolbar, which has a few commands we can run:
:: com-play ( gadget -- )
gadget timer>> restart-timer ;
:: com-stop ( gadget -- )
gadget timer>> stop-timer ;
:: com-clear ( gadget -- )
gadget dup grid>> [ clear-bits ] each relayout-1 ;
:: com-random ( gadget -- )
gadget dup grid>> [ [ drop { t f } random ] map! drop ] each relayout-1 ;
:: com-glider ( gadget -- )
gadget dup grid>> :> grid
{ { 2 1 } { 3 2 } { 1 3 } { 2 3 } { 3 3 } }
[ grid nth t -rot set-nth ] assoc-each relayout-1 ;
:: com-step ( gadget -- )
gadget dup grid>> next-step relayout-1 ;
And then store these as the "toolbar"
command map:
grid-gadget "toolbar" f {
{ T{ key-down { sym "1" } } com-play }
{ T{ key-down { sym "2" } } com-stop }
{ T{ key-down { sym "3" } } com-clear }
{ T{ key-down { sym "4" } } com-random }
{ T{ key-down { sym "5" } } com-glider }
{ T{ key-down { sym "6" } } com-step }
} define-command-map
And finally, we can wrap the grid gadget with something that makes a toolbar, and creates a main window when launched:
TUPLE: life-gadget < track ;
: <life-gadget> ( -- gadget )
vertical life-gadget new-track
20 20 make-grid <grid-gadget>
[ <toolbar> format-toolbar f track-add ]
[ 1 track-add ] bi ;
M: life-gadget focusable-child* children>> second ;
MAIN-WINDOW: life-window
{ { title "Game of Life" } }
<life-gadget> >>gadgets ;
As with anything, there are probably things we could continue to improve in our UI framework, but one of the biggest missing pieces are examples of working code, which is largely what motivated writing about this today.
Check it out!
And maybe think about how you might adjust it to be an infinite game board, or to increase performance when computing the next generation, to improve the OpenGL rendering logic, persist the game board between launches, or do things like communicate age of each cell by the color that it is rendered with.
Recently, Chris Siebenmann was lamenting
the lack of a good command line way to sort IPv6
addresses.
This followed a post of his a few years ago about how sort -V
can
easily sort IPv4
addresses.
Since I had some fun talking about sorting Roman
numerals recently – and we have an extensive
standard
library – I
thought I’d talk about how you might solve this problem with
Factor.
As a reminder, IPv6 uses a 128-bit address space with more theoretical addresses than the older – but still quite commonly used – IPv4 32-bit address space.
The internal network address of your computer is sometimes referred to as
localhost or a loopback
address, and
represented as 127.0.0.1
in IPv4, or ::1
in IPv6. We have an
ip-parser
vocabulary with
words for parsing and manipulating IP addresses as well as IP network strings
written in CIDR
notation. We can
use these words to show how to translate these addresses to their byte
representation:
IN: scratchpad "127.0.0.1" parse-ipv4 .
B{ 127 0 0 1 }
IN: scratchpad "::1" parse-ipv6 .
{ 0 0 0 0 0 0 0 1 }
And, we could use that to sort a list of addresses pretty easily:
IN: scratchpad {
"127.0.0.1"
"1.1.1.1"
"8.8.8.8"
"192.168.10.40"
} [ parse-ipv4 ] sort-by .
{
"1.1.1.1"
"8.8.8.8"
"127.0.0.1"
"192.168.10.40"
}
IN: scratchpad {
"2620:0:1cfe:face:b00c::3"
"2001:4860:4860::8844"
"2620:0:ccc::2"
"::1"
"2001:4860:4860::8888"
} [ parse-ipv6 ] sort-by .
{
"::1"
"2001:4860:4860::8844"
"2001:4860:4860::8888"
"2620:0:ccc::2"
"2620:0:1cfe:face:b00c::3"
}
And so, now that some great feedback encouraged us to do command-line eval with auto-use? enabled, we can run this easily as a one-line script:
# make a file full of unsorted IPv6 addresses
$ cat <<EOF > ips.txt
2620:0:1cfe:face:b00c::3
2001:4860:4860::8844
2620:0:ccc::2
::1
2001:4860:4860::8888
EOF
# show that you can parse the file as strings
$ cat ips.txt | ./factor -e="read-lines ."
{
"2620:0:1cfe:face:b00c::3"
"2001:4860:4860::8844"
"2620:0:ccc::2"
"::1"
"2001:4860:4860::8888"
}
# sort and print the sorted output
$ cat ips.txt | ./factor -e="read-lines [ parse-ipv6 ] sort-by [ print ] each"
::1
2001:4860:4860::8844
2001:4860:4860::8888
2620:0:ccc::2
2620:0:1cfe:face:b00c::3
Pretty cool!
Last week, Falk Hüffner wrote about making a leap year check in three instructions:
With the following code, we can check whether a year 0 ≤ y ≤ 102499 is a leap year with only about 3 CPU instructions:
bool is_leap_year_fast(uint32_t y) { return ((y * 1073750999) & 3221352463) <= 126976; }
How does this work? The answer is surprisingly complex. This article explains it, mostly to have some fun with bit-twiddling; at the end, I’ll briefly discuss the practical use.
This is how a leap year check is typically implemented:
bool is_leap_year(uint32_t y) { if ((y % 4) != 0) return false; if ((y % 100) != 0) return true; if ((y % 400) == 0) return true; return false; }
It would be fun to see how that works in Factor and compare the relative performance between a simple version and the new super-fast-highly-optimized 3 instruction version. To do that, we can use the benchmark word to record execution time by calling it repeatedly and returning an average time-per-call in nanoseconds:
TYPED: average-benchmark ( n: fixnum quot -- nanos-per-call )
over [ '[ _ _ times ] benchmark ] dip /f ; inline
Note: We are forcing the iteration loop above to be fixnum
to reduce its
overhead, and due to the design of the benchmark words below, are going to have
code blocks with predictable inputs. Testing your program with random inputs is
also important to see the impact of CPU optimizations such as cache and branch
predictions, or across multiple CPU architectures. Performance is also impacted
by use of code generation features such as
inline and
compiler steps such as dead-code elimination. Benchmarking is hard.
The simple – and typical – implementation can be easily written as:
: leap-year? ( year -- ? )
dup 100 divisor? 400 4 ? divisor? ;
And in fact, that’s how it is implemented in the standard library.
We can write a quick benchmarking word. This ensures we are using the optimizing compiler and also asserts that the result of the word is as expected:
: bench-leap-year ( n year ? -- nanos )
'[ _ leap-year? _ assert= ] average-benchmark ;
And then call it one hundred million times, to see how long it takes each call on average:
IN: scratchpad 100,000,000 2028 t bench-leap-year .
10.53904317
Just under 11 nanoseconds, including the loop and the assert…
The fast implementation suggested by Falk can be written directly as:
: fast-leap-year? ( year -- ? )
1073750999 * 3221352463 bitand 126976 <= ;
And then write a benchmarking word:
: bench-fast-leap-year ( n year ? -- nanos )
'[ _ fast-leap-year? _ assert= ] average-benchmark ;
And see how long it takes:
IN: scratchpad 100,000,000 2028 t bench-fast-leap-year .
4.74783302
Just under 5 nanoseconds…
Well, generally Factor supports arbitrarily large
integers by
allowing integers to implicitly promote from word-sized fixnum
to
overflow into bignum
. And, as they say, you can write the C programming
language in any
language.
A faster implementation might check the input is a fixnum
and then force
math without overflow:
TYPED: faster-leap-year? ( year: fixnum -- ? )
1073750999 fixnum*fast 3221352463 fixnum-bitand 126976 fixnum<= ;
And write a benchmark word:
: bench-faster-leap-year ( n year ? -- nanos )
'[ _ faster-leap-year? _ assert= ] average-benchmark ;
It’s a bit faster:
IN: scratchpad 100,000,000 2028 t bench-faster-leap-year .
3.24267145
Just under 4 nanoseconds…
But, to make sure that we take advantage of the least amount of instructions
possible, we can make it slightly-less-safe by declaring the input to be a
fixnum
to avoid the run-time type checks. This could cause issues if it is
called with other types on the stack.
: fastest-leap-year? ( year -- ? )
{ fixnum } declare
1073750999 fixnum*fast 3221352463 fixnum-bitand 126976 fixnum<= ;
And write a benchmark word:
: bench-fastest-leap-year ( n year ? -- nanos )
'[ _ fastest-leap-year? _ assert= ] average-benchmark ;
And then you can see it gets quite fast indeed:
IN: scratchpad 100,000,000 2028 t bench-fastest-leap-year .
2.82150476
Just under 3 nanoseconds!
But, is it also just 3 instructions?
IN: scratchpad \ fastest-leap-year? disassemble
000075f0afa19490: 89056a5bd1fe mov [rip-0x12ea496], eax
000075f0afa19496: 498b06 mov rax, [r14]
000075f0afa19499: 48c1f804 sar rax, 0x4
000075f0afa1949d: 4869c0d7230040 imul rax, rax, 0x400023d7
000075f0afa194a4: bb0ff001c0 mov ebx, 0xc001f00f
000075f0afa194a9: 4821d8 and rax, rbx
000075f0afa194ac: 4881f800f00100 cmp rax, 0x1f000
000075f0afa194b3: b801000000 mov eax, 0x1
000075f0afa194b8: 48bb5c0e388cf0750000 mov rbx, 0x75f08c380e5c
000075f0afa194c2: 480f4ec3 cmovle rax, rbx
000075f0afa194c6: 498906 mov [r14], rax
000075f0afa194c9: 8905315bd1fe mov [rip-0x12ea4cf], eax
000075f0afa194cf: c3 ret
Pretty close!
There is an extra instruction near the beginning to untag our fixnum
input. Due to the convention around handling booleans in
Factor, there
are a couple of extra instructions at the end for converting the result into
a return value of either t
or f
.
And it could get even faster if either the assert=
was removed, or the code
was made inline
so the function prologue and
epilogue could
be elided into the outer scope.
So much fun.
Raylib is a very neat C library that has become popular as a “simple and easy-to-use library to enjoy videogames programming”. Originally released in 2014, it has seen lots of updates including with the latest version 5.5 representing 11 years of updates since the original version 1.0 release.
You can sense the love this library has received from the extensive feature list:
- NO external dependencies, all required libraries included with raylib
- Multiplatform: Windows, Linux, MacOS, RPI, Android, HTML5… and more!
- Written in plain C code (C99) using PascalCase/camelCase notation
- Hardware accelerated with OpenGL (1.1, 2.1, 3.3, 4.3 or ES 2.0)
- Unique OpenGL abstraction layer: rlgl
- Powerful Fonts module (SpriteFonts, BMfonts, TTF, SDF)
- Multiple texture formats support, including compressed formats (DXT, ETC, ASTC)
- Full 3d support for 3d Shapes, Models, Billboards, Heightmaps and more!
- Flexible Materials system, supporting classic maps and PBR maps
- Animated 3d models supported (skeletal bones animation)
- Shaders support, including Model shaders and Postprocessing shaders
- Powerful math module for Vector, Matrix and Quaternion operations: raymath
- Audio loading and playing with streaming support (WAV, OGG, MP3, FLAC, XM, MOD)
- VR stereo rendering support with configurable HMD device parameters
- Huge examples collection with +120 code examples!
- Bindings to +60 programming languages!
- Free and open source. Check [LICENSE].
In 2019, we first included support for version
2.5
of raylib
. Over the years, we have updated our bindings to include new
functions and features of new versions. Our most recent two releases, Factor
0.99 and Factor
0.100 included support for version
4.5. And in the current development cycle, we have updated to version
5.0
and then again updated to version
5.5.
It is possible to maintain support for multiple versions, but we have chosen
for now to just target the most recent stable release as best we can.
Generally, the raylib
library has been quite stable, with the
occasional deprecation or breaking change which seem to be easy to adjust
for.
As a simple demonstration, we can start with the basic window example – an extremely simple program in the spirit of achieving a black triangle moment – to make sure everything works. We can directly translate this example into Factor using our Raylib bindings – which import the C functions using our convention for kebab-case word names.
USE: raylib
: basic-window ( -- )
800 450 "raylib [core] example - basic window" init-window
60 set-target-fps
[ window-should-close ] [
begin-drawing
RAYWHITE clear-background
"Congrats! You created your first window!"
190 200 20 LIGHTGRAY draw-text
end-drawing
] until close-window ;
And, if you run the example – you’ll end up with this:
Note: if you’re wondering why the background isn’t white, it’s because
RAYWHITE
is a special version of not-quite-white that matches the Raylib
logo.
Of course, your examples can become more complex. For example, Joseph Oziel released a fun game called BitGuessr written in Factor using Raylib. And, given the extensive feature list, your simple program written quickly for a game jam might end up including great audio, images, keyboard, mouse, gamepad, 3d rendering, and other features relatively easily!
I’d love to see more demos written in Factor and Raylib. Happy coding!
Some might describe shuffle words as one of the fundamental building blocks of Factor. Others might describe them as a code smell and seek to use dataflow combinators or other higher-level words to reduce code complexity.
Whatever your opinion is, they are useful concepts in a concatenative
language.
Besides the basic shuffle words – like dup
, swap
, rot
– we have
had the shuffle
vocabulary which
provides some “additional shuffle words” for awhile, as well as a syntax word
that can perform arbitrary shuffles:
IN: scratchpad USE: shuffle
IN: scratchpad { 1 2 3 4 5 } [
shuffle( a b c d e -- d e c a b )
] with-datastack .
{ 4 5 3 1 2 }
This would be quite useful, except that it has had a fundamental issue – the way it is implemented uses a macro to curry the stack arguments into an array, and then pull the stack arguments back out of the array onto the stack in the requested order.
For example, we can look at a simple swap
and a complex shuffle:
IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[
2 0 <array> dup >R >R R> 3 set-slot
R> >R R> dup >R >R R> 2 set-slot
R> >R R> dup >R >R R> >R R> 3 slot R> >R R> >R R> 2 slot
]
IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
5 0 <array> dup >R >R R> 6 set-slot
R> >R R> dup >R >R R> 5 set-slot
R> >R R> dup >R >R R> 4 set-slot
R> >R R> dup >R >R R> 3 set-slot
R> >R R> dup >R >R R> 2 set-slot
R> >R R> dup >R >R R> >R R> 3 slot
R> dup >R >R R> >R R> 2 slot R> dup >R >R R> >R R> 5 slot
R> dup >R >R R> >R R> 4 slot R> >R R> >R R> 6 slot
]
And not only would this be less-than-efficient, it would also turn literal
arguments that were on the stack into run-time arguments and potentially cause
a Cannot apply 'call' to a run-time computed value
error if one of the
shuffled arguments is a
quotation they
hope to use.
This bug was described on our issue tracker and I spent some time recently looking into it.
It turns out that we can use the stack
checker to
indicate that a shuffle is taking place, and use some "special"
machinery
to allow the optimizing compiler to generate efficient and correct code for
these arbitrary shuffles.
After applying a small fix, we can see that the earlier examples are now quite simple:
IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[ swap ]
IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
( 10791205 10791206 10791207 10791208 10791209 -- 10791206 10791205 10791208 10791207 10791209 )
]
This is available in the latest development version.
I was distracted a little by some recent explorations building a Brainfuck interpreter in Factor and had a couple of follow-ups to add to the conversation.
First, I realized my initial quick-and-dirty Brainfuck interpreter didn’t
support nested loops. Specifically, the logic for beginning or ending a
loop just searched for the nearest [
or ]
character without
considering nesting. This was fixed
today
so that will no longer be an issue.
Second, despite the Brainfuck compiler implicitly making an AST (abstract syntax tree) for Brainfuck by virtue of generating a quotation, I thought it would be more fun to build and generate one intentionally.
We can model the Brainfuck commands as operations using the following tuples and singletons:
TUPLE: ptr n ;
TUPLE: mem n ;
SINGLETONS: output input debug ;
TUPLE: loop ops ;
Next, we can build a parser using EBNF to convert the textual commands into our Brainfuck AST:
EBNF: ast-brainfuck [=[
inc-ptr = (">")+ => [[ length ptr boa ]]
dec-ptr = ("<")+ => [[ length neg ptr boa ]]
inc-mem = ("+")+ => [[ length mem boa ]]
dec-mem = ("-")+ => [[ length neg mem boa ]]
output = "." => [[ output ]]
input = "," => [[ input ]]
debug = "#" => [[ debug ]]
space = [ \t\n\r]+ => [[ f ]]
unknown = (.) => [[ "Invalid input" throw ]]
ops = inc-ptr|dec-ptr|inc-mem|dec-mem|output|input|debug|space
loop = "[" {loop|ops}+ "]" => [[ second sift loop boa ]]
code = (loop|ops|unknown)* => [[ sift ]]
]=]
This is interesting, because now we can more easily analyze a piece of Brainfuck code, such as the Hello, World example that I have been frequently using:
IN: scratchpad "
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" ast-brainfuck .
V{
T{ mem { n 10 } }
T{ loop
{ ops
V{
T{ ptr { n 1 } }
T{ mem { n 7 } }
T{ ptr { n 1 } }
T{ mem { n 10 } }
T{ ptr { n 1 } }
T{ mem { n 3 } }
T{ ptr { n 1 } }
T{ mem { n 1 } }
T{ ptr { n -4 } }
T{ mem { n -1 } }
}
}
}
T{ ptr { n 1 } }
T{ mem { n 2 } }
output
T{ ptr { n 1 } }
T{ mem { n 1 } }
output
T{ mem { n 7 } }
output
output
T{ mem { n 3 } }
output
T{ ptr { n 1 } }
T{ mem { n 2 } }
output
T{ ptr { n -2 } }
T{ mem { n 15 } }
output
T{ ptr { n 1 } }
output
T{ mem { n 3 } }
output
T{ mem { n -6 } }
output
T{ mem { n -8 } }
output
T{ ptr { n 1 } }
T{ mem { n 1 } }
output
T{ ptr { n 1 } }
output
}
And then we can implement those operations against a brainfuck
state
object, by deferring to words from our current
implementation:
GENERIC: op ( brainfuck op -- brainfuck )
M: ptr op n>> (>) ;
M: mem op n>> (+) ;
M: output op drop (.) ;
M: input op drop (,) ;
M: debug op drop (#) ;
M: loop op [ get-memory zero? ] swap ops>> '[ _ [ op ] each ] until ;
And now this Brainfuck AST represents a hybrid execution model somewhere between the compiled and interpreted versions:
: hybrid-brainfuck ( code -- )
[ <brainfuck> ] dip ast-brainfuck [ op ] each drop ;
And see that it works:
IN: scratchpad "
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" hybrid-brainfuck
Hello World!
We also gain some potential for building code optimization techniques that operate on an AST as a step before actual compilation or execution – for example, coalescing adjacent increment and decrement operations or some other more complex analysis.
That, however, is likely to remain an exercise for the reader!
Almost 16 years ago, I wrote about implementing the Brainfuck programming language in Factor. It is a curious programming language, sometimes considered one of the most famous esoteric programming languages.
In any event – and encouraged by a question I was asked recently – I spent some time thinking about the current process of “compiling” the Brainfuck into quotations versus how an interpreter might work instead.
As a quick reminder, our current implementation expands a program written in Brainfuck into an equivalent form in Factor, and then allows it to be run:
IN: scratchpad USE: brainfuck
IN: scratchpad "
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" run-brainfuck
Hello World!
IN: scratchpad [
"
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" run-brainfuck
] expand-macros .
[
<brainfuck> 10 (+)
[ get-memory zero? ] [
1 (>) 7 (+) 1 (>) 10 (+) 1 (>) 3 (+) 1 (>) 1 (+) 4 (<)
1 (-)
] until 1 (>) 2 (+) (.) 1 (>) 1 (+) (.) 7 (+) (.) (.) 3 (+)
(.) 1 (>) 2 (+) (.) 2 (<) 15 (+) (.) 1 (>) (.) 3 (+)
(.) 6 (-) (.) 8 (-) (.) 1 (>) 1 (+) (.) 1 (>) (.) drop flush
]
Notice the coalescing that it performs to collapse multiple identical operators into a single action.
But, this got me curious about a couple of things:
Thankfully, the answer to the first question is simple using parsing words.
SYNTAX: BRAINFUCK:
scan-new-word ";" parse-tokens concat
'[ _ run-brainfuck ] ( -- ) define-declared ;
Now, we can define a Hello, World, complete with inline comments:
BRAINFUCK: hello
+++++ +++ ! Set Cell #0 to 8
[
>++++ ! Add 4 to Cell #1; this will always set Cell #1 to 4
[ ! as the cell will be cleared by the loop
>++ ! Add 4*2 to Cell #2
>+++ ! Add 4*3 to Cell #3
>+++ ! Add 4*3 to Cell #4
>+ ! Add 4 to Cell #5
<<<<- ! Decrement the loop counter in Cell #1
] ! Loop till Cell #1 is zero
>+ ! Add 1 to Cell #2
>+ ! Add 1 to Cell #3
>- ! Subtract 1 from Cell #4
>>+ ! Add 1 to Cell #6
[<] ! Move back to the first zero cell you find; this will
! be Cell #1 which was cleared by the previous loop
<- ! Decrement the loop Counter in Cell #0
] ! Loop till Cell #0 is zero
! The result of this is:
! Cell No : 0 1 2 3 4 5 6
! Contents: 0 0 72 104 88 32 8
! Pointer : ^
>>. ! Cell #2 has value 72 which is 'H'
>---. ! Subtract 3 from Cell #3 to get 101 which is 'e'
+++++ ++..+++. ! Likewise for 'llo' from Cell #3
>>. ! Cell #5 is 32 for the space
<-. ! Subtract 1 from Cell #4 for 87 to give a 'W'
<. ! Cell #3 was set to 'o' from the end of 'Hello'
+++.----- -.----- ---. ! Cell #3 for 'rl' and 'd'
>>+. ! Add 1 to Cell #5 gives us an exclamation point
>++. ! And finally a newline from Cell #6
;
And, it works!
IN: scratchpad hello
Hello World!
Note: we are using !
as a comment character which is the convention in
Factor. Some Brainfuck implementations use that character to indicate embedded
program inputs.
That’s pretty cool, and a neat example of using the lexer, the parser, and macros.
The answer to the second question might be more complex and nuanced, but thankfully we can re-use some of the current implementation to make a quick-and-dirty interpreter:
: end-loop ( str i -- str j/f )
CHAR: ] swap pick index-from dup [ 1 + ] when ;
: start-loop ( str i -- str j/f )
1 - CHAR: [ swap pick last-index-from dup [ 1 + ] when ;
: interpret-brainfuck-from ( str i brainfuck -- str next/f brainfuck )
2over swap ?nth [ 1 + ] 2dip {
{ CHAR: > [ 1 (>) ] }
{ CHAR: < [ 1 (<) ] }
{ CHAR: + [ 1 (+) ] }
{ CHAR: - [ 1 (-) ] }
{ CHAR: . [ (.) ] }
{ CHAR: , [ (,) ] }
{ CHAR: # [ (#) ] }
{ CHAR: [ [ get-memory zero? [ [ end-loop ] dip ] when ] }
{ CHAR: ] [ get-memory zero? [ [ start-loop ] dip ] unless ] }
{ f [ [ drop f ] dip ] }
[ blank? [ "Invalid input" throw ] unless ]
} case ;
: interpret-brainfuck ( str -- )
0 <brainfuck> [ interpret-brainfuck-from over ] loop 3drop ;
And give it a try:
IN: scratchpad "
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" interpret-brainfuck
Hello, World!
It works! But now I’m curious about relative performance. Let’s build a silly benchmark equivalent to cat to redirect the contents of an input-stream to an output-stream. We can compare our compiled run-brainfuck macro to a version that uses the interpreter we made above and then to a native version implemented using stream-copy.
: cat1 ( -- ) ",[.,]" run-brainfuck ;
: cat2 ( -- ) ",[.,]" interpret-brainfuck ;
: cat3 ( -- ) input-stream get output-stream get stream-copy* ;
First, we should make sure they both work:
IN: scratchpad "Compiled" [ cat1 ] with-string-reader
Compiled
IN: scratchpad "Interpreted" [ cat2 ] with-string-reader
Interpreted
IN: scratchpad "Factor" [ cat3 ] with-string-reader
Factor
Okay, so it seems to work!
For quick performance testing, lets compare them outputting to a null stream:
IN: scratchpad [
1,000,000 CHAR: a <string> [
[ cat1 ] with-null-writer
] with-string-reader
] time
Running time: 0.059820291 seconds
IN: scratchpad [
1,000,000 CHAR: a <string> [
[ cat2 ] with-null-writer
] with-string-reader
] time
Running time: 0.13840325 seconds
IN: scratchpad [
1,000,000 CHAR: a <string> [
[ cat3 ] with-null-writer
] with-string-reader
] time
Running time: 0.015008417 seconds
The compiled one is a bit more than 2x
faster than the interpreted
version, but both are slower than the native version.
Let’s try comparing our “Hello, World” example – where the operator coalescing that the compiled version does might help:
: hello1 ( -- )
"
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" run-brainfuck ;
: hello2 ( -- )
"
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" interpret-brainfuck ;
We can see that the compiled one is now more like 7x
faster:
IN: scratchpad [ [ 10,000 [ hello1 ] times ] with-null-writer ] time
Running time: 0.018075292 seconds
IN: scratchpad [ [ 10,000 [ hello2 ] times ] with-null-writer ] time
Running time: 0.133718416 seconds
Obviously, this is somewhat comparing apples and oranges because it’s ignoring compilation time in the comparison, and I didn’t spend any time on optimizing the interpreted version – for example, stripping blanks or validating inputs before doing the interpreter loop – but it’s a useful starting point for understanding tradeoffs.
How long does it currently take to compile?
IN: scratchpad [
[
gensym [
"
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" run-brainfuck
] ( -- ) define-declared
] with-compilation-unit
] time
Running time: 0.02294075 seconds
…a bit more time than it takes to call it 10,000 times. Interesting.
Factor has included support for Roman numerals since at least 2008. Sometimes recent events – such as the election of Pope Leo XIV or the Super Bowl LIX – revive modern interest in how they work and how to computationally work with them.
There was a blog post a few days ago about sorting Roman numerals which pointed out that sorting them alphabetically worked pretty well. Given that we have a pretty good roman vocabulary, I thought we could explore different methods of sorting a sequence of strings representing Roman numbers.
Let’s first try the mostly-but-not-entirely-correct method suggested in the blog post:
IN: scratchpad 10 [1..b) [ >ROMAN ] map
IN: scratchpad sort .
{ "I" "II" "III" "IV" "IX" "V" "VI" "VII" "VIII" }
Well, that’s almost correct, but the number IX
– the number 9
–
sorts in the middle, rather than at the end. Could we fix this by using
sort-by to
convert the string to a number before calling
compare to
produce a sorted output?
IN: scratchpad 10 [1..b) [ >ROMAN ] map
IN: scratchpad [ roman> ] sort-by .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }
That fixes it, but how often do we end up calling the conversion function?
IN: scratchpad 10 [1..b) [ >ROMAN ] map
IN: scratchpad SYMBOL: calls
IN: scratchpad [ calls inc roman> ] sort-by .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }
IN: scratchpad calls get .
40
Wow, 40
times – that seems like a lot!
Perhaps we could try the Schwartzian transform – also known as decorate-sort-undecorate – at the expense of using intermediate storage by saving the keys for each element, then sorting, and then returning only the values:
IN: scratchpad 10 [1..b) [ >ROMAN ] map
IN: scratchpad [ [ roman> ] keep ] map>alist sort-keys values .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }
That seems like a lot of code to do a simple thing. Instead, we can use the map-sort abstraction which implements the same method:
IN: scratchpad 10 [1..b) [ >ROMAN ] map
IN: scratchpad [ roman> ] map-sort .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }
Does this make much of a difference? Let’s compare each method:
: roman-sort1 ( seq -- sorted-seq ) [ roman> ] sort-by ;
: roman-sort2 ( seq -- sorted-seq ) [ roman> ] map-sort ;
We can time sorting a list of 100,000
random Roman numbers under 1,000
:
IN: scratchpad 100,000 1,000 [1..b] randoms [ >ROMAN ] map
IN: scratchpad [ [ roman-sort1 ] time ]
[ [ roman-sort2 ] time ] bi
Running time: 4.164076625 seconds
Running time: 0.154227583 seconds
You can see that the decorate-sort-undecorate pattern is quite a bit faster in this case. This is not always true, but generally depends on how much work the key function is doing.
planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.