reflections on an apprenticeship

Written by Thomas Henderson
Published on 03 December 2017

I was an apprentice at 8th Light from July to November of 2017. It was a great experience, and I learned a lot. Here are a few notes on some takeaways.

Test-driven development (TDD)

One of the mentors told me, about the purpose of tests: “Tests are not there to ensure your program is correct. Tests are there to ensure your program does what you say it does.”

I like this. In the ongoing discussion over whether software is more like math, or more like literature, currently I’m on the literature side. (Surprised? I was!) But the way I see it right now: yes, software eventually gets compiled in one way or another to a sequence of symbols in a formal language, i.e., a whole bunch of ons and offs. But at the level most of us are working at — as the developers of software, or its end-users — we’re trying to represent sequences of actions, summaries of data, or states of systems, and our models have so much less context than the reality we’re representing. The words, sentences, gestures, and functions we use are necessarily encoding a lot of hidden ideas and assumptions.

I like writing tests first because it forces me to think about those assumptions. I have to consider how I want to use some fragment of code before I even have the code available. I might not insist on TDD if it’s not how the team I’m with does things; but the fearless refactoring that’s possible with a well-written slate of tests seems to justify the cost.


My poet and information architect tendencies often lead me to come up with a good number of short, potent words when I go to think about a system I’d like to have. I’ve long believed that complex problems can be broken into simple pieces. But now I’ve seen how breaking a problem up prematurely can lead to problems of a different kind. Sometimes its best to work with one unwieldy lump of code, develop the behaviors you want in that lumpy form, and look for the clean breaking points later in the process. Lots to learn here.


Story points are cool. I have a better sense now for how to give optimistic, realistic, and pessimistic estimates for a particular task. I feel lieske I was just barely starting to figure out the key properties of a good software spike as my apprenticeship ended; a place where I have room to grow is, getting better at converting a quick, dirty, lumpy bit of spike code into an estimate of how long it will take to develop clean code for the story at hand.


I had a great time working with Tam on a ping-pong kata for scoring a bowling game. Pair programming is a great way to keep flexing different muscles as you go about solving a problem. I didn’t have a chance during this apprenticeship to write about Sarah Mei’s excellent talk, “Factory Workshop Stage”, but it gave me lots to chew over about the possible congruences between improvisational theater and software development. Since I have experience in both, I’m going to be wanting to continue to meditate on what practices from the former could be used as the bases of exercises in the latter.

Final thoughts

It was great to work in an environment that was so dedicated to learning, and to hewing to a principle of continuous improvement and quality. I’m really excited to bring what I learned to my next gig. (Speaking of which: if you, dear reader, might be the source of that next gig, come check out my homepage, where I’m expanding on what I have to offer for your data, software, or research organization. Or write to me, at tom[remove this]@[also remove this part]

how to not know what you're doing: the spike

Written by Thomas Henderson
Published on 23 October 2017

“Write short functions.” “Give functions expressive names.” “Classes and namespaces should have one responsibility.” “Don’t repeat yourself.”

It’s good advice when you know what you’re doing. But what if you don’t? What if you don’t know what commands and queries belong together in functions? What if you haven’t conceptualized the domain well enough to give names and responsibilities to its components, or to recognize duplication?

We can’t encode what we don’t understand. So when we’re faced with a problem we don’t know how to solve, we need to spend effort on understanding it, before we design a solution for it. The name I’ve been taught for code written for understanding, rather than for production, is a ‘spike.’

I read somewhere that there’s no such thing as ‘no design’ — either you think about your design, or you don’t, and the result will be a thoughtful design (hopefully good), or thoughtless design (probably bad). Last week I referred to the idea that designing means pulling things apart so they can be put back together. When we begin with an unfamiliar domain, we don’t know what the good seams in the domain are — where it makes good sense to pull things apart to consider them individually.

I propose that if there’s no such thing as no design, and if pulling things apart thoughtlessly leads to a mess, then our best design in an unfamiliar domain is the one in which we pull things apart as little as possible.

Example: Http Server Spike (Sockets)

Here’s some spike code I wrote in order to try and understand sockets, server sockets, and input/output streams.

public class Service {
    public static void main(String[] args) throws Exception {
        int port = 1337;
        String directory = null;

        System.out.println("Processing command line arguments");
        for (int i=0; i < args.length; i++) {
            String token = args[i];
            if (Objects.equals(token, "-p")) {
                port = Integer.parseInt(args[i+1]);
            } else if (token.equals("-d")) {
                directory = args[i+1];

        System.out.println("Listening on " + port);
        System.out.println("Serving resources at " + directory);
        ServerSocket listener = new ServerSocket(port);

        while(true) {
            System.out.println("Accepting connections");
            Socket io = listener.accept();

            BufferedReader reading = new BufferedReader(new InputStreamReader(io.getInputStream()));
            PrintWriter writing = new PrintWriter(io.getOutputStream(), true);

            String line = reading.readLine();
            do {
                System.out.println("I heard: " + line);
                line = reading.readLine();
            } while (!line.isEmpty());

            System.out.println("\nResponding with:");

            String requestLine = "HTTP/1.1 200 OK";

            String contentType = "Content-Type: text/html";

            String contentLength = "Content-Length: 0";


            System.out.println("\nClosing connection\n");

I’d start the server, and then in another process act as a client, running

$ curl localhost:1337 -v

to hit my server with a basic GET request and be verbose about what I got. On the client side, the output of this command looked like

Rebuilt URL to: localhost:1337/
  Trying ::1...
Connected to localhost (::1) port 1337 (#0)
> GET / HTTP/1.1
> Host: localhost:1337
> User-Agent: curl/7.56.0
> Accept: */*
< HTTP/1.1 200 OK
< Content-Type: text/html
< Content-Length: 0
Connection #0 to host localhost left intact

And on the server side, all those System.out.println s gave this:

Processing command line arguments
Listening on 1337
Serving resources at null
Accepting connections

I heard: GET / HTTP/1.1
I heard: Host: localhost:1337
I heard: User-Agent: curl/7.56.0
I heard: Accept: */*

Responding with:
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 0

Closing connection

Accepting connections

Everything bad is good for you

Many things I discovered to be helpful in this spike are things that you wouldn’t want in your production code.


a test bestiary, part 2

Written by Thomas Henderson
Published on 16 October 2017

In my last post on testing, I described testing in the small: the smoke test (checking if we’ve set up our testing framework correctly), and unit tests (testing components of our system). Now I’ll address testing in the large, starting with a correction to a misunderstanding I had about functional testing.

Functional tests: Testing from the outside, in

Last time, I confused the function in “functional testing” for mathematical functions: relations betweens inputs and outputs that are deterministic, i.e., given the same inputs you will get the same output. I thought this meant, functional tests are for methods that are stateless. Instead, functional testing refers to the practice of writing tests that touch your system in the same way it will be used in production. The “function” in the name refers to how this style of test treats your system as a “black box” that takes inputs to outputs, without your tests being aware of the details of how this transformation takes place.

As I learned from writing Game of Life wrong in three different ways, it’s entirely possible to have tests that pass at the component level (unit tests), and still have a failing application. My Cell and World classes passed the life and death rules, but when I would test well-known Game of Life patterns like the Blinker or the Toad, they wouldn’t behave correctly as the system evolved through time.

When I moved on to Tic-Tac-Toe, one of my mentors advised me to write “outside-in tests” instead. The idea is, write a test that calls an outermost message of your system’s interface, one that would pass if your application did the right thing. Since I was writing a game, my first test checked to see if my system accepted a play message. Then I could test things that ought to be true about a started game: there should be a current player, that player should be player one, the game’s board should be empty.

Driving your code with functional tests is a bit like wish-driven development. You haven’t written any code yet, so these things won’t be true. But the behavior that you fill in with your code is behavior that’s correct from the point of view of the entire system. This keeps you from zooming in too closely and over-defining your components with overly fiddly tests, when you don’t know enough about the whole of the system.

You can read more about my experience using functional testing and delaying the breakdown of Tic-Tac-Toe into classes here. As I develop as a crafter, I want to keep using functional testing and the deferral of design decisions until necessary; but I want to get better at test-driving that breakdown of an overly large, but functional, block of code into components, so that design comes after functionality.

“Typically people think about design, and they say design is making this big, involved plan that’s going to solve every issue that the system is supposed to address. But I think that’s not what you do when you’re trying to get a design that’s going to survive and live over time. I think, instead, what you want to do is break things apart in such a way that they can be put back together.” — Rich Hickey, “Design, Composition and Performance”

Acceptance tests: Agreeing on behavior

A great solution to the wrong problem is no solution at all. A testing strategy to prevent the scenario of customers and developers misunderstanding one another so badly that the wrong product is delivered, is a suite of acceptance tests. These are tests that act as a kind of contract between the customer and the development team. They are very similar to functional tests, in that they have the same kind of outside-in, black-box shape. What distinguishes them from an ordinary functional test is the level of customer involvement. From “Customers are responsible for verifying the correctness of the acceptance tests and reviewing test scores to decide which failed tests are of highest priority.”

Last week, I worked with another apprentice on the bowling score kata. We had a lot of trouble when we tried to complete our algorithm to work for a perfect game. It wasn’t until an experienced bowler overheard us failing to understand a subtle distinction between our algorithm and the actuality of scoring a bowling game, and corrected us, that we realized the problem wasn’t our programming ability, but our understanding of bowling itself. Acceptance tests exist as a formal method to prevent this kind of domain misunderstanding from happening.

The Chicago apprentices recently attended a zagaku on ubiquitous language and domain-driven design, which has similar aims to the notion of acceptance testing. This is the discipline of having developers work closely with the users of the software in order to learn their language for the domain. The theory goes that if this common language is pursued with sufficient rigor, the problem domain and the software that grows to model that domain will approach isomorphism — mathspeak for “the same in every way that matters.” The developers can use additional abstractions for code re-use and performance, but these abstractions will serve the true needs of the users. I am really looking forward to learning more about this, and the degree to which the practice is harmonious with the more formal, more automatable acceptance tests.

Characterization Tests

Just because software should have tests doesn’t mean that it will. There may be few or no tests. Tests may be numerous, but they might be low-value tests, ones that are highly coupled to the implementation and break as soon as someone tries to improve the design. A code base may be full of sinful hacks, done under pressure; they may be clever hacks, but cleverness can be obscure, hard to understand by a developer who’s just joining the project, or even to the original author, a few weeks or months later.

When code seems to work only by dark magic, when the prospect of changing it is terrifying, and the only comments present may as well read # This function is full of spiders, it might be time for characterization tests.

I learned about characterization tests from two excellent talks by Katrina Owens, and from sitting in on one of 8th Light’s Weyrich Institute sessions.

I first learned of characterization tests from Katrina Owens’ talk, 467 tests, 0 failures, 0 confidence. Owens considers the case of an open source project with abundant but low-value tests. These tests are freezing the design solid, coupling the code to a swarm of tiny tests. She deletes them and writes new outside-in tests in order to pin down the behavior of the project. Since they’re outside-in, they are a type of functional test. But the code already exists, and she seeks to pin down the behavior while freeing future maintainers to refactor and redesign more aggressively. It’s the testing-after that makes this approach into a characterization test approach: she is characterizing the current behavior. It’s a good talk, and Owens frequently references Sandi Metz’s great Magic Tricks of Testing talk. I’m going to have to watch this talk again, because I felt it really drove home the why of Metz’s strategic recommendations.

I liked Therapeutic Refactoring even more. Owens tackles a big block of heavy, obscure code. She knows that it “works” in the sense that it is being used in production without functional complaints, but it is definitely full of spiders and it cannot be understood, much less be changed easily. Her technique is to write a test that takes a reasonable domain-appropriate input and expects that the code will return the string "something". Of course, it doesn’t return "something", but the errors that her test suite returns guide her toward a genuine, functional, passing test. This puts her in the green, giving her the power to refactor furiously. She chops the overlong method into bite-sized pieces, improving the names as she learns how the different pieces contribute to the overall behavior of the function. When it makes sense, she can add unit tests of her broken-out components. In Weyrich, I learned that output you may use as a definition of “correct” behavior is sometimes called a “golden master,” and got a chance to try this technique of doing test-driven refactoring against known correct output.

I’m interested in this kind of test because I like the idea of capturing behavior of unfamiliar code and refactoring it without fear. It sounds like a great technique for contributing to open source projects, raising the value of tests so that more contributors can make fearless changes. Also I hear that, with some frequency, consultants like 8th Light Crafters don’t get called in because everything is going great. One crafter went so far as to advise us, “Assume that everything is on fire.” Using functional testing techniques like acceptance tests (ensuring we’re solving the right problem) and characterization tests (to rescue troublesome but functional legacy code from bankruptcy and total redesign), sound like effective tools for software firefighting.

a test bestiary, part 1

Written by Thomas Henderson
Published on 30 September 2017

We write tests around here. It gives us confidence to refactor fearlessly. I know that if I think I see a cleaner way to write some method, but I’m not sure, I can rely on my tests to tell me that my “refactoring” wasn’t a true refactoring — it must have changed some behavior in a way I didn’t see. If that happens, I can stop, back up, and decide: is it more valuable to stop and figure out the real way to clean this code up now? Or is it good enough for now? Good tests mean I can make this decision based on relative value, rather than out of fear.

Now, writing tests doesn’t exactly prove that the software is behaving correctly — all the tests I’ve written during this apprenticeship have tested a finite number of examples, rather than used formal logic — but they do provide evidence that it is. Writing tests before writing code also gives me a chance to think about how I intend to use a method or class, and what name to give it, before I think about the details of its implementation.

I’ve gotten used to writing tests as a tool for thinking. Now I need to get better at writing tests, and choosing what tests to write in what circumstances. In this post, I will address the kinds of ‘small tests’ that I know how to write, saving ‘large tests’ for my next post.

Smoke test: testing that we’re testing

When I start a new project, especially in a new language or with a new test framework, I like to write a test that should always pass, just to make sure I know how to write tests in that situation. Typically my first test is something like

it "does not violate the laws of mathematics" do
  expect(1+1).to eq 2

It’s not strictly necessary, but when I change to a new language, this kind of smoke test gives me confidence that I’ve set things up correctly.

Functional testing

[Correction: This section was written out of a misunderstanding of the ‘function’ in functional testing. See my next post
on testing
for a more correct explanation.]

A function is a relationship between a set of inputs, and set of outputs, such that an input has one and only one output. What this means practically is, a function called twice with the same argument(s) gives the same result each time.

Functions are easy to write tests for. At least, easier than writing tests for something that contains mutable state, where calling a method twice can yield two different results. If I’m testing a square function, I know from math that every time I feed a two to squared I better get four back. Not all code is written in a functional style, but so far I’ve seen plenty of examples of code that contains pure functions that can be factored out into easy-to-test pieces.

Unit tests

Unit testing refers to testing one component of a system to make sure that it is operating correctly. If a class has good unit tests, then other code ought to be able to rely on the behavior of the class.

Sandi Metz is the author of Practical Object-Oriented Design in Ruby, an office favorite. Last week I watched one of her talks, entitled Magic Tricks of Testing. In this talk she defines two distinct classes of message:

It’s possible for a message to be both a query and a command. For example, when you pop something off of an array, you get a value returned, and the state of the array changes — it gets shorter. However, if you have more complicated than popping something off of an array, it is quite likely an anti-pattern to have an unseparated query/command pair. Better to split this apart and test each separately.

Metz also notes a second dimension of categorization for messages:

Pausing to multiply along the query-command dimension, and the message orientation dimension, we get six categories of message. Do we have to test all of them?

Metz says: no! We actually need to write tests for only half of these.

Notice that the first is very like a functional test. And if you squint, the second is also a bit like a functional test, albeit split in two: sending the same message in identical contexts should have the same effect, even though technically you need to call another method in order to verify this fact.

Why don’t we need to test messages sent from an object to itself? Because such a message exists only in order to break the work of an object into smaller parts. But that work, if it matters at all to our application, will matter to the rest of the system. That is, the reason we don’t need to test messages sent to self is, we will discover errors in these messages by testing the public interface of the object, the incoming queries and outgoing commands that it participates in. Rules can be broken, of course, when it’s valuable to do so. If you don’t feel confident you can get these private messages right without testing, Metz says, go ahead and write them — she sometimes does that, too. But when she does, she adds a comment to those tests: “If these tests fail, delete them.”

And outgoing queries? Outgoing queries for one object are incoming queries for another object — tests of these behavior belong with that receiving object.


Next: Outside-In Tests

less intelligent design

Written by Thomas Henderson
Published on 08 September 2017

Thanks to the ridiculous amount of mathematics I’ve read, I have a relatively easy time thinking about computation in terms of functions which take a given input and transform it deterministically into an output. This is a sort of ‘noun-y’ thinking, in which you declare what things are, and use those declarations to get your work done.

I have less experience with computers than I do with math. So I have a harder time using the ‘verb-y’ thinking of imperative programming, where you tell the computer what to do, turning some state-before-the-command into a state-after-the-command. But part of my apprenticeship is learning to stretch myself to think in unfamiliar ways; and so, I am doing my current work in imperative, object-oriented terms, where data can change from moment to moment and no one uses friendly, familiar words like ‘endomorphism.’

I’ve been doing my assigned reading, and trying to think in terms of objects. Unfortunately, I picked up the wrong lesson for a few weeks. I kept reading how important classes were in organization of code, and how each class should be responsible for one thing. “Well,” I thought, “I better come up with a bunch of classes, then, so that I have well-organized code that’s easy to change.” So I’d go to the whiteboard and sketch some swim lanes with some plausible names for classes (to help me remember what each one’s responsibility was meant to be). I’d make a sketch or snap a photo, get back to my keyboard, and start crunching away.

(Readers familiar with object-oriented style probably see the problem already.)

I would like to tell you how this went wrong, but it’s like trying to come up with a sensible narrative of a car accident you just survived. I don’t really know what happened; everything would seem to be going fine, and then there would be a small problem, and I would think I knew how to fix it, and after a couple of hours all of my tests would be failing, or else my tests would be passing at the unit level and yet fail at the application level, and moving from commit to commit wouldn’t be helping, and everything would be non-determininistic and basically on fire. So I’d start over from scratch. Again.

I finally got my “A-ha!” moment by stepping away from the keyboard, and getting into some un-assigned reading from the company library, on unconventional programming paradigms.


Unconventional Programming Paradigms is a collection of papers presented at a 2004 workshop on the topic. I saw the title and had to grab it; I have a special fondness for people who insist on doing things differently than their peers, whether out of curiousity about untrodden research path, or just from pure contrarianism. The topics explored during this workshop included:

It was this last one that really caught my imagination. In particular, the paper on ‘Membrane Systems’ set me on a wild weekend research binge on the molecular dynamics of membranes and the variety of forms of tubulogenesis, i.e. tube formation, for a happy couple of hours. (I really know how to party.)

When I came back to my computing work on Monday, to get to work on yet another fresh start of object-oriented tic-tac-toe, I had biology on the brain. I thought to myself, Okay so, classes organize code. But organs are a relatively late invention! Remember: there were roughly a billion years of all life being unicellular before plants and animals showed up. Even if you’re just a bag of genetic information with very little structure, you can have a fulfilling life, eating tholins and splitting in two to reproduce, for a thousand million years.

So, I decided not to ‘develop organs’ — i.e., split my code into classes — until I saw a clear benefit to doing so. I still used TDD, but I used it on a single undifferentiated TicTacToe class. I got as far as having tests and code for the board, the win/draw conditions, random computer moves, turn-taking, and a simple board display before I broke any code into a new class.

But my biological inspiration didn’t stop with delaying decisions about class organization. I also had visions of the variety of ways that organisms form tubes in my mind. So when I did finally start breaking code into a new class, I imagined the methods as genetic material that was being wrapped in a new membrane, like a bud forming from the parent cell before breaking loose. I kept running my test suite to make sure that everything was okay, hitting the undo button when a change broke a test. Bit by bit, I moved methods into the new class. I pretended that cell division was occurring, and that there was a shrinking, fragile tube between the two classes, represented in code by having my original class continue to delegate method calls to the daughter class. It wasn’t until I was confident my new class could be sent the messages directly, and my confidence resulted in passing tests, that I deleted the now-redundant methods. You can see evidence of this bio-inspired thinking in some of my commit messages:

1b6a6b3 * Bud movement choice to Player; put Display to console
eee61dc * Line up referee/referee_io method chromosomes for transposing with Player

It would be a stretch to consider my simple tic-tac-toe program to have any real principles of artificial life. But thinking in terms of membranes, amoebae, organs, and tubes helped me evolve a design, rather than trying to go about designing intelligently at the beginning, when I was guaranteed to know the least amount about my code.

inkscape is trying to break my heart

Written by Thomas Henderson
Published on 01 September 2017

I overcommitted to a method without sufficient prototyping, and it’s causing me problems. At issue is – I think – some default choices that Inkscape makes when handling SVG. But it could also be a problem with how my original artwork got rendered into a PDF, or, it could just be a failure of understanding on my part.

Let me back up. I was assigned to read a book called Understanding the Four Rules of Simple Design, and create a presentation and blog post on it. This seemed like a good time to try something I’ve been wanting to try for at least a year: creating a sort of DIY Prezi.

The obvious question: why? What’s wrong with Prezi? Nothing’s wrong with Prezi! But SVG is an infinitely scaleable, very mathy format. I want to learn to use the format for data visualizations and for web art, not just presentations. What’s more, vinyl and laser cutters can take SVG as an input, as can printers, opening up a whole realm of possibility for doing off-the-screen graphics. I dream about making rubber stamps out of generative art, or taking a representation of some high-symmetry finite group, turning it into a mask, slapping it on a piece of glass or ceramic or metal, and sandblasting or acid-etching myself some tangible mathematical or math-inspired art. So, although yesterday I did d accuse myself of once again doing everything in the most complicated way possible, I still think it’s been a good experience for learning how to work with the format directly.

While reading I created a giant artboard (27 MB pdf) with notes on the material and my response to it. My vision was of a viewport that would swoop around that image, changing size and position to juxtapose high-level concepts or zoom in on a detail. I did some reading on how to animate the viewport, crossed my fingers that my JavaScript skillz would be up to the task, installed Inkscape, opened up the artboard, and started editing.

After a little bit of tinkering I discovered that objects in Inkscape can be assigned IDs. A-ha! A strategy emerged:

I got pretty into this. I also added a few rectangles labeled, e.g., #transition-3-4, on the theory that it might be useful to have some midpoints for the animations. Finally I had thirty-ish rectangles. What I expected to happen now was,

My theory was that this would get me the ability to advance to the next slide, instantaneously. If everything worked to this point, then I could spend any remaining time on animating the transition over time, and maybe even inserting transitions.

HOWEVER. After hacking on the thing for a while, I finally understood that Inkscape’s coordinate system diverges from the browser’s coordinate system! All the y-coordinates for my rectangles were negative. Also my rectangles had parent <g> elements with a matrix transformation. I tried to write a function to try and do a change of coordinates, but I did not come up with the right one in time, and ended up giving my presentation by haphazardly sliding around the artboard in Inkscape itself. Not bueno.

Here is my final artboard (42 MB svg) as exported by Inkscape. (Note there are a few options for exporting, including “plain svg” and “optimized svg” — but they didn’t seem to be better.) These questions remain:

simple design made complex

Written by Thomas Henderson
Published on 25 August 2017

The Feynman Problem-Solving Algorithm:

1. Write down the problem.
2. Think very hard.
3. Write down the answer.

Attributed to Murray Gell-Mann

It is tempting to think that programming works like this:

  1. Write down the behavior that you want;
  2. Write down the code for that behavior;
  3. There is no step three.

But, that’s not how it works. Why? And if it doesn’t work, what can we do instead?

Understanding the Four Rules of Simple Design

Understanding the Four Rules of Simple Design takes as a fundamental premise, “The only thing we truly know about software development is that we can expect changes to our system.” It presents programming as an iterative process:

Now, when I see a simple, iterative process, I get interested in seeing it though a lens of formal systems. If I can formalize a system, then I am confident I understand it, or at least some part of it; if I can’t formalize it, then I know that I still have learning to do, and/or that I need to accept a degree of uncertainty and ambiguity. Additionally, if I can formalize a system, I can frequently draw one or more diagrams representing that system, which I love doing.

The four rules from the book’s title, in compact form, are:

  1. Tests Pass
  2. Expresses Intent
  3. No Duplication
  4. Small

Today I will examine the first rule, Tests Pass, in appalling detail, through the lens of category theory, before discussing the other three rules in that light. Category theory is a mathematical framework for organizing complex information. It makes simple things difficult, in order to help make difficult things simple.A modern classic on the distinctions among ‘simple,’ ‘easy,’ ‘complex,’ and ‘difficult’ is Rich Hickey’s 2011 talk, Simple Made Easy. It’s worth a listen. It is by no means the only way to think precisely about code that improves iteratively over time, but it’s a framework that works well for my algebraic and diagrammatic sensibilities. With that caveat, let’s jump in and try and hold two or three infinite spaces in our heads.

On Code and Behavior

I want you to imagine two very large ‘spaces’ of things. (A ‘space’ in this context just means, a sort of collection, but a collection that might be too big to count, even in principle.)

The first space is the collection of all possible programs. Let’s make that a tiny bit more manageable, and consider only the collection of all possible programs in your favorite language. My favorite language is Ruby these days, so when I talk about “the space of all programs,” I mean the space of programs written in syntactically correct Ruby.

The second space is even more ridiculous than the first: it’s the collection of all possible behaviors of the computer you’re programming. Now we’re getting completely ridiculous. What makes your computer special is that it is a machine that can behave like many other machines. Your computer is can compute prime numbers, it can encrypt messages so that no one on Earth can decode them without great effort; it can show you a picture of your dog, it can download the 1945 version of “The Big Sleep”Inferior to the more widely released 1946 version — Lauren Bacall’s re-shot scenes are tighter, the chemistry between Bacall and Humphrey Bogart is better — but, interesting from a film history point of view., it can order creamed corn for next-day delivery. In fact, you might argue that this space is so large as to be not well-defined at all. At least the space of all programs, that’s just a collection of finite sets of finite text files which follow the rules of Ruby — it’s an infinite collection, but at least it’s a relatively well-behaved one. So, how can we possibly get any kind of handle on this absurdly huge space with ill-defined boundaries?

We write a test.

The space of all behaviors of our computer is unfathomably large. But when we write a new test, we introduce a constraint on that space. That is to say, there are behaviors which include passing the new test as a sub-behavior, and there are behaviors which do not. “Write down the behaviors we want, then write the code that produces them” may be an impossible process. But when we write a test, we’re writing a function from your code to a set with two elements:

That function creates an induced constraint on the space of all programs: there’s code that passes, and there’s code that isn’t.

I propose, then, that we introduce a new space, one that will protect us from the uncertain horrors of the poorly defined space of all behaviors: the space of all finite collections of tests. When we begin to write a system, we have the empty set as our test suite. It is no constraint at all. When we write our first test, we constrain both our computer’s behavior, and the programs which encode for that behavior. When we write our second, we constrain both spaces, more than they were constrained when we had one test. And so on, and so on. And since tests are just a kind of code, this space is just as nice as the space of all programs: infinite, sure, but at least it’s countably infinite and reasonably well-defined. Tests with certain side effects, like asking a web page outside of our control, might fall into a gray area, but not one I’ll take the time to examine.

Now, it’s nice that we’re not staring into the abyss of infinite behavior and uncountable side effects — we’re protected by our test suite, like a radiologist in a lead apron. But, what about actually writing code? What about passing those tests?


Let’s pause here and recognize a fact about text files, which you may have already thought about if you’ve worked with the git version control system. Suppose there are two programs in the space of all programs in Ruby, which I will name C and C'. Then it must be the case that there is some sequence of transformations I can do in my text editor that will transform C into C'. How do I know this? Not for any deep reason — I just know that I could very easily delete every file in C, and write anew the files of C'. And so it seems reasonable to say that, in ordinary circumstances, there are likely to be less drastic transformations to get from C to C'. Taking from the book’s running example of Conway’s Game of Life, one simple transformation would be to take every instance of the word location and replace it with the word position. From the point of view of the compiler, this would be “the same” code. But to a human reader, there might be a slightly different connotation between the two words. I’ll address this notion of “the same but different” when I address the rule Express Intent.

Recall that my goal here is to put a categorical structure on the space of all programs. If we only consider the transformation, “X can be turned into Y in a text editor”, then there is a transformation (a.k.a. an arrow, a.k.a. a morphism) from every program to every other program. It’s not a very interesting structure, because it is unconstrained.

But wait — we wrote a test! In attempting to wrestle with the impossibly ill-defined and huge space of behaviors, we wrote a test Okay, okay, we imagined writing a test., which is a function that takes a program C to the set { Pass, Fail }. So that function induces a constraint on the space of programs: programs that do pass, and programs that don’t pass. (At least, it should — if the empty program passes your test, you need to be a bit more aggressive about defining failure.)

This suggests two special kinds of program-to-program morphisms. One might be called writing code: given a program that fails a collection of tests, transform it into a program that passes that collection of tests. When we begin, we have the empty program, which better fail your test. Through work and creativity, we struggle from the red zone to the subspace of code that runs green.

Another program-to-program morphism is called refactoring: given a program C that passes a collection of tests, transform it into a new program C' that also passes the tests. These programs are “the same” from the perspective of your tests. But from the perspective of you, the programmer, and your colleagues, they ought to be different. The new program should be “better.”

≥, abstractly

What does “better” mean? It isn’t a notion that can be easily formalized and categorized — maybe it cannot be mathematically formalized at all. But the remaining rules give us guidance:

Does C' use an improved name for a concept (Express Intent)? Does it combine two similar ideas into a one abstraction with variations (No Duplication)? Does it break out an inline method into a well-named helper method (Express Intent)? Does it move a helper method that is now used only once back inline (Small)? We have an enormous number of choices that we can make to rewrite C into C' — but we do not have every choice, thanks to the first rule: we can only morph our code from passing to passing states.


The space of all behaviors of our computer is of innumerable size and indeterminate boundaries. If we try and write out a nice compact list of behaviors in advance, at the point in time when we know the least about what we’re trying to do, we are bound to fail.

But we still have hope. We can write tests, which are code that constrains those innumerable behaviors. We can use those tests to guide us as we write more code, by letting our test suite be the fixed point by which we judge failure and success. And we can use our experience at reading words, thinking abstractly, and trimming excess to steer our code toward flexibility and readability, within the constraints provided by our tests.

From unknowable; to constrained; to successful; to better; to more constrained. Trusting the 4 Rules requires one to be comfortable with ambiguity, uncertainty, and change, but it provides assurances that your work is heading in the right direction. I’ll conclude with one of my favorite quotes about writing and uncertainty, from author E.L. Doctorow:

Writing is like driving at night in the fog. You can only see as far as your headlights, but you can make the whole trip that way.

vector trouble, revisited

Written by Thomas Henderson
Published on 18 August 2017

Last time I tried to display SVG content I couldn’t get it to be visible. I tried inlining, for a quick-and-dirty method; linking, for keeping SVG code and blog text separate; and using Hexo’s syntactic sugar for referencing asset folders — not because I like extra sugar, but because everything else had failed.

After Joshua (@heyitry) kindly took a close look at my site and gave me some advice, I thought I should give rendering SVG in the browser another shot.

inline: line breaks

Joshua suggested that I take a closer look at exactly how my inline SVG got rendered. What I had typed into my buffer last time was

<svg height="100" width="100">
  <circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red" />

The HTML that Hexo rendered for my site was:

<svg height="100" width="100">
<circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red"/>

It’s close. All the information I put in is there, and everything looks like it’s nested correctly. But there’s a little bit of code I did not put in: that pair of <br/> tags. These tell the browser to put in a line break; but, they are just an artefact of the way I formatted my HTML for readability — they don’t actually have meaning in my document. In fancy words, the <br/> tags are not ‘semantic HTML‘ — a term I picked up hanging around with indieweb kids.)

Is it those line breaks that are messing me up? What if I write my SVG in a single line, like this?

<svg height="100" width="100"><circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red" /></svg>

When I write the above into my text file:

Hooray! We have <circle>!

So, supercool. Now I can draw pictures with my magic index card, and paste the resulting code into a blog post. So long as I remember to make it into a single line — voilà! Vector comics! And with the code in my editor, I can tweak it. I don’t know a lot about the SVG format, but I know you can group things together, and give them id’s and classes. That way, when it’s time to get really fancy, I can grab pieces of the image to reuse, or even animate.

But, sometimes it might be nice to separate the code for a picture from the text and code that make up a post. SVG code can get pretty big, after all. Thing is, I get confused about exactly how to write a relative link. Do you need to have a leading slash? What about a leading dot? Are there quotes or no?

I’m running my hexo server locally to develop this post, and when I navigate to localhost:4000/images/circle.svg I see the correct code for a red circle:

<svg height="100" width="100">
  <circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red"/>

…along with some complaint from my browser that “This XML file does not appear to have any style information associated with it.” I’m going to guess that this is because accessing a random XML file is different than viewing it embedded in a web page?

Well, if it’s there, that means I can reference it. I think. And I think to reference it, I use something like <img src="./images/circle.svg" alt="I'M MEANT TO BE A RED CIRCLE">. So I type that in…


Well, dang. A screen reader should be able to read the alt text to a user, but it’s not showing a picture to the rest of us.

Maybe the third debugging attempt will be the charm…

setting up TDD and continuous integration

Written by Thomas Henderson
Published on 07 August 2017

Today we’re going to set up a new project in Ruby. Me, I’m programming John Horton Conway’s Game of Life.

In this post, we’re going to go through the process of: starting a Ruby project from nothing; adding RSpec; adding just enough tests and code to see if we’re testing anything; synchronizing our project to Github; and, finally, authorizing Travis-CI to read our repository and run tests whenever we push our changes.

Creating the project

Create a directory where the project should live.

mkdir game-of-life
cd game-of-life

Open a file called Gemfile, to tell Ruby where to find gems and what gems we need.

source ""
gem "rspec"

Make sure we have the Ruby program for fetching and installing gems…

gem install bundler

…and then do so.

bundle install --path .bundle

The --path flag installs it to a specific directory; that way, we don’t have to use sudo to install the bundle with root privileges, which could get us into some difficult-to-diagnose trouble.

Testing whether we’re testing

Now let’s write our first test. Make a specification directory:

mkdir spec

and in that directory, write your first test. Right now we want to test whether we’re testing, so a trivial test that exists but has no assertions will do.

describe TrivialTest do

We can run the test with bundle exec rspec. The test will fail, because there is no TrivialTest class. To pass the test, create a directory for our source code in the project root directory:

mkdir lib

And inside that directory, create an rspec_works.rb file:

class TrivialTest

If you like you can run bundle exec rspec again. The test will fail again, for a different reason: the class has been defined, but the spec file doesn’t know about it. We need a require statement. Modify rspec_works_spec.rb like so:

require "rspec_works"
describe TrivialTest do

Run bundle exec rspec again, and the test will pass.

Synchronizing our project with

Next we commit our work. If you haven’t already, initialize the directory as a git repository with git init. Now we can start tracking our working files with git add Gemfile lib spec, followed by git commit. Or if you want to write a one-line commit message without git sending you to your editor, write something like git commit -m "Pass trivial test" after adding the files.

Now we can begin integrating with Github and Travis.

Start by opening your Github profile, signing in if you haven’t already, and clicking the ‘plus’ icon. Give your repository a name – you’ll probably want to match the name you gave to the project directory – and a description, then click “Create Repository.” Do NOT check the box to “Initialize this repository with a README.”

Once you’ve done this, a tutorial page will appear; follow the directions marked “…or push an existing repository from the command line.” If you don’t have an ssh key set up, you’ll have to enter your login credentials for Github. (If you don’t set up an ssh key, you will quickly get over having to enter this every time you want to push work, but that’s a tutorial for another time.)

Now, we’re ready to set up continuous integration with Travis.

Setting up continuous integration with Travis-CI

There is a Getting Started with Travis CI tutorial, but I found some discrepancies between it and reality, possibly just from my own misunderstanding. The following is what worked for me.

Go to and sign in with your Github account. Then go to the repository that you created, follow the Settings link, then the Integrations & services link. Or, just enter into your browser. Slick the Add Service dropdown menu, and select Travis. (There’s a lot of options but you can filter the options.) You will be prompted to confirm your password.

You’ll be presented with a page that has a few text fields which I ignored. Follow the link to If you’ve never done this before, I believe that this is when Travis-CI will attempt to load your Github repositories. (Note: As of this writing, having an unpaid balance with Github will keep Travis-CI from completing this action, and it will just keep trying to import with no success forever. If you have a lapsed paid account with Github, you’ll have to cough up some money before this step will work).

Assuming your repos have been successfully imported, Travis-CI will present you with a list of your repos, with a checkmark slider for each. Check the repo(s) you’d like Travis to watch.

Instructing Travis-CI what to do with your project

We’re almost there. Your project is on the remote server owned by Github, and Travis-CI has been authorized to read it from there. But Travis doesn’t have enough information to do anything with your code. It needs some directives to know how to behave when you push code to the repo.

In the root of your project directory, open .travis.yml. Here’s the simple version I used:

language: ruby
- 2.4.0
install: gem install rspec
script: bundle exec rspec

This let’s Travis know that we’re using Ruby, that we’re using a particular version, that we’ll need to have rspec installed before we take action, and that the action to take is to run our test suite. (~Note~: As of this writing, there is something with Ruby 2.4.1 that interferes with Travis-CI’s ability to write gems, so I had to walk the version back to 2.4.0. I did not bother changing the version of Ruby I was using to write my project though. So far, so good, but your mileage may vary.)

Let’s find out if it works. Add .travis.yml to your repository, commit, and push. Now if you click the link to your repo, or go directly to

…well, probably nothing will appear to be happening. Be patient. Soon enough, there should be an entry with your most recent commit message, marked in yellow to signify that the build is still in progress. In about a minute, assuming that you only have that one passing test in your code, the build will be marked with green, and the message, “#1 passed.”

Hooray! Your tests all passed, and your integration is complete.

But… why?

This is probably 100 times slower than it was to run the tests locally. So why bother? There’s several reasons. First, when your tests are numerous enough that they may take several minutes to run, this way you can keep developing code while your tests run off of your machine. Second, it’s easy enough to run unit tests on your own machine, but Travis could, for example, run tests that involve resources you don’t control. Maybe a change in a dependency breaks your project when it’s deployed even though it works fine locally, or there’s a difference in behaviors when you switch from your test database to your production database.

But perhaps most intriguing is, you can set up triggers for different behaviors for Travis to initiate based on the passing or failing of tests. For example, you might edit your blog, and when you push changes, set up Travis to notify you if there are broken links in the blog, and deploy the blog if and only if the tests pass. Me, I went to my company’s Slack channel, typed /apps travis, and got a link to instructions on how to create a bot that will post to the private apprenticeship channel I’m in with my mentors, so that they are notified when I’ve pushed changes for one of my assigned projects. It’s simple, but useful. (If you do this, make sure to look into using the Travis-CI command line tool to encrypt the necessary tokens – after all, they will be getting pushed to your repo, and you don’t want to put secrets up there.)

vector trouble

Written by Thomas Henderson
Published on 01 August 2017

This year I discovered I really like drawing in vector graphics. My phone has a large screen with a pop-out stylus; it’s sort of like having a magic index card to draw on. I can change its size, select things and copy or resize them… I really like it. And since I’ve kept a hand-written journal for many years, my handwriting is clear enough that I’d love to produce some technical zines, following in the footsteps of Julia Evans. Her work has taught me, well why not read about tcpdump and strace with stick figures?

However, I’ve got some things to learn about how to serve resources and work with the SVG file format. Here are some things I’ve tried so far.

Inline SVG

Supposedly markdown lets you just drop into the html language whenever you feel like it. At least, I thought it did. But when I type this right into the markdown file that gets turned into the post you’re reading right now:

<svg height="100" width="100">
<circle cx="50" cy="50" r="40" stroke="black" stroke-width="3" fill="red" />

This is what I see:

Maybe you can see it, but I can’t.

Hexo Asset Folders

Per Hexo’s docs, there are at least two ways to serve non-post assets such as images.

If I make a source/images directory and put the previous SVG code as circle.svg, then refer to it using markdown’s syntax for an image link, like ![Alternate Title](/images/circle.svg), what happens?

Alternate Title

All I see is, “Alternate Title.” Dang. Well, that’s the “global asset folder” way. Maybe if I try the “post asset folder” method? It creates a new folder for every post, which seems excessive but would also mean that I could re-use names, I suppose.

Let’s try it. There’s a folder for this post, I’ll put a copy of that circle.svg code in it, and try ![Another Alternate Title](/vector-trouble/circle.svg):

Another Alternate Title

Well, heck. Last thing to try is this special syntax that Hexo 3 has for referencing assets. Supposedly it will make things render correctly in archive and index pages, whereas the other options would render correctly only in posts.

What you cannot see is four instances of asset_img something-i-hoped-would-work/circle.svg wrapped in curly braces and percentage signs. Alas, none of them draw a circle.


Thus my vector comics experiments end in failure. There are at least three things I don’t understand well:

I’m sad not to have diagramming as a communication tool, but I will get some help and try again.