We write tests around here. It gives us confidence to refactor fearlessly. I know that if I think I see a cleaner way to write some method, but I’m not sure, I can rely on my tests to tell me that my “refactoring” wasn’t a true refactoring — it must have changed some behavior in a way I didn’t see. If that happens, I can stop, back up, and decide: is it more valuable to stop and figure out the real way to clean this code up now? Or is it good enough for now? Good tests mean I can make this decision based on relative value, rather than out of fear.
Now, writing tests doesn’t exactly prove that the software is behaving correctly — all the tests I’ve written during this apprenticeship have tested a finite number of examples, rather than used formal logic — but they do provide evidence that it is. Writing tests before writing code also gives me a chance to think about how I intend to use a method or class, and what name to give it, before I think about the details of its implementation.
I’ve gotten used to writing tests as a tool for thinking. Now I need to get better at writing tests, and choosing what tests to write in what circumstances. In this post, I will address the kinds of ‘small tests’ that I know how to write, saving ‘large tests’ for my next post.
Smoke test: testing that we’re testing
When I start a new project, especially in a new language or with a new test framework, I like to write a test that should always pass, just to make sure I know how to write tests in that situation. Typically my first test is something like
it "does not violate the laws of mathematics" do expect(1+1).to eq 2 end
It’s not strictly necessary, but when I change to a new language, this kind of smoke test gives me confidence that I’ve set things up correctly.
[Correction: This section was written out of a misunderstanding of the ‘function’ in functional testing. See my next post
on testing for a more correct explanation.]
A function is a relationship between a set of inputs, and set of outputs, such that an input has one and only one output. What this means practically is, a function called twice with the same argument(s) gives the same result each time.
Functions are easy to write tests for. At least, easier than writing tests for something that contains mutable state, where calling a method twice can yield two different results. If I’m testing a
square function, I know from math that every time I feed a two to
squared I better get four back. Not all code is written in a functional style, but so far I’ve seen plenty of examples of code that contains pure functions that can be factored out into easy-to-test pieces.
Unit testing refers to testing one component of a system to make sure that it is operating correctly. If a class has good unit tests, then other code ought to be able to rely on the behavior of the class.
Sandi Metz is the author of Practical Object-Oriented Design in Ruby, an office favorite. Last week I watched one of her talks, entitled Magic Tricks of Testing. In this talk she defines two distinct classes of message:
- queries: An object is sent a message, and returns a value. The object does not change state; if the object is sent the same message with the same arguments again, the value returned will be the same.
- commands: An object is sent a message, and returns nothing. Instead, something about the object changes internally, or interacts with the outside world somehow.
It’s possible for a message to be both a query and a command. For example, when you pop something off of an array, you get a value returned, and the state of the array changes — it gets shorter. However, if you have more complicated than popping something off of an array, it is quite likely an anti-pattern to have an unseparated query/command pair. Better to split this apart and test each separately.
Metz also notes a second dimension of categorization for messages:
- incoming: Is the object receiving an message?
- outgoing: Is the object sending a message?
- sent to self: Is the object sending a message to itself?
Pausing to multiply along the query-command dimension, and the message orientation dimension, we get six categories of message. Do we have to test all of them?
Metz says: no! We actually need to write tests for only half of these.
- Incoming Query: Assert that, given some message, the result is what it should be.
- Incoming Command: Assert that the ‘direct local’ side effects after a message are what they should be. (It occurred to me, this assumes that the test object has a public way to get the state of interest. If it doesn’t, perhaps the right thing to do is to subclass it in your test code, adding an attribute reader.)
- Outgoing Command: Expect that the object sends the right thing.
Notice that the first is very like a functional test. And if you squint, the second is also a bit like a functional test, albeit split in two: sending the same message in identical contexts should have the same effect, even though technically you need to call another method in order to verify this fact.
Why don’t we need to test messages sent from an object to itself? Because such a message exists only in order to break the work of an object into smaller parts. But that work, if it matters at all to our application, will matter to the rest of the system. That is, the reason we don’t need to test messages sent to self is, we will discover errors in these messages by testing the public interface of the object, the incoming queries and outgoing commands that it participates in. Rules can be broken, of course, when it’s valuable to do so. If you don’t feel confident you can get these private messages right without testing, Metz says, go ahead and write them — she sometimes does that, too. But when she does, she adds a comment to those tests: “If these tests fail, delete them.”
And outgoing queries? Outgoing queries for one object are incoming queries for another object — tests of these behavior belong with that receiving object.
Next: Outside-In Tests