Gödel and Testing

For some months now I’ve being playing around the idea of writing a testing framework for Clojure. It started as just a more extensible fork of the fact library but now I’m trying to explore some funny ideas in the testing semantics.

Although this project is progressing too slowly it already spawned some other pet projects. One of those projects is the struct-quack library.

This is just an extremely naïve implementation of duck-typed structs for Clojure. Although I am ok using maps for pretty much everything I often need “maps that only accept certain keys” – a register/struct pretty much.

As a TDD practitioner I need some testing support for it. As I already had a somewhat working version of my testing library by that time I used it.

After finishing struct-quack’s first iteration I started to rewrite the testing framework itself to use that and perform some smart multimethods magic.

And now I have a hard problem to solve: what am I going to use to test the struct-quack library in its next iteration? Given that neither Java nor Clojure have versioned packages, if I use the library in the testing tool I just put myself in a circular dependency chain.

Circular dependencies are bad in multiple contexts. In this case it’s not only bad but also dangerous. As I modify struct-quack I may introduce a bug. This bug may affect the testing framework itself and tests that should fail could pass. There is no way to assert that struct-quack has no known bugs if its tests rely (even indirectly) on itself.

Shoulder of Giants?

I’m reading The Annotated Turing (great book so far) and the author does a great job in covering some basic concepts behind Turing’s work, including explanations of those in a way non-mathematicians (like me) can understand.

One of those building blocks is the work of Kurt Gödel. This called my attention:

[Gödel] had used the arithmetic developed within the system to associate every formula and every proof with a number. He was then able to develop a formula that asserted its own unprovability. […] The formula asserts nothing about its own truth or falsehood, but instead it’s unprovable. If arithmetic is consistent, then this formula can’t be false, because that would lead to a contradiction. The formula must be true –but true only in a metamathematical sense because truth is not a concept of the logical system itself- which means that it really is unprovable.
[…]
One crucial premise for the Incompleteness Theorem is that arithmetic is consistent. As a corollary, Gödel also showed that a consistency proof for arithmetic within the system was impossible. Because certain formulas could not be proved or disproved, it was possible that these formulas were inconsistent.[…]

As I said before I’m no mathematician but this looks like the same problem we saw in the previous section. It is just impossible to prove –again, in the usual unit testing sense- that a system works if the test code verifying that depends on code from the system under test.

In the Real World

But we all know that tests are not formal proofs and most of us are no Dijkstra or Knuth to validate every single line of code we write. What’s in there for me then?

Whenever I’m in a consulting gig I often find people immersed in the problem described above. The rationale for this is simple: testing requires a lot of code so we must reuse code we already have.

At lower level testing -unit-testing classes or functions- probably all you need is a testing framework and some mocking library. Units are small and easy to test.

The problem arises in higher-level testing. When you start thinking about page models and story runners you often need to create data structures to use during tests. The issue here is that you already have those structures in your domain code.

It is really tempting to just reuse your production code with a thin mapper from HTML back to the class.

But if you think about this what you have is a circular dependency between code under test and test code, just like what I described in sections above:

When I find this scenario it is often common that a huge change will make no tests break as the bug has corrupted the tests themselves. The level of confidence in testing decreases and people stop writing and fixing tests.

Your tests are compromised. In this scenario a failed test says that there is something wrong but a successful test doesn’t mean that everything is ok. If the test pass does it mean that the code is correct? What if the user class has a bug and, therefore, your very test has a bug?

For me a correct test is a test that verifies the right behaviour in the right way. A correct test won’t necessarily be a passing test. A correct system (or code, class, function… whatever unit you use) is one that passes a correct test.

In a good design there must be a clear separation of test code and code under test. They have to be orthogonal, one must not depend on the other to be correct and bugs in one must not interfere in the other in any way.

The duplication of the user class in this example is really annoying and wasteful but the solution proposed above affects the effectiveness of the whole testing process.

I can’t think of an easy solution for this problem. Currently my suggestion for most cases is to avoid using a bureaucratic language like Java to write your testing model. Languages like Clojure, Ruby, JavaScript or Python make it so easy to create data structures that the amount of code required to create your test model is actually less than what it would be required to reuse production code.

12 Responses to “Gödel and Testing”


  1. 1 Ashutosh Mohan Mar 14th, 2009 at 8:40 pm

    The Godel aspect only seems to be a very rough analogy. Mostly forcefully twisting of Godel into an interpretation which is only very vaguely valid.

  2. 2 Phillip Calçado "Shoes" Mar 15th, 2009 at 1:03 am

    Ashutosh,

    It would be lovely if you’ve presented any argument. Any at all. Right now your comment says nothing.

  3. 3 beza1e1 Mar 15th, 2009 at 2:32 am

    Your circular dependency could be broken by versioning. Example: quack 0.1 (untested) <- fato 0.1 <- quack 0.2 <- fato 0.2 <- quack 0.3 <- …
    No circle here.

  4. 4 RobertJ Mar 15th, 2009 at 4:52 am

    While Gödel’s proof is fundamental, your unit test problem is really trivial. With enough resources, unit tests won’t be even necessary because code can be formally proven. So you’re basically comparing redundant code with one of the most important theorems of the last 100 years.

  5. 5 RobertJ Mar 15th, 2009 at 5:03 am

    Additionally, we won’t even need to bother with testing, if it was possible to build automated test machines. Fortunately, we already know that these machines will never exist, so we can move on other problems, like unit testing ;)

  6. 6 ifatree Mar 15th, 2009 at 5:12 am

    easy to answer, actually, harder to make happen, sometimes, but…

    mock up a static app (that won’t change much) to exhaustively work the testing framework, and make sure that one always works before you assume your framework changes are green.

    otherwise you’ve just promoted your test app to production status without leaving a test copy behind. struct-quack was essentially manually testing fato. now that you want fato to test struct-quack, you need to pull out the pieces that can test fato directly and make a third app underneath.

    i hope this concept maps to clojure. i’ve seen people with similar questions lately in every language trying to get their heads around testing. it’s not circular unless you make it circular (which is bad).

  7. 7 Titus Brown Mar 15th, 2009 at 6:12 am

    Automated test code isn’t a proof of any sort, so the analogy doesn’t hold.

    Another way to state Godel’s theorem is, I think, that within any sufficiently interesting formal system it is impossible to prove or disprove certain interesting theorems. Were your programming language a formal system, you could probably draw a stronger analogy; as it is, your language and platform are implementations of some (not necessarily well-specified) system and so long before you get to formal correctness you have to worry about implementation details.

    At least, that’s what I think. ;)

  8. 8 Phillip Calçado "Shoes" Mar 15th, 2009 at 9:46 am

    **anonymous comments are deleted**

    @beza1e1

    Your circular dependency could be broken by versioning.

    True, but as I say in the post neither Clojure nor Java have versioned packaes so I would have to either create a hacky version (i.e. changing namespaces to have their versions in their name) or implement my own versioning system.

    @RobertJ

    I’m not sure you got the idea. It’s an analogy:

    # S: (adj) analogous, correspondent (similar or equivalent in some respects though otherwise dissimilar) “brains and computers are often considered analogous”; “salmon roe is marketed as analogous to caviar”

    ( http://wordnetweb.princeton.edu/perl/webwn?s=analogous )

    @ifatree

    Yes, in the end that’s the approach I’m taking. Fato itself is tested by a simple application that uses it, if this app’s build pass then fato’s build is ok.

    @Titus

    Interesting. Why wouldn’t a test (forget about TDD) e a proof of *any* kind? I’d like to understand your comment better.

    Thanks all.

  9. 9 Damien Pollet Mar 15th, 2009 at 10:16 am

    Why not just make a copy of the whole thing and rename the package/namespace declarations? Basically an ad-hoc versioned package if I understood what that is…

  10. 10 Phillip Calçado "Shoes" Mar 15th, 2009 at 11:38 am

    I thought of that but gave up when thinking that I would have to use fully qualified calls for all functions. Then I thought about porting a thing I wrote for Ruby (that has the same versioning issues) to add version information to Clojure namespaces. That’s yet another pet project to be procrastinated :)

  11. 11 Christoph Johann Szczecina Mar 23rd, 2009 at 7:16 pm

    Hi Phil,

    On coincidence I found your Scrum pictures on flickr, read a line or two here and thought you might be a good person to talk with. I am currently writing my master thesis on Scrum and I am looking for creative people working in Scrums being interested in taking part in. Would be nice hearing from you, if you are :-)

    With best regards form Trondheim,

    Christoph

  1. 1 RTFSpec is now in Alpha at Fragmental.tw Pingback on Apr 17th, 2009 at 12:13 pm

Leave a Reply








Creative Commons License

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.