Thursday, August 28, 2008

GSOC OverView

My GSOC project was all about testing for PyGame;

  • I wrote lots of tests; Almost every module in PyGame now has at least one test

  • Test modules can now be isolated in subprocesses; one segfault no longer brings down the whole test suite

  • Can now test for speed regressions; important for real time software such as games

  • PyGame Automated Build Page extended
    • Shows / Collects more info
    • Runs tests in subprocesses


  • Test Stubbing Utility: A Testing "Todo List"

  • Optional Interactive Tests / Test Tagging



For writing the tests I wrote a small utility that inspects the PyGame package and finds all the untested callables (functions, properties) and creates test stubs, including documentation for each so you don't have to leave the editor. The stubber knows which functions have already been tested by using a naming scheme for all of the tests. Essentially, "test_$callable__$comment", namespaced by having TestCase[s] per Class and a test module per module.

In this way I could create stubs for each module, essentially a TODO list, and cycle through all the modules looking for tests that were easy to write. The functions in PyGame are many and greatly varied, each requiring somewhat specialised knowledge to test. I wasn't able to write tests for all them but hopefully the test stubbing utility will help enable some testing sprints. I intend to develop a testing website where people can submit bugs/tests in the form of a unittest.

PyGame has a somewhat unique set of requirements compared to most python libraries in that most of the framework is actually written in C. C code when it goes awry can do some very strange things. We had a test runner running all of the tests in one single process so if one failed hard it would bring down the whole suite. This can be a bit of a pain so I developed a test runner that isolates each module in a subprocess.

Some of the tests in PyGame have requirements that make them unsuitable for running as part of the main test suite. For example some require a CDRom, a JoyStick, take way too long or need interaction with a human. With the test runner script I extended unittest with the ability to exclude certain tests by tags. The tags can be module, class or individual test level and are inheritable/ over-ridable.

Another extension to the test runner was the ability to randomize the run ordering of tests, so along with the test results the seed is printed out. If there are failures you can seed the randomizer with the failure inducing seed. We also wanted to be able to record the timings of each individual test so we could make comparisons between revisions / platforms. I again extended the test runner with that ability.

I worked with Brian Fisher to extend the PyGame automated build page to record the test results in a ZODB and utilize the new test runner to run tests in subprocesses. We will be able to use this information for detecting speed regressions amongst other things.

Saturday, August 23, 2008

Johnny, Kick A Hole Right In The Sky

Johnny, Kick a hole right in the sky! Won't some body testify? Poke a lion in it's eye!

I bought pygame-testify.net today, and set up a python/cgi based form that takes a zip and enumerates the results + adds the (safe evaled) test results dict to a ZODB.

I found a multi-part python snippet for POST[ing] of test results.

The test/build page is starting to come together.

I am using htpasswd for security.

Saturday, August 2, 2008

todo_xxxxxxx

I recently altered the "fail incomplete tests" mechanism we use in the pygame test runner. Before we were doing assertions on test_utils.test_not_implemented(). This would check a module level variable test_utils.fail_incomplete_tests, which we would set as desired depending on whether we wanted to fail incomplete tests.

This was a fairly non-invasive technique but as I was already hijacking the test loading mechanism for filtering tests by tags, I realized I could alter the TestLoader class to pick up tests starting with the prefix "todo_" as well as "test_". I would call TestCase.fail directly which would only run if picking up todo_ tests.

This of course meant altering all the stubs. I pondered briefly doing a mass search and replace, completely automating it but I don't really trust that for tests.

For the test stubs I have been including the documentation so it's really easy to walk through a test file writing tests without having to leave the editor. I was just using inspect.getdoc to get the __doc__ string.

It seems the documentation included in the .doc files is different to that contained in the __doc__ for each function. The __doc__ seems to be the function signature and a very brief, usually one sentence description. The .doc files contains a lot more detailed descriptions that can be very useful when writing tests.

I quickly added a docs_as_dict() function to makeref.py, then added it to the stub generator. The stub generator will add both the __doc__ and the .doc file documentation to each stub.

I went through semi-manually updating all the unfilled out stubs for each test file with the more complete docs and the new todo_xxxxx test naming. It took about an hour but I feel more confident than if I had just grep'd it.

Everything is pretty much now in place for the test site I wanted to create.

Test Timing
Test Tagging
Isolated Tests