Automated Testing of Software

The OSG Software team maintains over 200 software packages, and every monthly release includes changes to about 20 different packages. How does OSG ensure that its software will work at all sites? Manual testing of individual and integrated components is critical, especially for specific issues or for exploratory testing, where human judgment is important. Increasingly, however, automated testing is being used for fast, broad and precise coverage of routine test cases. Typically, developers run tests before promoting a package from development into formal testing, then the OSG Release team runs them again as a final check before production. Plus, a broad suite of tests run every night to catch issues at any stage of the release cycle.

Each automated test run simulates tasks that a site administrator performs: It installs OSG software on a bare machine, configures and integrates the components, verifies basic integrated functionality, and cleans up and removes the OSG software. Every step can fail and hence is expressed as a test. The final output of a test run indicates which tests ran and which passed and failed. A run has up to about 300 tests, with light coverage of most components and a focus on core functionality such as running jobs, moving files, and handling security, accounting, and monitoring. The same tests can be run in myriad scenarios, varying by operating system, by software packages that are installed, and for fresh installs and updates (including OSG 3.1 to 3.2 updates).

A full suite of all permutations of test scenarios includes about 400 individual test runs, each taking 15–30 minutes to complete. On a single machine, this suite would take about 150 hours; yet we want to run this suite every night. Fortunately, each test run is independent and so this is a perfect opportunity to use high-throughput computing. The automated suite runs at the University of Wisconsin–Madison’s Center for High Throughput Computing using HTCondor’s virtual machine universe. Each test run is a separate job and runs in a virtual machine on an execute node. DAGMan is used to organize the workflow, preparing the suite, setting up input disk images for each test scenario, running tests, and analyzing results. In addition to the fully automated nightly runs, any OSG developer can initiate a test run with a few simple commands, simplifying access to this powerful testing tool.

Latest test results are viewable here.

~ Tim Cartwright