Detecting Software Errors Using Genetic Algorithms

Saarland University

This news release is available in German.

According to a current study from the University of Cambridge, software developers are spending about the half of their time on detecting errors and resolving them. Projected onto the global software industry, according to the study, this would amount to a bill of about 312 billion US dollars every year. “Of course, automated testing is cheaper,” explains Andreas Zeller, professor of Software Engineering at Saarland University, as you could run a program a thousand times without incurring any charges. “But where do these necessary test cases come from?,” asks Zeller. “Generating them automatically is tough, but thinking of them yourself is even tougher.”

In cooperation with the computer scientists Nikolas Havrikov and Matthias Höschele, he has now developed the software system “XMLMATE.” It generates test cases automatically and uses them to test the given program code automatically. What is special about it is that the only requirement the program to be tested has to meet is that its input must be structured in a certain way, since the researchers use it to generate the initial set of test cases. They feed them to the so-called genetic algorithm on which the testing is based. It works similarly to biological evolution, where the chromosomes are operating as the input. Only the input that covers a significant amount of code which has not been executed yet survives. As Nikolas Havrikov explains their strategy: “It is not easy to detect a real error, and the more code we are covering, the more sure we can be that more errors will not occur.” Havrikov implemented XMLMATE. “As we use the real existing input interface, we make sure that there are no false alarms: Every error found can also happen during the execution of the program,” adds Zeller.

The researchers have unleashed their software on open source programs users are already working with in daily life. With their program they detected almost twice as many fatal errors as similar test methods that only work with randomly generated input. “But the best thing is that we are completely independent from the application area. With our framework, we are not only able to test computer networks, the processing of datasets, websites or operating systems, but we can also examine software for sensors in cars,” says Zeller.

The computer scientists in Saarbrücken developed XMLMATE in the Java programming language. The input for the software to test is defined according to the description language XML, so the existence of a XML schema is helpful. Since XML is standardized and considered as a kind of world language between input formats, most of the programming input fits XMLMATE and if not, it can be quickly converted to do so with the corresponding tools.