Code reviews on steroids using mutation testing

I always found reviewing tests to be a bit tedious, specially if they aren't written in a manner that makes them easy to read. Today I reviewed a pull-request that has a significant amount of tests. Rather than scrutinizing the test cases to make sure there were no gaps, I decided to fetch the branch and manually apply mutation testing.

Mutation testing (or mutation analysis or program mutation) is used to design new software tests and evaluate the quality of existing software tests. Mutation testing involves modifying a program in small ways.[1] Each mutated version is called a mutant and tests detect and reject mutants by causing the behaviour of the original version to differ from the mutant. This is called killing the mutant. Test suites are measured by the percentage of mutants that they kill. New tests can be designed to kill additional mutants. Mutants are based on well-defined mutation operators that either mimic typical programming errors (such as using the wrong operator or variable name) or force the creation of valuable tests (such as dividing each expression by zero). The purpose is to help the tester develop effective tests or locate weaknesses in the test data used for the program or in sections of the code that are seldom or never accessed during execution.

I started to randomly modify the new code that was covered with tests. Then I verified whether the tests would still pass or fail (expected behaviour). This allowed me to pinpoint a few missing test cases (yes there were a few).

While it's a shame that I have had to go through this manually, I view it as a cheap MVP for advocating the use of mutation testing for our project. After a few code reviews, the next natural step would be to think about how to automate this process. I haven't checked yet the literature on mutation testing as it relates to Android development, so I am not sure whether that is realistic or not at this stage.