Coverage Test Metrics and Test Quality

Rate this item
(2 votes)

Coverage test metrics are vital for measuring the efficacy of your software development teams' testing efforts. The point of software testing is to ensure as much of the code gets tested as possible to ensure a high-quality product—coverage metrics help you evaluate this.

There is often confusion about coverage test metrics because different types of coverage metrics measure quite different things. The following article explains the main coverage metrics that development teams measure, how insightful coverage metrics are in terms of what they tell you about the quality of underlying applications, and what other metrics can measure overall test quality.


Types of Coverage Metrics

Line Coverage    

Line coverage (also known as statement coverage) measures the percentage of lines of code executed when you run a suite of tests. For example, consider the below code illustration:

coverage 1

If a test runs four out of five of these lines, the line coverage is 80 percent. While line coverage is useful for highlighting obvious areas of untested code that new test cases should be written to cover, it doesn't give a comprehensive picture of software quality—it's a rather primitive coverage metric.

Branch Coverage

Branch coverage measures whether each possible branch from each decision point in the code is executed at least once in software test cases. Branch coverage ensures tests cover what happens when both the true and false conditions in a statement are hit.

For an illustration of branch coverage, consider the below code in the C programming language:

coverage 2

A test case for this code that only covers what happens when the if conditions are met (x greater than 0 and y greater than 0) gives a branch coverage of only 50 percent. This is because there is another scenario that needs testing which our test case has missed—what happens when the if conditions are not met? To obtain 100 percent branch coverage, the test case would need to check the code when either of the if conditions don't hold true.

It is more challenging to achieve a higher percentage of branch coverage, however, branch coverage implies a codebase that is more comprehensively tested than one with high statement coverage. Thus branch coverage is more indicative of software quality than simple line coverage because more of the application's behavior has been checked.       

Functional Coverage

Functional coverage is a measure of how much of the software's functionality is covered by the tests you write. Functional coverage is difficult to measure for an existing large codebase because it requires writing many tests that will end up being redundant.                

To improve the chances of achieving full functional coverage, tests should be written for any added functionality code at the time the code is written. This practice is referred to as test-driven development (TDD), and it ensures all functionality is covered by tests while minimizing redundant testing.

How Effective Are Coverage Metrics?

Code coverage is one of the most common test metrics used by software QA teams.

As a broad overview metric, code coverage is useful in determining areas of code that don't execute in response to a stimulus (i.e. running a test).

But you can't glean any information about the quality of tests using code coverage—you might have 100% code coverage from badly written tests that don't actually test anything.

Aiming for a certain percentage of code coverage is not the right way to go about it. High code coverage tends to be the result of good testing practices, and low code coverage should be investigated.

Functionality coverage metrics are more effective because they ensure that tests cover 100% of the functionality you need your software to provide for users. Striving for a high functionality coverage metric ensures that you improve the chances of having high quality software.

Potential Problems with Test Coverage

There are many types of coverage metrics, but arguably the most controversial one is test coverage—this is a measure of how much of the code is exercised when we run a test suite. The terms test coverage and code coverage are often used interchangeably to describe the same metric.

The previous comment on poor test quality is only the beginning of the potential issues with test coverage, which extend to:

  • Acceptable levels—how do you define what constitutes acceptable test coverage? 80 percent? 90? There is no real industry standard here—the informal consensus is that 80 percent is adequate, but this consensus differs for “critical” systems. With no real definition of an acceptable test coverage level, confusion ensues as to what software development teams should aim for.
  • Fudging—consider this example of a testing requirement: “you can't go into production with less than 88% coverage.” Who is to say that in a project which demands a certain level of coverage that developers won't simply manipulate this metric by writing a host of easy or low quality test cases that don't actually test to the level of detail required for a quality application?

A possible solution to the issues with test coverage is to improve the ratio of good tests in your test suites. For example, you can implement a practice such as test-driven development (TDD). With TDD, you code the application in short increments based on initially failed test cases that define a desired function in the application or some improvement to the system.

The below diagram shows the TDD process in motion:

coverage 3

The TDD approach works because it ensures all application requirements are covered by test cases. By definition, TDD should lead to good test cases because it focuses on designing tests around use requirements. Furthermore, high coverage is implied by TDD, since all application code is written to pass a test.

Metrics That Measure Overall Test Quality

Since we've established that code coverage doesn't say much about test quality, it's natural to move on to some metrics that do measure test quality. The below metrics can better gauge the quality of any testing effort than code coverage.

Defect Removal Efficiency

Defect removal efficiency (DRE) tells you the percentage of defects removed during the testing phase out of the total defects found. Since tests are written to find bugs in the software, this metric informs you on how well the testing team are writing their tests. You can use DRE and compare it over releases to improve the testing efforts.

coverage 4

The above diagram shows you how defect removal efficiency can be graphed over time, showing whether the testing efficiency is improving in terms of finding and fixing defects.

Number of Severe Defects Found In Production

While defect removal efficiency is useful for determining test quality, it makes no distinction between types of defects.
For example, a bad defect removal efficiency might be skewed by lots of minor defects found after a release, which don't necessarily affect the software's functionality.

By categorizing defects and measuring the number of severe bugs found in production, you also get a measure of how well the testing team is doing its job. Ideally, no severe bugs should make it through to production.

Closing Thoughts

  • Test metrics that focus on code coverage, test coverage, and functional coverage are all useful in their own ways at improving software testing efforts.
  • Many organizations monitor their unit test code coverage with statement coverage and branch coverage, but it is essential to monitor functional coverage as well.
  • You should use other metrics alongside code coverage to get a complete picture of your testing ship and the quality of the tests your team’s run.
  • The plethora of different strategies and techniques used to test both functional and non-functional software requirements leads to huge challenges for organizations in getting a holistic picture of the entire testing process and the quality of software they produce.
  • A potential solution is the Sealights continuous testing platform, which provides a unified dashboard across all types of tests and the ability to monitor functional test code coverage. This allows teams to identify effectiveness, duplications, and overlaps, and use the data to plan effective and efficient test automation.
  • It's important to not only focus on unit test coverage. Development teams need to ensure greater regression test coverage that better approximates software quality because it tests for dependencies in both new and old software features.
  • Additionally, teams need to properly test the user interface. UI tests are typically difficult to write and maintain, and they are slow to execute. Sealights' continuous testing platform simplifies and speeds up the UI testing process.
Read 308 times Last modified onThursday, 09 November 2017 17:58
More in this category: « How to clean Joomla cache

Only registred users have rights to post comments. Please log-in or create an account.

We have 51 guests and no members online
Copyright © 2015 JotComponents
We have 51 guests and no members online
Copyright © 2019 JotComponents