Debunking Cyclomatic Complexity
March 15, 2008 —
(Page 1 of 2)
There can be little doubt that metrics are emerging as a new dimension in the management of code quality. Whereas five years ago, few people except for software engineering wonks cared to run metrics on their code, now many managers are starting to view metrics dashboards as a key tool for knowing where a project stands. I suspect, but don’t know for sure, that the door was opened by unit testing: the ability to have a visual display of test results and then of code coverage stimulated the desire to obtain additional quantitative data about codebases.
The current infatuation with metrics has led to the creation of many new metrics—an explosion of measures that perfectly parallels the explosion of sabermetrics in baseball. With so many emerging metrics, it’s hard to know what is useful. So, almost a year ago, Enerjy—a company that specializes in metrics dashboards—decided to undertake extensive research into which metrics most track the likelihood of defects. They examined more than 50 open-source projects. They combed through code release-by-release and matched bug reports back to the individual modules; from this, they built up a statistical model that would identify which metrics were the best predictors of problems in code. The No. 1 predictor—I doubt this will surprise anyone—is the amount of code in a given module. The more code, the greater the odds of a bug. This seems kind of obvious: If all code has bugs, the more code in a module the more likely it will have bugs. However, as unremarkable as this correlation is, it testifies powerfully to the benefit of small, discrete methods, which is a keystone of object-oriented programming.
What the survey did not show, however, is that code complexity does not correlate directly to defect probability. Enerjy measured complexity via the cyclomatic complexity number (CCN), which is also known as McCabe. It counts the number of paths through a given chunk of code. Even though CCN has limitations (for example, every case statement is treated as equal to a new if-statement), it’s relied on as a solid gauge. What Enerjy found was that routines with CCNs of 1 through 25 did not follow the expected result that greater CCN correlates to greater probability of defects. Rather, it found that for CCNs of 1 through 11, the higher the CCN the lower the bug probability. It was not until CCN reached 25 that defect probability rose sufficiently to be equal that of routines with a CCN of 1. This is an important discovery, because it essentially states that there is no correlation between CCNs of 1 through 25 and bug expectancy. By no correlation I mean here that for half this range, a higher CCN indicates a lower chance of defects, while for the other half of the span, it implies a higher likelihood.