Print

Debunking Cyclomatic Complexity



Andrew Binstock
Email
March 15, 2008 —  (Page 1 of 2)
There can be little doubt that metrics are emerging as a new dimension in the management of code quality. Whereas five years ago, few people except for software engineering wonks cared to run metrics on their code, now many managers are starting to view metrics dashboards as a key tool for knowing where a project stands. I suspect, but don’t know for sure, that the door was opened by unit testing: the ability to have a visual display of test results and then of code coverage stimulated the desire to obtain additional quantitative data about codebases.

The current infatuation with metrics has led to the creation of many new metrics—an explosion of measures that perfectly parallels the explosion of sabermetrics in baseball. With so many emerging metrics, it’s hard to know what is useful. So, almost a year ago, Enerjy—a company that specializes in metrics dashboards—decided to undertake extensive research into which metrics most track the likelihood of defects. They examined more than 50 open-source projects. They combed through code release-by-release and matched bug reports back to the individual modules; from this, they built up a statistical model that would identify which metrics were the best predictors of problems in code. The No. 1 predictor—I doubt this will surprise anyone—is the amount of code in a given module. The more code, the greater the odds of a bug. This seems kind of obvious: If all code has bugs, the more code in a module the more likely it will have bugs. However, as unremarkable as this correlation is, it testifies powerfully to the benefit of small, discrete methods, which is a keystone of object-oriented programming.

What the survey did not show, however, is that code complexity does not correlate directly to defect probability. Enerjy measured complexity via the cyclomatic complexity number (CCN), which is also known as McCabe. It counts the number of paths through a given chunk of code. Even though CCN has limitations (for example, every case statement is treated as equal to a new if-statement), it’s relied on as a solid gauge. What Enerjy found was that routines with CCNs of 1 through 25 did not follow the expected result that greater CCN correlates to greater probability of defects. Rather, it found that for CCNs of 1 through 11, the higher the CCN the lower the bug probability. It was not until CCN reached 25 that defect probability rose sufficiently to be equal that of routines with a CCN of 1. This is an important discovery, because it essentially states that there is no correlation between CCNs of 1 through 25 and bug expectancy. By no correlation I mean here that for half this range, a higher CCN indicates a lower chance of defects, while for the other half of the span, it implies a higher likelihood.




Pages 1 2 


Share this link: http://sdt.bz/31820
 

close
NEXT ARTICLE
Zeichick's Take: Reducing complexity
Software developers should make usability a higher priority so that users aren't confused about how to use their software Read More...
 
 
 




News on Monday  more>>
Android Developer News  more>>
SharePoint Tech Report  more>>
Big Data TechReport  more>>

   
 
 

 


Download Current Issue
JUNE 2013 PDF ISSUE

Need Back Issues?
DOWNLOAD HERE

Want to subscribe?


 
 
 
 

Events calendar tab
Velocity Conf.
6/18/2013 to 6/20/2013
Santa Clara, Calif.
O'Reilly Media
Structure
6/19/2013 to 6/20/2013
San Francisco
GigaOM
Mobile Commerce World
6/24/2013 to 6/26/2013
San Francisco
UBM TechWeb
USENIX Federated Conference
6/24/2013 to 6/28/2013
San Jose, Calif.
USENIX
Microsoft Build
6/26/2013 to 6/28/2013
San Francisco
Microsoft