Introducing Skunk: Combine Code Quality and Coverage to Calculate Your Project's SkunkScore
Two weeks ago I had the opportunity to speak at Solidus Conf 2019 . I presented Escaping the Tar Pit for the first time and I got to talk about a few metrics that we can use to quickly assess code quality in any Ruby project.
In this article I’d like to talk about Skunk: A SkunkScore Calculator ! I’ll explain why we need it, how it works, and the roadmap for this new tool.
Every month we get contacted by leads (potential clients) who want to work with us on their Rails upgrade projects . Given that we have some basic requirements for all of our new client projects, we want to carefully analyze every project before we commit to working on it.
We analyze two very important aspects:
- Code Coverage
- Code Quality
For Code Coverage we like to use SimpleCov . For Code Quality we like to use RubyCritic . Both tools give us a few signals which tell us a story about the health of a Rails application. We want to answer these questions:
- Is it a dumpster fire?
- Are we going to get ourselves stuck in the tar pit?
- Is it a project that is easy to maintain?
Skunk is a Ruby gem that will combine code quality metrics from Reek ; Flay ; Flog ; and SimpleCov to calculate a SkunkScore for a file or set of files.
Skunk is a library built on top of
RubyCritic . It uses the cost
value
for each module:
module RubyCritic
class AnalysedModule
def cost
@cost ||= smells.map(&:cost).inject(0.0, :+) +
(complexity / COMPLEXITY_FACTOR)
end
end
end
The cost
is a combination of smells and complexity:
- Smells: They come from static code analysis performed by Flog; Flay; and Reek.
- Complexity: It comes from Flog’s total ABC metric
After determining that the cost, Skunk penalizes modules which lack code coverage by multiplying their cost by a factor directly related to the lack of coverage:
module RubyCritic
# Monkey-patches RubyCritic::AnalysedModule to add a skunk_score method
class AnalysedModule
PERFECT_COVERAGE = 100
# Returns a numeric value that represents the skunk_score of a module:
#
# If module is perfectly covered, skunk score is the same as the
# `churn_times_cost`
#
# If module has no coverage, skunk score is a penalized value of
# `churn_times_cost`
#
# For now the skunk_score is calculated by multiplying `churn_times_cost`
# times the lack of coverage.
#
# For example:
#
# When `churn_times_cost` is 100 and module is perfectly covered:
# skunk_score => 100
#
# When `churn_times_cost` is 100 and module is not covered at all:
# skunk_score => 100 * 100 = 10_000
#
# When `churn_times_cost` is 100 and module is covered at 75%:
# skunk_score => 100 * 25 (percentage uncovered) = 2_500
#
# @return [Float]
def skunk_score
return churn_times_cost.round(2) if coverage == PERFECT_COVERAGE
(churn_times_cost * (PERFECT_COVERAGE - coverage.to_i)).round(2)
end
end
end
After doing all these calculations, we get a Skunk Score for the files we are evaluating:
$ skunk
running flay smells
.............
running flog smells
.............
running reek smells
.............
running complexity
.............
running attributes
.............
running churn
.............
running simple_cov
.............
New critique at file:////skunk/tmp/rubycritic/overview.html
+-----------------------------------------------------+----------------------------+----------------------------+----------------------------+----------------------------+----------------------------+
| file | skunk_score | churn_times_cost | churn | cost | coverage |
+-----------------------------------------------------+----------------------------+----------------------------+----------------------------+----------------------------+----------------------------+
| lib/skunk/cli/commands/default.rb | 166.44 | 1.6643999999999999 | 3 | 0.5548 | 0 |
| lib/skunk/cli/application.rb | 139.2 | 1.392 | 3 | 0.46399999999999997 | 0 |
| lib/skunk/cli/command_factory.rb | 97.6 | 0.976 | 2 | 0.488 | 0 |
| test/test_helper.rb | 75.2 | 0.752 | 2 | 0.376 | 0 |
| lib/skunk/rubycritic/analysed_module.rb | 48.12 | 1.7184 | 2 | 0.8592 | 72.72727272727273 |
| test/lib/skunk/cli/commands/status_reporter_test.rb | 45.6 | 0.456 | 1 | 0.456 | 0 |
| lib/skunk/cli/commands/base.rb | 29.52 | 0.2952 | 3 | 0.0984 | 0 |
| lib/skunk/cli/commands/status_reporter.rb | 8.0 | 7.9956 | 3 | 2.6652 | 100.0 |
| test/lib/skunk/rubycritic/analysed_module_test.rb | 2.63 | 2.6312 | 2 | 1.3156 | 100.0 |
| lib/skunk.rb | 0.0 | 0.0 | 2 | 0.0 | 0 |
| lib/skunk/cli/options.rb | 0.0 | 0.0 | 2 | 0.0 | 0 |
| lib/skunk/version.rb | 0.0 | 0.0 | 2 | 0.0 | 0 |
| lib/skunk/cli/commands/help.rb | 0.0 | 0.0 | 2 | 0.0 | 0 |
+-----------------------------------------------------+----------------------------+----------------------------+----------------------------+----------------------------+----------------------------+
Skunk Score Total: 612.31
Modules Analysed: 13
Skunk Score Average: 0.47100769230769230769230769231e2
Worst Skunk Score: 166.44 (lib/skunk/cli/commands/default.rb)
The most important signals here are:
- Average Skunk Score per module
- Most complex files with little to no code coverage
We now know where we stand. We can clearly see the state of the application in terms of code coverage and project complexity. We can now answer this question: “Which are the most complex files with the least coverage?”
We can use the Skunk Score to guide us in our refactoring efforts:
- How can I pay off technical debt and invest in the future of my application?
- If I were to write tests to decrease the Skunk Score, which files could I write tests for?
- If I were to refactor some of the most complex files, which files with good code coverage could I refactor?
Caveats
Skunk expects you to have a .resultset.json
file in the coverage directory
within the directory that you are evaluating. It uses the data within that file
to calculate the code coverage percentage for each module.
That means that you will have to run your test suite with SimpleCov enabled
before you call skunk
.
Total Skunk Score is not a useful metric within a single project, as the total will continue to grow as you add more features to your application. It is certainly a useful metric if you use it to compare two projects.
Known Issues
The calculation of the Skunk Score is not 100% accurate. It is comparing a module’s code coverage and a module’s complexity. It should be a method-based calculation: It should calculate the complexity of a method, the code coverage of the same method, then calculate the Skunk Score per method.
Finally, the Skunk Score of a module should be the sum of all the Skunk Scores in the module.
Roadmap
Assessing code quality for an application shouldn’t stop at the application level. The Skunk Score of our application is composed by two Skunk Scores:
- Skunk Score of your application
- Skunk Score of your dependencies
Right now Skunk will only calculate Skunk Score for your application code. In the future it should consider your dependencies as well, generating a Skunk Score for each individual dependency.
The best way to assess progress in your project is to keep track of the Skunk Score average over time. Is that number going up? Is it going down? How much does your pull request change the Skunk Score average? Right now Skunk does not support this, so you will have to do it manually.
Final Thoughts
I know that “stink” is a negative word to judge an application’s technical debt and it might lead you down a negative path. By all means I don’t want the Skunk Score to be used in a witch hunt, to point fingers at code authors, or in a negative way in your team.
I seriously hope that you can use the Skunk Score as the compass to move your team in the right direction. You should be able to use the Skunk Score as a compass to gradually pay off technical debt:
- Writing tests which increase code coverage will improve the Skunk Score
- Refactoring complex files will improve the Skunk Score
Skunk will show you your location in the map of technical debt. It will also show you a few paths to take to get to a better place. You will be able to prioritize the paths and pick one to pay off technical debt.
What do you think about this new metric for technical debt? Would you use it next time you need to evaluate legacy code?
Please let me know in the comments below. If you want to watch my talk at RubyConf 2019, here it is: