Cyclomatic Complexity (McCabe)¶
Cyclomatic complexity is the number of independent paths through a function. It was developed by Thomas J. McCabe, Sr. in 1976. Read more in Wikipedia.
What it Measures¶
The number of independent paths through a function: Higher = more branching = more paths to test and reason about. Uses a graph-theoretic formula \(M=E−N+2P\) (edges, nodes, components). In practice, approximate by counting decision points.
How it's Measured¶
if,else ifmatchcases (each non-defaultcase)- Loops:
while,do while,for(including guards) catchclauses- Short-circuit booleans:
&&,||
Why it Helps¶
- Identifies refactoring hotspots and code hardest to test/maintain.
- Enables quality gates (e.g., warn when \(M > 10\)).
Quick refactoring recipes¶
Refactoring for lower perceived complexity is mostly about making decisions obvious. The goal isn’t to chase a smaller number at any cost, it’s to shape the code so the decisions are easy to see, name, and test. Here are some small but effective moves to consider:
1. Small decision tree Complexity = 3¶
Counting: base 1 + if (1) + else if (1) ⇒ 3.
Why it’s fine: Three independent cases, three straightforward tests. Complexity mirrors the domain split.
Testing heuristic
At minimum, cover x > 0, x < 0, and x == 0.
2. Boolean soup vs. named predicates Complexity = 4¶
Dense boolean expressions often pack several decisions into a single line. Extracting named predicates doesn’t reduce
the count of decisions, but it reduces the readability cost. Names like isTransient and safeToRetry carry domain
meaning
and make it easier to see what scenarios are being permitted. This, in turn, improves test ergonomics: you can design
cases that intentionally flip only one predicate at a time. The trade-off is a couple of extra lines. The gain is lower
cognitive load and fewer mistakes when the rule inevitably evolves.
Same complexity, lower cognitive load
Naming intermediate predicates makes intent obvious and tests easier to write.
3. Pattern matching: total & explicit Complexity = 5¶
Pattern matching puts branches next to the domain types so behavior is easy to audit. Even if the complexity stays the same, extracting small helpers prevents any single case from ballooning into its own mini-program. This structure is resilient as tokens grow, adding a new token becomes a compiler-guided change, not a scavenger hunt. If your analyzer flags rising complexity here, it’s usually a nudge to either split handlers or promote behavior to the types themselves.
sealed trait Token
case class Number(n: Int) extends Token
case object Plus extends Token
case object Minus extends Token
case class Ident(name: String) extends Token
def describe(t: Token): String = t match {
case Number(n) => s"number:$n"
case Plus => "plus"
case Minus => "minus"
case Ident(name) if name.nonEmpty => s"id:$name"
case _ => "other"
}
Counting: base 1 + four non-default cases ⇒ 5. Pattern guard doesn’t add beyond the case for this rule set.
private def handleNumber(n: Int): String = s"number:$n"
private def handleIdent(name: String): String =
if (name.nonEmpty) s"id:$name" else "other"
def describe(t: Token): String = t match {
case Number(n) => handleNumber(n)
case Plus => "plus"
case Minus => "minus"
case Ident(name) => handleIdent(name)
case _ => "other"
}
4. Loops & for-guards Complexity = 3¶
Guards inside for-comprehensions are real decision points, and mutation can distract from the core intent. Shifting to
collection combinators doesn’t necessarily lower complexity, but it does isolate it: the predicate is where the decision
lives; sum is just aggregation. This makes future tweaks, like changing the predicate or accumulating additional
stats, cleaner and safer. As a side effect, removing mutation cuts down on incidental state bugs and simplifies
property-based
testing.
Tip
Combinators don’t magically lower complexity, but they isolate it: the predicate owns the decision; the rest is plumbing.
5. Exceptions vs. explicit control flow 3 → 4¶
Exceptions can keep complexity numerically low while hiding control flow. The exception-based example looks smaller numerically, but it hides control flow and couples normal behavior to failure mechanisms. The explicit match is one notch more complex by the metric, yet it is easier to refactor (e.g., to return richer errors), easier to test (no exception stubbing), and friendlier to readers who want to see all outcomes laid out. If performance matters, both versions are similar for valid input, but the explicit version avoids the cost of constructing stack traces in failure-heavy paths.
6. Else-if tower → lookup table 5 → 1¶
Code with many symmetric branches is brittle: every new case expands the ladder and invites mistakes in operator precedence or copy/paste. Turning the branches into data collapses the complexity to 1 and moves the burden to a clearly reviewable structure. Extending behavior is a safe, localized edit, and you can even load the table from configuration or a resource file if it’s expected to change at runtime. The trade-off is that rule order (when relevant) must be encoded explicitly, but for pure lookups, maps are ideal.
7. Distribute decisions with ADTs/polymorphism 3 → per‑method 1¶
Centralizing branching at call sites makes those sites complex and fragile (every new case forces edits far from the
type’s definition.) By moving behavior into the types, each implementation stays trivial (complexity 1), and the
compiler warns you if you forget to implement the operation for a new subtype. The call sites also become
self-documenting (s.area), which reduces the surface area where mistakes can creep in. If performance or allocation is
sensitive, note that this approach does not add overhead in Scala; it mainly influences structure and readability.
8. Encapsulate necessary complexity Internal: 15–20 (encapsulated)¶
Some domains are naturally branchy (parsers, small interpreters). The goal isn’t to obliterate complexity, but to contain it. Keep the gnarly logic private, blast it with focused tests, and present a small, intention-revealing public API that callers can use without branching. Over time, this boundary lets you refactor the internals (or even swap algorithms) while preserving simple, stable contracts for users.