I once thought that code review was all about catching bugs, but I was wrong.
In this article I’ll discuss the additional value that teams get from code review beyond defect detection, and how Pull Requests (PRs) can get in the way of maximizing that value. Understanding the drawbacks of PR-based workflows, we’ll then discover a few simple practices which prevent pull requests from dragging down a team’s efficiency.
Almost every team I’ve worked with in the last few years has been using some variant of Github Flow - a code development branching workflow where developers do their work on relatively short-lived feature branches, and use a pull request to request a code review of the changes on that branch from other members of their team, before the changes are merged into a shared integration branch (usually
This methodology works pretty well for many teams - it’s a straight-forward branching strategy which provides frequent integration into a shared branch, with some control and rigor added via the pre-merge code review. However, Github Flow is not without its drawbacks, particularly when followed dogmatically.
Practices for better PR-based code review
We’ll dig deeper into some of the benefits and drawbacks of Github Flow - specifically around pull request-based code review - later in this article, with the goal of identifying practices that allow us to work most effectively within this model. Here’s a sneak peek at some of those ideas:
Don’t couple dev tasks 1:1 with pull requests. It’s OK to split development of a feature across multiple pull requests. It’s also OK to create pull requests that aren’t associated with a specific task.
Use a naming convention to explicitly categorize your code review feedback. The PR submitter should know which code review comments are calling out must-fix issues, and which are simply stylistic feedback, without ambiguity.
Use tooling and conventions to reduce tedious arguments. Back-and-forth discussion over whether there should be a space before or after a brace is an extremely low-value activity. Pick a convention and enforce it with tooling that resolves these stylistic issues before changes ever get to code review.
Don’t require code review. Shock! Horror! Perhaps a two-line copy change or a small update to some simple, well-understood logic can be reviewed after code has been merged?
Discuss the overall technical approach before coding, not after. No-one enjoys a code review where the conclusion is that the entire branch needs to be re-done because the approach was wrong.
Augment async code review with inline feedback. Practices like mob coding and pair programming provide an insanely tight feedback loop compared to pull requests - seconds vs. days.
With that preview aside, let’s start by looking at why the Pull Request model has become so popular.
The Point in Pull Requests
To understand the motivation and benefits for these practices, let’s first look in more detail at why pull requests are valuable in the first place.
Pull requests are used within Github Flow as the mechanism to orchestrate code review. But why do we need code review? What value does it provide?
Code review catches bugs
Additional brains looking at a set of code changes can spot errors or defects that the original author missed. At one time I considered this to be basically the point of code review - catching bugs - but over time I have come to realize that code review actually provides other benefits, which are as valuable as defect detection, if not more so.
Code review enforces norms
To illustrate these other benefits, let’s take a look at some code review comments:
“We should be using all caps for constant names”
“This feels like too much logic to be living in a controller class”
“I’m not sure we should be doing this work on the client”
The feedback in those comments varies widely in scope, from small coding style details - should a newline go before or after a method call - through to broad architectural decisions - should some logic live on the client or the server. However while the scope of these comments varies widely, all of them follow a common theme: calling out deviation from the team’s agreed norms. These issues don’t constitute bugs or defects per se, but they do represent important feedback that should be addressed. It turns out that keeping a codebase (and a team) aligned around coding and design conventions is a key benefit of code review.
Code review evolves norms
Here are some more code review examples, demonstrating another class of feedback:
“Oh, neat, this lookup table is much cleaner than the switch statement we had before.”
“Hmm, I wasn’t sure about doing the validation here, but now that I think about it this makes a lot more sense”
“We really need to decide whether we want constants like these to live in a central file or at the top of the file that references them”
These comments invite discussion. Rather than enforcing existing coding conventions, they are opening up the opportunity to evolve or adjust those conventions. This is the third benefit I see from code reviews - PR commentary can act as a sort of “town square”, a shared space for a team to discuss new approaches and new norms.
Drawbacks of Pull Requests
We’ve seen that pull requests can act as a conduit to bring in the value of code review, but they also bring some drawbacks along for the ride.
A PR suffers from two types of delay. After a PR is created it can sit waiting for engineers to simply find time to start reviewing the code. Then once review starts there can be a period of back-and-forth as review comments are issued, changes made, and code re-reviewed. If a team is distributed across timezones, or has a particularly stringent review culture, then this back-and-forth can drag on for hours, days or even weeks!
This delay is extraordinarily expensive. Let’s talk about why. Firstly, the changes in a PR grow stale as it waits for approval. The chance of expensive merge conflicts increases, as other changes continue to land on the integration branch in the meantime. More fundamentally, this code review delay represents a delay in completed work getting into production and actually delivering value to users. In the Lean way of thinking, this delay represents waste.
On a high-performing team this delay might add up to a few hours, but as I just mentioned I’ve seen teams where a PR can spend days or even weeks in review before it is finally accepted and merged. This is an extremely expensive overhead.
Focus on the wrong things
There’s a strange phenomenon that can occur with the type of offline code review that PRs encourage. A reviewer will spend a lot of time on feedback for relatively minor issues - whitespace, naming conventions, and so on - while at the same time failing to provide any feedback at all on much more serious design or architectural problems in the PR.
Why is this? I suspect that the line-oriented commenting mechanism that pull requests provide is partly to blame, in that it pushes reviewers towards low-level, line-oriented feedback rather than higher-level design feedback which doesn’t necessarily map to a specific set of lines in a PR’s changeset.
Beyond the review feedback mechanics, there’s also a lot of human psychology at play. A reviewer can be uncomfortable suggesting that the design approach that’s been used is wrong and the whole PR should be reworked. It’s easy to succumb to the Sunk Cost fallacy and feel reluctant to suggest “throwing away” the existing implementation, particularly if it’s functional as is. It’s also simply a tough thing to tell another human being that you work with that they got it wrong and need to go back to the drawing board. And so sometimes we punt, and don’t make give the critical feedback that we should.
The asynchronous, textual nature of PR-based code review also makes it hard for a submitter to gauge the relative weight of each comment. Does a comment saying “naming seems confusing” mean that the naming must be improved before the PR can be merged, or that we should fix it in a future PR, or the next time we’re in this area of the code, or just that the reviewer just commenting in passing that the naming didn’t seem particularly clear?
The nature of PR-based code review also makes certain issues hard to spot during review. Creeping code duplication is a good example. When performing a code review primarily through the lens of what’s changed, you might not spot that a new class is in fact 70% the same as four other classes that were copy-pasted from, or that a new unit test is extremely similar to some existing tests. This is because that existing code isn’t readily visible when reviewing a diff that focuses on what’s changed.
PRs are coupled to branches
With Github Flow we define the set of changes that we want to be code reviewed using branches. That’s how the pull request mechanism works - you mark a branch as ready for review. There is always a direct correspondence between the set of changes on a branch and the changes that will be reviewed, as a unit. This coupling means that if a feature branch contains a large set of changes then all the changes must be tackled in one large code review. Splitting up the changes into several smaller reviews is possible, but only if the author is able to tease the changes apart into a sequence of independent feature branches, which can be integrated one by one into the shared branch. As it happens, there are quite a few other benefits to this more trunk-based approach, but it takes thought and care, and the tools and practices required to do this are not readily available to every development team.
This means that for many teams a complex feature means a large feature branch, which means a large code review changeset. Research indicates that code reviews of large changesets are less effective, but unfortunately the branch-oriented model that pull requests are based on makes it hard to divide the review of a large feature into smaller chunks.
Techniques for better PR-based code review
We’ve seen that code review serves a valuable purpose, and that Github Flow uses pull requests as the mechanism for code review. But we’ve also seen that there are drawbacks to PR-based code review. So we wonder how we can maximize the benefits of PRs while minimizing the downsides, and we’re brought back to the practices for effective pull requests which I outlined at the start of this article.
Use fast-follow PRs
One drawback of PRs is the delay they introduce in getting changes merged. We can reduce that delay by not requiring every issue raised during code review to be addressed in the same PR. Instead, we can opt to fix just the critical, merge-blocking issues within the same PR, and fix other, non-critical issues in a fast-follow PR. This allows a PR submitter to merge their changes with less delay, while still addressing all issues raise in code review.
This approach requires a certain amount of discipline and trust within a team. PR submitters must have the discipline to deliver on their promise to address issues in a fast-follow PR, and reviewers must trust that they’ll do so. However, I’d say that this is the sort of discipline and trust that’s reasonable to expect from professional developers working in a team with a healthy dynamic. If engineers don’t trust other engineers to act like adults then a team has more pressing issues to focus on than their pull request conventions.
Use a notation to categorize code review comments
We saw earlier that it can be hard to understand whether a code review comment is identifying a must-fix defect, or simply a deviation from agreed conventions. If we plan to only fix critical issues within the originating PR then we need a clear way to distinguish between critical and non-critical defects.
To cut through this ambiguity, I encourage teams to adopt a simple notation in their code review comments:
“BLOCKER: this API is not available on mobile browsers and will cause the phone to implode”
“FAST-FOLLOW: We really shouldn’t have this business logic in a controller action - please pull it out into a domain class”
“NIT: I would probably have pulled this out into a little helper function”
The notation is hopefully fairly self-explanatory. A BLOCKER must be addressed before this PR can be merged. A FAST-FOLLOW must be addressed, but optionally via a follow-up PR. A NIT is feedback that the author can take or leave, or an opportunity to open up a discussion on team norms.
Use tools to enforce coding conventions
Engineers discussing the placement of punctuation within a code review is a very expensive way to enforce team norms. Tooling running within some sort of CI/CD infrastructure is a much more effective and consistent approach. Leverage static analysis tools such as linters, code formatters, etc. to automatically enforce team standards, and save the code review comments for meaningful discussion on more nuanced topics where machines can’t (yet!) detect deviation from the team’s agreed norms.
Break up dev tasks into multiple pull requests
Code review simply doesn’t deliver as much value when review changesets are too large. Breaking a task up into multiple smaller PRs results in higher quality review feedback. Additionally, by breaking a development task up into multiple PRs we reduce the cost of delay, as the required changes can be reviewed and merged incrementally rather than having to wait for everything to be fully completed before anything can start being reviewed.
There are two ways we can break up our changes across multiple PRs. First, we can use trunk-based development techniques like feature-flagging, which allow us to incrementally deliver code changes to a shared integration branch before a full feature is complete.
Secondly, we can address non-critical issues that are raised in a code review using fast-follow PRs, rather than requiring all changes to be made against the original PR.
Encourage “campsite” PRs
The scouts have a rule about leaving the campsite cleaner than you found it. This “campsite rule”, applied to software, is perhaps the single best trick to achieving an evergreen codebase; a codebase that does not degrade into a pile of unworkable spaghetti as it ages. Unfortunately when followed dogmatically, Github Flow tends to discourage the campsite rule, since developers are incentivized to keep the changes in their branch focused on the feature at hand, in order to reduce the size of the resulting code review, as well as to avoid the potential distraction of unrelated changes done as part of a “campsite cleanup”. One compromise is to build a team culture of tackling this type of opportunistic refactoring in small, separate PRs - “campsite” PRs. This takes a little discipline, and a willingness to learn the necessary git-fu required to safely put your feature branch to one side and create a new branch when you come across a cleanup opportunity, but it’s a team habit that will pay wonderful dividends over the long term life of your codebase.
Review the technical approach outside of PRs, early and often
We’ve seen that code review isn’t a great mechanism for ensuring that a change is suitable from a broader perspective than the technical implementation. The format doesn’t lend itself to this type of discussion, and fundamentally the feedback comes too late - after the implementation of those decisions are complete.
There are better forums for these broader technical discussions. Teams can institute “story kickoffs”, where the developer of a feature huddles with the tech lead and a product person to discuss the details just before development begins.
In addition, setting up various opportunities for lightweight touchpoints as feature development gets underway allows the technical direction to be refined in the light of new information - most good plans benefit from some adjustment once we encounter the enemy. I personally like to establish “tech huddles”1, a very brief tech-oriented chat straight after daily standup where team members can raise any interesting technical topics that have come up over the previous workday.
Even when a feature is complete and ready for review, teams shouldn’t rely on pull requests as the only way of doing code review. Tapping someone on the shoulder (or firing up a quick screenshare) and doing an in-person review provides for a much higher-bandwidth discussion than a text box in a web browser. You can take this to the next level and perform “mob reviews”, gathering all or part of the team in front of a monitor, projector, or screenshare to perform code review as a group. While this has value for critical changes that benefit from extra brains, it’s also a good way to maximize opportunities for that third benefit of code review - evolving team norms. Getting the team together to talk through the pros and cons of a new approach can be much more productive - and much more inclusive - than discussing ideas via pull request comments.
Diminishing the importance of PRs
I started off this article discussing the benefits of pull requests. However, what we’ve seen is that PRs don’t provide much intrinsic value at all - they mostly serve as a useful mechanism for code review. It’s code review that provides the true value. Moreover, we’ve seen that while PRs are one way to get the value of code review there are other mechanisms too, which teams who follow Github Flow dogmatically could be missing out on.
Practices like mob coding, pair programming and good old-fashioned team discussion can provide a much higher-bandwidth alternative to pull request-based code review.
Additionally, while we use PRs as the organizing structure for code review, we don’t need to force that structure to align 1:1 with our development tasks. We can opt to split development work across multiple PRs, reducing the expensive delay that asynchronous pre-merge code review can cause.
Thanks for reading. I hope you’re excited to try out some of these ideas on your team soon!
- I am a big fan of Tech Huddles, partly because I like to call them Tech Cuddles in a rushed voice and see how long it take for folks to catch on. A few teams have “embraced” this alternative nomenclature. [return]