Code reviews can feel daunting for many reasons (time pressure, skill disparities, amount to be reviewed, etc.), but they play a pivotal role in software projects. They ensure not only the quality and reliability of code, but also serve as a way to teach and learn from others on your team.
In my previous post, I went over how to enjoy code reviews instead of dreading them. Learning to enjoy reviewing code is the very first step to improving your code-reviewing skills. But what exactly should your code review process look like?
This post will provide an overview of how I approach code reviews. My approach is more thorough than most, but as I’ll argue and, hopefully, convince you – this level of detail is where the entire value of a code review lies.
Understanding Your Goals
As with any task, if you don’t have a clear picture of your end goal(s), you’re unlikely to happen across them by chance. Your team may have some unique goals for code reviews specific to your project, but the following are generally applicable across projects:
- Make sure any new feature works as intended
- Check whether any existing features were broken by the changes
- Determine whether the code is easily readable and understandable by others
- Guarantee the code is designed in a future-resistant way
I discussed these in more detail in the second section of my original post, so I won’t restate that here. I do, however, want to address the primary purpose of a code review is to catch errors in the above bullets at a level deeper than your QA team examines.
For some items like readability and future resistant code, it’s obvious this is a dev team only activity. When it comes to new features and regression testing, your job as a reviewer is to catch errors in code that will result in intermittent or non-deterministic bugs that QA could easily miss.
If your strategy for achieving the first two goals during code reviews is at the user level, just as QA will do, you’re only accomplishing the smallest – and most redundant – portion of the code review.
Identifying the Purpose of the Change
First, you need to know the exact purpose behind these code changes; otherwise, how can you possibly know whether the submitted code meets expectations or not?
It is not acceptable to have questions about the requirements of a work item before you start reviewing. Ideally, you should have a work tracking system in place, and your chosen review software will have a way to link to the relevant item of work. If it is not already expected practice for your team to link to the user story/work item/task that prompted the code change, you should put this in place.
In our projects, we typically use Azure DevOps, which has nice integration between the built-in work items/project backlog and Pull Requests (PRs) where the creator of a PR can link the work item they were addressing. My first step is always to open this work item and read it, making sure I understand what the purpose of the work is. I’ll keep a window open during my review so I can refer to it and confirm that it meets all its requirements.
Build and Run It
Imagine spending 30-60 minutes thoroughly reviewing some code that implements a new feature only to encounter a line or two of code that makes the new feature fail 100% of the time or perhaps that won’t even compile. Worse, what if the only way of working around this error is to totally change the design such that all the time you spent reviewing was wasted?
I’ve had this happen to me before when I failed to build the new code and run through the “happy path” of the new feature. Starting with these two simple steps doesn’t take much time and can save you from a potential headache, so don’t skip it. Most modern source control platforms can automate this check for you by building and testing when a Pull Request is created.
When I encounter a PR that fails this initial check, I will ask the author if they possibly forgot to push some changes, and unless that was the case, I will immediately mark it as “Rejected.”
There really is no excuse for submitting code that won’t compile or do the very thing the changes were supposed to do. Failing to be stern here will result in those less inclined towards thoroughness using your reviews as a crutch instead of checking their own code. Maintaining this discipline will result in an increase in the overall quality of your development team, so make sure you stick to your principles.
Divide and Conquer
For small PRs, this may not be necessary, but for larger changes you should break down the PR into a set of smaller pieces in a way that each piece is relatively well abstracted from the others. Without breaking the PR down, you risk overlooking changes that have been made – again, defeating the purpose of the code review.
To give an example, suppose you’re reviewing a backend change that adds two endpoints and makes alterations to a shared business logic class. In this case, I would mentally separate the code changes into these three chunks: Endpoint #1 and its call chain, Endpoint #2 and its call chain, and the business logic class.
If the code is reasonably well written, this should be easy. If you’re struggling to do this, it’s a warning sign that it may be poorly designed.
While this may sound like a bit of a logical leap, code is written for developers, not computers. It’s entirely possible that you, or someone else other than the author, will need to work on this same set of code in the future, and failing to understand how the code you’re working with functions is how bugs are introduced.
This is also the step where you assess whether the design of the code is appropriate for both the current functionality as well as easily extensible for foreseeable future functions. Are these abstracted flows you identified reasonable? Are there clearly defined purposes for each, or is there functionality overlap? If you find issues with the design, mention them now, and don’t proceed to the next step until they’re resolved.
Every Line, Every Time
Now that you have the problem subdivided, pick one of your chunks and move through the relevant source files, reviewing every line of code that was changed along with its context. You will spend most of your time on this step during the review.
The exact problems you’re looking for are language, framework, and project specific, but in general, I ask myself the following questions for every modified line:
- Do I understand why the author wrote/modified this line? Are there any pieces of it I can’t explain?
- If yes, figure it out or consider asking the author to explain it. If it’s something that wouldn’t be immediately obvious to other programmers (we all have our moments) then ask the author to write it in a way that’s more understandable or to add a comment explaining.
- Is this addition/modification necessary?
- Unnecessary code is bad code. This does not mean turn the program into a code golf exercise; code that makes a process more understandable to another developer is necessary.
- What explicit or implicit expectations does this line of code have? If you were trying to break this intentionally, how could you change state surrounding this line to cause it to fail in some way? Some common ways to break code:
- Make a variable used by the line null/empty
- Break a constraint of an external system like a duplicate key in a relational database
- Invalidating “captured” state (e.g. modifying a collection between where an index into it was captured and where it is used)
- Performing operations in parallel where one modifies an input used by the other
- Not covering all possible states of an input. For example, all the below are valid strings, but does a function expecting a path operate properly with all of them:
- What if a string path “/home/Documents/” uses ‘\’ instead?
- What if it doesn’t end in a ‘/’?
- What if it’s a relative path instead of absolute?
- What if it contains “../”?
- What if the path doesn’t exist?
- What if it points to a file instead of a directory? Vice-versa?
- Are there any potential security vulnerabilities in the code? In particular, handling of user input i.e. any content not directly generated by your code, needs special attention. For example, the file path checks listed in the above bullet are one of the common ways a vulnerability can sneak into the system.
- Does this line conform to established team coding standards? Be careful to avoid injecting personal preference here but be strict about code formatting/naming/etc. that the team as a whole has agreed on.
In addition to the above, there are some general non-technical rules of thumb that I use throughout the code review process. Keep these in mind at all stages of the process below.
Don’t Expect to Bench 400 Pounds the First Time in the Gym
Just as it would be ridiculous to expect to bench press 400 pounds the first time you set foot in a gym, don’t expect to be blazing through large code reviews in 5 minutes when you begin being more thorough in your approach.
Whenever you’re reviewing code written in a new language/framework or on a new project, you are guaranteed to spend longer on code reviews; gaining instinctive knowledge takes time. This is not a bug – it’s how you build your internal understanding of the language/framework/project.
This is equivalent to sore arms after the gym. If you try and avoid it, you’re never going to get better. Over time, however, you’ll notice that there are fewer things you need to look up, you’ll start noticing problems at a glance, and while your reviews probably won’t ever be as short as the cursory (non)-reviews you may have done previously, they’ll get much closer to that time – except now they’ll be providing far more value.
Treat Others as You Treat Yourself
For most, it’s tempting to give others breaks on things that you wouldn’t give yourself. This is fine for personal taste in code formatting if your team decided not to require specific styles, but for anything more substantial don’t let it pass unremarked if you wouldn’t write it yourself.
If they disagree, be open to changing your mind or letting it pass for more minor issues, but always write the comment. If you’re right, you saved the team some future frustration. If you’re wrong, you’ve learned something new.
I have always felt this need to figure out why someone wrote code in a certain way that I didn’t understand. A few years back I heard it succinctly phrased as “Attack Uncertainty.”
As an engineer, you should be uncomfortable with not understanding the bounds of what you don’t know when working within a technical system. This not only includes when you are writing code yourself, but also when reviewing code written by another person. Do not allow yourself to skip over code that you do not fully understand.
To demonstrate how subtle this uncertainty can be, consider a review in which something like the following was submitted:
Imagine you were reviewing the line in which an HTTP call line was added. Assume you’ve already tested the code that executes this, and it works correctly in all cases you can come up with. What would you do when you saw this line?
Perhaps you would just continue to look at other changes, after all it’s pretty clear that this is making a simple GET request and it appears to be working. But what about that ‘ConfigureAwait(false)’ call? Can you explain what it does and why it was called?
If you’re like me the first time I saw it and you don’t know what the purpose of it is, it might be tempting to skip past it as it’s obviously some library configuration thing and after all the code appears to work.
As it turns out, after I did some research to understand what was going on here, I realized this call effectively did nothing in our specific case and was unnecessary. When I asked the author why they added it, they told me they weren’t sure what it did and that they had found it in online examples, so they left it in.
Even though this particular example ended up being innocuous, imagine if it had done something we didn’t want under certain conditions. This is what I mean when I say to “attack uncertainty.”
Do not allow yourself to skip over even the smallest changes that you don’t understand. It’s easy to assume that the author had a reason for things, but sometimes they don’t, or their reason isn’t a good one.
It’s often the case during a code review that people will only give a cursory glance at the code and not bother trying to understand what the author was trying to do or how what they built actually works. This turns what should be a useful exercise into a worthless one. I have, on occasion, been the second reviewer on some code, seen an approval by another developer already on it, and still found a glaring issue in less than 30 seconds.
Approving a code review where you did not actually think in detail about the problem sends a signal to everyone on your team that the code is better than it actually is. Other developers who may have been more thorough, seeing your approval, will skip reviewing it themselves. The author will assume anything in it they were a bit unsure of (even though they shouldn’t be at this point) is good because of your approval.
Cursory reviews are therefore socially acceptable, indirect lies, and like all lies they make it harder to identify the source of a problem. They mislead the team into assuming that the amount of bugs or poor designs that make it into production are inevitable, or that perhaps the problem is somewhere other than in the code reviews.
If you’re pressed for time or for some other reason incapable of doing a thorough review of code, do yourself and your team a favor and leave the approval to someone else. You should leave any feedback you do have, but skip giving the review an official approval. If no one on your team is capable of conducting a thorough code review, then the most honest path is to skip it and push changes directly into the main branch without an explicit code review approval – relying on QA to catch everything they can.
To be clear, this is not a good thing. Far better to reduce your team’s workload until they do have the capacity to start being thorough in their reviews, but it’s at least honest. If your system is overly buggy or requires a lot of rework, the cause will be more obvious once you see that the code was introduced by a PR that didn’t have an approval.
Don’t Leave Code Worse Than You Found It
It’s unlikely that your codebase is perfect or will ever be. Designs that initially seem fine, even great, can end up being suboptimal because you found a better way of doing things, requirements changed in a way you didn’t foresee, or maybe your team simply had a bad week at one point and accepted a design they shouldn’t have.
My personal rule when coding is that I never leave code worse than I found it by building more code on top of bad designs. When reviewing code, I won’t let others do it either. This does not mean I expect developers to refactor every problem as soon as they identify them, but rather that they shouldn’t build new code on top of that bad design in a way that will require more work from the person who eventually refactors it.
If a problem is small enough you may refactor it as a part of other work that touches it, but I will commonly find ways of isolating my new code from the bad so that when it is refactored the new code won’t need to be altered much or at all.
Clearly Communicate to the Author
Once you’ve gone through all the code changes, make sure the author of the code you are reviewing knows you’re ready for them to address your comments and understands the level of importance of each suggestion.
In our projects, we tend to use Azure DevOps, which provides 4 different states you can tag the review with. When I add one to a review, this is what they mean:
- Approved – The code is good to merge immediately. There aren’t any remaining issues visible that are worth mentioning.
- Approved with Suggestions – There aren’t any major remaining issues visible, and the code may be merged, but there is at least one minor comment that I would prefer be addressed.
- Waiting for Author – The code should not be merged. There’s at least one major comment that must be addressed before merging.
- Rejected – Used relatively rarely, this means either that the whole Task/Work Item/Business Requirement/etc. never should have been pushed forward to active development because it’s fundamentally flawed or that the PR is so rife with fundamental errors that it never should have reached the review stage in its current state.
Once the author has completed their changes to the submitted code, return to step 2 and review their changes using the same process. Typically, it will take several passes back and forth to get everything ironed out, but the process doesn’t change – you’ll just find less each time.
It’s important to insist on being the one to resolve all comments you left. Sometimes I find the fix made based on my comment is inadequate in some fashion. If the author has already resolved the comment before I verify their changes, it becomes difficult to track which ones have already been checked and which haven’t.
Code reviews are an important but often neglected part of the development process. So much so that you can easily find joke videos or memes like this one online.
While conducting thorough code reviews takes more time and can feel tedious, it doesn’t have to be painful. After adopting the habit of reviewing code in-depth – and ideally getting your entire team on board with this practice – you’ll start seeing significant reductions in the number of bugs and a marked improvement in the quality of your code. The problem was never that you couldn’t see the issues, it was that you weren’t taking the time to.