Building Safe Algorithms in the Open, Part 2 - Implementation
Note the first: This is the second part of a three-part series. The first part can be found here. The third part hasn’t been written yet.
Note the second: This post is partially adapted from an Autoware meetup hosted by Apex.AI on December 10th, 2019. A link to the slides can be found here.
When I first started developing professionally, I didn’t understand open-source. I didn’t understand side projects either, for that matter. After all, why would you give valuable work away for free? In the years since, after spending a little bit more time working with the open-source community through Apex.AI’s work with ROS 2 and Autoware.Auto, I’ve reached a little bit of an understanding of open-source.
Engineers like to build things. People want acknowledgment and appreciation.
When you put those things together, you get a pathway to open-source. If I’m building something for the satisfaction of building something, then why not let everyone else appreciate and find use and value from it? I’m not doing it for the money, after all.
And to get there, it wasn’t until I started developing professionally with more rigor that I understood the allure of side projects. To build a solid product that people will pay for, you often have to introduce many more constraints to development. Design review. Code reviews. Coding standards, style guides, coverage metrics. So on and so forth. Don’t get me wrong. These are all good things that are very likely necessary to develop a high-quality product, but sometimes a developer just wants to be free. To build what they want to, how they want to, when they want to. No meetings, reviews, or business cases to be seen.
So when it comes to developing safe, or otherwise high-quality algorithms in the open, how can we make the two work? After all, part of the allure of open-source is freedom, whereas practices that help ensure the development of more reliable code infringe on that very same developmental freedom.
An answer I’ve found lies in applying a little bit of consistent discipline, a little bit more up-front discipline, and making liberal use of the many great tools open-source has given us.
Project planning in the open
To tackle these things which we engineers like to do, engineers need a specific set of skills. Engineers need focus, good problem-solving skills, the ability to decompose problems, and solid foundational knowledge to inform all of the above.
This particular set of skills can lead us, engineers, to sometimes get a little bit too big for our britches.
For all the technical ability we engineers have, we are ultimately finite in our capabilities. I would posit that most developers can’t keep the whole of a project in mind while they write individual lines of code. What’s more, I would say that most developers cannot code and keep the broader project in mind while also minding the overall business goals.
This is where the black art of project management comes into play.
While us developers may have a somewhat contentious relationship with managers of the people, technical, or project persuasion, it has to be admitted that these managers provide a valuable service. At their best, they make sure that we developers are pointed at the right problems, and nagging annoyances are out of the way so that we can fire with all barrels on the problem at hand.
Though we recognize the importance of project manager types, people with these kinds of skills generally don’t show up on your average joe’s open-source project, where the goal is usually developing for the pleasure of it.
So what are we then to do?
Well, we developers can get our hands a little dirty and spend a little bit of time planning upfront.
I won’t spend too much time talking about this since I’ve written at length about the design and planning part of development in the previous post. The gist of it is that given a design and architecture, for which there are generally many components, and some dependencies therein, you choose individual components in your design, and build them one at a time.
Returning to the project planning aspect, I like to go from components with the fewest dependencies first (think min-cut!), and go from there, creating stub implementations where necessary to maintain development cadence. From this ordering, you can generally make a bunch of tickets (with some dependencies therein matching your architectural dependencies, if your issue tracker supports it), with some general high-level notes to keep in mind for each issue before you dive into the low level. The tickets should be as small and tightly scoped as possible because let’s face it. We’re all ultimately limited in our capacity for context and attention. The more bite-sized the development tasks are, the easier it is, so why not make hard problems as easy as possible?
Then, as you develop, you pop issues off the top of your queue and get to it.
Obviously, this is a vastly simplified version of project management. In true project management, there are many other things to consider, such as resources, scheduling, competing business cases, and so on. Project management in the open, being simpler, might be more liberating as well. Maybe there’s a case for open-source project management as well.
Developing in the open
With a bundle of tickets in hand, the project planned, and everything broadly scoped out, we can just go ahead and get to developing.
That said, many of the liberating things about cowboy, open-source development are things that shouldn’t be present when developing safe code. You can, however, avoid many of these pitfalls by liberally using open-source tools and applying a little bit of discipline (and having friends).
I’m a huge proponent of discipline as a means to improve the quality of pursuits (after all, discipline is number 6 on my StrengthsFinder assessment; take from that what you will). By being disciplined enough to use the plenitude of tools open-source provides, heed and act on the results, and stick to a process, we can overcome many of the shortcomings that creep in with the cowboy development we often get in open-source.
Briefly, I’ll argue that using the following tools and practices (which can easily be applied to any project, with caveats) goes a long way towards improving the quality of code:
Tests (or better yet, test-driven development)
Continuous Integration (CI/CD)
On top of this, some general principles which guide my line-by-line development are:
Making full use of the language and libraries
Code should be readable and blindingly obvious
I’ll try to tie this back to the actual implementation of NDT localizer, which happened over the course of 10 merge requests by my good colleague Yunus. He’s too busy doing actual work, so I get to take some imaginary credit by writing about his work.
In addition, I’ll use the development of another open-source algorithm, the MPC controller, as an extra live example of some of these processes and practices. It was developed in a somewhat looser, or cowboy fashion over some 30-odd merge requests, not to mention several more merge requests and improvements after the main development.
Let’s first talk about testing.
I have had a long (for me) and a complicated relationship with testing. In my first project in my first development role, I had absolutely no faith that my code worked, so I was the very first member of the team to start writing any substantive unit tests. Back patting aside, I was entirely right in that my code didn’t work.
Since then, my tumultuous relationship with testing has gone through many twists and turns befitting a late-night drama. I liked it. I hated it. I wrote too many tests. Did way too much copy-pasting, wrote too many redundant test cases. Then it became a chore as a part of development. First, I wrote the code, then I wrote the test, and it was just a slog.
Now I and testing are good. It’s an integral part of my development workflow, and a first-class citizen, as much as (though not quite) any application code I write.
What changed for me was using test-driven development, which I started using in my mpc project.
I’ve talked briefly about test-driven development in the previous post, but just to reiterate, the process is:
Come up with a design specification (use cases, requirements, etc.)
Implement stub API/architecture
Write tests based on API and design specification; they should fail
Implement logic; tests should pass
There’s some iteration in this process (tests don’t fail on stubs, the implementation doesn’t pass tests, API is awkward, etc.), but as a whole, I think the process is extremely valuable.
I’ve made a big hullabaloo about doing a little bit of planning before implementation, and TDD gives you space for this. Check. Then you think about the architecture and API, and match it against the use cases. This gives you an excellent opportunity to get closer to code but still think about the problem at a high-level. Check the second.
What comes next is writing tests first. There’s a couple of different reasons why writing tests before implementation is valuable, and I think they’re all important:
The test is written as a first-class citizen, and not an afterthought
The test is written with only the spec in mind, and not any detailed knowledge of implementation--meaning the test will test how the implementation should work
The test allows you to play with your API and see how comfortable it is to use
You can ensure that your test is valid--that is--it will fail if your code doesn’t do what you want it to
Taken all together, I think the benefits of test-driven development are immense, and once again, I highly encourage everyone to at least give it a fair shot.
Turning back to Autoware.Auto, though he didn’t follow idiomatic test-driven development, Yunus wrote tests for every merge request in the NDT development, with the test code often equaling or outweighing the amount of new application code, which is a good thing. As a point of comparison, SqlLite, which is probably at the top tier in terms of testing (in general, not just for an open-source project), has 662 times as much testing code as application code. We’re not quite at that point in Autoware.Auto, but if you go through the history of NDT-related merge requests, you’ll also see that the test coverage slowly crept up until it exceeded 90% coverage (though it’s since fallen due to other developments and external code).
Similarly, for my mpc project, there are tests for everything, including some of the testing code. What’s more, I’m always careful to put in a regression test to ensure that the bug is fixed and stays fixed.
Good job, me.
The funny thing about a lot of concepts is that you can stretch definitions pretty far. By a similar tack, testing goes far beyond manually written, functional tests. In fact, checking for style adherence, or bugs can be considered a form of testing (inspection, really, but testing if we stretch the definition).
These kinds of “tests” are somewhat painful and laborious to run. After all, checking if the line spacing is four spaces or a tab? No, thank you.
But one of the most beautiful and valuable things about coding is that you can automate things, painful and laborious things included. In doing so, you can accomplish things faster and more thoroughly with code than any human could possibly hope to. What if we could do the same with bugs and problematic or error-prone constructs in code?
Well, we can; with static analysis tools.
I’ve written at some length about static analysis in a previous blog post, so I won’t rehash too much here about the value or potential tools you can use.
In the Autoware.Auto project, we use a slightly-pared down version of the ament_lint suite of linters and tools from our close friends in the ROS 2 community. This gives us many nice things, but perhaps the most important thing is an automatic way to format our code, thus avoiding fights about style since the impartial tools tell us what’s right or wrong. As a side note, clang-format is more strict and unambiguous than uncrustify if you care.
By contrast, in the mpc project, I went a bit farther for once. Here, I made use of Clang’s -Weverything flag, in addition to every possible warning in clang-tidy and Clang static analysis. Surprisingly, I only had to disable a few of the rules for bespoke development, due to inapplicable warnings, or philosophical differences. When interfacing with external code, I had to disable many checks due to the amount of noise such things introduce.
In all, though, I’ve found that while using extensive static analysis checks doesn’t impede normal development too much (for new code, and after an initial learning curve).
It’s hard to quantify the value of static analysis, especially when you use it from the get-go. After all, it’s hard to guess whether or not a bug would have existed when static analysis may or may not have prevented its existence in the first place.
Still, I believe liberal use of warning and static analysis is one of those things, where, if used correctly, you can’t be sure if they’ve done anything at all. In other words, you can’t be sure of its value when it’s included, but you’ll sure as heck miss it when it’s gone.
As much as I love thorough testing and code analyses of the static and dynamic nature, all the tests and analyses in the world are worthless if you don’t run them. This is one of the things CI can enforce with minimal developer overhead.
I think it’s generally accepted that having CI/CD (continuous integration/continuous deployment) infrastructure is an essential part of modern development, same as using version control and having development standards (or at the very least, a style guide). The value of a good CI/CD pipeline is something that bears repeating, however.
At a bare minimum, a CI/CD pipeline should build the code, and run the tests before allowing the code into the mainline of your repository. After all, no one wants to be that guy (or girl, or person) who breaks the build or some test, in which case you have to quickly and shamefully fix the issue. CI (and by extension, your dear DevOps engineers) protects you from that embarrassment.
But CI can do so much more for you.
With a robust CI pipeline, you can test any number of combinations of operating systems, compilers, or architectures (but not too many, consider combinatorial testing). You can also do builds and tests that are otherwise too costly or cumbersome for a developer to run locally, such as measuring code coverage. The sky’s the limit.
Returning to my original point, having a CI/CD pipeline (which Autoware.Auto does) as a part of your open-source project goes a long way towards taming the unruliness of cowboy development. Code can’t go in if it doesn’t build or pass tests. If you combine this with a robust testing discipline, you can always be mostly sure your code works.
In Autoware.Auto, the CI:
Builds the code
Runs tests (style, linters, functional tests)
Makes sure code is documented
By contrast, my hastily-cobbled together CI in the mpc project:
Builds the code
Runs scan-build (Clang static analysis)
Runs tests (but doesn’t fail CI on test failure)
Tests, analyses, and CI are wonderful. You can test and analyze, and make sure those tests are run with CI, right?
Echoing a previous refrain, all the tests in the world are worthless if they’re bad tests. So how do you make sure they’re good tests?
Here, I, unfortunately, have no magic answers for you. Instead, I fall back to an old mainstay in the engineering discipline, reviews. Specifically, code reviews.
It’s generally understood that two heads are better than one. In fact, I would argue that not only literature but also theory backs up this notion.
Ensemble methods are one example of this. In fact, it’s generally understood that using ensemble methods is a quick and easy way to boost the performance of statistical models (the famous boosting technique is an example of this). Similarly, from a purely statistical perspective, the variance of an estimate goes down (with assumptions) ,the more samples you have. In other words, you’re more likely to be closer to the true answer with more heads.
You can try out a live example of this by doing a team-building exercise. A less fun version might involve guessing random statistics individually and as a group.
Theory and team-building aside, code review is an important and powerful tool. It’s no wonder then, that code review is an integral part of any professional development process, and is even recommended by the ISO 26262 standard.
All that said, there’s always the danger of having too many cooks in the kitchen. What’s more, code reviews can sometimes become contentious exercises.
With that said, I think code reviews can be made palatable and painless if both reviewer and reviewee keep the following in mind:
You are not your code
There’s a person on the other side
Everyone is working towards the same goal; code reviews aren’t competitive (but some coding is)
Many smarter and nicer people than myself have written about how to do code reviews right, and I suggest you have a look at that. If I can impress nothing else from this, it’s that you should be doing code reviews if you want your code to be more reliable than not.
I’ve talked at length about proper processes and tools with which to surround the development. Checks and automated tools that run and make sure the code is good.
From that process-level view, I’d like to move on to talking briefly about the craftsmanship of coding and share some thoughts on the process and intention behind writing individual lines of code.
There are a couple of concepts that, when I understood and took to heart, resulted in huge improvements in my code. One concept was keeping in mind intention, semantics, and readability, which I’ll talk about in a bit. Another was understanding object-oriented programming and the separation of concerns principle. The last big thing was DRY or the principle of “Don’t Repeat Yourself.”
DRY is something they teach you in school, and as with many similar things, we file it away and don’t give it too much thought outside of exams (or at least, this is true for me). But as with many things in school, we don’t learn these things for no reason at all. It’s actually good practice.
Simply put, if you find yourself copy and pasting code here and there, or otherwise writing very similar code frequently, that’s a very good smell that the repeated code should be a function, or otherwise a part of some abstraction.
But DRY goes much further than being a smell test for when to make something a function. It can also inform some architectural decisions.
Though this intersects with some architectural concepts, such as coupling, cohesion, and separation of concerns, an example of applying DRY to architecture can be seen in the mpc project. I noticed, during the development of the mpc controller, that I would have to duplicate some code if I ever to write another controller, such as some boilerplate state tracking, publishing, subscribing, transforms, and so on. Put another way; it seemed to be a separate concern from a purely mpc controller.
That was a good smell to break out common constructs and functionality into a separate class. The payoff was two-fold: the mpc controller is 100% focused on mpc-related code, and the associated node is only configuration boilerplate. In other words, because of the architectural abstractions, I won’t have to repeat myself when I write another controller.
The world is nothing more than shades of grey, so care and proper thought should go into these design decisions to avoid going too far, and creating abstractions where none exists. Still, if the developer is mindful of the concepts they are modeling, DRY is a powerful tool for informing architectural decisions, and in my opinion, is a core concept for keeping your code clean and tight.
After all, one of the key values of code lies in its ability to do repetitive tasks, so why not offload the repetition of code to well-crafted functions and classes?
Making Full Use of Language and Library
DRY, in my opinion, is such an important and all-encompassing concept that this point is really just an extension of DRY.
If your language supports something, then you should generally use the built-in version of it, unless you have a very good reason not to. And C++ has very many things built into it.
Programming is a skill, and as such, there is a great variance in programming skill levels. I have caught a glimpse of how high the mountain goes, so I generally assume that the people who implement the standards implement these common patterns better than I can.
A similar argument can be made for library functionality (though perhaps less strongly). Someone else has done it, and probably to a good level, so there’s no reason to reinvent the wheel.
That said, this, like most things, is a recommendation and not a hard and fast rule. While you shouldn’t reinvent the wheel, and while standard implementations are generally very good, there’s also no sense in trying to squeeze a square peg into a round hole. Use your best judgment.
One last concept that helped to improve my programming was the notion that coding wasn’t so much about coding as it was about communicating. If not communicating with other developers, then communicating with your future self. You need to think about memory and math, and Big-O complexity, of course, but once that becomes easier, you then need to start thinking about intention, semantics, and clarity after that.
There’s a very famous, and widely recommended book on this subject, Clean Code, so there’s not a whole lot I can add on the subject other than list some general things I look for when I’m writing my own code and doing code reviews:
Keep classes tight and focused:
Minimize the public interface to well-defined and understood concepts (Example)
Be mindful of the state of objects
But also be mindful of the different kinds of exception guarantees
Functions should do one thing
If in your documentation of a function, you’re using “and”, then that’s a smell that you can maybe break the function up
Avoid having too many function parameters (typically a smell for a missing abstraction) (Example)
Are you repeating yourself? (Example)
Maybe it could be a function (Example)
Another excellent resource regarding these kinds of concerns is the ISO C++ Core Guidelines.
I’ll reiterate that none of these are groundbreaking, new, or unique, but if spelling the principles out has value, or gives someone an “aha” moment, then I didn’t waste bits and bandwidth writing this.
Those were some of the tools, principles, and processes we used in the development and implementation of NDT localization, and MPC control. Lots of work, fun to do, but not so interesting to talk about.
In general, we made great use of the tools and practices I’ve talked about in the development of NDT and MPC, but we weren’t perfect.
As an example, NDT didn’t follow idiomatic test-driven development (but it’s still well tested!). And while MPC did follow test-driven development to a T, it did not benefit from the more powerful CI built into Autoware.Auto. Further, lacking friends, and being a secretive personal project at the outset, MPC did not have the benefit of code review.
While both algorithms could have benefited from more static analysis, more testing, and more reviews, for what are essentially one-person projects, I think the testing and review that has been done is quite thorough. As for static analysis, better and more thorough forms of static analysis generally fall into the purview of products, and as such, are not something readily attainable for the open-source community (although there might be some interesting things on the horizon).
Meanwhile, there’s not too much to say for the line development of the two algorithms--we did it to the best of our abilities, adhering to the principles I’ve outlined above.
Overall, for both projects, I think we did an excellent job at decomposing the larger problem into smaller, byte-sized (pun intended) chunks (though the MRs for NDT could have been a bit smaller), and doing a thorough job testing it. I think the preliminary results speak for themself.
What comes after implementation is integration. This means plugging it into the larger system with other complex components providing inputs and digesting the output of your algorithm. Integration is probably the hardest part of algorithm development because you need to maintain a high-level view while fixing low-level problems. Any one bug in any of your many lines of code can break the integration of your algorithm.
I’ll talk about our experiences with this in the third and final blog post in this series.
As a preview, I will claim, as a testament to the development process, that there haven’t been any huge bugs found during the usage and integration of the mpc controller. True, there were some build scripting issues, flakey tests, missing input validation, and issues with incompatible QoS settings, but nothing was horribly wrong with the code.
In fact, it was able to run (sans QoS incompatibility and parameter tuning) pretty much out of the box.
Similarly, while NDT had some of its own minor issues, like unstable covariances, a line search bug from cargo-culting in existing code, and misaligned maps, it was also basically able to run out of the box as well.
Not bad for stuff developed in the open.