Porting Algorithms from ROS 1 to ROS 2
Note: Apex.AI is creating an automotive grade ROS 2 called Apex.OS. In this blog post we refer to some proprietary Apex.OS constructs about which we will report in the upcoming blog posts.
Converting an application framework from ROS 1 to ROS 2 can be as easy as switching a library and changing some types. To properly convert an application and to ensure the ROS 2 implementation is production grade, the following tasks must be considered and implemented:
Ensure the algorithm implementation follows architectural and software engineering best practices
Ensure concerns are separated
Follow a safety-critical coding standard
Use warnings and static checkers
Ensure the algorithm implementation is fully tested
Ensure the implementation is well documented
Tune and optimize the implementation for the target platform
To more concretely understand the steps needed to port an algorithm implementation from ROS 1 to ROS 2, the ROS 1 Velodyne driver is examined in detail.
ROS 1 is an excellent framework for developing and quickly prototyping robotic applications. This is achieved by having vast community support and a wide array of tools. For more performant use cases with stricter requirements, it has become increasingly clear that ROS 1 is not sufficient for developing production quality and safety critical robotic applications.
ROS 2 by contrast is a framework for safe, secure, and robust applications, such as autonomous driving which falls into the domain of safety-critical applications.
Converting an algorithm or application to use ROS 2 as a framework can be as easy as switching out constructs such as publishers, subscribers, and nodes, in addition to modifying the build process by including the correct headers and linking against the correct libraries. Modifying an application in this way would result in an application that nominally runs on top of ROS 2, but it cannot be called a ROS 2 application, as it is not real-time or robust. To convert a ROS 1 application to an ROS 2 application, a number of improvements must be implemented.
Often times ROS 1 is used for rapid prototyping and development of robotic applications. In engineering, an umbrella under which software development falls, prototyping is a necessary evil to get applications and systems working. From a prototype, an engineer is afforded a holistic view of an application that works, which is not always available during low-level development.
From this holistic view, an opportunity to understand what the application does at the highest level is available. This view allows the engineer to map these steps to the most appropriate software engineering and architectural patterns. When rewriting an application to run on ROS 2, the engineer is also afforded the opportunity to fix the architecture of the application to be most appropriate to the target problem.
For a concrete example, consider the architecture and workflow of the ROS 1 Velodyne driver.
What the driver does
Before implementing architecture changes, it's important to analyze the high-level functionality of the existing solution. Schematically, the ROS 1 Velodyne lidar driver does the following:
Fundamentally, the operation of the driver can be broken down into four steps:
Wait for a packet
Deserialize the packet into points
Aggregate points into a point cloud
Publish the message
Applying best practice - multithreading
Architecturally, the existing Velodyne node can be thought of as having two worker threads, plus any number of threads ROS 1 automatically generated. What can first be reviewed is the multi-threading in this implementation.
While there are many rules on how to correctly implement multi-threading, such as prefer locks to lock-free programming, and so on, the most widely agreed rule for best multi-threading best practice is to not use multi-threading.
If multi-threading is not needed, then it should not be used. This has complexity implications (i.e. multi-threaded programs are more complex and harder to debug), and it also has performance implications; some performance is lost in multi-threading due to context switching at the kernel level. The use of two threads in a sequential configuration can potentially have some latency benefits, and attenuate bursty effects from input data.
Conceptually, the input to this software component can be thought of as steady stream of packets. If a packet cannot be handled in the time before the next packet arrive (i.e. the packet cannot be deserialized in the time available), then there is no hope of the application keeping up with the stream of data input. As such, this application can be rearchitected to reside in a single thread.
Applying best practice - polling
In order to avoid blocking, the ROS 1 Velodyne driver polls its socket infinitely.
Polling is generally understood to increase jitter in systems.
To avoid jitter, the a special UdpReceiver (Apex.OS IP) is used, which wraps POSIX's recvfrom along with select, the combination of which allows the developer to wait for data up to a timeout. Using a waiting or event-driven pattern reduces CPU load and jitter in a system.
Separating concerns is a problem endemic to rapidly prototyped code, and a problem that is common in ROS 1 code, is not obeying the design principal of separation of concerns. In general, software engineering best practice suggests that concerns should generally be kept as separated as possible, and a loosely coupled software architecture should be used whenever possible. Recent best practice suggests that in a large robotic system, it is important to keep the following five concerns separate:
A robotic framework, such as ROS 1 or ROS 2 typically has ownership over concerns 2-5. This implies that the remaining concern, computation (i.e. algorithms, control logic etc.), falls in the developer's domain and the operation of this concern should be separate from the other concerns.
Because of the rapidly developed nature of many ROS 1 applications, concerns are seldom kept well separated. A port to ROS 2 is an opportunity to properly separate concerns. Returning to the example of the ROS 1 Velodyne driver, it can be seen that concerns are not kept separate.
Interleaved concerns in the ROS 1 Velodyne driver
Fundamentally, the driver does the following:
Wait for a packet