While working with the robots at DLR (the German aerospace center), I’ve been confronted with a new type of data—next to camera images—which I hadn’t come across so far, namely point clouds. As it turns out, point clouds can be an extremely useful extension to the two dimensional RGB camera images already commonly used in scene analysis, for example for object recognition and classification.

However, there are differences between the data types which prevent us from directly applying successful techniques in one area to another. In this post, I’d like to explore those properties after a detailed look at point clouds themselves, to then see which ideas have been employed to extend the deep learning revolution to this promising data type. As usual, you can find the code used to generate all graphics in this post on GitHub and try it out directly on Binder.

What exactly is a point cloud?

As the name suggests, a point cloud is an agglomeration of points in three dimensional space often resembling a cloud, depending on the angle and distance we look at it. Below, you see such a specimen.