Learning from Graphs

Welcome to this third and still not final episode of the series learning from 3D data. We’ve already looked at point clouds, and voxel grids so now it’s time for graphs. I’ve already motivated learning on 3D data as opposed to 2D data like images here, so let’s skip this and directly move on to a quick recap on point clouds and voxels to see why we might want and need yet another representation.

Previously on 3D deep learning

Point clouds are great, because they are the raw output of 3D scanning hardware so we don’t need any hand-crafted pre-processing. Apart from being computationally efficient they are also efficient to store due to natural sparsity where unoccupied space remains empty and we simply store a three-tuple of xyz coordinates for each point. Extracting information from this format, i.e. learning, is however difficult in part due to this sparseness but also due to unorderedness and varying density.

Voxel grids try to alleviate some of these problems by putting points into boxes, i.e. voxels, and stacking them into an ordered structure, the voxel grid. Similar to images, each voxel now has pre-defined neighbors and density is equalized through binning, as multiple nearby points are lumped together into a single voxel. This allows to employ the workhorse of deep learning on 2D structured data, i.e. images, namely convolutions.

The holy graph

What do graphs bring to the table then? The best of both worlds would be an exaggeration, but there certainly is some of both. But what is a graph anyway?