Skip to content
Colin Wren
Twitter

Reconstructing 3D objects from 2D images

Technology3 min read

While attending DroidCon in 2014 I saw an interesting talk from the guys at Seene about how the technology behind their app worked. It was a great talk and got me a little more than excited about computer vision again.

I then promptly got snowed under with work and forgot about it all until about 2 years later when I got back into skateboarding (I’m still terrible, but I managed to drop in — once). I was talking to a friend about how cool would it be to take a few pictures of a place and have an game engine create a skate-able arena for a game like Touch Grind.

With that in mind I starting looking at some of the things that I learned from the Seene talk. Seene worked, at least to my understanding by comparing images and using the shifts in features to detect depth which allowed it to then create a 3D point cloud and subsequent mesh and use the images for textures so I started looking in that direction and found VisualSFM.

VisualSFM works in the same kind of way — be at a much more bigger scale and takes multiple picture sources and by comparing features can recreate objects in a 3D space.

So I decided to give it a play. I found this tutorial a good resource for getting something up and running (although you’ll need to make your own image set to work from).

Reconstructing my Bat Gremlin statue

I have this amazing Bat Gremlin statue sitting on my desk (I love Gremlins) and as it was right next to me and had some interesting features I thought might be a good object to reconstruct so I placed it atop a tripod I had and set about taking 142 images of it from all kinds of angles.

bat gremlin
Bat Gremlin statue during it’s ‘photoshoot’

Once I had the source imagery I then added all 142 images into VisualSFM, this took a while (~17 minutes) as the software loops through each image matching it to all the other images.

Once this initial match was completed I then ran the ‘compute missing matches’. After this was done the software had all the image match data it needed to start reconstructing the statue.

The first reconstruction stage is a ‘Sparse 3D reconstruction’, the bit I love about this stage is you see the software take a source image and locate it in the relative coordinate system, then take another image and another and start rebuilding the object.

Once the sparse reconstruction is done you’ll be able to see a series of points in a 3D space which looks amazingly like the object your trying to reconstuct. You can right-click to rotate the object, which is helpful to really demonstrate the 3D-ness of the point cloud.

sparse 3d reconstruction stage of visualsfm
Result of the Sparse 3D Reconstruction stage. The coloured triangles are the camera angles the software determined the photos were taken at

The last (and most intensive step) is to run the ‘Dense 3D reconstruction’. This step requires the CMVS and PMVS tools and took about 30 minutes on my setup. The output of this is a bundle.out, list.txt and 0000.ply file which contains the 3D model generated during the process, you can use MeshLab to open this.

So does it work?

Ish — the sparse reconstruction definitely did manage to reconstruct the object into a eerily accurate point cloud however the dense reconstruction didn’t quite manage it.

dense 3d reconstruction rendered in meshlab
The mesh generated from the Dense 3D Reconstruction — a bit of a miss

The wings on the Bat Gremlin were completely lost in the dense reconstruction which may just be due to them being ‘featureless’ compared to the rest of the statue or the lighting may have been a factor as I did this under normal night-time room lighting conditions.

However the fact that in an hour I was able to take 142 images, import them into VisualSFM, press a few buttons and generate a 3D mesh that had the main features of the statue, all for free is not something to discard.

Jamie Fuller on YouTube had much better luck with his reconstruction of a model on a bed (May be NSFW — depending on your workplace)

I think Seene had the right approach to trying to make this technology ‘user friendly’ by reducing the number of degrees that would need to be reconstructed the created the optimal experience for the app market.

These constraints meant you could download the app, create a short clip and have a 3D model to play with in a matter of seconds which is ideal for getting people sharing their creations and coming back to the app to create more.

I’ve also used Autodesk’s 123D app but I found it drained my battery, took forever to upload the imagery and which it did successfully recreate the object I was trying to reconstruct it also reconstructed the environment around it — something I didn’t really want.

So while my little experiment may not have worked that well I have to give kudos to the developers working in the space. They have made it so simple to reconstruct objects from images and the entire workflow is completely free and opensource.