Exploring Blender for synthetic data generation (SDG)
One of the bottlenecks of machine learning (ML) and artificial intelligence (AI) development is data availability and quality. In my case, I am working toward detecting the orientation of parts so that a robot can pick them up. Ideally, I would use existing, pre-trained models for the task. But, my parts are unique. They’ve never been seen by these pre-trained models. So, I’ll need to fine-tune these models a bit so that they do recognize my parts. This is where things get a bit complicated…
What if I only have a few samples of parts?
What if future parts have different finishes or defects?
How can I efficiently take photos of these parts in different orientations and lighting?
How many images do I really need? 10’s, 100’s, 1000’s? And, will I manually label all of these?
What if I want to use tables, bins, or fixtures to hold the parts?
Looking at those questions, I think physical data collection will be a project in of itself. I almost need to build a photo studio for it to work! More so, delays can happen if I don’t have enough parts on hand (or if I think they’ll change). This is what got me to look at synthetic data generation (SDG).
The case for synthetic data
The idea of synthetic data is not a new one. The base idea is to artificially create images (or video, text, etc.) and use those to train a neural network (NN). From now on, I’ll focus on images because that is my main problem.
With SDG, images can be created programmatically. And, because they are generated by a computer, they can presumably be done faster.
Conditions, lighting, backgrounds can all be changed quickly too. So, theoretically, one can get a ton of variance as long as they have access to a computer (albeit with a GPU)—still much cheaper than making a studio from scratch.
There are a number of platforms/software packages that generate this type of data. Most of the popular ones use video-game-like rendering to get realistic images. So, in effect, you’re using pictures from a video game to train your neural network.
Given that many video games are now difficult to discern from reality, it now seems to be a viable option. I mean, even professional Formula 1 racers use video games to train. If they do, why can’t a neural network learn from them as well?
The alternative: data augmentation
There exists an alternative approach: data augmentation. This is something that has been around since the beginning of computer vision. Data augmentation tries to introduce variance in data by skewing, scaling, blurring, and many other 2D methods. However, the scene doesn’t change. The lighting is somewhat changed, but in a way that appears very unreal.
Now, this may be good for many of the existing object detection methods. But, for seeing every angle of a product and telling a robot where to grab, it just doesn't cut it.
Existing platforms for SDG
After doing a bit of research, synthetic data for images can be handled by a number of platforms. Most have a steep commitment in cash or are complex to use. The most prominent one that is free is from NVIDIA, called Omniverse (and it’s robotics simulator Isaac Sim).
I’ve done a bit of exploring Omniverse and Isaac Sim. So far I can tell, it’s very powerful. However, the learning curve is steep. I am good at software, but I am not a software developer. So, I’ve begun exploring other avenues that go well with my computer-aided design (CAD) background. This is what had me look at Blender.
A look at Blender
Blender is a modeling tool used primarily by artists, designers, and 3D printers. It has powerful rendering capabilities and a plethora of models available on the internet.
This is somewhat of an unconventional approach. But, I have a background in CAD and Blender is easy to learn for me. It’s not exactly a CAD program, but the workflow is similar. Its rendering capabilities are very good and you can extend its capabilities with Python. So, although this approach is not well documented. I think I’ll get further along much faster.
With Blender, I can import files directly from my other CAD programs, such as Fusion 360. I can also get models from free places like 3D printing sites and sites hosting Blender models. High quality models and scenes can also be purchased for reasonable cost from various designers and artists. This gives me access to high quality models right off the bat. That way I can build out scenes that are realistic and varied quickly.
Even more so, Blender has plenty of tools to modify and cleanup models. It’s also a familiar environment—I’m way more comfortable making visual scenes and models visually versus programmatically, given my background. So, I think this will be an interesting path to explore.
In the next post, I'll explore how Blender can be used to generate realistic training data—covering the basics of setting up a scene, rendering, and scripting for automation.