This week, Nvidia will present a paper detailing their newly developed neural network at the Conference on Neural Information Processing Systems. The firm’s researchers will explain how its artificial intelligence (AI) can create three-dimensional models from two-dimensional images. Consequently, the company’s differentiable interpolation-based renderer (DIB-R) can potentially revolutionize a host of disciplines that involve computer imaging.
Understanding the DIB-R
Nvidia’s developers created the DIB-R while reverse engineering a typical computer graphics workflow. Typically, designers develop pipelines that allow for the rendering of 3D images on 2D screens. The firm’s researchers sought to find out what kind of functions could be achieved from inverting the process.
In particular, Nvidia’s data scientists sought to make a model that would optimize object tracking by inferring a 3D object from a 2D input. Accordingly, the firm’s researchers built a neural network capable of transforming data into a vector that can predict image color, shape, texture, and lighting.
Eventually, the chipmaker’s researchers trained their AI to mold polygon spheres to match the shape in a 2D image. The firm’s developers instructed their model using large single object datasets.
For instance, Nvidia’s team spent two days training their program to extrapolate the contours, textures, and colors of birds using photos taken from a host of different angles. Afterward, the V100 GPU-powered DIB-R could render lifelike 3D bird models in less than a second. Because of its high-quality results and remarkable processing speed, the program has potential applications in many different fields.
A Wide Array of Potential Applications
In a blog post, Nvidia indicated DIB-R could be used to improve the function of the autonomous robots. Specifically, the company’s researchers said it could help refine an AI’s depth perception.
As such, the program could be used in two industries that are working to bring self-directed systems into their operations. First, the DIB-R could optimize corporate logistics by teaching picking robots to identify and retrieve items in a warehouse. It could also help machines navigate environments filled with merchandise, equipment, other robots, and human workers more efficiently and safely.
Besides, Nvidia’s model could train self-driving systems to differentiate objects that occupy a freeway or busy intersection. Indeed, in the past, even high-grade autonomous vehicle programs have struggled to detect jaywalkers and delineate open sky from all-white tractor-trailer paneling.
With more research, it’s conceivable that the DIB-R could help automakers and tech firms build fully autonomous self-driving programs.
Conversely, Nvidia’s innovation could also have an impact on the construction of video game environments. Theoretically, the model could help developers expedite the process of creating richly textured and immersive worlds.
As an example, when Google debuted the Stadia at GDC 2019, the firm mentioned that the cloud platform would allow for a new style of game design. The firm’s engineers noted the system would let developers shape a level’s look by introducing a single image. Subsequently, the Stadia’s machine learning tools would model the entire environment around the graphical input.
When Google demonstrated the feature at the conference, it looked intriguing but unrefined. Nvidia’s DIB-R could help the Big Tech giant’s environmental tool produce a more seamless final product.
Nvidia’s DIB-R paper, “Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer,” can be read here. The firm also has information about the neural network available via its Kaolin PyTorch Library.