Scientists Develop A Novel Multitask Learning Method for Surface Reconstruction and Image Synthesis



Surface reconstruction and photorealistic rendering of water have always been two independent and tricky subjects in computer graphics and computer vision. Traditional methods solve the two issues respectively based on physical models. While in fact, there is an inverse relationship between them, a two-way transfer between water images and surface geometry. 3D water surface reconstruction from a single image is a well-known ill-posed problem. It is eagerly to put forward an efficient and stable water surface reconstruction algorithm for the computer vision community.

Recently, Assoc. Prof. HOU Fei from State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences and his collaborators from Beihang University and Stony Brook University, put forward a joint deep learning architecture based on the inverse relationship, to efficiently handle water surface reconstruction and water rendering together.


The researchers designed an end-to-end cycle multitask learning framework, in which the two ends are respectively RGB water images and surface normal maps. The model included a forward surface estimation net and a backward surface rendering net, which was very similar to the work on cycle image-to-image transformation with encoder-transformation-decoder structures. To reuse the lighting situation from the source images in the rendering process, they further employed an extra subnetwork to encode the imaging parameters in the forward part and exerted those parameters on the backward part. The inverse relation of the two parts was exploited to expedite the learning process of each other and improved overall robustness.


They further improved the performance using other techniques, for example, adding universality loss to make the estimated parameters independent of geometries and applying conditional GAN architecture to measure the high-level consistency.


Abundant experimental results validate that the method is both visually plausible and computationally efficient. which makes estimating fluid geometries and editing fluid images/videos possible in real time. This work facilitates real-time modeling and rendering of water surfaces and makes them in a unified framework. It prompts the practical application of the water rendering and modeling and is possible to be applied in the field of computer games, film production and simulation.

This work was presented in the 32nd International Conference on Computer Animation and Social Agents and published in the Computer Animation and Virtual Worlds, entitled " Multitask learning on monocular water images: Surface reconstruction and image synthesis ".


Figure: Network architecture. Our network includes a forward estimating net E and a backward rendering net R, both based on the encoder–transformation–decoder structure. The estimating net E also includes a Sub-Net 2 for extracting imaging parameters to reuse the lighting conditions in the backward rendering pass of net R.


Image by HOU Fei