We learn a latent space for easy capture, semantic editing, consistent interpolation, and efficient reproduction of visual material appearance. When users provide a photo of a stationary natural material captured under flash light illumination, it is converted in milliseconds into a latent material code. In a second step, conditioned on the material code, our method, again in milliseconds, produces an infinite and diverse spatial field of BRDF model parameters (diffuse albedo, specular albedo, roughness, normals) that allows rendering in complex scenes and illuminations, matching the appearance of the input picture. Technically, we jointly embed all flash images into a latent space using a convolutional encoder, and – conditioned on these latent codes – convert random spatial fields into fields of BRDF parameters using a convolutional neural network (CNN). We condition these BRDF parameters to match the visual characteristics (statistics and spectra of visual features) of the input under matching light. A user study confirms that the semantics of the latent material space agree with user expectations and compares our approach favorably to previous work.
BRDF space. From a flash image, which contains sparse observa-tions across material, space and view-light(left)we map to a latent code(middle)so that changes in that code can be decoded to enable (right) material synthesis (holding material fixed and moving spatially), material morphing (holding space and view/light fixed and changing material), or classical shading and material generation (points in the latent space).
Starting from an exemplar (top-left) training encodes the image to a compact latent space variable $z$. Additionally, a random infinite field is cropped with the same spatial dimensions as the flash input image. The noise crop is then reshaped based on a convolutional U-Net architecture. Each convolution in the network is followed by and Adaptive Instance Normalization (AdaIN) layer reshaping the statistics (mean $\mu$ and standard deviation $\sigma$) of features. A learned affine transformation per layer maps $z$ to the desired $\mu$’s and $\sigma$’s. The output of the network are the diffuse, specular, roughness, normal parameters of a svBRDF that when rendered (using a flash light) look the same as the input.
The visual quality is best inspected from our interactive WebGL demo . It allows exploring the space by relighting, changing the random seed and visualizing individual BRDF model channels and their combinations. The same package contains all channels of all materials as images. See the accompanying video for a demonstration of our interactive interface.
For further results please have a look here
Our learned BRDF space can be sampled at any query (x,y) location without producing visible repetition artifact. The network architecture, by construction, does not require any special boundary alignment to avoid tiling artifacts. All results are sampled from the learned BRDF space.
As any point in our space is a material, we can simply sample randomly to produce a coverage of all materials available in the space. Above figure shows random samples from this space applied to a set of 3D cubes. Note that no materials are similar and all look plausible, with spatially varying appearance.
More generated brdf-maps can be found here .
In each row, a left and a right latent code $\mathbf z_1$ and $\mathbf z_2$ are obtained by encoding two flash images, respectively. The intermediate, continuous field of BRDF parameters is computed by interpolating, in the learned BRDF space, from $\mathbf z_1$ to $\mathbf z_2$ and conditioning the decoder CNN with the intermediate code. The result is lit with a fronto-parallel light source to demonstrate the changes in appearance. For comparison, the last row shows image space linear interpolation – compare against the second last row showing latent space interpolation.
More interpolated results can be found here .
After a flash exemplar (first column) has been embedded into our learned BRDF space (second column), its latent material code can be manipulated such that certain semantic attributes are enhanced or suppressed (last three columns). Note, that the semantic change maximizes the attribute, while retaining properties not in conflict. A move towards fabric in the first and fourth example, makes both become more like fabric, (e.g. less shiny) but retains color. A move towards rubber imposes some structure to the normal map, but is compatible with both shiny and rough.
For more semantical manipulation examples please click here .