Learning with Cross-Task Consistency. The upper and lower rows show the results of consistency-based learning and the baseline (individual learning). The consistency-based model yields higher quality (especially at hard-to-predict fine-grained details) and more consistent predictions.
3) How can we design a learning system that makes consistent predictions?
This paper proposes a fully computational method that, given an arbitrary dictionary of desired tasks to solve, augments the learning objective in such a way that the predictions are explicitly encouraged to be cross-task consistent. The consistency constraints are completely learned from the data, rather than from human supervision or known analytic relationships (e.g., it is not necessary to encode that surface normals are the 3D derivative of depth, though such derivations could be used if available). Please see the figure below.
Enforcing cross-task consistency. (a) shows the typical multitask setup where predictors between domains are trained independently (or with a shared encoder and separate decoder heads). Both neither explicitly enforce cross-task consistency among predictions nor empirically achieve that. (b) demonstrates the consistency constraint, over three domains. (c) shows how the triangle unit from (b) can be an element of a larger system of domains. Finally, (d) illustrates the generalized case where in such a larger system of domains, consistency can be enforced along any paths of arbitrary length, as long as their beginning and end are the same. When two different paths with the same endpoints are enforced to yield similar results, that implies none of the middle domains introduced conflicting information, as far as the prediction job from the input endpoint to the output endpoint was concerned. This is the general concept behind path-independence or conservativeness. The triangle in (b) is the smallest unit of such paths.
Enforcing cross-task consistency using perceptual losses illustrated with an example.
Top row shows the results of standard training (MSE loss). When the predicted normals after converged training are projected onto other domains various inaccuracies transpire.
Middle row shows the results of training enforcing consistency with the other domains. The results are notably improved, especially in hard to predict fine-grained details.
Bottom row shows the ground truth. (The convention for visualizing normals: redder is righter, greener is more downwards, bluer is more towards the camera).
We quantify the amount of inconsistency in a prediction made for a query using an energy-based quantity called Consistency Energy. It is defined to be the sum of squared pairwise inconsistencies. This is also equivalent to the sample variance of the predictions. The consistency energy is an intrinsic quantity of the system, as it requires no ground truth and can be computed without any supervision. As shown below, it strongly correlates with estimation error and sample distributions.
Analysis of consistency energy. Left: The consistency energy shown over the course of training, demonstrating a successful optimization to achieve more consistent results, ending significantly lower than that of independent predictions and multi-task baselines. Middle: Energy is predictive of estimation error, with a Pearson correlation coefficient of 0.67. Right: Out-of-domain samples (blue, red) have an energy distribution significantly higher than in-distribution samples (grey), hence energy can be used as a strong unsupervised method for detecting domain shifts in the data (AUC=0.99).
In order to convey a more tangible understanding of consistency-based learning versus various baselines, we ran the models frame-by-frame on a youtube video. Visit the visualization page to specify the comparison configuration of your choice and analyze the performance. You can compare the results to several baselines.
Robust Learning Through Cross-Task Consistency.
Amir Zamir*, Alexander Sax*, Teresa Yeo, Oğuzhan Kar, Nikhil Cheerla, Rohan Suri, Zhangjie Cao, Jitendra Malik, Leonidas Guibas.
CVPR 2020 [oral]