Dr. Yuval Bahat – STAR Sensing and Sensibility

Outgoing fellow, affiliation:

08/2021-07/2023: Computational Imaging Lab, Princeton University, USA;
2023-2024: Chair for Computergraphics, Chair for Media Systems, Chair for Computer Vision and Chair Technical Thermodynamics

Ambiguity-Aware System Design for Computer Vision Problems.

Dr. Yuval Bahat

Ambiguity is inherent to most computer vision problems, including image reconstruction tasks, where the output of a visual sensor (e.g. a low resolution image) may correspond to infinitely many
different visual reconstructions (e.g. high resolution images), as well as to high level tasks, where low level data (e.g. a high resolution image of a scene) often has multiple different valid interpretations (e.g. sentences describing the scene). We would argue that accounting for this ambiguity is crucial when attempting to use computer vision algorithms for any practical purpose, and even more so when dealing with systems operating in sensitive domains such as health, forensics and transportation. Take for instance an example from the medical domain, of a system for enhancing the quality of CT images. A system presenting only a single enhanced output to a radiologist trying to assess the likelihood of a suspected tumor being malignant is problematic, regardless of how perceptually pleasing the output is. Instead, for the system to be more useful, it should allow exploring the various different possible appearances corresponding to the different possible diagnoses (e.g. malignant vs. benign) that are consistent with the data recorded by the CT scanner.

Recently we have been conducting research on ambiguity in low-level vision tasks, which are located earlier in the “algorithmic pipeline”, and involve the capturing and initial processing of the visual data. This may be in preparation for higher level analysis, or as a standalone task (e.g. image enhancement). We initiated a study of explorable image restoration, and proposed frameworks allowing to perform “explorable” super-resolution, as well as “explorable” decoding of compressed images (e.g. JPEG).
Acknowledging the importance of the new ability to explore the space of solutions, we are now excited about working towards extending this new paradigm to consider a variety of tasks that play a key role in systems affecting many aspects of modern life. We believe introducing user exploration and guidance mechanisms into a wide range of tasks (from medicine to forensics) can make a significant positive impact, as it has the potential to close the gap between the state-of-the-art capabilities celebrated in the computer vision academic community, and the ability to apply these capabilities for solving practical problems in the real world.