T1 – Understanding the In-Camera Image Processing Pipeline for Computer Vision
T2 – Capturing 3D Deformable Models from the Real World
Computer vision technologies are yet to make a great impact in performance-critical areas such as graphics production for the entertainment industry and bio-mechanical modelling for medicine and sports. For these purposes, it is necessary to build accurate and editable 3D deformation models. Computer graphics has traditionally approached this requirement from the other side, using detailed hand-crafted models that are suited to the specific purpose. But data-driven deformation models, inspired partly from advances in computer vision, are getting increasingly popular due to their greater realism. With relatively cheap consumer-grade capture technologies, robust 3D deformation models can be built from the “big-data” of 3D deformations. It is an exciting opportunity for computer vision researchers to contribute to several new real-world applications.
Robust deformable models are also a powerful tool for solving challenging computer vision problems, as they provide more accurate priors than can be obtained from the images themselves. However, the knowledge of 3D surface deformation methods and 3D geometry processing is not as wide-spread in the computer vision community as it is in computer graphics. In this tutorial, we aim to bridge this gap.
T3 – Theory and Methods of Lightfield Photography
Computational photography focuses on capturing and processing discrete representations of all the light rays in the 3D space of a scene. Compared to conventional photography, which captures 2D images, computational photography captures the entire 4D “lightfield,” i.e., the full 4D radiance. To multiplex the 4D radiance onto conventional 2D sensors, light-field photography demands sophisticated optics and imaging technology. At the same time, 2D image creation is based on creating 2D projections of the 4D radiance.
This course presents light-field analysis in a rigorous, yet accessible, mathematical way, which often leads to surprisingly direct solutions. The mathematical foundations will be used to develop computational methods for lightfield processing and image rendering, including digital refocusing and perspective viewing. While emphasizing theoretical understanding, we also explain approaches and engineering solutions to practical problems in computational photography.
T4 – Higher Order Models and Inference Approaches in Computer Vision
T5 – DIY Deep Learning for Vision: a Hands-On Tutorial
This is a hands-on tutorial intended to present state-of-the-art deep learning models and equip vision researchers with the tools and know-how to incorporate deep learning into their work. Deep learning models and deep features have recently achieved strong results in classification and recognition, detection, and segmentation, but a common framework and shared models are needed to advance further work and reduce the barrier to entry.
To this end we present the Caffe – Convolutional Architecture for Fast Feature Embedding – framework that offers an open-source library, public reference models, and worked examples for deep learning in vision. Demos will be done live and the audience will be able to follow along with examples (if they follow pre-tutorial installation instructions).
T6 – Robust Optimization Techniques in Computer Vision
T7 – Domain Adaptation and Transfer Learning
A large part of the computer vision literature focuses on obtaining impressive results on large datasets under the main assumption that training and test samples are drawn from the same distribution. However, in several applications this assumption is grossly violated. Think about using algorithms trained on clean Amazon images to annotate objects acquired with a low-resolution cellphone camera, or using an organ detection and segmentation tool trained on CT images for MRI scans. Other challenging tasks appear across object classes: given the models of a giraffe and a zebra or some of their image patches, can we use them to detect and recognize an okapi?
Despite the large availability of principled learning methods, it has been shown that they often fail in generalizing across domains, preventing any reliable automatic labeling and bringing back to the error prone and time expensive human annotation for new images. Domain adaptation and Transfer learning tackle these problems proposing methods that bridge the gap between the source training domain and different but related target test domains.
T8 – 3D Scene Understanding
What does it mean to understand an image? The bounding-box or segment-level understanding produced by many current computer vision systems tells us little about where objects are located in 3D and how agents like humans could interact with them. However, recent work has focused on obtaining a complementary and geometric understanding of the scene in terms of the 3D volumes and surfaces that compose the scene and their interactions. This representation enables reasoning about the objects as they exist in a 3D world, rather than simply in the image plane, and has been demonstrated to have a myriad of applications for object detection, human-centric understanding, and graphics. Additionally, recent data-set collection efforts with depth cameras have made large-scale learning of these geometric representations possible and have opened up exciting avenues for research on large-scale learning with RGB-D datasets.
The tutorial organizers will summarize the state-of-the-art in 3D scene understanding in a half day tutorial. Participants will learn the fundamentals of 3D scene understanding with the aim of enabling its application to traditional 2D image tasks as well as research on the topic itself.