Professionals in film and TV production are very familiar with chroma keying that consists in removing a specific color, often green, from a video, and replacing it by content from other scenes. Chroma keying has been used in a large number of applications ranging from professional photography to special effects in movies and TV programs, where special screens are physically placed at the back side of the studio. We have all seen while watching TV news a presenter informing viewers about the upcoming weather forecast in front of a virtual map of the area.

More recently, with the increased reliance on video-based teleconferencing, several vendors have integrated foreground extraction tools in their systems, enabling their users to select between a plethora of static or dynamic backgrounds, replacing their physical environment by a virtual surrounding. Performance of such tools remain far from perfect when compared to what can be achieved by chroma keying in a controlled studio environment and often results in a poor quality of experience because of annoying artifacts between the speaker on the front, partially merged into the background or vice versa. An efficient foreground separation in an uncontrolled environment remains a challenge. If efficiently addressed, efficient foreground extraction techniques can open the door to an unlimited number of innovations in immersive applications.

The state of the art is divided into two main approaches to overcome this challenge. The first relies on more advanced foreground extraction algorithms such as those based on artificial intelligence. The second makes use of novel sensing technologies such as compound vision or depth plus texture sensors.
In AdMiRe project, we rely on a technique based on deep neural networks for foreground extraction in typical natural surroundings for consumers which are uncontrolled, with performance comparable to those obtained in controlled environments and infrastructure in professional studio. A key feature of the approach is that it does not require any chroma keying screen and can be deployed through a typical smart phone camera.

Prof. Dr. Touradj Ebrahimi / EPFL

Left image: typical consumer’s uncontrolled environment with green screen
Middle image: typical consumer’s uncontrolled environment without green screen
Right image:  Feedback of the final result presented on consumer’s smart phone