Home United States USA — IT How Apple's iPhone X TrueDepth Camera Works

How Apple's iPhone X TrueDepth Camera Works

271
0
SHARE

Face ID, Animojis, and Portrait Lighting effects in the iPhone X are all made possible by its new TrueDepth camera…
One of the most intriguing features of Apple’s new flagship phone, the iPhone X, is Face ID. It replaces the fingerprint sensor and Touch ID with facial recognition. This is made possible by Apple’s new TrueDepth front-facing camera system. TrueDepth also enables Apple’s new Animojis and other special effects that require a 3D model of the user’s face and head. Let’s take a look at how the TrueDepth camera and sensors work.
TrueDepth starts with a traditional 7MP front-facing “selfie” camera. It adds an infrared emitter that projects over 30,000 dots in a known pattern onto the user’s face. Those dots are then photographed by a dedicated infrared camera for analysis. There is a proximity sensor, presumably so that the system knows when a user is close enough to activate. An ambient light sensor helps the system set output light levels.
Apple also calls out a Flood Illuminator. It hasn’ t said explicitly what it is for, but it would make sense that in low light flood-filling the scene with IR would help the system get an image of the user’s face to complement the depth map — which explains how Apple says it will work in the dark. IR also does an excellent job of picking up sub-surface features from skin, which might also be helpful in making sure masks can’ t fool the system.
While depth estimation using two or more standard cameras gets better every year — and is enough to do some great special effects in dual-camera phones, including the Plus models of recent iPhones — it is far from perfect. In particular, when those systems are used to perform facial recognition, they have been criticized for being too easy to fool. Recently, for example, a researcher was able to fool the facial recognition system on Samsung’s new Galaxy Note 8 simply by showing it a photograph of his face displayed on a second Note 8.
Since Apple is relying on Face ID for unlocking the X and activating Apple Pay, it needs to do a lot better. It has created a more sophisticated system that uses structured light. Its depth estimation works by having an IR emitter send out 30,000 dots arranged in a regular pattern. They’ re invisible to people, but not to the IR camera that reads the deformed pattern as it shows up reflected off surfaces at various depths.
This is the same type of system used by the original version of Microsoft’s Kinect, which was widely praised for its accuracy at the time. That shouldn’ t be a surprise, since Apple acquired PrimeSense, which developed the structured light sensor for the original Kinect, in 2013. This type of system works well, but has typically required large, powerful emitters and sensors. It has been more suitable for the always-on Kinect, or for laptops, than for a battery-powered iPhone with a tiny area for sensors.
Apple appears to have delivered on the mobile device promise Intel continues to make about its RealSense depth-aware cameras. Intel has shown them off on stage built into prototype mobile devices, but units in the market are still too large and power hungry to find their way into a phone. While RealSense also has an IR emitter, it uses it to paint the entire scene, and then relies on stereo disparity captured by two IR cameras to calculate depth. The result is a module for laptops accurate enough to power facial recognition for Windows Hello, and do gesture recognition.
I’ m sure Apple’s TrueDepth camera will give Intel even more impetus to build a version of RealSense for phones. Intel’s recent purchase of video processing chip startup Movidius will definitely help. Movidius has already been tapped by industry leaders like Google to provide low power vision and generalized AI processing for mobile devices, and certainly could replace the custom processor in the RealSense modules over time.
Getting a depth estimate for portions of a scene is only the beginning of what’s required for Apple’s implementation of secure facial recognition and Animojis. For example, a mask could be used to hack a facial recognition system that relied solely on the shape of the face. So Apple is using processing power to learn and recognize 50 different facial motions that are much harder to forge. They also provide the basis for making Animoji figures seem to mimic the phone’s owner.
Early facial recognition systems got a bad name, as they could be fooled with simple photographs. Even second-generation systems that added motion detection could be fooled by videos. Modern versions like Windows Hello go beyond that by building and recognizing 3D models of the user’s face. They can also rely on some properties of light and skin to ensure that the whatever it is looking at is skin-like. But even 3D models can be tricked, as one researcher demonstrated by using a plaster cast made from a material that acted similar to skin to fool Windows Hello.
Given how willing Apple is to commit to using Face ID for financial transactions, I’ m sure they have pushed the limits beyond either simple 3D models or 2D motion. It is likely they are relying on the phone’s ability to recognize minute facial movements and feed them into a machine learning system on the A11 Bionic chip that will add another layer of security to the system. That piece will also be key in helping the phone decide whether you’ re the same person when you put on a pair of glasses, a hat, or grow a beard — all of which Apple claims Face ID will handle.
A quick side note: Those who watched Apple’s keynote this week no doubt noticed that Face ID didn’ t work when first tried. It turns out this was not a problem with Face ID; instead, it was a security feature. The phone had been handled by a number of other staff before the demo, who apparently had tried to test Face ID. After a number of unsuccessful matches, the phone locked itself down and required a passcode (the same way Touch ID does) .
Overall, Apple says that Face ID is accurate enough that only one in a million people will have similar enough faces to fool it. This contrasts with one in 50,000 for TouchID, so that’s a factor of 20 improvement. Of more concern to many is the problem of users being coerced into looking at their phones and unlocking them, by thieves or law enforcement.

Continue reading...