5 ways Google Pixel 3 camera pushes the boundaries of computational photography
With the launch of the Google Pixel 3, smartphone cameras have taken yet another leap in capability. I had the opportunity to sit down with Isaac Reynolds, Product Manager for Camera on Pixel, and Marc Levoy, Distinguished Engineer and Computational Photography Lead at Google, to learn more about the technology behind the new camera in the Pixel 3.
One of the first things you might notice about the Pixel 3 is the single rear camera. At a time when we're seeing companies add dual, triple, even quad-camera setups, one main camera seems at first an odd choice.
But after speaking to Marc and Isaac I think that the Pixel camera team is taking the correct approach – at least for now. Any technology that makes a single camera better will make multiple cameras in future models that much better, and we've seen in the past that a single camera approach can outperform a dual camera approach in Portrait Mode, particularly when the telephoto camera module has a smaller sensor and slower lens, or lacks reliable autofocus.
Let's take a closer look at some of the Pixel 3's core technologies.
1. Super Res Zoom
Last year the Pixel 2 showed us what was possible with burst photography. HDR+ was its secret sauce, and it worked by constantly buffering nine frames in memory. When you press the shutter, the camera essentially goes back in time to those last nine frames1, breaks each of them up into thousands of 'tiles', aligns them all, and then averages them.
Breaking each image into small tiles allows for advanced alignment even when the photographer or subject introduces movement. Blurred elements in some shots can be discarded, or subjects that have moved from frame to frame can be realigned. Averaging simulates the effects of shooting with a larger sensor by 'evening out' noise. And going back in time to the last 9 frames captured right before you hit the shutter button means there's zero shutter lag.
|Like the Pixel 2, HDR+ allows the Pixel 3 to render sharp, low noise images even in high contrast situations. Click image to view the level of detail at 100%. Photo: Google|
This year, the Pixel 3 pushes all this further. It uses HDR+ burst photography to buffer up to 15 images2, and then employs super-resolution techniques to increase the resolution of the image beyond what the sensor and lens combination would traditionally achieve3. Subtle shifts from handheld shake and optical image stabilization (OIS) allow scene detail to be localized with sub-pixel precision, since shifts are unlikely to be exact multiples of a pixel.
In fact, I was told the shifts are carefully controlled by the optical image stabilization system. "We can demonstrate the way the optical image stabilization moves very slightly" remarked Marc Levoy. Precise sub-pixel shifts are not necessary at the sensor level though; instead, OIS is used to uniformly distribute a bunch of scene samples across a pixel, and then the images are aligned to sub-pixel precision in software.
We get a red, green, and blue filter behind every pixel just because of the way we shake the lens, so there's no more need to demosaic
But Google – and Peyman Milanfar's research team working on this particular feature – didn't stop there. "We get a red, green, and blue filter behind every pixel just because of the way we shake the lens, so there's no more need to demosaic" explains Marc. If you have enough samples, you can expect any scene element to have fallen on a red, green, and blue pixel. After alignment, then, you have R, G, and B information for any given scene element, which removes the need to demosaic. That itself leads to an increase in resolution (since you don't have to interpolate spatial data from neighboring pixels), and a decrease in noise since the math required for demosaicing is itself a source of noise. The benefits are essentially similar to what you get when shooting pixel shift modes on dedicated cameras.
|Normal wide-angle (28mm equiv.)||Super Res Zoom|
There's a small catch to all this – at least for now. Super Res only activates at 1.2x zoom or more. Not in the default 'zoomed out' 28mm equivalent mode. As expected, the lower your level of zoom, the more impressed you'll be with the resulting Super Res images, and naturally the resolving power of the lens will be a limitation. But the claim is that you can get "digital zoom roughly competitive with a 2x optical zoom" according to Isaac Reynolds, and it all happens right on the phone.
The results I was shown at Google appeared to be more impressive than the example we were provided above, no doubt at least in part due to the extreme zoom of our example here. We'll reserve judgement until we've had a chance to test the feature for ourselves.
Would the Pixel 3 benefit from a second rear camera? For certain scenarios – still landscapes for example – probably. But having more cameras doesn't always mean better capabilities. Quite often 'second' cameras have worse low light performance due to a smaller sensor and slower lens, as well as poor autofocus due to the lack of, or fewer, phase-detect pixels. One huge advantage of Pixel's Portrait Mode is that its autofocus doesn't differ from normal wide-angle shooting: dual pixel AF combined with HDR+ and pixel-binning yields incredible low light performance, even with fast moving erratic subjects.
2. Computational Raw
The Pixel 3 introduces 'computational Raw' capture in the default camera app. Isaac stressed that when Google decided to enable Raw in its Pixel cameras, they wanted to do it right, taking advantage of the phone's computational power.
Our Raw file is the result of aligning and merging multiple frames, which makes it look more like the result of a DSLR
"There's one key difference relative to the rest of the industry. Our DNG is the result of aligning and merging [up to 15] multiple frames... which makes it look more like the result of a DSLR" explains Marc. There's no exaggeration here: we know very well that image quality tends to scale with sensor size thanks to a greater amount of total light collected per exposure, which reduces the impact of the most dominant source of noise in images: photon shot, or statistical, noise.
The Pixel cameras can effectively make up for their small sensor sizes by capturing more total light through multiple exposures, while aligning moving objects from frame to frame so they can still be averaged to decrease noise. That means better low light performance and higher dynamic range than what you'd expect from such a small sensor.
Shooting Raw allows you to take advantage of that extra range: by pulling back blown highlights and raising shadows otherwise clipped to black in the JPEG, and with full freedom over white balance in post thanks to the fact that there's no scaling of the color channels before the Raw file is written. Even better news? HDR+ independently merges red, green and blue channels, which means the Raws are true Raws - un-demosaiced.
|Pixel 3 introduces in-camera computational Raw capture.|
Such 'merged' Raw files represent a major threat to traditional cameras. The math alone suggests that, solely based on sensor size, 15 averaged frames from the Pixel 3 sensor should compete with APS-C sized sensors in terms of noise levels. There are more factors at play, including fill factor, quantum efficiency and microlens design, but needless to say we're very excited to get the Pixel 3 into our studio scene and compare it with dedicated cameras in Raw mode, where the effects of the JPEG engine can be decoupled from raw performance.
While solutions do exist for combining multiple Raws from traditional cameras with alignment into a single output DNG, having an integrated solution in a smartphone that takes advantage of Google's frankly class-leading tile-based align and merge - with no ghosting artifacts even with moving objects in the frame - is incredibly exciting. This feature should prove highly beneficial to enthusiast photographers. And what's more - Raws are automatically uploaded to Google Photos, so you don't have to worry about transferring them as you do with traditional cameras.
3. Synthetic Fill Flash
|'Synthetic Fill Flash' adds a glow to human subjects, as if a reflector were held out in front of them. Photo: Google|
Often a photographer will use a reflector to light the faces of backlit subjects. Pixel 3 does this computationally. The same machine-learning based segmentation algorithm that the Pixel camera uses in Portrait Mode is used to identify human subjects and add a warm glow to them.
If you've used the front facing camera on the Pixel 2 for Portrait Mode selfies, you've probably noticed how well it detects and masks human subjects using only segmentation. By using that same segmentation method for synthetic fill flash, the Pixel 3 is able to relight human subjects very effectively, with believable results that don't confuse and relight other objects in the frame.
Interestingly, the same segmentation methods used to identify human subjects are also used for front-facing video image stabilization, which is great news for vloggers. If you're vlogging, you typically want yourself, not the background, to be stabilized. That's impossible with typical gyro-based optical image stabilization. The Pixel 3 analyzes each frame of the video feed and uses digital stabilization to steady you in the frame. There's a small crop penalty to enabling this mode, but it allows for very steady video of the person holding the camera.
4. Learning-based Portrait Mode
The Pixel 2 had one of the best Portrait Modes we've tested despite having only one lens. This was due to its clever use of split pixels to sample a stereo pair of images behind the lens, combined with machine-learning based segmentation to understand human vs. non-human objects in the scene (for an in-depth explanation, watch my video here). Furthermore, dual pixel AF meant robust performance of even moving subjects in low light - great for constantly moving toddlers. The Pixel 3 brings some significant improvements despite lacking a second lens.
According to computational lead Marc Levoy, "Where we used to compute stereo from the dual pixels, we now use a learning-based pipeline. It still utilizes the dual pixels, but it's not a conventional algorithm, it's learning based". What this means is improved results: more uniformly defocused backgrounds and fewer depth map errors. Have a look at the improved results with complex objects, where many approaches are unable to reliably blur backgrounds 'seen through' holes in foreground objects:
Interestingly, this learning-based approach also yields better results with mid-distance shots where a person is further away. Typically, the further away your subject is, the less difference in stereo disparity between your subject and background, making accurate depth maps difficult to compute given the small 1mm baseline of the split pixels. Take a look at the Portrait Mode comparison below, with the new algorithm on the left vs. the old on the right.
|Learned result. The background is uniformly defocused, and the ground shows a smooth, gradual blur.||Stereo-only result. Note the sharp railing in the background, and the harsh transition from in-focus to out-of-focus in the ground.|
5. Night Sight
Rather than simply rely on long exposures for low light photography, 'Night Sight' utilizes HDR+ burst mode photography to take usable photos in very dark situations. Previously, the Pixel 2 would never drop below 1/15s shutter speed, simply because it needed faster shutter speeds to maintain that 9-frame buffer with zero shutter lag. That does mean that even the Pixel 2 could, in very low light, effectively sample 0.6 seconds (9 x 1/15s), but sometimes that's not even enough to get a usable photo in extremely dark situations.
The camera will merge up to 15 frames... to get you an image equivalent to a 5 second exposure
The Pixel 3 now has a 'Night Sight' mode which sacrifices the zero shutter lag and expects you to hold the camera steady after you've pressed the shutter button. When you do so, the camera will merge up to 15 frames, each with shutter speeds as low as, say, 1/3s, to get you an image equivalent to a 5 second exposure. But without the motion blur that would inevitably result from such a long exposure.
Put simply: even though there might be subject or handheld movement over the entire 5s span of the 15 frame burst, many of the the 1/3s 'snapshots' of that burst are likely to still be sharp, albeit possibly displaced relative to one another. The tile-based alignment of Google's 'robust merge' technology, however, can handle inter-frame movement by aligning objects that have moved and discarding tiles of any frame that have too much motion blur.
Have a look at the results below, which also shows you the benefit of the wider-angle, second front-facing 'groupie' camera:
|Normal front-camera 'selfie'||Night Sight 'groupie' with wide-angle front-facing lens|
Furthermore, Night Sight mode takes a machine-learning based approach to auto white balance. It's often very difficult to determine the dominant light source in such dark environments, so Google has opted to use learning-based AWB to yield natural looking images.
Final thoughts: simpler photography
The philosophy behind the Pixel camera - and for that matter the philosophy behind many smartphone cameras today - is one-button photography. A seamless experience without the need to activate various modes or features.
This is possible thanks to the computational approaches these devices embrace. The Pixel camera and software are designed to give you pleasing results without requiring you to think much about camera settings. Synthetic fill flash activates automatically with backlit human subjects, and Super Resolution automatically kicks in as you zoom.
At their best, these technologies allows you to focus on the moment
Motion photos turns on automatically when the camera detects interesting activity, and Top Shot now uses AI to automatically suggest the best photo of the bunch, even if it's a moment that occurred before you pressed the shutter button. Autofocus typically focuses on human subjects very reliably, but when you need to specify your subject, just tap on it and 'Motion Autofocus' will continue to track and focus on it very reliably. Perfect for your toddler or pet.
At their best, these technologies allow you to focus on the moment, perhaps even enjoy it, and sometimes even help you to capture memories you might have otherwise missed.
We'll be putting the Pixel 3 through its paces soon, so stay tuned. In the meantime, let us know in the comments below what your favorite features are, and what you'd like to see tested.
1In good light, these last 9 frames typically span the last 150ms before you pressed the shutter button. In very low light, it can span up to the last 0.6s.
2We were only told 'say, maybe 15 images' in conversation about the number of images in the buffer for Super Res Zoom and Night Sight. It may be more, it could be less, but we were at least told that it is more than 9 frames. One thing to keep in mind is that even if you have a 15-frame buffer, not all frames are guaranteed to be usable. For example, if in Night Sight one or more of these frames have too much subject motion blur, they're discarded.
3You can achieve a similar super-resolution effect manually with traditional cameras, and we describe the process here.
The Olympus TG-5 is one of our favorite waterproof cameras, and the company today introduced the TG-6, a relatively low-key update. New features include the addition of an anti-reflective coating on the sensor, a higher-res LCD, and more underwater and macro modes.
We've long held Olympus' Tough cameras in high regard and the TG-6 is no exception. It offers top notch image quality for its class and lots of useful features. It's also a blast to shoot with.
A meticulously curated video from YouTuber Guy Jones highlights the evolution of street photography from mid-19th century to present day.
A big design is in the pipeline for the popular Godox shoot-through flash trigger, with a new control layout and a collection of new features - including Bluetooth connectivity
The iconic statues of Easter Island are at risk of getting destroyed thanks to tourists climbing on them and picking their noses.
The Honor 20 Pro looks like an attractively priced alternative to some more established flagship competitors.
We've been playing around with a prototype of the new Peak Design Travel Tripod and are impressed so far: it's incredibly compact, fast to deploy and stable enough for the heaviest bodies. However, the price may turn some away.
Peak Design is back at it again, this time crowdfunding the Travel Tripod, the company's 9th Kickstarter to date.
The Camera Store has reported that on May 15, 2019 their store was robbed of a Sony lens and camera, and an employee subsequently sprayed with bear spray while attempting to apprehend the suspect.
Voigtländer's 21mm F1.4 Nokton lens for the Sony E mount has been officially announced.
Google complies with an executive order and resulting blacklists that prohibits US companies to do business with certain foreign entities.
Dutch public broadcaster VPRO has published a documentary called '#followme' that takes a behind-the-scenes look at how some Instagram influencers game the system trough shady tactics.
Nikon says it will fix affected Z6 and Z7 camera units free of charge (including shipping), even if the camera is out of warranty.
Looking to get in on the instant camera fun? We tried every model and think the Fujifilm Instax Mini 70 strikes the right balance between price and features – the Instax Wide 300 is our choice if you crave a larger format.
We talked to executives from Ricoh about the company's broad portfolio of imaging products, the GR III's warm reception and what they make of the surge in popularity of mirrorless cameras.
A new week, a new special edition Leica.
Reuben Wu's ethereal landscapes are lit by drone-mounted light sources rather than the sun or the moon. We talked with him about his process, the equipment he uses and what inspires him.
Earlier today, the Federal Aviation Administration issued a notice stating that recreational drone users are limited to where they can operate.
Chris and Jordan take a look at Canon's latest, tiniest Rebel and get a serious sense of déjà vu as they take a look at its still and video capabilities.
Nature photographer Erez Marom shares his experience shooting the famous Ijen volcano in Indonesia – from the best time to begin the hike to the crater to the equipment needed to withstand the toxic environment.
While Canon and Nikon have lost double-digit percentages year-over-year for their respective imaging divisions, Sony has managed to come out in the green, albeit not by much.
Photo software developer On1 has introduced an update of its raw processing application that it claims is up to 50x faster than the previous version and which includes a host of new features.
Don't expect any new features or functionality. These updates are simply to improve the overall stability of eight Sony camera systems.
A rotating mechanism on the Asus ZenFone 6 does away with the need for a front-facing camera.
Canon's diminutive Rebel SL3 (also known as the EOS 250D and EOS Kiss X10) is currently the smallest DSLR on the market, but it comes with a proven sensor, an updated processor, and more. We've taken our review copy to New Orleans and back, and put it in front of our studio test scene – see how it stacks up.
Instagram has brought live the update to its Explore tab that brings more content and better organization to the forefront of the user interface.
No one ever said 1TB of storage in a form factor smaller than your thumbnail would come cheap.
Honor has already revealed some sample photos including EXIF-data from its upcoming 20 Pro flagship phone.
The latest in a line of celebrities caught using pictures without permission, singer Ariana Grande is being sued over images she posted on Instagram.