As part of our regular appearances on the TWiT Network (named after its flagship show, This Week in Tech) show 'The New Screen Savers', our Science Editor Rishi Sanyal joined host Leo Laporte and co-host Megan Morrone to talk about how smartphone cameras are revolutionizing photography. Watch the segment above, then catch the full episode here.

Rishi has also expounded upon some of the topics covered in the segment below, with detailed examples that clarify some of the points covered. Have a read after the fold once you've watched the segment.

You can watch The New Screen Savers live every Saturday at 3pm Pacific Time (23:00 UTC), on demand through our articles, the TWiT website, or YouTube, as well as through most podcasting apps.


So who wins? iPhone X or Pixel 2?

Not so fast.

Each has its strengths, which we talk about in our video segment above and in our examples below. Google and Apple take different approaches, and each has its pros and cons, but there are common overlapping practices and themes as well. And that's before we begin discussing video, where the iPhone's 4K/60p HEVC video borders on professional quality while Google's stabilization may make you chuck your gimbal.

Smartphones have to deal with the fact that their cameras, and therefore sensors, are tiny. And since we all (now) know that, generally speaking, it's the amount of light you capture that determines image quality, smartphones have a serious disadvantage to deal with: they don't capture enough light. But that's where computational photography comes in. By combining machine learning, computer vision, and computer graphics with traditional optical processes, computational photography aims to enhance what is achievable with traditional methods.

Intelligent exposure and processing? Press. Here.

One of the defining characteristics of smartphone photography is the idea that you can get a great image with one button press, and nothing more. No exposure decision, no tapping on the screen to set your exposure, no exposure compensation, and no post-processing. Just take a look at what the Google Pixel 2 XL did with this huge dynamic range sunrise at Banff National Park in Canada:

Sunrise at Banff, with Mt. Rundle in the background. Shot on Pixel 2 with one button press. I also shot this with my Sony a7R II full-frame camera, but that required a 4-stop reverse graduated neutral density ('Daryl Benson') filter, and a dynamic range compensation mode (DRO Lv5) to get a usable image. While the resulting image from the Sony was head-and-shoulders above this one at 100%, I got this image from a device in my pocket by just pointing and shooting.

Apple's iPhones try to achieve similar results by combining multiple exposures if the scene has enough contrast to warrant it. But iPhones can't achieve these results (yet) since they don't average as many 'samples' as the Google Pixel 2. Sometimes Apple's longer exposures can blur subjects, and iPhones tend to overexpose and blow highlights for the sake of exposing the subject properly. Apple is also still pretty reticent to enable HDR in 'Auto HDR'.

The Pixel 2 was able to achieve the image above by first determining the correct focal plane exposure required to not blow large bright (non-specular) areas (an approach known as ETTR or 'expose-to-the-right'). When you press the shutter button, the Pixel 2 goes back in time 9 frames, aligning and averaging them to give you a final image with quality similar to what you might expect from a sensor with 9x as much surface area.

How does it do that? It's constantly keeping the last 9 frames it shot in memory, so when you press the shutter it can grab them, break each into many square 'tiles', align them all, and then average them. Breaking each image into small tiles allows for alignment despite photographer or subject movement by ignoring moving elements, discarding blurred elements in some shots, or re-aligning subjects that have moved from frame to frame. Averaging simulates the effects of shooting with a larger sensor by 'evening out' noise.

That's what allows the Pixel 2 to capture such a wide dynamic range scene: expose for the bright regions, while reducing noise in static elements of the scene by image averaging, while not blurring moving (water) elements of the scene by making intelligent decisions about what to do with elements that shift from frame to frame. Sure, moving elements have more noise to them (since they couldn't have as many of the 9 frames dedicated to them for averaging), but overall, do you see anything but a pleasing image?

Autofocus

Who focuses better? Google Pixel 2, hands down. Its dual pixel AF uses nearly the entire sensor for autofocus (binning the high-resolution sensor into a low-resolution mode to decrease noise), while also using HDR+ and its 9-frame image averaging to further decrease noise and have a usable signal to make AF calculations from.

Google Pixel 2 can focus lightning fast even in indoor artificial light, which allowed me to snap this candid before it was over in a split second. The iPhone X captured a far less interesting moment seconds later when it finally achieved focus, missing the candid moment.

And despite the left and right perspectives the split pixels in the Pixel 2 sensor 'see' having less than 1mm stereo disparity, an impressive depth map can be built, rendering an optically accurate lens blur. This isn't just a matter of masking the foreground and blurring the background, it's an actual progressive blur based on depth.

That's what allowed me to nail this candid image the instant after my wife and child whirled around to face the camera. Nearly all my iPhone X images of this scene were either out-of-focus or captured a less interesting, non-candid moment because of the shutter lag required to focus. The iPhone X only uses approximately 3% of its pixels for its 'Dual PDAF' autofocus, as opposed to the Pixel 2's use of its entire sensor combined with multi-frame noise reduction, not just for image capture but also for focus.

Portrait Lighting

While we've been praising the Pixel phones, Apple is leading smartphone photography in a number of ways. First and foremost: color accuracy. Apple displays are all calibrated and profiled to display accurate colors, so no matter what Apple or color-managed device (or print) you're viewing, colors look the same. Android devices are still the Wild West in this regard, but Google is trying to solve this via a proper color management system (CMS) under-the-hood. It'll be some time before all devices catch up, and even Google itself is struggling with its current display and CMS implementation.

But let's talk about Portrait Lighting. Look at the iPhone X 'Contour Lighting' shot below, left, vs. what the natural lighting looked like at the right (shot on a Google Pixel 2 with no special lighting features). While the Pixel 2 image is more natural, the iPhone X image is far more interesting, as if I'd lit my subject with a light on the spot.

Apple iPhone X, 'Contour Lighting' Google Pixel 2

Apple builds a 3D map of a face using trained algorithms, then allows you to re-light your subject using modes such as 'natural', 'studio' and 'contour' lighting. The latter highlights points of the face like the nose, cheeks and chin that would've caught the light from an external light source aimed at the subject. This gives the image a dimensionality you could normally only achieve using external lighting solutions or a lot of post-processing.

Currently, the Pixel 2 has no such feature, so we get the flat lighting the scene actually had on the right. But, as you can imagine, it won't be long before we see other phones and software packages taking advantage of—and even improving on—these computational approaches.

HDR and wide-gamut photography

And then we have HDR. Not the HDR you're used to thinking about, that creates flat images from large dynamic range scenes. No, we're talking about the ability of HDR displays—like bright contrasty OLEDs—to display the wide range of tones and colors cameras can capture these days, rather than sacrificing global contrast just to increase and preserve local contrast, as traditional camera JPEGs do.

iPhone X is the first device ever to support the HDR display of HDR photos. That is: it can capture a wide dynamic range and color gamut but then also display them without clipping tones and colors on its class-leading OLED display, all in an effort to get closer to reproducing the range of tones and colors we see in the real world.

iPhone X is the first device ever to support HDR display of HDR photos

Have a look below at a Portrait Mode image I shot of my daughter that utilizes colors and luminances in the P3 color space. P3 is the color space Hollywood is now using for most of its movies (it's similar, though shifted, to Adobe RGB). You'll only see the extra colors if you have a P3-capable display and a color-managed OS/browser (macOS + Google Chrome, or the newest iPads and iPhones). On a P3 display, switch between 'P3' and 'sRGB' to see the colors you're missing with sRGB-only capture.

Or, on any display, hover over 'Colors in P3 out-of-gamut of sRGB' to see (in grey) what you're missing with a sRGB-only capture/display workflow.

iPhone X Portrait Mode, image in P3 color space iPhone X Portrait mode, image in sRGB color space Colors in P3 out-of-gamut of sRGB highlighted in grey

Apple is not only taking advantage of the extra colors of the P3 color space, it's also encoding its images in the 'High Efficiency Image Format' (HEIF), which is an advanced format aimed to replace JPEG that is more efficient and also allows for 10-bit color encoding (to avoid banding while allowing for more colors) and HDR encoding to allow the display of a larger range of tones on HDR displays.

But will smartphones replace traditional cameras?

For many, yes, absolutely. You've seen the autofocus speeds of the Pixel 2, assisted by not only dual pixel AF but also laser AF. You've seen the results of HDR+ image stacking, which will only get better with time. We've seen dual lens units that give you the focal lengths of a camera body and two primes, and we've seen the ability to selectively blur backgrounds and isolate subjects like the pros do.

Below is a shot from the Pixel 2 vs. a shot from a $4,000 full-frame body and 55mm F1.8 lens combo—which is which?

Full Frame or Pixel 2? Pixel 2 or Full Frame?

Yes, the trained—myself included—can pick out which is the smartphone image. But when is the smartphone image good enough?

Smartphone cameras are not only catching up with traditional cameras, they're actually exceeding them in many ways. Take for example...

Creative control...

The image below exemplifies an interesting use of computational blur. The camera has chosen to keep much of the subject—like the front speaker cone, which has significant depth to it—in focus, while blurring the rest of the scene significantly. In fact, if you look at the upper right front of the speaker cabinet, you'll see a good portion of it in focus. After a certain point, the cabinet suddenly-yet-gradually blurs significantly.

The camera and software has chosen to keep a significant depth-of-focus around the focus plane before blurring objects far enough away from the focus plane significantly. That's the beauty of computational approaches: while F1.2 lenses can usually only keep one eye in focus—much less the nose or the ear—computational approaches allow you to choose how much you wish to keep in focus even if you wish to blur the rest of the scene to a degree where traditional optics wouldn't allow for much of your subject to remain in focus.

B&W speakers at sunrise. Take a look at the depth-of-focus vs. depth-of-field in this image. If you look closely, the entire speaker cone and a large front portion of the black cabinet is in focus. There is then a sudden, yet gradual blur to very shallow depth-of-field. That's the beauty of computational approaches: one can choose extended (say, F5.6 equivalent) depth-of-focus near the focus plane, but then gradually transition to far shallower - say F2.0 - depth-of-field outside of the focus plane. This allows one to keep much of the subject in focus, bet achieve the subject isolation of a much faster lens.

Surprise and delight...

Digital assistants. Love them or hate them, they will be a part of your future, and they're another way in which smartphone photography augments and exceeds traditional photography approaches. My smartphone is always on me, and when I have my full-frame Sony a7R III with me, I often transfer JPEGs from it to my smartphone. Those images (and 720p video proxies) automatically upload to my Google Photos account. From there any image or video that has my or my daughter's face in it automatically gets shared with my wife without my so much as lifting a finger.

Better yet? Often I get a notification that Google Assistant has pulled a cute animated GIF from my movie it thinks is interesting. And more often than not, the animations are adorable:

Splash splash! in Xcaret, Quintana Roo, Mexico. Animated GIF auto-generated from a movie shot on the Pixel 2.

Machine learning allowed Google Assistant to automatically guess that this clip from a much longer video was an interesting moment I might wish to revisit and preserve. And it was right. Just as it was right in picking the moment below, where my daughter is clapping in response to her cousin clapping at successfully feeding her... after which my wife claps as well.

Claps all around!

Google Assistant is impressive in its ability to pick out meaningful moments from photos and videos. Apple takes a similar approach in compiling 'Memories'.

But animated GIFs aren't the only way Google Assistant helps me curate and find the important moments in my life. It also auto-curates videos that pull together photos and clips from my videos—be it from my smartphone or media I've imported from my camera—into emotionally moving 'Auto Awesome' compilations:

At any time I can hand-select the photos and videos, down to the portions of each video, I want in a compilation—using an editing interface far simpler than Final Cut Pro or Adobe Premiere. I can even edit the auto-compilations Google Assistant generates, choosing my favorite photos, clips and music. And did you notice that the video clips and photos are cut down to the beat in the music?

This is a perfect example of where smartphone photography exceeds traditional cameras, especially for us time-starved souls that hardly have the time to download our assets to a hard drive (not to mention back up said assets). And it's a reminder that traditional cameras that don't play well with such automated services like Google and Apple Photos will only be left behind simpler services that surprise and delight a majority of us.

The future is bright

This is just the beginning. The computational approaches Apple, Google, Samsung and many others are taking are revolutionizing what we can expect from devices we have in our pockets, devices we always have on us.

Are they going to defy physics and replace traditional cameras tomorrow? Not necessarily, not yet, but for many purposes and people, they will offer pros that are well-worth the cons. In some cases they offer more than we've come to expect of traditional cameras, which will have to continue to innovate—perhaps taking advantage of the very computational techniques smartphones and other innovative computational devices are leveraging—to stay ahead of the curve.

But as techniques like HDR+ and Portrait Mode and Portrait Lighting have shown us, we can't just look at past technologies to predict what's to come. Computational photography will make things you've never imagined a reality. And that's incredibly exciting.


Appendix: Studio Scene

We've added the Google Pixel 2 and Apple iPhone X to our studio scene widget. You can compare the Daylight and Low Light scenes below, keeping in mind that we shot the smartphones in their default camera apps without controlling exposure to see how they would perform in these light levels (10 and 3 EV, respectively, for Daylight and Low Light).

Note that we introduced some motion into the Low Light scene to simulate what the iPhone does when there's movement in the scene. Hence, the ISO 640, 1/30s iPhone X image is more reflective of low light image quality for scenes that can't be shot at the 1/4s shutter speed (ISO 125) the iPhone X will tend to drop to for completely static (tripod-based) low light scenes.

The Pixel 2 rarely drops to shutter speeds slower than 1/30s in low light, yet impressively almost matches the performance of a 1"-type sensor at these shutter speeds in low light (though the 'i' tab shows the RX100 shot at 1/6s F4, you'd get an equivalent exposure at 1/30s were you to shoot the Sony at F1.8 like the Pixel 2).