Computational photography part II: Computational sensors and optics
1 Computational Sensors
Editor's note: This is the second article in a three-part series by guest contributor Vasily Zubarev. Series overview:
- Part I: What is computational photography?
- Part II: Computational sensors and optics
- Part III: Computational lighting, 3D scene and augmented reality
You can visit Vasily's website where he also demystifies other complex subjects. If you find this article useful we encourage you to give him a small donation so that he can write about other interesting topics.
The article has been lightly edited for clarity and to reflect a handful of industry updates since it first appeared on the author's own website.
Computational Sensor: Plenoptic and Light Fields
Well, our sensors are crap. We simply got used to it and trying to do our best with them. They haven't changed much in their design from the beginning of time. Technical process was the only thing that improved — we reduced the distance between pixels, fought read noise, increased readout speeds and added specific pixels for phase-detection autofocus systems. But even if we take the most expensive camera to try to photograph a running cat in the indoor light, the cat will win.
- Video link: The Science of Camera Sensors
We've been trying to invent a better sensor for a long time. You can google a lot of research in this field by "computational sensor" or "non-Bayer sensor" queries. Even the Pixel Shifting example can be referred to as an attempt to improve sensors with calculations.
The most promising stories of the last twenty years, though, come to us from plenoptic cameras.
To calm your sense of impending boring math, I'll throw in the insider's note — the last Google Pixel camera is a little bit plenoptic. With only two pixels in one, there's still enough to calculate a fair optical depth of field map without having a second camera like everyone else.
Plenoptics is a powerful weapon that hasn't fired yet.
Invented in 1994. For the first time assembled at Stanford in 2004. The first consumer product — Lytro, released in 2012. The VR industry is now actively experimenting with similar technologies.
Plenoptic camera differs from the normal one by only one modification. Its sensor is covered with a grid of lenses, each of which covers several real pixels. Something like this:
If we place the grid and sensor at the right distance, we'll see sharp pixel clusters containing mini-versions of the original image on the final RAW image.
- Video link: Muted video showing RAW editing process
Apparently, if you take only one central pixel from each cluster and build the image only from them, it won't be any different from one taken with a standard camera. Yes, we lose a bit in resolution, but we'll just ask Sony to stuff more megapixels in the next sensor.
That's where the fun part begins. If you take another pixel from each cluster and build the image again, you again get a standard photo, only as if it was taken with a camera shifted by one pixel in space. Thus, with 10x10 pixel clusters, we get 100 images from "slightly" different angles.
The more the cluster size, the more images we have. Resolution is lower, though. In the world of smartphones with 41-megapixel sensors, everything has a limit, although we can neglect resolution a bit. We have to keep the balance.
Alright, we've got a plenoptic camera. What can we do with it?
The feature that everyone was buzzing about in the articles covering Lytro is the possibility to adjust focus after the shot was taken. "Fair" means we don't use any deblurring algorithms, but rather only available pixels, picking or averaging in the right order.
A RAW photo taken with a plenoptic camera looks weird. To get the usual sharp JPEG out of it, you have to assemble it first. The result will vary depending on how we select the pixels from the RAW.
The farther the cluster is from the point of impact of the original ray, the more defocused the ray is. Because the optics. To get the image shifted in focus, we only need to choose the pixels at the desired distance from the original — either closer or farther.
|The picture should be read from right to left as we are sort of restoring the image, knowing the pixels on the sensor. We get a sharp original image on top, and below we calculate what was behind it. That is, we shift the focus computationally.|
The process of shifting the focus forward is a bit more complicated as we have fewer pixels in these parts of the clusters. In the beginning, Lytro developers didn't even want to let the user focus manually because of that — the camera made a decision itself using the software. Users didn't like that, so the feature was added in the late versions as "creative mode", but with very limited refocus for exactly that reason.
Depth Map and 3D using a single lens
One of the simplest operations in plenoptics is to get a depth map. You just need to gather two different images and calculate how the objects are shifted between them. The more the shift — the farther away from the camera the object is.
Google recently bought and killed Lytro, but used their technology for its VR and... Pixel's camera. Starting with the Pixel 2, the camera became "a little bit" plenoptic, though with only two pixels per cluster. As a result, Google doesn't need to install a second camera like all the other cool kids. Instead, they can calculate a depth map from one photo.
|Images which top and bottom subpixels of the Google Pixel camera see. The right one is animated for clarity (click to enlarge and see animation). Source: Google|
|The depth map is additionally processed with neural networks to make the background blur more even. Source: Google|
The depth map is built on two shots shifted by one sub-pixel. This is enough to calculate a rudimentary depth map and separate the foreground from the background to blur it out with some fashionable bokeh. The result of this stratification is still smoothed and "improved" by neural networks which are trained to improve depth maps (rather than to observe, as many people think).
|The trick is that we got plenoptics in smartphones almost at no charge. We already put lenses on these tiny sensors to increase the luminous flux at least somehow. Some patents from Google suggest that future Pixel phones may go further and cover four photodiodes with a lens.|
Slicing layers and objects
You don't see your nose because your brain combines a final image from both of your eyes. Close one eye, and you will see a huge Egyptian pyramid at the edge.
The same effect can be achieved in a plenoptic camera. By assembling shifted images from pixels of different clusters, we can look at the object as if from several points. Same as our eyes do. It gives us two cool opportunities. First is we can estimate the approximate distance to the objects, which allows us easily separate the foreground from the background as in life. And second, if the object is small, we can completely remove it from the photo since we can effectively look around the object. Like a nose. Just clone it out. Optically, for real, with no photoshop.
Using this, we can cut out trees between the camera and the object or remove the falling confetti, as in the video below.
"Optical" stabilization with no optics
From a plenoptic RAW, you can make a hundred of photos with several pixels shift over the entire sensor area. Accordingly, we have a tube of lens diameter within which we can move the shooting point freely, thereby offsetting the shake of the image.
Technically, stabilization is still optical, because we don't have to calculate anything — we just select pixels in the right places. On the other hand, any plenoptic camera sacrifices the number of megapixels in favor of plenoptic capabilities, and any digital stabilizer works the same way. It's nice to have it as a bonus, but using it only for its sake is costly.
The larger the sensor and lens, the bigger window for movement. The more camera capabilities, the more ozone holes from supplying this circus with electricity and cooling. Yeah, technology!
Fighting with Bayer filter
Bayer filter is still necessary even with a plenoptic camera. We haven't come up with any other way of getting a colorful digital image. And using a plenoptic RAW, we can average the color not only by the group of nearby pixels, as in classic demosaicing, but also using dozens of its copies in neighboring clusters.
It's called "computable super-resolution" in some articles, but I would question it. In fact, we reduce the real resolution of the sensor in these some dozen times first in order to proudly restore it again. You have to try hard to sell it to someone.
But technically it's still more interesting than shaking the sensor in a pixel shifting spasm.
Computational aperture (bokeh)
Those who like to shoot bokeh hearts will be thrilled. Since we know how to control the refocus, we can move on and take only a few pixels from the unfocused image and others from the normal one. Thus we can get an aperture of any shape. Yay! (No)
Many more tricks for video
So, not to move too far away from the photo topic, everyone who's interested should check out the links above and below. They contain about half a dozen other interesting applications of a plenoptic camera.
- Video link: Watch Lytro Change Cinematography Forever
Light Field: More than a photo, less than VR
Usually, the explanation of plenoptics starts with light fields. And yes, from the science perspective, the plenoptic camera captures the light field, not just the photo. Plenus comes from the Latin "full", i.e., collecting all the information about the rays of light. Just like a Parliament plenary session.
Let's get to the bottom of this to understand what a light field is and why we need it.
Traditional photos are two-dimensional. When a ray hits a sensor there will be a corresponding pixel in the photo that records simply its intensity. The camera doesn't care where the ray came from, whether it accidentally fell from aside or was reflected off of another object. The photo captures only the point of intersection of the ray with the surface of the sensor. So it's kinda 2D.
Light field images are similar, but with a new component — the origin and angle of each ray. The microlens array in front of the sensor is calibrated such that each lens samples a certain portion of the aperture of the main lens, and each pixel behind each lens samples a certain set of ray angles. And since light rays emanating from an object with different angles fall across different pixels on a light field camera's sensor, you can build an understanding of all the different incoming angles of light rays from this object. This means the camera effectively captures the ray vectors in 3D space. Like calculating the lighting of a video game, but the other way around — we're trying to catch the scene, not create it. The light field is the set of all the light rays in our scene — capturing both the intensity and angular information about each ray.
|There are a lot of mathematical models of light fields. Here's one of the most representative.|
The light field is essentially a visual model of the space around it. We can easily compute any photo within this space mathematically. Point of view, depth of field, aperture — all these are also computable; however, one can only reposition the point of view so much, determined by the entrance pupil of the main lens. That is, the amount of freedom with which you can change the field of view depends upon the breadth of perspectives you've captured, which is necessarily limited.
I love to draw an analogy with a city here. Photography is like your favorite path from your home to the bar you always remember, while the light field is a map of the whole town. Using the map, you can calculate any route from point A to B. In the same way, knowing the light field, we can calculate any photo.
For an ordinary photo it's overkill, I agree. But here comes VR, where light fields are one of the most promising areas of development.
Having a light field model of an object or a room allows you to see this object or a room from multiple perspectives, with motion parallax and other depth cues like realistic changes in textures and lighting as you move your head. You can even travel through a space, albeit to a limited degree. It feels like virtual reality, but it's no longer necessary to build a 3D-model of the room. We can 'simply' capture all the rays inside it and calculate many different pictures from within that volume. Simply, yeah. That's what we're fighting over.
Canon's mirrorless EOS R5 comes with a ton of features and capability stemming from its design inside and out. Come along with us on a guided tour of Canon's new high-end, high-megapixel camera and check it out for yourself.
Announced alongside the EOS R5, the R6 offers a lot of the same technology but in a more affordable, slightly more enthusiast-focused model. Take a closer look.
Alongside the EOS R5 and R6, Canon has announced a brace of lenses, all in the short to long telephoto range. Filling out the 'long' end are one L-series zoom, and two innovative primes.
Alongside a trio of telephoto lenses, Canon also announced a new 85mm this week. The RF 85mm F2 Macro IS STM is a compact, affordable alternative to the pro-oriented 85mm F1.2L.
The EOS R5 has been a long time coming – we knew it had 8K and we knew it had an AF joystick. But now that's it's here, what is it really like to use? Find out in our initial review based on hands-on time with the camera.
The R6 doesn't promise quite such headline-grabbing specs as its big brother, but it still packs a punch, whether you shoot stills, video or both.
Think you've read everything there is to know about the new Canon cameras? Chris and Jordan share eight important things you may have missed from today's Canon EOS R5 and R6 announcements.
We've been shooting around with the new Canon EOS R6. Initial impressions of image quality are positive, and out-of-camera JPEGs appear similar to that of the gold award-winning Canon EOS-1D X III. Have a look for yourself.
Canon has officially released the long-awaited EOS R5, the company's top-end full-frame mirrorless camera. Featuring a new 45MP CMOS sensor, Dual Pixel AF II system, 8K video capture and 20 fps bursts, this is the RF-mount camera we've been waiting for.
Although the Canon EOS R6 doesn't have the 45MP sensor and 8K video capture of the higher-end R5, it's still an incredibly capable camera with specs that outshine similarly priced peers.
The Canon RF 100-500mm F4.5-7.1L IS USM is the company's first super-zoom lens for RF-mount. Despite a relatively slow aperture range, it's very versatile, offering five stops of stabilization, weather-sealing and compatibility with Canon's new teleconverters.
Canon's RF 85mm F2 Macro IS STM is an inexpensive telephoto prime lens with a minimum focus distance of just 0.35m (14") and a 0.5x magnification. When attached to the new R5 and R6, it offers a whopping eight stops of shake reduction.
Canon has announced a pair of super-telephoto fixed-aperture primes. The 600mm and 800mm use diffractive optics to keep their size and weight down. They'll also be compatible with new 1.4x and 2x RF teleconverters.
Canon has announced a new small-footprint inkjet photo printer, the imageProGraf Pro-300. it will produce prints up to 13 x 19" and it goes on sale later this month for $900. A new textured photo paper will also arrive in July.
The new compression standard is set to reduce video file sizes by half to save space and speed-up transmission, paving the way for more portable 8K footage.
Sony recently confirmed plans to launch a successor to the video-centric a7S II. We don't even know the name of the camera, but Jordan already has a feature wish list for the new 'a7S III' – and it doesn't include 8K.
The Profot B10 is the first studio flash system that can be used when shooting with an iPhone camera.
The Pixii camera is an interesting little rangefinder camera that features a 12MP APS-C sensor and lacks a rear LCD display, opting instead to pair with your mobile device, which can be used to view and transfer images.
Sirui is launching an Indiegogo campaign for a wide-angle answer to its existing 50mm F1.8 anamorphic lens. The 35mm APS-C lens will come in a Micro Four Thirds mount with adapters for other systems.
Sony has added a 12-24mm F2.8 to its top-shelf 'G Master' series of lenses. It's the widest constant F2.8 zoom currently offered for full-frame, with a hefty price tag to match: it will sell for $3000 when it ships in mid-August.
Take a look at the view from Sony's new ultra-wide F2.8 zoom – we paired it with the a7R IV for some initial shooting.
Canon's EOS-1D X Mark III is one of the best DSLRs ever made. With fast burst speeds, great video quality and impressive autofocus, the 1D X III is equal parts cinema rig and sports shooter. Find out how it fares against steep competition in our full review.
Nikon Rumors is reporting that Nikon will announce successors to its Z6 and Z7 camera systems by the end of the calendar year.
Canon says the event, set to take place at 14:00 CEST in two days on July 9, will be its 'biggest product launch yet.'
The Verge Video Director, Becca Farsace, shows how she built a custom Raspberry Pi camera with effectively zero coding knowledge over the course of just three days.
The EOS R5 has been in the works for some time, and Canon has published a handful of specifications, but there's still plenty we don't know. What are you hoping to see from Canon's forthcoming flagship camera?
Canon's CE-SAT-IB satellite camera was destroyed alongside six other satellites during Rocket Lab's ironically-named 'Pics or It Didn't Happen Mission.'
This sample gallery includes images from our recent review of the Tamron 28-200mm F2.8-5.6 Di III RXD zoom lens. Check out these photos to see how it performs, from wide-angle to telephoto and everything in between.
The Tamron 28-200mm F2.8-5.6 Di III RXD provides a wide zoom range in compact, weather-sealed design. Find out why it's Chris and Jordan's new favorite travel lens.
Kodak Portra 800 is a wonderful and versatile color film. And any rumors of it being discontinued, we're pleased to report, are simply untrue. That's a good thing, because it's capable of producing lovely results in all sorts of conditions.