Using PDAF pixels for depth mapping, bokeh simulation

Started Jun 14, 2018 | Discussions
JimKasson
JimKasson Forum Pro • Posts: 26,542
Using PDAF pixels for depth mapping, bokeh simulation
4

Interesting work here:

https://arxiv.org/pdf/1806.04171.pdf

Jim

-- hide signature --
 JimKasson's gear list:JimKasson's gear list
Nikon D5 Sony a7 III Nikon Z7 Fujifilm GFX 100 Sony a7R IV +4 more
mosswings Veteran Member • Posts: 9,323
Re: Using PDAF pixels for depth mapping, bokeh simulation
1

Well, now THAT was a thorough technical paper. Really quite amazing what's going on, and how the individual pieces of the computational art - some of them failed (Lytro), some of them relatively successful - are being integrated into an effective whole.

 mosswings's gear list:mosswings's gear list
Olympus XZ-1 Olympus Stylus 1 Nikon D90 Nikon D7100 Sony a6400 +7 more
AiryDiscus Senior Member • Posts: 1,862
Re: Using PDAF pixels for depth mapping, bokeh simulation
2

JimKasson wrote:

Interesting work here:

https://arxiv.org/pdf/1806.04171.pdf

Jim

The paper is a tour de force.  Note that they use DPAF, not more general PDAF pixels.

mosswings Veteran Member • Posts: 9,323
Re: Using PDAF pixels for depth mapping, bokeh simulation

AiryDiscus wrote:

JimKasson wrote:

Interesting work here:

https://arxiv.org/pdf/1806.04171.pdf

Jim

The paper is a tour de force. Note that they use DPAF, not more general PDAF pixels.

Yes, that's the critical observation here.  Without it you don't get the data you need for best quality fake bokeh.
It causes me to wonder how things would improve with a QPAF arrangement (recent Nikon patent, permitting X-Y feature sensing). You'd think that X-Y light field data would be better than just X, but maybe not...

 mosswings's gear list:mosswings's gear list
Olympus XZ-1 Olympus Stylus 1 Nikon D90 Nikon D7100 Sony a6400 +7 more
Wayne Larmon Forum Pro • Posts: 10,400
Google Pixel 2
1

JimKasson wrote:

Interesting work here:

https://arxiv.org/pdf/1806.04171.pdf

I'm not sure if this is needed, but I'm pretty sure that the above paper is referring to the Google Pixel 2. Some older references

Burst photography for high dynamic range and low-light imaging on mobile cameras
http://static.googleusercontent.com/media/hdrplusdata.org/en//hdrplus.pdf

Introducing the HDR+ Burst Photography Dataset (Has a links to other Google smartphone technology.)
https://research.googleblog.com/2018/02/introducing-hdr-burst-photography.html

Google just made the tech behind its 'portrait mode' open source
https://www.dpreview.com/news/2915513901/google-just-made-the-tech-behind-its-portrait-mode-open-source

DPReview review: Google Pixel 2 is the best smartphone for stills photographers
https://www.dpreview.com/reviews/google-pixel-2

Why smartphone cameras are blowing our minds (Gets into depth mapping fake bokeh emulation)
https://www.dpreview.com/articles/8037960069/why-smartphone-cameras-are-blowing-our-minds

A Google employee discusses using HDR+ for DSLRs
HDR+ Pipeline Implementation
...same high level pipeline to raw images off a Canon 5D III DSLR, with modified algorithms.
http://timothybrooks.com/tech/hdr-plus/

"Burst photography for..." is the highest level reference and is comparable to the paper Jim linked to. IMO, the greatest advance of the Pixel 2 is the tiled stacking of multiple frames (HDR+) that is very robust, works very well and indeed produces image quality that is comparable to a sensor that has nine times the area. i.e., ~= M 4/3 sensor. This is described in depth in the "Burst..." paper. And is verified in the DPReview review of the Pixel 2.

Google Research shares a lot of information.  I'm still waiting for references to similar information from the manufacturers of other flagship phones (or for that matter ILCs.)  Compared to Google Research information, the other manufacturers just do a lot of hand waving.

Wayne

mosswings Veteran Member • Posts: 9,323
Re: Google Pixel 2
2

Wayne Larmon wrote:

I'm still waiting for references to similar information from the manufacturers of other flagship phones (or for that matter ILCs.) Compared to Google Research information, the other manufacturers just do a lot of hand waving.

Wayne

You may wait a long time.  It was very common at the beginning of my career for tech firms to be rather open with their intellectual property, because it was tied to a manufacturing and engineering capability that was hard to duplicate.  Thus, they could enjoy mindshare AND a predictable income stream from the instruments they made with that IP.  There wasn't this dependence on information asymmetry for market successs - more implementational competence.

The final company in my career had a completely different attitude towards publishing - no way, unless it was a press release for a new product - but the technology employed was an absolute company secret that was only worth what it brought to the company in terms of IC sales.  The lifecycle of IC products was simply too short, and the competition too intense, to disclose, or to devote any amount of an engineer's time to writing a disclosure.

Google is disclosing a lot, though if you study the papers it's only at a high level - the nuts and bolts are still secret, and that's OK...but it also says to me that they see value in mind share - that they're selling something more than widgets, and that they feel comfortable enough in their market position that disclosure won't kill their dominance.

 mosswings's gear list:mosswings's gear list
Olympus XZ-1 Olympus Stylus 1 Nikon D90 Nikon D7100 Sony a6400 +7 more
AiryDiscus Senior Member • Posts: 1,862
Re: Google Pixel 2
3

mosswings wrote:

Wayne Larmon wrote:

I'm still waiting for references to similar information from the manufacturers of other flagship phones (or for that matter ILCs.) Compared to Google Research information, the other manufacturers just do a lot of hand waving.

Wayne

You may wait a long time. It was very common at the beginning of my career for tech firms to be rather open with their intellectual property, because it was tied to a manufacturing and engineering capability that was hard to duplicate. Thus, they could enjoy mindshare AND a predictable income stream from the instruments they made with that IP. There wasn't this dependence on information asymmetry for market successs - more implementational competence.

The final company in my career had a completely different attitude towards publishing - no way, unless it was a press release for a new product - but the technology employed was an absolute company secret that was only worth what it brought to the company in terms of IC sales. The lifecycle of IC products was simply too short, and the competition too intense, to disclose, or to devote any amount of an engineer's time to writing a disclosure.

Google is disclosing a lot, though if you study the papers it's only at a high level - the nuts and bolts are still secret, and that's OK...but it also says to me that they see value in mind share - that they're selling something more than widgets, and that they feel comfortable enough in their market position that disclosure won't kill their dominance.

A reasonable part of that is access to resources / expense.  The neural nets they use take more than a month of single-use time of an entire datacenter built from the ground up for machine learning.  Since they operate their own datacenters, that has a cost of the electrical bill and amortization of hardware for them.

I would imagine that is single digit millions of dollars for them.  For a competitor to do it through standard service contracts, I would think the value is high double to triple digit millions of dollars.

Few companies or institutions have reasonable-cost access to the computational resources needed.

Joofa Senior Member • Posts: 2,655
Re: Google Pixel 2
1

Wayne Larmon wrote:

Google Research shares a lot of information. I'm still waiting for references to similar information from the manufacturers of other flagship phones (or for that matter ILCs.) Compared to Google Research information, the other manufacturers just do a lot of hand waving.

Google has no philanthropy in mind when sharing information. The essential part is promoting their TensorFlow deep neural network, which is open-sourced by them. And, this paper on which this tread is based also uses Tensorflow. Amazon AWS and Microsoft Azure also have competing stuff, which among other things is using the open Apache project MxNet . And, AWS cleverly has recently started offering both Tensorflow and MxNet in a distributed environment as part of itsSageMarker platform.

Oh and do check out the Deep Lens camera from AWS.

-- hide signature --
(unknown member) Contributing Member • Posts: 975
Google's results can't compete with a dual camera

I haven't read the Google paper, but here's a shorter Google article about exactly the same topic:

https://ai.googleblog.com/2017/10/portrait-mode-on-pixel-2-and-pixel-2-xl.html

However Google's results can't compete with a dual camera currently: https://youtu.be/DiTt4040TZw (or see https://support.google.com/pixelphone/forum/AAAAb4-OgUs7GdpZMw9lok and https://support.google.com/pixelphone/forum/AAAAb4-OgUs6-YPrbPkZXs ) One reason is that Google's method only works with very near objects, they admit this in the paper.

Though Google is expected to release a Pixel 3 in October and everything seems to indicate that Google will again avoid a dual rear camera, so maybe they found a way to improve their method. Interestingly the Pixel 3 is expected to have a dual selfie camera though, probably because the selfie camera doesn't have a dual pixel sensor. This supports the assumption that Google found a way to improve their dual pixel method because I don't think that Google would avoid a dual camera, if the rear camera would lead to worse results than the dual selfie camera.

Rishi Sanyal
Rishi Sanyal dpreview Admin • Posts: 875
Re: Google's results can't compete with a dual camera
3

noisephotographer wrote:

I haven't read the Google paper, but here's a shorter Google article about exactly the same topic:

https://ai.googleblog.com/2017/10/portrait-mode-on-pixel-2-and-pixel-2-xl.html

However Google's results can't compete with a dual camera currently: https://youtu.be/DiTt4040TZw (or see https://support.google.com/pixelphone/forum/AAAAb4-OgUs7GdpZMw9lok and https://support.google.com/pixelphone/forum/AAAAb4-OgUs6-YPrbPkZXs ) One reason is that Google's method only works with very near objects, they admit this in the paper.

Though Google is expected to release a Pixel 3 in October and everything seems to indicate that Google will again avoid a dual rear camera, so maybe they found a way to improve their method. Interestingly the Pixel 3 is expected to have a dual selfie camera though, probably because the selfie camera doesn't have a dual pixel sensor. This supports the assumption that Google found a way to improve their dual pixel method because I don't think that Google would avoid a dual camera, if the rear camera would lead to worse results than the dual selfie camera.

First of all thanks Jim for posting. We'll repost on our home page shortly.

I am very perplexed as to the results in that YouTube video. We at DPReview have been using the Pixel 2, iPhone X, and Samsung S9+ in Portrait Mode for many months now. There is no question in our minds that:

Pixel 2 >> iPhone X >>> Samsung S9+

The S9+ isn't really a contender: we had a nearly 0% hit-rate of in-focus Portrait Mode images for anything but the most static subject. The iPhone X fares better, but is still considerably behind the Pixel 2 in hit-rate. The Samsung cannot focus reliably on subjects with its telephoto lens, and even when it (rarely) does, there is motion blur from far too slow shutter speeds even at light levels around 100 lux.

The iPhone X does at least simulate convincing blur, and at least results in some usable in-focus images with non-static subjects, but the depth mask is not convincing - there are harsh transitions between in-focus and out-of-focus areas that lead to an unconvincing effect.

The Pixel 2 has been the most robust across a wide variety of scenarios, helped by the high-resolution upsampling of its depth mask, and the far better denoising of the mask itself thanks to the 9-frame tile-based image averaging that no other smartphone we've tested to date can even nearly approach. The Samsung supposedly takes 12 images that it groups into 3 groups of 4 images it aligns to temporally de-noise, but we often see mismatches in alignment and subject movement coming from far too slow shutter speeds or poor alignment.

The Pixel 2 screws up every now and then, but the important message I think is this: there are many times we've shot images of even moving kids to landscapes where we look at the Pixel 2 image and think 'this was shot with an ILC', whereas there are few to nearly no such examples from the Samsung that have caused us to pause like this. iPhone X being a sort of middle ground. The iPhone does have other very positive attributes though - like wide gamut photography, or Portrait Lighting.

I personally would pick the Pixel 2 over any other dual-camera smartphone for portraits without question. That said, I do wish the Pixel 2 would improve on other aspects of image quality where it currently suffers: white balance, skintones, blue skies, and far too center-weighted AF priority. When it comes to Portrait Mode, though, the Pixel 2 is the benchmark at this point in time.

And, agreed, a landmark paper. What they're doing is incredible, and the way in which these techniques can surpass optical approaches (like extended DOF for subject, shallower DOF for everything else) is inspiring.
----------------------
Rishi Sanyal, Ph.D
Science Editor | Digital Photography Review
dpreview.com (work) | rishi.photography (personal)

 Rishi Sanyal's gear list:Rishi Sanyal's gear list
Sony RX100 IV Nikon D810 Canon MP-E 65mm f/2.5 1-5x Macro Nikon 85mm F1.8G Sigma 35mm F1.4 DG HSM Art +3 more
(unknown member) Contributing Member • Posts: 975
Re: Google's results can't compete with a dual camera

I don't have a Pixel 2, but I have seen a lot of reviews that show that the Pixel 2 generates worse depth maps (for objects that are more than a meter away) than the iPhone. In my opinion it looks awful and I would prefer the iPhone in this regard. And the other link that I posted confirms that multiple users have the same issue. Even Dxomark's sample images show this:

Pixel 2: https://cdn.dxomark.com/wp-content/uploads/2017/10/ref2_Bokeh-Outdoor_GooglePixel2.jpg

iPhone 8 Plus: https://cdn.dxomark.com/wp-content/uploads/2017/10/ref2_Bokeh-Outdoor_ip8Plus.jpg

Dpreview's opinion:

"The new Pixel 2 fares the worst in this comparison, with multiple aritfacts throughout the image." https://www.dpreview.com/news/7778846080/google-pixel-2-earns-highest-ever-dxomark-score-of-98-bests-apple

Or look at https://youtu.be/3b-VvmAZn14 at 6:23. Looks like exactly the same issue.

Google actually even shows such issues in their paper and they make lens abberations and sensor defects responsible and also mention that this dual pixel method only works for "nearby scenes" or "macro-style photos". The authors also say that "The small stereo baseline of the dual-pixels causes these [disparity] estimates to be strongly affected by optical aberrations."

By the way I also don't like Google's approach that they don't want to simulate depth of field realistically. In the paper they indicate that they ignore realistic depth of field in order to have more sharp people/dogs in the photo. I don't like this. And it's an inconsistent philosophy because Google adds noise to the blurred background for a more realistic photo. Apple isn't that different. Apple only applies blur to the background and even applies blur to the chest when it is in the focal plane.

(unknown member) Contributing Member • Posts: 975
Re: Google's results can't compete with a dual camera

And here's another example:

https://youtu.be/-ldRVlQUWUI 1:40 Horrible edges (look at the top) and it gets even worse at 1:48

And here's another one:

Pixel 2: https://www.imaging-resource.com/PRODS/google-pixel-2/Y-GooglePixel2-PortraitMode-024.HTM

iPhone 8 Plus: https://www.imaging-resource.com/PRODS/apple-iphone-8-plus/Y-iPhone8Plus-PortaitMode-001.HTM

And here's another example (scene 5):

Pixel 2 vs Note 9: https://www.phonearena.com/news/Samsung-Galaxy-Note-9-vs-Pixel-2-XL-night-camera-comparison_id108172 (scene 5: Pixel 2: very unrealistic depth at the right side and at the top)

Another example:

https://www.anandtech.com/show/13392/the-iphone-xs-xs-max-review-unveiling-the-silicon-secrets/10 (see Portrait Mode)

And here's another one:

https://youtu.be/MRZ6LJtHQlY at 4:36 (see the right side)

Entropy512 Senior Member • Posts: 4,598
Re: Google Pixel 2

Wayne Larmon wrote:

JimKasson wrote:

Interesting work here:

https://arxiv.org/pdf/1806.04171.pdf

A Google employee discusses using HDR+ for DSLRs
HDR+ Pipeline Implementation
...same high level pipeline to raw images off a Canon 5D III DSLR, with modified algorithms.
http://timothybrooks.com/tech/hdr-plus/

"Burst photography for..." is the highest level reference and is comparable to the paper Jim linked to. IMO, the greatest advance of the Pixel 2 is the tiled stacking of multiple frames (HDR+) that is very robust, works very well and indeed produces image quality that is comparable to a sensor that has nine times the area. i.e., ~= M 4/3 sensor. This is described in depth in the "Burst..." paper. And is verified in the DPReview review of the Pixel 2.

Google Research shares a lot of information. I'm still waiting for references to similar information from the manufacturers of other flagship phones (or for that matter ILCs.) Compared to Google Research information, the other manufacturers just do a lot of hand waving.

Wayne

Interestingly, a lot of what's described in that article by Tim Brooks is pretty close to what can potentially be achieved using hugin's align_image_stack, hugin_stacker (new since the last time I did stacking), and enfuse. A few approaches involving burst image stacking made the rounds back in late December. Unfortunately the post I made regarding a hugin-based workflow (which is from before hugin_stacker was released, which will likely perform better) has disappeared from DPR with not a trace remaining of its existence.

The one delta is that the alignment approach for merging moving objects within the frame appears to be significantly more sophisticated/robust than hugin's align_image_stack.

Edit:  In fact, he cites the same Mertens paper from 2007 that is implemented by enfuse - so his exposure fusion strategy is likely near-identical.

One big difference between smartphones and most ILCs is that the small sensor size allows for A9-style stacked sensors to be pretty much standard, allowing for much faster readout rates and hence the ability to burst much faster. I'm pretty sure Qualcomm's newer ISPs offer much higher sustained demosaicer/scaler multiframe bandwidth than the 500 MPixel/sec we've seen in all recent Sonys.

-- hide signature --

Context is key. If I have quoted someone else's post when replying, please do not reply to something I say without reading text that I have quoted, and understanding the reason the quote function exists.

 Entropy512's gear list:Entropy512's gear list
Sony a6000 Pentax K-5 Pentax K-01 Sony a6300 Canon EF 85mm f/1.8 USM +5 more
(unknown member) Contributing Member • Posts: 975
Pixel 3: Google's results still can't compete with a dual camera

The new Pixel 3 still has these issues sometimes, see

https://youtu.be/_dqFwgiTkSI at 5:00 (the right edge has much less blur than the left edge)

https://youtu.be/rSFNpJJeo4c at 8:56 - 9:09

Or https://youtu.be/S9qNHYtNsik at 7:40 (Pixel 3: the left edge has a slightly higher amount of blur than the right side, the iPhone Xs is more consistent)

AiryDiscus Senior Member • Posts: 1,862
Re: Pixel 3: Google's results still can't compete with a dual camera

noisephotographer wrote:

The new Pixel 3 still has these issues sometimes, see

https://youtu.be/_dqFwgiTkSI at 5:00 (the right edge has much less blur than the left edge)

https://youtu.be/rSFNpJJeo4c at 8:56 - 9:09

Or https://youtu.be/S9qNHYtNsik at 7:40 (Pixel 3: the left edge has a slightly higher amount of blur than the right side, the iPhone Xs is more consistent)

It's pretty in vogue at the moment to lump heaps of praise onto Google for what seems like natural or even obvious advancements in their depth mapping [enhancement] algorithm.  E.g. eliminating discontinuities in the map and smoothing it.  Seems pretty obvious to me.

Another "obvious" enhancement here.  Detect medium to large scale boxes in the background to identify a plane and skew of plane from.  Use that to constrain the depth map and avoid this sort of side-to-side error.

The edges are fairly in focus at the time of capture, so it should be trivial to detect them...

Rishi Sanyal
Rishi Sanyal dpreview Admin • Posts: 875
Re: Pixel 3: Google's results still can't compete with a dual camera

AiryDiscus wrote:

noisephotographer wrote:

The new Pixel 3 still has these issues sometimes, see

https://youtu.be/_dqFwgiTkSI at 5:00 (the right edge has much less blur than the left edge)

https://youtu.be/rSFNpJJeo4c at 8:56 - 9:09

Or https://youtu.be/S9qNHYtNsik at 7:40 (Pixel 3: the left edge has a slightly higher amount of blur than the right side, the iPhone Xs is more consistent)

It's pretty in vogue at the moment to lump heaps of praise onto Google for what seems like natural or even obvious advancements in their depth mapping [enhancement] algorithm. E.g. eliminating discontinuities in the map and smoothing it. Seems pretty obvious to me.

Another "obvious" enhancement here. Detect medium to large scale boxes in the background to identify a plane and skew of plane from. Use that to constrain the depth map and avoid this sort of side-to-side error.

The edges are fairly in focus at the time of capture, so it should be trivial to detect them...

I think in some cases this is caused by Google's transparency and Apple's arguable lack there-of, though it was great to see Apple doing a bit more technical dive about the camera in this year's presentation. Both companies are doing fascinating things, each with its strengths and weaknesses. On that note, there'll be some papers/Google Blog posts on the new technologies / enhancements soon.

Do you know how this smoothing is done? It was quite obvious in the initial depth maps I saw (published in my article here). I was told the depth maps resulted from a learning-based pipeline that combined the stereo depth map generated by the split pixels with a learning-based approach. I'm trying to get my head around what the data sets would be though to train such a system. In the training dataset, what do they use as the ground truth? And is the input in that training set the output of just the stereo algorithm?

In initial testing with the 3 vs. the 2 in Portrait mode, generally the background is more uniformly blurred, but I still do see less defocused backgrounds near sides (sometimes symmetrical sometimes not). 
-------------------------
Rishi Sanyal, Ph.D
Science Editor | Digital Photography Review

 Rishi Sanyal's gear list:Rishi Sanyal's gear list
Sony RX100 IV Nikon D810 Canon MP-E 65mm f/2.5 1-5x Macro Nikon 85mm F1.8G Sigma 35mm F1.4 DG HSM Art +3 more
beatboxa Senior Member • Posts: 6,046
Re: Pixel 3: Google's results still can't compete with a dual camera

Rishi Sanyal wrote:

AiryDiscus wrote:

noisephotographer wrote:

The new Pixel 3 still has these issues sometimes, see

https://youtu.be/_dqFwgiTkSI at 5:00 (the right edge has much less blur than the left edge)

https://youtu.be/rSFNpJJeo4c at 8:56 - 9:09

Or https://youtu.be/S9qNHYtNsik at 7:40 (Pixel 3: the left edge has a slightly higher amount of blur than the right side, the iPhone Xs is more consistent)

It's pretty in vogue at the moment to lump heaps of praise onto Google for what seems like natural or even obvious advancements in their depth mapping [enhancement] algorithm. E.g. eliminating discontinuities in the map and smoothing it. Seems pretty obvious to me.

Another "obvious" enhancement here. Detect medium to large scale boxes in the background to identify a plane and skew of plane from. Use that to constrain the depth map and avoid this sort of side-to-side error.

The edges are fairly in focus at the time of capture, so it should be trivial to detect them...

I think in some cases this is caused by Google's transparency and Apple's arguable lack there-of, though it was great to see Apple doing a bit more technical dive about the camera in this year's presentation. Both companies are doing fascinating things, each with its strengths and weaknesses. On that note, there'll be some papers/Google Blog posts on the new technologies / enhancements soon.

Do you know how this smoothing is done? It was quite obvious in the initial depth maps I saw (published in my article here). I was told the depth maps resulted from a learning-based pipeline that combined the stereo depth map generated by the split pixels with a learning-based approach. I'm trying to get my head around what the data sets would be though to train such a system. In the training dataset, what do they use as the ground truth? And is the input in that training set the output of just the stereo algorithm?

In initial testing with the 3 vs. the 2 in Portrait mode, generally the background is more uniformly blurred, but I still do see less defocused backgrounds near sides (sometimes symmetrical sometimes not).
-------------------------
Rishi Sanyal, Ph.D
Science Editor | Digital Photography Review

I can't imagine depth maps being particularly precise with such small apertures at typical distances.

As far as training algorithms, the only things I could potentially think of would be first obviously starting with parallax / DPAF.  But this may or may not offer the precision or complexity they'd need to assess a complex scene--particularly with various levels of detail.

So they'd need to combine this with solid / edge object detection & probably lighting detection (shadows vs predicted lighting).

But I think it likely would have been the case that they captured many different types of scenes using multiple cameras far away from one another to do proper depth mapping; and then used this information to train an ML algorithm--then refined the algorithm by comparing its expected mapping to the actual mapping from a single sensor.

So, not a single type of scene or dataset; but rather iterations of comparing known values (actual depth mapping, from multiple cameras) to predicted values (single camera).

AiryDiscus Senior Member • Posts: 1,862
Re: Pixel 3: Google's results still can't compete with a dual camera
1

Rishi Sanyal wrote:

AiryDiscus wrote:

noisephotographer wrote:

The new Pixel 3 still has these issues sometimes, see

https://youtu.be/_dqFwgiTkSI at 5:00 (the right edge has much less blur than the left edge)

https://youtu.be/rSFNpJJeo4c at 8:56 - 9:09

Or https://youtu.be/S9qNHYtNsik at 7:40 (Pixel 3: the left edge has a slightly higher amount of blur than the right side, the iPhone Xs is more consistent)

It's pretty in vogue at the moment to lump heaps of praise onto Google for what seems like natural or even obvious advancements in their depth mapping [enhancement] algorithm. E.g. eliminating discontinuities in the map and smoothing it. Seems pretty obvious to me.

Another "obvious" enhancement here. Detect medium to large scale boxes in the background to identify a plane and skew of plane from. Use that to constrain the depth map and avoid this sort of side-to-side error.

The edges are fairly in focus at the time of capture, so it should be trivial to detect them...

I think in some cases this is caused by Google's transparency and Apple's arguable lack there-of, though it was great to see Apple doing a bit more technical dive about the camera in this year's presentation. Both companies are doing fascinating things, each with its strengths and weaknesses. On that note, there'll be some papers/Google Blog posts on the new technologies / enhancements soon.

I think it is about public perception and who is the more 800lb of the gorillas.  Apple is a ~$229B/yr company, that takes home about $61B (27%) of that.  Google is a $111B/yr company that takes home about $26B of that (23%).  Apple is then ~15% more profitable, but is also a much more secured company since hardware trends are much slower than software ones, and apple's core business is hardware and not SaaS.

Apple doesn't need tech PR about their fake bokeh.  They just need consumers to know their software magic and fancy A12 makes their pictures blurrier nicer than the previous phone.  Plus they're riding relatively high on pro-privacy coverage right now, while google is, well, the opposite.

Google's customer base is more techy in nature, so they gobble up highly diluted [computer] science to feed their sense of superiority confidence in google's products.  And google needs the PR to distract from the news that all android phones phone home to google with your location to a precision of a meter 14 times an hour, or that a stock pixel phone with no accounts activated chews 10x the data of a stock iphone with no accounts activated.

I'm very critical of google, in general, but also on all this ML image enhancement stuff.  Most of what they're doing seems obvious to me, and I think part of the IMO slow pace is an indicator of the incomprehensibly vast computational resources needed to train these models, but also an indicator that much of the team is probably career software engineers with little background in optics or geometry, so the physics is all new which slows the pace of development.

After all, how many phones come out making fake bokeh with gaussian blurs?  Because all signals are gaussians right, and optics is no different

Do you know how this smoothing is done? It was quite obvious in the initial depth maps I saw (published in my article here). I was told the depth maps resulted from a learning-based pipeline that combined the stereo depth map generated by the split pixels with a learning-based approach. I'm trying to get my head around what the data sets would be though to train such a system. In the training dataset, what do they use as the ground truth? And is the input in that training set the output of just the stereo algorithm?

I don't know how it's done, but it's either supervised learning, where they have truth depth maps for scenes that are used to score the guesses the algorithm produces, or unsupervised where they let it rip without knowing what truth is.

This is probably a tile-based KNN as google is so fond of using in almost everything these days.  The image would be broken into mxn, often nxn tiles and the algorithm run on them independently.  The edges are not computed (cropped to make an integer number of chunks) in that case.

I assume to smooth they do what every ML problem is, and throw gaussians at it, where they suppress "noisy" tiles by using a superposition of gaussians from the neighbors to guess if a tile is on.  They may do this within the tile too, with an edge preservation mask.

In this model the edge pixels would be assigned no blur, but then the neighboring ones would "spill over" (but attenuated) and fill in a bit of blur.  How close that goes to the edge depends on the size of the tiles relative to the image dimensions.

Or instead of tiles they do feature detection, and use a superposition of gaussians to fill within the boundary of each feature, preserving the edges.

Or something else entirely.

(so in short, I don't know)

These sort of models are phenomenally expensive to train, but Google can spin the wheels on their own datacenters at cost, and what's a few  tera kilowatt-hours of power in the name of blurry cell phone pictures.

Rishi Sanyal
Rishi Sanyal dpreview Admin • Posts: 875
Re: Pixel 3: Google's results still can't compete with a dual camera

AiryDiscus wrote:

After all, how many phones come out making fake bokeh with gaussian blurs? Because all signals are gaussians right, and optics is no different

I noticed this with the earlier iPhones, but most of the (admittedly higher end) devices we look at today use disc-shaped blur.

FWIW, now after shooting hundreds more Portrait Mode images with the Pixel 3, I have to say whatever they've done to improve it is really working - far fewer errors with complex objects / scenes, and generally cuts around hair very well, save for some nightmare scenarios. I am seeing overall considerably better performance than the Pixel 2.

Still need to do many more direct comparisons before coming to a more informed opinion of how it performs vs. iPhone XS.

-Rishi

 Rishi Sanyal's gear list:Rishi Sanyal's gear list
Sony RX100 IV Nikon D810 Canon MP-E 65mm f/2.5 1-5x Macro Nikon 85mm F1.8G Sigma 35mm F1.4 DG HSM Art +3 more
(unknown member) Contributing Member • Posts: 975
Re: Using PDAF pixels for depth mapping, bokeh simulation
Keyboard shortcuts:
FForum MMy threads