Why RGGB?

Great Bustard

Forum Pro
Messages
45,961
Solutions
17
Reaction score
34,046
Why not RYGB, especially given how much yellow comes up (skin tones and browns)? I know Sony tries RGTB (T being teal) a while back, but that seems the wrong direction.

Is there an advantage to using the same green filter twice? Or, more to the point, I suppose, is there no advantage to not using the same green filter twice?

Or have I missed the plot all together? The two green filters are not the same, just closer together than yellow and green?
 
Why not RYGB, especially given how much yellow comes up (skin tones and browns)? I know Sony tries RGTB (T being teal) a while back, but that seems the wrong direction.

Is there an advantage to using the same green filter twice? Or, more to the point, I suppose, is there no advantage to not using the same green filter twice?

Or have I missed the plot all together? The two green filters are not the same, just closer together than yellow and green?
The advantage to using the same filter twice is that you get twice the luminance information. This extra luminance info is also used to calculate and compensate for the color aliased portion of the luminance info that is adequately sampled by the green pixels but not by the red and blue pixels. Check the details of my quincunx demosaic thread to see exactly what I mean.
 
Last edited:
Why not RYGB, especially given how much yellow comes up (skin tones and browns)? I know Sony tries RGTB (T being teal) a while back, but that seems the wrong direction.

Is there an advantage to using the same green filter twice? Or, more to the point, I suppose, is there no advantage to not using the same green filter twice?

Or have I missed the plot all together? The two green filters are not the same, just closer together
If the objective is to reduce capture metameric error, ie have the camera see the same way we do, there are large advantages to a fourth filter. Its placement is a complex decision, but having it not bee too far away from the human luminance response curve (but is a different direction than the "green" filter, would have the advantage of allowing the parts of the spectrum that figure heavily into luminance to be sampled more often that those that don't (red, and, especially, blue). It's not clear that the best spectrum to maximize luminance would be the best one for minimizing capture metameric error; in fact, that's not likely to be the case. Demosaicing would be trickier than with three colors, and to the degree that the two greenish filters diverged from luminance response, would produce less accurate demosaiced images at high spatial frequencies (see DSPographer's post).

Adding a fourth filter gives you the opportunity to rethink the other three, especially the green one. If capture metameric error were minimized, cameras could do better in lots of different lighting conditions, if having the camera's illuminant metamerism match human illuminant metamerism is the relevant criterion.

Jim
 
Why not RYGB, especially given how much yellow comes up (skin tones and browns)? I know Sony tries RGTB (T being teal) a while back, but that seems the wrong direction.

Is there an advantage to using the same green filter twice? Or, more to the point, I suppose, is there no advantage to not using the same green filter twice?

Or have I missed the plot all together? The two green filters are not the same, just closer together than yellow and green?
It's mostly a questions of unknowns...

More accurately, how many unknowns you can have in a system before the solution space becomes unstable. When you have two identical greens per four pixels, your number of unknowns is halved for most intents and purposes. More than that in many cases.

You have to include the ultimate destabilizer here, noise.

Consider that you have four colours in a 2x2 repeating grid. If the optical resolution vs the digitizing resolution is kept low, so that the best the lens can do is to get you a Rayleigh of about 2 p-p distances, most scenarios can be said to contain stable solutions. Solutions where you can compute the luminance part AND the chrominance part to a reasonable accuracy for each pixel.

Add in some noise, and that stable solution falls apart. It de-stabilizes the system so that you have to choose between stable estimation result deviations to be "luminance errors" or "chrominance errors", and then compensate for that.

Now add in "better optical sharpness"... Which is the condition most consumer cameras - except the smallest sensor, high-res smartphone modules - work under today. Better sharpness than 2 p-p distances mean that there's NO stable solutions for chrominance / luminance parts. Even without noise... Even PERFECT, noiseless data will require the interpolation algorithm to make very unstable guesses about what the conditions causing the given data were, since there will be many (countless, up to the value resolution limit...) solutions to the problem. This is why raw-converters differ from each other in how they render small detail. There IS no correct solution, given that the raw converter can only know what the data tells it - it has no knowledge of what really was in front of the lens.

Now take the two worst cases (oversharp images AND noise) - and you get total chaos. At that point, you have no stable solutions no matter what you do. It's all guesswork, and you've lowered the resolution of the image by a factor 2x.
 
Why not RYGB, especially given how much yellow comes up (skin tones and browns)? I know Sony tries RGTB (T being teal) a while back, but that seems the wrong direction.

Is there an advantage to using the same green filter twice? Or, more to the point, I suppose, is there no advantage to not using the same green filter twice?

Or have I missed the plot all together? The two green filters are not the same, just closer together than yellow and green?
It's mostly a questions of unknowns...

More accurately, how many unknowns you can have in a system before the solution space becomes unstable. When you have two identical greens per four pixels, your number of unknowns is halved for most intents and purposes. More than that in many cases.

You have to include the ultimate destabilizer here, noise.

Consider that you have four colours in a 2x2 repeating grid. If the optical resolution vs the digitizing resolution is kept low, so that the best the lens can do is to get you a Rayleigh of about 2 p-p distances, most scenarios can be said to contain stable solutions. Solutions where you can compute the luminance part AND the chrominance part to a reasonable accuracy for each pixel.

Add in some noise, and that stable solution falls apart. It de-stabilizes the system so that you have to choose between stable estimation result deviations to be "luminance errors" or "chrominance errors", and then compensate for that.

Now add in "better optical sharpness"... Which is the condition most consumer cameras - except the smallest sensor, high-res smartphone modules - work under today. Better sharpness than 2 p-p distances mean that there's NO stable solutions for chrominance / luminance parts. Even without noise... Even PERFECT, noiseless data will require the interpolation algorithm to make very unstable guesses about what the conditions causing the given data were, since there will be many (countless, up to the value resolution limit...) solutions to the problem. This is why raw-converters differ from each other in how they render small detail. There IS no correct solution, given that the raw converter can only know what the data tells it - it has no knowledge of what really was in front of the lens.

Now take the two worst cases (oversharp images AND noise) - and you get total chaos. At that point, you have no stable solutions no matter what you do. It's all guesswork, and you've lowered the resolution of the image by a factor 2x.
This makes a great deal of sense to me and explains that bit about detail in RAW conversion which I always wondered about. From what you've said, it's amazing to me that Bayer works as well as it does.
 
I think the explanation is/can be a lot simpler than the previous replies...

The human eye only has three colour detectors referred to as L, M. S (i.e. long medium and short wavelength, roughly corresponding to R, G and B parts of the visible spectrum).

The the ideal colour image sensor would be able to detect and record signals with exactly the same response characteristics as the human eye's three colour sensors.

The simplest way to do this (certainly not the only way) is to replicate the way the eye works/records, i.e. with three sensor elements with responses closely matching the response characteristics of the eye.
Why not RYGB, especially given how much yellow comes up (skin tones and browns)? I know Sony tries RGTB (T being teal) a while back, but that seems the wrong direction.
Having an image sensor with more than 4 'colours (colour filters)' is, very often, either something of a technical/sales & marketing 'gimmick' - or perhaps, being more generous, might be a means of better reproducing the response characteristics of the eye by 'mixing' colour channels signals - or sometimes a means of extending dynamic range (paler, or 'white' pixels).
Is there an advantage to using the same green filter twice?
Yes...

Obviously a '3 colour' sensor doesn't fit well into a square pixel matrix.

A '4 element' pattern of 'RGGB' does fit well into a square pixel matrix.

But why 'RGGB' and not 'RGBR' or 'RBBG'...

Well the human vision system (eye + brain) is centred on, most sensitive to, and gets most of it's spatial information from the middle part of the visible spectrum i.e. 'green~yellow' - so it makes sense to have more 'green' pixel sensors than the other two ends of the visible spectrum.

To quote from Wikipedia article 'Bayer filter' ... "He used twice as many green elements as red or blue to mimic the physiology of the human eye".
Or, more to the point, I suppose, is there no advantage to not using the same green filter twice?
There is no 'obvious' advantage that I can see.

Perhaps a single 'double size' green pixel could be read-out with lower total read-noise (i.e. lower SNR) at the cost of some loss of spatial resolution - but usually the green channel is the highest signal level anyway, and the fact that there are normally twice as many green pixels in itself reduces aggregate green channel read-noise relative to the R & B channels anyway too.
Or have I missed the plot all together? The two green filters are not the same, just closer together than yellow and green?
The green filter(s) are nearly always the same.

Having said that - it is quite common for the mean signal of the 'G(R-row)' to be different to the 'G(B-row)' - two reasons:- 1) the different 'G' rows often use different read-out channels and can have gain/offset discrepancies - 2) where the 'G' read-out channels share/alternate reading with the respective 'R' or 'B' pixels sharing the same row, there can easily be signal 'crosstalk/bleeding' - both problems can (but sometimes/often may not) be corrected/compensated for, to varying degrees.
 
Last edited:
I think the explanation is/can be a lot simpler than the previous replies...

The human eye only has three colour detectors referred to as L, M. S (i.e. long medium and short wavelength, roughly corresponding to R, G and B parts of the visible spectrum).

The the ideal colour image sensor would be able to detect and record signals with exactly the same response characteristics as the human eye's three colour sensors.

The simplest way to do this (certainly not the only way) is to replicate the way the eye works/records, i.e. with three sensor elements with responses closely matching the response characteristics of the eye.
This is a common fallacy.

It might be true if you were designing an eye for a robot, but a camera is not an eye.

The purpose of a photo or video system is to provide a copy of the light that would have arrived at your eye if you had been there. It is not to "see" the light but to freeze and store it.

The perfect camera would be like a sheet of glass with a "freeze" button. It would record for playback the exact spectrum curve of the light hitting it from each direction. As this needs a huge amount of data, ways have been found to simplify the image. Using only three colour records is one of these. It is a very crude simplification, but the human visual system can work quite well from very limited clues, as when you look at a pen-and-ink drawing.

The three-primary approach is the cheapest and worst arrangement that is just about acceptable. Skin colours suffer the most.
 
I think the explanation is/can be a lot simpler than the previous replies...

The human eye only has three colour detectors referred to as L, M. S (i.e. long medium and short wavelength, roughly corresponding to R, G and B parts of the visible spectrum).

The the ideal colour image sensor would be able to detect and record signals with exactly the same response characteristics as the human eye's three colour sensors.

The simplest way to do this (certainly not the only way) is to replicate the way the eye works/records, i.e. with three sensor elements with responses closely matching the response characteristics of the eye.
This is a common fallacy.

It might be true if you were designing an eye for a robot, but a camera is not an eye.

The purpose of a photo or video system is to provide a copy of the light that would have arrived at your eye if you had been there. It is not to "see" the light but to freeze and store it.

The perfect camera would be like a sheet of glass with a "freeze" button. It would record for playback the exact spectrum curve of the light hitting it from each direction. As this needs a huge amount of data, ways have been found to simplify the image. Using only three colour records is one of these. It is a very crude simplification, but the human visual system can work quite well from very limited clues, as when you look at a pen-and-ink drawing.

The three-primary approach is the cheapest and worst arrangement that is just about acceptable. Skin colours suffer the most.
Certainly Bayer has given the best results thus far. Many other schemes have been floated and tried, especially during the period when the Koday patent was enforced. As soon as the patent expired everyone reverted to Bayer.

One should not underestimate the momentum of the RGB color system. It is embedded everywhere and perhaps one way of thinking of Bayer CFA now is that it fits best into the existing infrastructure, even if that sounds a little circular.

We have the same problem with HDR. Many many schemes have been proposed for HDR but almost all displays are designed around just 24b RGB along with all the infrastucture between capture and display. Displacing the incumbent system can only happen if you show compelling advantage, or if the government intervenes (e.g. HDTV).

It is really too bad when the technology readily exists to far exceed the performance of the "infrastructure".
 
I think the explanation is/can be a lot simpler than the previous replies...

The human eye only has three colour detectors referred to as L, M. S (i.e. long medium and short wavelength, roughly corresponding to R, G and B parts of the visible spectrum).

The the ideal colour image sensor would be able to detect and record signals with exactly the same response characteristics as the human eye's three colour sensors.

The simplest way to do this (certainly not the only way) is to replicate the way the eye works/records, i.e. with three sensor elements with responses closely matching the response characteristics of the eye.
You're suggesting a camera filter set that not only meets the Luther condition , it goes further by leaving out the 3x3 matrix multiplication to get to the cone responses. I know of no camera that even meets the Luther condition, much less does it that way. There are reasons for that that go beyond the mundane practicalities of what dye sets are available. The human eye has two greatly-overlapped channels (rho and gamma) and one that's offset towards short wavelength and plays almost no part in luminance (beta). That's a good strategy if you're stuck with a simple lens that can't bring many wavelengths into simultaneous focus, like the one in our eyes. However, it's a technique that can cause chroma noise as you try to sort out the overlap.

Even attempts to meet the Luther criterion are rare. I do have some experience with one.

When I was working at the IBM Almaden Research Laboratory in the early 90s as a color scientist, I consulted with Fred Mintzerand his group in Yorktown Heights who developed a scanning camera with the objective that the wavelength-by-wavelength product of the camera's RGB filters, the IR-blocking filter, and the CCD's spectral response would be close to a 3x3 matrix multiply away from human tristimulus response. The camera was used to digitize Andrew Wyeth's work, to capture artwork in the Vatican Library, and for some other projects where color fidelity was important.

Here is a a paper that has some system-level RGB responses. You'll note a high degree of spectral overlap of the red and green channels, just as there is overlap in the medium and long cone channels. You'll also note an absence of the short-wavelength bump in the red channel; this camera didn't do violet. Because the illumination was part of the camera system, the camera did not have to deal with illuminant metamerism the same way as a human.
Having an image sensor with more than 4 'colours (colour filters)' is, very often, either something of a technical/sales & marketing 'gimmick' - or perhaps, being more generous, might be a means of better reproducing the response characteristics of the eye by 'mixing' colour channels signals - or sometimes a means of extending dynamic range (paler, or 'white' pixels).
In 1992, Michael Vrehl, a student of Joel Trussell at North Carolina State (I once saw a bumper sticker in Chapel Hill that read "Honk if you love Carolina, moo if Y'all fum State.") presented a paper at the SPIE Imaging Conference. I'm sorry I can't find a link to the paper itself, only one to the abstract :

"The quality of color correction is dependent upon the filters used to scan the image. This paper introduces a method of selecting the scanning filters using a priori information about the viewing illumination. Experimental results are presented. The addition of a fourth filter produces significantly improved color correction over that obtained by three filters."

I remember being quite impressed with the improvement in color accuracy afforded by the addition of the fourth filter.

Here is a paper that, although coming at the problem from another angle, discusses in general the tradeoff involved in selecting a filter set and here is one the goes right at the problem.

Here is Peter Burns' (yes, that Peter Burns; the slanted-edge guy) RIT PhD thesis that talks about the relationship of noise and filter choices: http://www.losburns.com/imaging/pbpubs/pdburns1997.pdf

Jim

--
http://blog.kasson.com
 
Last edited:
Certainly Bayer has given the best results thus far.
Am I safe to assume that the Bayer array is simultaneously the most compact, highest resolution, most quantum efficient, and has the lowest potential noise of any regular square arrangement for a full-color sensor? Or at least close to it? it seems that improving any one factor by any degree will harm the other factors considerably, with the possible exception of using four kinds of filters instead of three.
It is really too bad when the technology readily exists to far exceed the performance of the "infrastructure".
I’ve been studying a bit of tiling theory, which is relevant to sensor design, and it appears as though a regular hexagonal tiling might have the square array beat at least in some of these factors, although it doesn’t seem to be particularly good with three-color sensors.

While hexagonal tiling has some attractive features, our “infrastructure” precludes using it: everything from file formats, editing software, and computer displays are designed around the idea of a square array. Were there a move to this new tiling arrangement, interim solutions to display the new format on old media — such as some kind of resampling — would likely produce images that look rather poor on screen, which may not speed adoption of the new standard, but rather rejection of it.

On the other hand, tiling theory leads to some interesting conclusions and possibilities, especially if you consider using these in a full-color sensor. I’ve long been a fan of abstract geometric art that can be generated from tiling, and the kinds of regular coloring that you can give to these tilings could be useful in the design of sensors.

If you extend tiling to curved surfaces, this theory could be useful if folks decide to make uniform sensor arrays out of curved sensors.
 
Certainly Bayer has given the best results thus far.
Am I safe to assume that the Bayer array is simultaneously the most compact, highest resolution, most quantum efficient, and has the lowest potential noise of any regular square arrangement for a full-color sensor? Or at least close to it? it seems that improving any one factor by any degree will harm the other factors considerably, with the possible exception of using four kinds of filters instead of three.
In theory there are some other color arrangements, different from RGGB, that offer better aliasing response. But, then again that is just theory. Practically, for various reasons, the Bayer array has widespread distribution.
It is really too bad when the technology readily exists to far exceed the performance of the "infrastructure".
I’ve been studying a bit of tiling theory, which is relevant to sensor design, and it appears as though a regular hexagonal tiling might have the square array beat at least in some of these factors, although it doesn’t seem to be particularly good with three-color sensors.
Hexagonal sampling needs about 13% less sample points than the usual rectangular (or square) sampling for images without loosing any information. In higher dimensions the advantage is much more significant.
While hexagonal tiling has some attractive features, our “infrastructure” precludes using it: everything from file formats, editing software, and computer displays are designed around the idea of a square array. Were there a move to this new tiling arrangement, interim solutions to display the new format on old media — such as some kind of resampling — would likely produce images that look rather poor on screen, which may not speed adoption of the new standard, but rather rejection of it.
You can get closer to hexagonal sampling in a convoluted way. Remember that quincunx approach that DSPographer outlined in other posts by rotating the square sampling grid. After rotation the sampling structure is close to the hexagonal one if you overlay both of them together!
On the other hand, tiling theory leads to some interesting conclusions and possibilities, especially if you consider using these in a full-color sensor. I’ve long been a fan of abstract geometric art that can be generated from tiling, and the kinds of regular coloring that you can give to these tilings could be useful in the design of sensors.
In the area of vector quantization tiling structure has been studied from various perspectives including sampling. You might want to look into that also.

--
Dj Joofa
http://www.djjoofa.com
 
Last edited:
I think the explanation is/can be a lot simpler than the previous replies...

The human eye only has three colour detectors referred to as L, M. S (i.e. long medium and short wavelength, roughly corresponding to R, G and B parts of the visible spectrum).

The the ideal colour image sensor would be able to detect and record signals with exactly the same response characteristics as the human eye's three colour sensors.

The simplest way to do this (certainly not the only way) is to replicate the way the eye works/records, i.e. with three sensor elements with responses closely matching the response characteristics of the eye.
This is a common fallacy.
LOL - it absolutely is not "fallacy".

It is the way that the vast majority of colour imaging systems and sensors are designed and operate - all with very good reason and sound justification.
It might be true if you were designing an eye for a robot, but a camera is not an eye.
Nonsense.

A 'camera/camera sensor' is a close analogue of the human colour vision system.

Quite what difference you allude to re 'camera' versus 'robot' is anybody's guess - a 'robot' vision system may, or may not, be a completely different requirement set to that of human colour vision.
The purpose of a photo or video system is to provide a copy of the light that would have arrived at your eye if you had been there.
No - that is wrong.

It does not need to 'copy' the light at all (and indeed, of course, it doesn't).

It only needs to record, and later reproduce, the equivalent tri-stimulus values of the human vision system (which is, of course, what it actually does do).

It is not to "see" the light but to freeze and store it.
The purpose is to record/reproduce human vision equivalent 'tri-stimulus' values.
The perfect camera would be like a sheet of glass with a "freeze" button. It would record for playback the exact spectrum curve of the light hitting it from each direction. As this needs a huge amount of data, ways have been found to simplify the image.
Nobody ever had to set out to 'find ways to simplify the image' in that fashion at all.

The human eye has always performed the said 'simplification' - and consequently most 'man-made' colour vision systems have not needed to do anything more than closely replicate the same tri-stimulus sensing/recording/reproducing scheme.

Using only three colour records is one of these. It is a very crude simplification, but the human visual system can work quite well from very limited clues, as when you look at a pen-and-ink drawing.
It is not a 'crude simplification' - it is the fundamental way that human colour vision, the eye & brain, actually works.
The three-primary approach is the cheapest and worst arrangement that is just about acceptable.
Nonsense.

Whilst there are undoubtedly 'theoretically' better (and by far, more complex/impractical) schema - to say 'RGB' is the 'cheapest/worst' is nothing more than hyperbolic nonsense.

The tri-stimulus, 'RGB' etc, based colour imaging system, is an excellent colour image recording/reproduction system - certainly not perfect in every way, but for the vast majority of practical needs and purposes, works extremely well.

Skin colours suffer the most.
ROTFL.
 
I think the explanation is/can be a lot simpler than the previous replies...

The human eye only has three colour detectors referred to as L, M. S (i.e. long medium and short wavelength, roughly corresponding to R, G and B parts of the visible spectrum).

The the ideal colour image sensor would be able to detect and record signals with exactly the same response characteristics as the human eye's three colour sensors.

The simplest way to do this (certainly not the only way) is to replicate the way the eye works/records, i.e. with three sensor elements with responses closely matching the response characteristics of the eye.
You're suggesting a camera filter set that not only meets the Luther condition , it goes further by leaving out the 3x3 matrix multiplication to get to the cone responses. I know of no camera that even meets the Luther condition, much less does it that way....
I was not being that specific.

I was simply saying that a typical RGB (e.g. Bayer CFA etc) based camera sensor emulates approximately the same behaviour/characteristics/same tri-stimulus colour model as that of the human eye and vision.

Thanks for all the various links to papers etc - but to be honest, when I see that much mathematical formulae/tables etc, I rather quickly lose interest.
 
Last edited:
You can get closer to hexagonal sampling in a convoluted way. Remember that quincunx approach that DSPographer outlined in other posts by rotating the square sampling grid. After rotation the sampling structure is close to the hexagonal one if you overlay both of them together!
That is true. Also notice that if you use a Bayer pattern rotated by 45 degrees like the original fujifilm superCCD:
That you could sample this in a square 4:2:2, Y:Cr:Cb way where the Y samples could correspond to the green pixel locations and the color samples would correspond to the red and blue pixel locations.

This would lower the number of raw converter output pixels to match the number of green photosites on the sensor, instead of being equal to the total photsite count like with a conventional Bayer pattern; and definitely not twice the count like Fujifilm's raw converter created for the S1 pro. This may be worthwhile when cameras are created with huge pixel counts, but marketing departments may not like the halving of output pixel count.
 
Certainly Bayer has given the best results thus far.
Am I safe to assume that the Bayer array is simultaneously the most compact, highest resolution, most quantum efficient, and has the lowest potential noise of any regular square arrangement for a full-color sensor? Or at least close to it? it seems that improving any one factor by any degree will harm the other factors considerably, with the possible exception of using four kinds of filters instead of three.
You might like to take a look at Fujifilm's 'X-Trans' sensors' CFA pattern...

http://www.fujifilm.eu/uk/products/.../features/fujifilm-x-trans-sensor-technology/

The CFA pattern is less susceptible to colour moire (aliasing) which enables the anti-aliasing filter to be dispensed with.

Dispensing with the AA filter has two advantages...

1) Potentially sharper imaging/higher detail resolution.

2) Fractionally improved image noise (better SNR) due to less absorption light loss.

The ratio of green 'G' pixels to 'R' and 'B' pixels is also slightly different...

'X-Trans' = 2R : 5G : 2B = (approx') 22% : 56% : 22% (versus Bayer's 25% : 50% :25%)

...although I don't think this is of great significance.
It is really too bad when the technology readily exists to far exceed the performance of the "infrastructure".
I’ve been studying a bit of tiling theory, which is relevant to sensor design, and it appears as though a regular hexagonal tiling might have the square array beat at least in some of these factors, although it doesn’t seem to be particularly good with three-color sensors.

While hexagonal tiling has some attractive features, our “infrastructure” precludes using it: everything from file formats, editing software, and computer displays are designed around the idea of a square array.

Were there a move to this new tiling arrangement, interim solutions to display the new format on old media — such as some kind of resampling — would likely produce images that look rather poor on screen, which may not speed adoption of the new standard, but rather rejection of it.
I don't think that would in itself be any serious impediment.

Fujifilm (again) have been using diagonally orientated, 45% rotated arrays, for many many years, 'Super CD' and 'EXR' for example, which require routine re-sampling.

Furthermore, much more complex re-sampling is routine these days, correcting in camera for a multitude of lens distortions/aberrations etc, all at very high speed and in good quality.
On the other hand, tiling theory leads to some interesting conclusions and possibilities, especially if you consider using these in a full-color sensor. I’ve long been a fan of abstract geometric art that can be generated from tiling, and the kinds of regular coloring that you can give to these tilings could be useful in the design of sensors.

If you extend tiling to curved surfaces, this theory could be useful if folks decide to make uniform sensor arrays out of curved sensors.
As resolutions gets higher, in both sensing and displaying, then I suspect that varieties of the tiling/patterns probably matter less and less.
 
Certainly Bayer has given the best results thus far.
Am I safe to assume that the Bayer array is simultaneously the most compact, highest resolution, most quantum efficient, and has the lowest potential noise of any regular square arrangement for a full-color sensor? Or at least close to it? it seems that improving any one factor by any degree will harm the other factors considerably, with the possible exception of using four kinds of filters instead of three.
You might like to take a look at Fujifilm's 'X-Trans' sensors' CFA pattern...

http://www.fujifilm.eu/uk/products/.../features/fujifilm-x-trans-sensor-technology/

The CFA pattern is less susceptible to colour moire (aliasing) which enables the anti-aliasing filter to be dispensed with.
The pattern is less susceptible to moire for most regular patterns, and different color pixels are edge adjacent instead of corner adjacent as with a Bayer array, but the maximum distance between same-color pixels for any angle is *increased*. This means that for some types of color details, such as small features that are mostly only sensed by one pixel color, the aliasing is actually worse for the x-trans sensor. So, while Fujifilm dispenses with the AA filter, that doesn't mean all aliasing problems are avoided.
 
Last edited:
Certainly Bayer has given the best results thus far.
Am I safe to assume that the Bayer array is simultaneously the most compact, highest resolution, most quantum efficient, and has the lowest potential noise of any regular square arrangement for a full-color sensor? Or at least close to it? it seems that improving any one factor by any degree will harm the other factors considerably, with the possible exception of using four kinds of filters instead of three.
You might like to take a look at Fujifilm's 'X-Trans' sensors' CFA pattern...

http://www.fujifilm.eu/uk/products/.../features/fujifilm-x-trans-sensor-technology/

The CFA pattern is less susceptible to colour moire (aliasing) which enables the anti-aliasing filter to be dispensed with.
The pattern is less susceptible to moire for most regular patterns,...
Pretty much what I just stated.

Incidentally, it's not just about the slightly irregular pattern - but it is also advantageous in its having all three R-G-B pixels occurring in every row and column, which virtually eliminates the issue with the Bayer pattern effectively having the separate R-G-B channels effectively displaced, mis-registered (x,y) by up to (1,1) pixels.
... and different color pixels are edge adjacent instead of corner adjacent as with a Bayer array,...
What do you mean by 'edge adjacent' and 'corner adjacent' i.e. what 'edge', what 'corner'?
... but the maximum distance between same-color pixels for any angle is *increased*.
Well to some degree that is obvious - there are 11% less 'R' and 'B' pixels (than a Bayer CFA), so the mean distance (and therefore also the max distance) is bound to increase for these 'colours'.

However, the pros/cons re the 'G' pixels is less clear - there are (unsurprisingly) 11% more green pixels, although they are distributed less evenly than the Bayer pattern.
This means that for some types of color details, such as small features that are mostly only sensed by one pixel color, the aliasing is actually worse for the x-trans sensor.
Quite possible, theoretically - but do you/anyone have any significant real world evidence to show this as being an actual real issue outweighing the benefits of dispensing with the AA filter?
So, while Fujifilm dispenses with the AA filter, that doesn't mean all aliasing problems are avoided.
Nobody here said it did.
 
Last edited:
You're suggesting a camera filter set that not only meets the Luther condition , it goes further by leaving out the 3x3 matrix multiplication to get to the cone responses. I know of no camera that even meets the Luther condition, much less does it that way....
I was not being that specific.

I was simply saying that a typical RGB (e.g. Bayer CFA etc) based camera sensor emulates approximately the same behaviour/characteristics/same tri-stimulus colour model as that of the human eye and vision.
If you're eliding the spectral differences and saying that cameras are trichromats because normal humans are trichromats, I can't argue with that. Trichromacy in humans means that at least three filters are necessary for any but the most elemental color images, Dr. Land's techniques aside. It doesn't mean that more channels wouldn't be better for some purposes, which I believe to be the case. So you're right, three channels is the simplest way to get the job done, since fewer wouldn't work at all, and more is unnecessary if you're happy with the color information you get with three.
Thanks for all the various links to papers etc - but to be honest, when I see that much mathematical formulae/tables etc, I rather quickly lose interest.
I usually skip the math myself on first reading, concentrating on the tables and graphs. But when I want to understand exactly what the authors are saying, I'm glad the equations are there, since they are less ambiguous and more compact than the text. And if I'm going to try to implement an algorithm from a paper, I really want the equations.

Jim
 
It turns out that there's been a lot published on this topic. I've collected a few relevant abstracts:

http://spie.org/Publications/Proceedings/Paper/10.1117/12.2005256

http://spie.org/Publications/Proceedings/Paper/10.1117/12.912073

http://spie.org/Publications/Proceedings/Paper/10.1117/12.872253

http://ieeexplore.ieee.org/xpl/logi...re.ieee.org/xpls/abs_all.jsp?arnumber=1421830

http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=1348371

http://ieeexplore.ieee.org/xpl/logi...re.ieee.org/xpls/abs_all.jsp?arnumber=4623180

http://ieeexplore.ieee.org/xpl/logi...re.ieee.org/xpls/abs_all.jsp?arnumber=4106702

There are people working on arrangement of the CFA elements, number of filters in the CFA, and demosaicing techniques.

I think the key issue that would drive the addition of more color channels is whether or not camera buyers are happy with the color fidelity of today's cameras. If they are, then the disadvantages of adding complexity and perhaps sacrificing some spatial resolution will keep us going down the track we're on.

Jim
 
It's mostly a questions of unknowns...

More accurately, how many unknowns you can have in a system before the solution space becomes unstable. When you have two identical greens per four pixels, your number of unknowns is halved for most intents and purposes. More than that in many cases.

You have to include the ultimate destabilizer here, noise.

Consider that you have four colours in a 2x2 repeating grid. If the optical resolution vs the digitizing resolution is kept low, so that the best the lens can do is to get you a Rayleigh of about 2 p-p distances, most scenarios can be said to contain stable solutions. Solutions where you can compute the luminance part AND the chrominance part to a reasonable accuracy for each pixel.

Add in some noise, and that stable solution falls apart. It de-stabilizes the system so that you have to choose between stable estimation result deviations to be "luminance errors" or "chrominance errors", and then compensate for that.

Now add in "better optical sharpness"... Which is the condition most consumer cameras - except the smallest sensor, high-res smartphone modules - work under today. Better sharpness than 2 p-p distances mean that there's NO stable solutions for chrominance / luminance parts. Even without noise... Even PERFECT, noiseless data will require the interpolation algorithm to make very unstable guesses about what the conditions causing the given data were, since there will be many (countless, up to the value resolution limit...) solutions to the problem. This is why raw-converters differ from each other in how they render small detail. There IS no correct solution, given that the raw converter can only know what the data tells it - it has no knowledge of what really was in front of the lens.

Now take the two worst cases (oversharp images AND noise) - and you get total chaos. At that point, you have no stable solutions no matter what you do. It's all guesswork, and you've lowered the resolution of the image by a factor 2x.
All good points. It amazing that Bayer CFAs work as well as they do, isn't it? However, I think there could be a win here, if we concentrate on holding luminance detail and letting the high-frequency chroma changes get averaged out, on the theory that human response to high-frequency chroma changes is way down from human response to luminance changes within a couple of octaves of luminance CSF cutoff. So we'd lose hf chroma changes that we can't see, and gain better color at low spatial frequencies, where we can see it.

As a crude concrete example, what if we assigned the same chrominance numbers to every pixel in a RGCB quad?

Jim
 

Keyboard shortcuts

Back
Top