First of all lets go back to the part I quoted...
"There I have compared cameras with a Foveon (SD10) and a Bayer
(14n) sensor containing the same number of pixels - pixels, not
cells. Both have 3.4 million pixels (although the Bayer has 13.8
million cells)."
Either he counts pixels as spatially distinct on the sensor
surface, in this case he would be right about the SD10 having 3.4
million pixel but than he would be wrong about the 14n which in
this case has 13.8 million. Or he could count photosensors, that
would put the SD10 at 10.2 million and the 14n at 13.8 million. But
I see no way to put it like the author did, this is just decieving
rubbish.
Or, he could be defining pixels the way Foveon defines pixels:
"Accepted Definitions: Picture Element (pixel) - an RGB triple in a sampled image."
http://www.x3f.info/technotes/x3pixel/pixelpage.html
In which case, he is exactly right, the SD10 has 3.4M and the 14n has 3.3M.
Than we have his BS on the green sampling, just have a look at the
original Bayer patent or try it yourself in Photoshop and you will
see that he is just wrong. Or have a look here:
http://www.stanford.edu/~esetton/experiments.htm
In any case the eye is most sensitive for high frequency in the
green region of the spectrum so he is wrong, and his explanation
why this should be a myth is not much more than hot air, no facts,
no proofs that would really contradict what is known, just nothing.
The double green hoopla is simply wrong, it was fabricated to rationalize to buyers the obvious inefficiency of a 2D sensor design (2x2 scalable pixel elements) when using a 3 primary color model, so I agree with him there too. The reason there is double green is because you must double a primary color in a 2x2 mosiac if using a 3 color model, there is no choice.
If more green was always better, sensors would be all green.
As has already been pointed out, it really doesn't matter what the eye is most sensitive to, because a digital image is an emitter not a receptor. Further, the emission has identical amounts of RGB, because the medium (prints and monitors) require complete RGB at every pixel location. Neither media can display anything but complete RGB triples, where there is not enough R and B to form a complete triple, they are digitally interpolated. There is no objective way to know if the digital placeholders/guesses were right or wrong, but in any case, there is the exact same amount of red, green, and blue in the final emission.
Then we have the Camera shake stuff that is again just wrong, if
the number of pixels and the Field of View of the picture are equal
the size of the sensor just does not matter for Camera shake, if
you still disagree you could just do the math on it.
It's really just common sense. If you cram 50M sensors on to a sensor that is a millionth of an inch wide, I dare you to try to hand hold it without blurring/overlapping two of them. If you spread 50M sensors over a sensor the size of the solar system, it can jiggle a little without overlapping most of them.
Or how about this:
"Electronic sensors pick up random fluctuations in light that we
cannot see. These show up on enlargements like grain in film"
nonsense again.
He's just oversimplifying an explanation of noise. He is essentially correct, it's due to random fluctuations in lots of things, semantically, you can nitpick any explanation of noise endlessly.