Yes, we see the word "Luminance" here a lot since the Quattro came out, but rarely do we see it's definition. So it has become a word similar to "resolution" which means many things to many people.
So. In the world of imaging, Luminance Y = 0.2R + 0.7G + 0.07B, where we're talking linear RGB decoded from a camera. The coefficients 0.2, 0.7 and 0.07 are called weights or weightings. The weighting of 0.7 for green tells us that green is the most important channel when determining the luminance of a pixel's-worth of a scene. And we note that blue (decoded blue, not the blue layer) carries hardly any weight at all, i.e. 0.07 is not a typo. And that is why JPEG compression uses red and blue for the color info, I reckon.
Regarding luminosity. I have been doing lots of monochrome film based photography. I then used filters. Those filters changed the spectral response, so did the lighting/illuminant. Still - I would absolutely call all those responses luminosity. I mean - monochrome - what is there more than luminosity?
A difficulty in using only one channel to determine Y (as Foveon implies) is that the top layer signal does not tell us what it's detected wavelength is. Go back to diagram and it can be seen that there are many instances of incident spectra that would give the same output. A narrow spectrum at say 450nm could easily give the same output as a wider spectrum at say 630nm (think "area under the curve").
So all this talk of glibly using the top layer as "the luminance channel" carries little weight, lol. Before someone rushes to disagree, provision of Sigmas exact algorithm would necessary to convince anyone here. So far, no such algorithm has been, even after all this time.
Yes, we do not know the algorithm. We can guess its nature though.
For Merrill and Quattro it has to put very high weight on the top layer, for different reasons. Merrill has lousy lower layers and Quattro has lower layers of lower resolution.
So this is my (maybe faulty) guess.
They use the top layer in order to make the original monochrome image. This image has max quality and resolution. But, it has the wrong luminosity - it is too blue sensitive.
Then they use the noisy information from the lower layers. Exactly how they do it - I do not know. They cannot simply denoise and smooth them and do some mixing, because then you would get halos and also decoloring of small details.
I guess they use some AI thinking or adaptive method. This method is used to correct the faulty luminance and also to color the image, without getting too much blotches.
Can be wrong - maybe they just do some clever filtering. Cannot understand how though.