he chooses to base his calculations on the unweighted sum of the four channels and then relates the resulting value to the max recordable level. I prefer instead to base my calculations on the weighted average of the channel averages, i.e., (R + B + 0.5G1 + 0.5G2) / 3, but this difference between us is of little practical consequence for the results.

To better match the luminance as perceived by human eye (and encoded, for example, in JPEG's luma channel), the correct formula should be closer to 0.3R+0.6G+0.1B (or, if you will, 0.3R+0.3G1+0.3G2+0.1B.
Should not change the result much though.

