Now let's look at the new image as a whole. Each pixel has signal 4s (and noise 2Sqrts(s)). The size of the new image is m/2 x n/2, and so total signal is m/2 x n/2 x 4s = mns, and total noise is Sqrt(mns), just the same as for the original image.
What is responsible for the image as a whole appearing less noisy if the relative noise is the same?
It doesn't; at least not when I do it and make the comparison as meaningful as possible.
Here is a 100% crop of an original image followed by a 100% crop of the same region of the x2 downsampled version

original

x2 downsampled
If you view them either above or at their originals, the noise looks pretty much the same to me – particularly if, when viewing the "originals," you view the smaller image at half the distance you use to view the larger.
But, one would really like to view them at the same size. There are, of course, problems in doing this because of the differences due to the downsampling along with the fact that the jpeg artifacts that are created when making and saving the downsampled jpeg version are quite different from those of the original. While the noise is the same, the structure of the noise is different and when you blow the downsampled version up it need not compare well.
The best way to compare at the same size is to do the following:
• take a raw image with good noise, such as in the images above, and open in ACR -> PS.
• duplicate the image in PS
• downsample the duplicate by halving the pixel dimensions
• magnify the original to 100%
• magnify the downsampled duplicate to 200%
And now compare similar regions.
Here are screen shots of the original at 100% and downsampled duplicate at 200%. These have the same pixel dimensions and can be readily compared in their "originals" in the dpr image viewer:

screen shot of original at 100%

screen shot of downsampled at 200%
And here, just as a matter of interest, are the statistics for the above two shots:

statistics for original

statistics for downsampled