Then they just
create a 16-bit per pixel per color image from the 12-bit raw data.
The highest 4 bits reman as zeros.
I don't believe this is the case. The conversion process from 12-bit RAW to 16-bit uses all of the available bits in the destination space.
I think you got it a bit wrong in both cases. 12-bit RAW means 12-bits per
photosite , not 12-bits per pixel (a "pixel" in this case is comprised of, as you say, green-red-green-blue). Again, in the 16-bit destination color space, that refers to 16 bits per
channel , not per pixel.
So that means you have 48 bits per "RAW pixel" (12 bits/photosite * 4 photosites/pixel = 48 bits/pixel) that's getting processed into a 48-bit destination space (16 bits/channel * 3 channels/pixel = 48 bits/pixel).
So, basically, you're going from one 48-bit space to another.
You can see this is true by considering what an 8-bit JPG means. This means for each pixel in the JPG image, there are 256 different values that can be assigned to that pixel's red channel, 256 different values to the green, and 256 different values to the blue. That's 2^8 different values
per channel . This means that an 8-bit/channel pixel actually has 24 bits, so it can display 2^24 different colors (16,777,216, about 16 million colors).
So 16 bit color is double the bit-space per channel, meaning each R, G, and B value can have 65,535 different values. This means a single pixel can display about 281 trillion different colors, which should be enough for anybody.
There's another concept in computer graphics when talking about color spaces called the "alpha" channel. This is often used for 3D rendering in games like Doom!. This is 8 bit/channel color using the standard R, G, and B, but it also adds an alpha channel. The 8 bits in that channel specifies 256 different levels of opacity, 0 means totally transparent (invisible) and 256 means 100% opaque (nothing behind it shows through). This is also why in PS you can add a new channel to the channels layer to do masking--in that case, you're adding an 8-bit greyscale mask that can be used to apply effects to different parts of the image differently.
(I don't honestly know if adding a channel layer in PS actually adds an extra channel of bits to each pixel, but that's the concept. It probably uses a different implementation of that concept, because it would be incredibly slow to physically insert 8 bits at the back of every pixel...it most likely just overlays an entirely different image.)