Relative Linear System Transfer Function
I feel like the stop system it uses must have some relation to actual nit values. It has a “SDR white” threshold, then 1, 2, 3, and 4 stops above that. So if SDR white is 120nits, the max value would be 1920nits, and (for example) the point where the visualization switches from purple(+2.x stops) to pink (+3.x stops) is 960nits.
Except I don’t know what reference white value they’re using for that SDR/HDR line. I’m also unclear if the output image actually follows the stops. It’s possible those refer to the linear working space, and there’s some sort of output look curve/rolloff that gets applied prior to the output. If that’s the case, “4 stops over SDR white” values wouldn’t actually come out to 16x the reference SDR white value. I believe this is how the ACES HDR output transforms work, the input floating point value to the ACES RRT is always fixed (I believe it’s 16.3 or something like that) which then gets mapped onto PQ or HLG at varying brightness. You can specify whether 16.3 is 1000nits, 2000nits, or 4000nits.
Since the “whites” slider in the basic panel just drags the histogram up and down, you can theoretically use that to specify the brightness of a highlight. It’s just we don’t know what nit level “+3.2 stops” actually is.
From a photographer's perspective, I would prefer the system OOTF to be assumed to be linear - then leave it up to me to introduce emphasis here or there if I so wished. Those however would be artistic choices, a different subject. As I understand it, one does not need to know absolute intensity values to achieve that. In fact in current HDR displays peak Luminance may be a bit higher than in the past but it is relative to what it was in the field, almost never absolute.
Thinking aloud, and knowing that photographers view their images in a variety of environments and backgrounds, it seems to me that for ideal linear system rendering all that is needed is a relative intensity reference, which has historically been maximum diffuse white (with its derivatives L*100 and mid-gray L*50). In color science it is often shown as a linear intensity normalized at 1.0. If anyone wants to capture and display light sources and/or mixed/specular highlights, feel free to use the range above that: 4 stops above it = 4.0.
Then take the capture and display it. But but but, you say, assuming a linear transfer function, how do I display that? Feed it to your display as-is, I say, given its capabilities. I find it puzzling that ACR HDR thinks my HDR10 display with maximum luminance about 250 cd/m^2 has less than a stop available above maximum diffuse white because ... ??? How does it know how bright I like to show my images? For the record white (1.0) is 65 cd/m^2 as I write this, that's how I like it with my setup. 250 cd/m^2 is 2 stops above 1.0.
1.0, hence everything else. should be explicit in the encoding function. I generally Expose To The Right (ETTR) and have about 12 usable stops of linear Dynamic Range available from a single raw file. I want to be able to plug that into a 12 stop CR display and DTTR (D for Display), with a linear OOTF*. Let me decide how bright I want max diffuse white (or peak or mid-gray) luminance to be, depending on the medium and viewing conditions, different on an HDR 27" monitor at a desk than on a 90" TV in the living room.
That's my current feeling for photography, don't know much about video/gaming.
* If it doesn't fit we have Tone Mapping Operators.