Modifying raw image data for experiments

Horshack

Forum Pro
Messages
11,231
Solutions
28
Reaction score
12,594
Location
US
When experimenting with sensor analysis it might be useful to modify the raw data and then see how that modification translates visually by loading the modified raw data into your favorite demosaicing app like ACR/PS, LR,RawTherapee, etc..

The ideal conversion scenario would be simply replacing the raw data in its existing vendor-specific raw or DNG container. That way you retain all the metadata and profile info. I'm still working on finding an existing tool chain that will do that, or writing my own. In the meantime I found a tool chain will at least get to a working DNG file:

Tools needed
  1. Build an executable from the pgm2dng GitHub source. A pre-built Windows binary is included in the repository - direct link here
  2. Get dcptool. Windows executable here (free) and OSX here ($9.99)
  3. Matlab ($$$) or Octave
  4. LibRaw's unprocessed_raw.exe, available here
  5. exiftool
One-time steps per camera model
  1. Convert any original out-of-camera raw for the camera model you're working with to a DNG using Adobe's DNG Conveter
  2. Build an XML representation of the Adobe Digital Camera Prfile for that DNG by running dcpTool -d <raw image file> out.dcp.xml
  3. Convert the XML file into a binary DCP by running dcpTool -c out.dcp.xml out.dcp. Ignore any informational warnings displayed
Modifying a raw image
  1. Convert the raw file to a PGM by running unprocessed_raw <raw filename>. This builds a greyscale image with the original raw bayer data. The output filename will be the same as the input raw name plus PGM. For example, a73.ARW becomes a73.ARW.pgm
  2. Load the PGM file into octave via imgData = imread("<pgm filename>") and do whatever modifications you want on the raw data contained in the imgData matrix. You may also choose to delete the masked pixel rows to match the actual dimensions of the image. For example, the A73 with masked pixels is 6048x4024 and 6024x4024 without
  3. Write the modified raw data using imwrite(imgData, "<modified pgm filename>")
  4. Convert the modified PGM file into a DNG via pgm2dgn with the following parameters:
    1. --in=<modified pgm filename>, where the filename is the one used in imwrite
    2. --out=<modified DNG filename>, where the filename is of your choosing
    3. --dcp=out.dcp
    4. --pattern=RGGB (or GBRG for raws that start a G2-B row)
    5. --wp=<R,G,B>, specifies the white balance. One way to get this is to load the original raw into a demosaicing tool that doesn't apply WB and sampling the RGB of a neutral patch. However, since WB is so easy to set after the fact, I just use any triplet, like --wp=128,128,128
    6. Optional: --white=<whitepoint> , which you can get via exiftool -whitelevel <raw filename>. Use only one of the three values. For example, --white=15360
    7. Optional: --black=<blackpoint>, which you can get via exiftool -blacklevel <raw filename>. Use only one of the three values. For example, --black=512
  5. Load the modified DNG into your favorite image processing app and set the white balance
Here's a trivial example of dpreview's A73 ISO 100 studio sample with the lower 12 bits of every raw pixel cleared in Octave via bitand(imgData, 0xf000), as processed by ACR/PS:


A7 III ISO 100 Studio sample with lower 12-bits of raw data masked off
 

Attachments

  • 4159535.jpg
    4159535.jpg
    4.2 MB · Views: 1
Last edited:
Here are some quick Matlab/Octave recipes for modifying individual color channels as part of the raw data analysis/modification workflow.

rawBayerToChannels.m - Separates color channels from raw bayer data
channelsToRawBayer.m - Interleaves color channels back into a bayer pattern
swapRedBlueChannels.m - Example code that uses the above two functions

Here's an animate PNG showing the dpreview A73 ISO 100 studio raw image with its R/B channels swapped, with the resulting raw processed by ACR/LR:
Animated PNG: A73 ISO 100 Studio Image, Before/Afer of Red/Blue raw channel swap
 
Last edited:
Here are some quick Matlab/Octave recipes for modifying individual color channels as part of the raw data analysis/modification workflow.

rawBayerToChannels.m - Separates color channels from raw bayer data
channelsToRawBayer.m - Interleaves color channels back into a bayer pattern
swapRedBlueChannels.m - Example code that uses the above two functions

Here's an animate PNG showing the dpreview A73 ISO 100 studio raw image with its R/B channels swapped, with the resulting raw processed by ACR/LR:
Animated PNG: A73 ISO 100 Studio Image, Before/Afer of Red/Blue raw channel swap
Nice use of reshaping for the interleaving.

-h
 
Here's a more practical application of raw modification vs using it just for scientific tinkering - the ability to generate flat-field corrected raws. Lightroom actually supports this - Eric Chan's plugin was added to the mainline code in 2019. However, Lightroom's implementation doesn't let you configure the blur size so it's not useful for removing dust spots, which is what I needed for an RX1 I own that has sensor dust which can't be cleaned off the sensor.

Here's the before/after of applying a flatrame to a raw, as rendered by ACR/LR. I did a quick 'n dirty sky shot for the flat so the colors are a bit janky for this sample.

Animated PNG: Before/after flat-field dust removal

Source code: applyRawFlatFrame.m
 
Last edited:
Good one Horshack. It's not as elegant as your approach but for reference I've been using the following hack in Matlab via Adobe DNG Converter to achieve the same objective (don't know about Octave).

1) Convert the original raw file to DNG with Adobe's DNG Converter, Uncompressed option ticked, and place it in the working directory. Then read the raw data directly into Matlab

t = Tiff(fileName,'r');
offsets = getTag(t,'SubIFD');
setSubDirectory(t,offsets(1));
cfa = read(t);
close(t);

2) Modify the raw data in array cfa as desired while ensuring that at the end it is still the same size and in its native uint16 format. Then write it back to the DNG file above

t = Tiff(fileName,'r+');
setSubDirectory(t,offsets(1));
t.write(cfa)
close(t);

3) Now open the modified DNG file in the raw converter of choice. The simple code above does not update checksums etc. so for instance PS ACR complains that the file appears to be damaged but opens it fine anyways.

Jack
 
Last edited:
Good one Horshack. It's not as elegant as your approach but for reference I've been using the following hack in Matlab via Adobe DNG Converter to achieve the same objective (don't know about Octave).

1) Convert the original raw file to DNG with Adobe's DNG Converter, Uncompressed option ticked, and place it in the working directory. Then read the raw data directly into Matlab

t = Tiff(fileName,'r');
offsets = getTag(t,'SubIFD');
setSubDirectory(t,offsets(1));
cfa = read(t);
close(t);

2) Modify the raw data in array cfa as desired while ensuring that at the end it is still the same size and in its native uint16 format. Then write it back to the DNG file above

t = Tiff(fileName,'r+');
setSubDirectory(t,offsets(1));
t.write(cfa)
close(t);

3) Now open the modified DNG file in the raw converter of choice. The simple code above does not update checksums etc. so for instance PS ACR complains that the file appears to be damaged but opens it fine anyways.

Jack
Thanks Jack. My next step described in the OP was to find a way to do in-place replacement of data in the DNG and your examples set me on the right track. Octave doesn't have that TIFF library but all the info I need to find the raw data inside the DNG is contained in the EXIF so I'll be invoking exiftool to extract the offsets/lengths and then doing a straight binary read inside Octave/Matlab script to pull the data in. I should be able to fixup the DNG checksum as well - that is available via the NewRawImageDigest EXIF tag and is an MD5 hash of the raw data. Or it might just be easier to just remove that EXIF tag entirely so the MD5 doesn't have to be recalculated. I'll post the resulting code when I have it done, probably within a few days as time permits.
 
Last edited:
I have completed a set of Octave/Matlab function scripts for reading and writing the raw data inside a DNG. Unlike the OP workflow this works on an existing DNG, so it doesn't require converting back and forth between a raw and PGM file. You do have to use uncompressed DNGs though.

The GitHub repository is here. I've only tested these under Octave but will ping Jack to see if he can try them under Matlab as well.

loadDngRawData.m - loads the raw data for a DNG. It also loads the full EXIF data into a containers.Map, which allows the EXIF info to be accessed like an associative array. For example, exifMap("iso") returns the ISO value. Use exifMap.keys() to get a full list of tags available for the DNG loaded.

saveRawDataToDng.m - writes modified raw data back into the DNG. It also strips off the MD5 digest of the raw data, which will stop ACR/LR from complaining about the file being damaged.

swapRedBlueChannelsInDng.m - Trivial example showing how to use loadDngRawData and saveRawDataToDng.

applyFlatFrameToDng.m - Another simple example of usage, applies a raw flat frame to another raw file (for sensor dust removal and vignetting correction).
 
Last edited:
I have added automated median image stacking (source , documentation) to the repository. Here's how it works:
  1. Put all your raw files in a directory
  2. From the Octave console, run createMedianStackedDngs("source dir", "output dir")
Script does the following:
  1. Converts all raw images in "source dir" to uncompressed DNGs by spawning Adobe's DNG converter. Thanks to Jack for giving me the idea of using the DNG converter from the command-line.
  2. Scans all converted DNGs to automatically find related images to stack. Right now it considers images to be in the same stack if their EXIF creation time tags are within two seconds of each other.
  3. Reads the raw data from each collection of images for a detected sequence and calculates the median.
  4. Generates a new DNG with the calculated median by cloning the first image of the stack and overwriting its raw data with the calculated median. The resulting filename is equal to the first image of the stack with "_x_Stacked" appended to the name, where <x> is the number of images stacked to create the DNG.
  5. Loops back to step #2 to find the next related group of images to stack in "source dir".
Here is the sample output of an underexposed GX85 ISO 6400 9-image stack, comparing the stacked image to one of the constituent images making up the stack

Animated PNG (9MB) - GX85 ISO 6400 9-image median stack

Notes:
  • This hasn't been tested under Matlab yet since I don't own a copy. Jack has generously offered his time to help fix any compatibility issues between my Octave scripts and Matlab.
  • I haven't tested this under the Mac OS yet. I plan to within the next few days.
  • Code is initial implementation - it can use some cleanup when I have the time.
 
Last edited:
I recently had to experiment with pixel binning for a project I'm working on. I thought I'd post in case anyone has interest. I added a module to my Octave Raw Tools framework that does 2x2 pixel binning on regular bayer DNG images. This isn't binning per se since I'm not combining electrical charges like a binning sensor would - I'm instead averaging the post-readout pixels.

The specific source modules are pixelBinDng2x2.m and pixelBin2x2.m. There's also support for uncompressed_raw.exe PGMs pixelBinPgm2x2.m.

Here is a sample output, showing a 100% crop of a 2x2 pixel-binned A7rIV image vs the original image downsampled via Preserve Details 2.0.

Animated PNG: Pixel-binning vs Downsample, 100% Crop
 
Last edited:
I've been planning on working on something similar for a while using rawpy, and finally stopped procrastinating (it helps that tifffile added support for a bunch of DNG tags internally so I didn't have to mess with multiple metadata libraries - well at least I didn't prior to wanting to copy lens/model metadata to the destination...):

https://github.com/Entropy512/pyimageconvert/blob/master/libraw2dng.py - Basic converter that takes anything supported by rawpy and attempts to output a DNG. Some metadata is copied - the script currently is tuned to behave as a preprocessor for RawTherapee to handle unsupported raw formats until RT 6.0's libraw rework happens, but happens to allow you to process the raw data as you wish. I'm currently subtracting black levels and scaling the white level in the script and then saving as float16 - if this seems weird, it was because it was preparation for

https://github.com/Entropy512/pyimageconvert/blob/master/pystack.py - mean-stacking of a set of images. It's not as automated as Horshack's stacker so far (does not auto-identify stacks based on interval between images) but this could be added and I plan on it eventually.

These should serve as decent starting points for anyone who wants to do their own manipulation using numpy
 
Hello everyone!

I know this thread is oldish, but kindly I need your help, please!

Since I am no programmer and I have a hard time with anything involving codes, I really struggle to do any of these steps in order to modify a specific spec in a DNG image. It is about a color target I shot for creating a custom DCP profile for my cameras in Lumariver Profile Designer.

My problem with this DNG is that it has a slight deviation from the ideal white balance, I mean, not the scene, the scene is ~6500 Kelvin, ideal condition to create a good dual illuminant profile with 6500-2800 Kelvin, but when I took the photo, I did not measured the white balance with the camera prior to the shot so it would match the scene WB.

Why would I need the WB to match the 6500 K of the scene, when in Lumariver I could use to measure the white balance and use the measured one to create the profile, instead of using the image white balance?! I could do that, but when I set Lumariver to use the image WB, the colors are more pleasing and more accurate, especially in the 2800 K file and I cannot have separate settings for each white point, so I can use either the image WB or the measured WB by Lumariver.

Now, I can modify the DCP file in Lumariver and combine the matrix and overcome this, but the colors are slightly nicer if i use the white balance of the RAW file, but I need it to match the scene in order to be accurate.

I could shoot the target again, but truth is, this shot is the best I have shot and it's no easy task to have spot on 6500 kelvin natural light, it usually comes in all temperatures...

So I am stuck with the option of modifying the WB in the DNG file itself to the 6500 values for my camera. I can read the data and provide you with the right values to replace, but I really cannot go over all the steps myself in order to accomplish this task.

Could anyone help me achieve this?

Thank you,

Ilarion
 

Keyboard shortcuts

Back
Top