"Exif metadata" - a mini tutorial

Doug Kerr

Forum Pro
Messages
20,899
Reaction score
13
Location
Alamogordo, NM, US
You have perhaps noted that I often speak of "Exif metadata" when describing the familiar information about the camera, exposure details, and the like embedded in our camera output files, often called "Exif data" by others. I thought that an explanation of the term I use might be in order.

First, what does "Exif" signify? It does not, as is often believed, refer just to the provision in an image file for the kind of information I mentioned above. Rather, Exif (short for "Exchangeable image file format for digital still cameras") designates a specification for an entire image file format. It is the format used for the JPG files generated by our cameras.

(To be a bit more precise, we are actually talking about the "Design rule for Camera File System" format, which is an adaptation and elaboration of the Exif format, Version 2.1 up, and which prescribes guidelines for folder and filename matters.)

Exif files come in two flavors, in one of which the actual image data is in TIFF form, and in the other, in JPEG form. It is of course the latter flavor that we most commonly encounter.

Now, all that having been said, one of the most distinctive aspects of the Exif file format is indeed its inclusion of provisions for carrying information identifying the camera, giving the technical details of the exposure, and so forth. This is formally called "Exif metadata".

What does "metadata" mean? In this context, metadata can be defined as "data about data". In our case, what it the actual "data" that the metadata is about? It is the image data itself.

Thus, to be really rigorous, "Exif data" refers to the JPEG image in our Exif files - it's "payload".

Aren't you glad you asked!

Best regards,

Doug
 
Very interesting, thanks.

I've been wondering a bit how this Exif metadata is storred with jpg's and tiff's as I have all sorts of different reactions when saving with photoshop.

Sometimes it seems to destroy the Exif data, other times it's intact. But here is the bizzare one. I usually use Irfranview to view my images and often, after saving with PS, all the Exif data seems to be gone - but the same images SHOW all the EXif data with other viewing packages.

Wierd-

--
For those inclined, I can be found here:
http://www.pbase.com/jchambers
or
http://www.photosig.com/users.php?id=3042
 
Hi, Jo,
Very interesting, thanks.
Sometimes it seems to destroy the Exif data, other times it's
intact. But here is the bizzare one. I usually use Irfranview to
view my images and often, after saving with PS, all the Exif data
seems to be gone - but the same images SHOW all the EXif data with
other viewing packages.
The syntax for the Exif metadata is quiote complex. Some Exif reading applications take shortcuts in locating the various data items that won't necessarily work with all "legitimate" Exif files.

It might be that PS follows some particular legitimate approach in formatting the data that some "shortcut" in Irfacnview won't follow, or maybe PS does soemthing illegal that is still tolerated by many apps, but not Irfanview.

I's suggest that you send one of the affected files to Irfan and ask him to look into why Irfanview doesn't read teh Exif metadata.

Best regards,

Doug
 
There must be a reason, since people rarely put in complexity for its own sake. Why not, for example, embed an XML document containing the metadata, and provide a schema that describes the syntax? Does Exif metadata predate the "XML revolution?"

Petteri
--
Me on photography: [ http://www.prime-junta.tk/ ]
Me on politics: [ http://p-on-p.blogspot.com/ ]
 
Why not, for example, embed an XML document containing the
metadata, and provide a schema that describes the syntax? Does Exif
metadata predate the "XML revolution?"
AFAIK the first exif specification document dates back to 1996 and was sold by JEIDA (Japan Electronic Industry Development Association) for $60.
There must be a reason, since people rarely put in complexity for its own
sake.
Unfortunately sometimes people do put in avoidable complexity, usually not for its own sake but because they don't know any better. This is especially true for syntactic definitions -- just look at some of the most popular programming "languages". They are unnecessarily hard to parse for both humans and machines.[/I]
 
e.g. smugmug can make a good guess at the focal length of any photo. How does it know? Is this included in the Exif data?
--

jonclayton.smugmug.com
 
Hi, Petteri,
There must be a reason, since people rarely put in complexity for
its own sake. Why not, for example, embed an XML document
containing the metadata, and provide a schema that describes the
syntax? Does Exif metadata predate the "XML revolution?"
The Exif format specification certainly predates XML.

Probably more to the point, the "pointer-based" syntax prescribed was very much in favor for data structures in many IT areas during that area. Very similar approaches were used in, for example, many telecom signal formats developed during the same era (such as those involved in Signalling System No. 7, and in many wireless telephony system control protocols).

These formats appeal to programmers whose outlook is that of assembly language. Even though such standards are not supposed to be "implementation-independent", that is not the reality of life in a standards committee. There are various things about the structure of ASCII, for example, that were utlimately decided based on somebody comparing the implications of one choice or another on the type of latches that would be required to screen certain parts of the character set in an electromechanical teletypewriter. (I have to confess that I did some of that myself.)

I suspect that, with the perspective of the era, the pointer syntax was seen as quite easy to parse with simple code. But that is no protection against people taking shortcuts. ("I don't need to let down no stinking steering column to take out the speedometer in this puppy.")

Recall the problem we had in the year 2000 CE regarding leap year calculations, as a result of the fact that many programmers had earlier not implemented the entirety of the long clearly-defined algorithm for whether a year was a leap year, saving a couple of lines of code because the third clause "would never come into play during the lifetime of these systems".

Many of those gultily of this then said, "well, who would have known that 2000 would be an exception to the rule?" It of course was no exception to the rule - there was just one more clause in the rule than they had bothered to implement. (I myself learned the rule - all of it - in 1940, when my mother looked up "leap year" in the encyclopedia for me on February 29.)

(You guys here who suffer from me can put part of the blame on Florence Louise Ledee Kerr, who stimulated and supported so much of my early technical learning, and gracefully tolerated the fact that I really didn't care for any other learning.)

Best regards,

Doug
 
It appears that subject distance - previously available for lenses that supported it, is no longer being output on the latest cameras - 350D and 20D

Now is this cockup* or conspiracy? (* quaint UK term meaning error!).

It does appear to be making life more difficult for using lens correction software, which plays into the conspiracy theory.

Canon may well have concerns about the increasing trend for lens correction in the digital domain to iron out some of the performance differences between low end/high end Canon lenses, and between Canon and third party lenses.

I certainly resent this data being missing - apart from lens correction it is useful for learning purposes and understanding what went wrong with some shots. Canon made a certain amount of marketing noise about some lenses providing this data, and customers may well have chosen such lenses in the expectation that this data would be available.

It would be different if it had never been output, but removing it, especially without giving explicit warning to this effect, means that customers upgrading from previous cameras had a legitimate and reasonable expectation of continuing to have the use of this information. Customers could even have grounds for legal remedy (if they have deep enough pockets!).

Personally, I am convinced that digital correction is the way forward, so much so that I would like to see a lens design which was simplified and optimised for digital correction, even if at the expense of some viewfinder image distortion.

Fred
 
Hi, Fred,
It appears that subject distance - previously available for lenses
that supported it, is no longer being output on the latest cameras
  • 350D and 20D
I do know that in the past, when focus distance was supplied in the Exif metadata for certain Canon lenses, it was often very unreliable - even unbelievable.

I'm not sure why that was.

Possibly figuring into this whole story somehow is the rumor that Canon's ability to report this was hampered by some Nikon patent regarding deriviation of distance information from AF lenses. I never heard any details, or even an authentic confirmation that there was such a issue.

But everything you hear means something!

Best regards,

Doug
 
Thanks for the explanation. That does explain it. The Lord knows we need to deal with a lot of historical legacy.

For starters, everyone should start using Unicode for everything. Now.

Petteri
--
Me on photography: [ http://www.prime-junta.tk/ ]
Me on politics: [ http://p-on-p.blogspot.com/ ]
 
Yes - I've seen some wacky values in the past too!

However there has been quite a lot of marketing hype about using distance information in E-TTL II (and before someone jumps down my throat I know it is not essential for that).

But if Canon have withdrawn this data, at least in the form of subject distance information in the EXIF, because they can't provide it accurately (with or without infringing Nikon patents), then what are we to infer about its use in fine tuning flash output!

In any case, given the previous history, I think the lack of information provided to current owners and potential purchasers, about this new policy - if such it is - is inexcusable.

On the other hand maybe it is all just a mistake and the data will re-appear in the next round of firmware upgrades!

Fred
 
It's nice to be a grandpa.

In that vein, my first wife (now deceased) was always a little irritated that I always used to remember the birth date of our youngest daughter as "two days before the first release of ASCII was approved" (which was July 17, 1963).

That daugher in fact later in life often gave semionars and training programs on various computer application topics. Sometimes, she would point out that in a certain situation, it would perhaps be easiest to cast the data as an ASCII file to transport it to a new context.

Once, a student said, "Linda, just how long has ASCII been around, anyway?"

"Two days longer than me", she replied.

Best regards,

Doug
 

Keyboard shortcuts

Back
Top