I do not mind if you get personal as long as you stay on the subject.

As you, Id like to think there is enough information out there to make image (and other media) tagging possible but I KNOW it is not the case.

Here is the "proof" in the pudding. In last 24 hours I had 6720 hits from Googlebot/2.1 (the text web page indexer). In the same time I had 296 hits from Googlebot-Image/1.0". Just for clarity I have 100 times as many images as I have articles. Google or any other bot cannot cope with the volume of media out there. Also, the web servers would have a really tough time if Google tried to get all images from pbase, SmugMug , or flicker and every other photo website out there continuously. I have no doubt Google has the capacity to do so, but others, which it obtains the information from do not.

Just think in numbers of images and volume of date these servers would have to deal with to stay current. Something else to think about - how long does it take to transfer an image over the net from a server to another one, open it and analyse binary data, index it and store in a database? My typical image is only 800px on the long side and it still can end up 400kb in size.

Here is the latest EXIF specification:

Please point me to the part that contains fields for semantic description of the image content. There is none, and thus why would Google process it for content?

Just to stress one more time. If there is a humanly readable semantic way for describing images (and other media) I'd like to know about them. And no, do not show me RDF and XMP - these are not designed for easy use.

As for changes - people do change HTML all the time. Just look at HTML5 speciications. It has some nice changes for content relevancy but it is still focused 100% on text and not on ANY media. IMG tag and OBJECT tag are embarrassment to W3C as far as semantic approach to information goes. Even TABLE tag has a CAPTION but IMG does not.

Ponder on that for a while, please.

grcm wrote:

"Firstly, there is a very poor understanding of the Internet as media
platform in general amongst the creative community. Secondly, there
is even less comprehension when it comes to the Internet indexing by
search companies such as Google and Microsoft and subsequently how
the search results are or can be formulated by these companies. There
is also a third issue which confuses many people, namely proprietary
tags associated with specific media, such as EXIF generated by modern
digital cameras."

I don't want to get personal but I think you're under-estimating the
technical knowledge of many of the people here. Not all of them will
know how to write a web-server in C, but enough will.

It sounds as though you are thinking of search engines in the 1990s
era. Search engines now are rather more sophisticated. You've
ignored any sort of contextual information. e.g. your paragraph of
text surrounding the image will be used by Google as information
relating to that image. I believe they also use the EXIF information
available. We all know EXIF is a tricky standard and lots is
optional. However, for your simple example of knowing a picture is
of a cat... do you REALLY expect people to change IMG tags to satisfy
a craving for microformats? Just sticking the word "cat" into the
EXIF - anywhere - should make it clear a cat is involved.

People are not going to start changing the IMG tag. People who care
though (which is maybe 0.01% of the world) will put information into
the EXIF tag and this will be used.

Your previous post also seemed to skip the fact that EXIF is not
solely for JPEG images.

I think you're making this non-problem over-complicated.

