You get the focal length from the exif, then you measure the scene (in real life), measure the image, triangulates and compares.
You are getting into major forensics here, and unless you know the EXACT lens, EXACT focal length, then you'll not be able to do what you are thinking.
I'm also sure that there are enough variations in lenses to render this difficult.
Then what if you crop and then resize back? and what if person doing this has also performed lens correction (or even dilberate distortion - this is a legal case afterall)
Way too many factors, and if its only a small crop practically impossible to work out.