The write-up I gave you in the body of this thread is about as good as you'll find anywhere.
Most internet 'comparisons' have only demonstrated that photographer's looking to disprove something can take dimensionally flat images across multiple formats, even with really good equipment - which is brave of the photographers to admit, but there you go - maybe it makes the point for them.
None of them ever takes a MF image with clear depth and dimension, and then demonstrates how they could reproduce that on a smaller format.
If you can't/won't acknowledge the difference it's a moot point - you saved yourself the cost of the upgrade.
Oh, I remember a video from a fabulous yt channel called mediadivision - it's video focuses, but I recommend it to everyone, the videos have incredibly high quality in terms of content. He speaks of dimensionality too and when taken to an extreme like he did, one really can see a little bit more true to life rendering:
The one fact and a very important fact that is often missing which Chris points out and the video shows us is that photographs do not obey the intuition of classical Euclidean geometry. In reality when on takes a photograph they are projecting a volume of three dimensional space onto a plane. That is projective geometry is the appropriate mathematics necessary to describe photography. That was true for the photointerpreter analyzing reconnaissance photographs to engineers working today in computer vision.
That is the camera sits at the "point at infinity" in projective space and the lens projects the space in the angle of view of the lens on the the image plane. If we move the image plane we change the image. The concept of distance does not really exist in the way our intuition perceives it in projective geometry. That we can not measure the true distance of a fixed object from its projection, the perceived distance changes as the change the orientation of the plane or the point at infinity.
This concept is extremely important in computer vision and when one is trying to estimate the actual size of an object.
"Who is taller?" Considering the link below - how tall is the woman in question.
https://dhoiem.web.engr.illinois.ed...2 - Projective Geometry and Camera Models.pdf
Euclidean geometry on the plane is insufficient to answer the question because length is not preserved.
The change of sensor size as shown in this video shows that there is a projective transform taking place to produce between the projective image produced by the two sensors. It is obvious that size is no preserved. Our brains have no issue in dealing with these issues since our brains have been programmed from birth to receive the world in projective space. But as computer vision is going to be necessary to move forward in a more autonomous world where machines have to be able to see - then the ability of computers to transform what cameras see in to estimates of space, object location and sizes in a three dimensional world become more and more important.
https://hal.inria.fr/inria-00548361/document
Yes the world is viewed differently through different lenses. The world is viewed differently as the sensor size changes. The expansion applied to the image of a smaller format sensor to match that of a larger format sensor is does not preserve distance. That is the Euclidean transform does not preserve distance in the projective image. That is what the video shows and that is exactly what Chris is speaking of.
Yes the medium format camera will view the world in a different way than the small format. The medium format will give a perception of more depth (not to mention smoother tonal gradations). The large format will expand on that. There is a huge difference in viewing a print of an image taken using a 6x7 negative vs a 35 mm negative. There is a reason traditional fine arts landscape is the preview of large format cameras 4x5 up. An image of the same filed of view taken by a 12x20 view camera is vastly different than taken by a 35 mm camera. The LF image has more depth while the 35 mm seems flat.
Clyde Butcher, often called the Ansel Adams of the Everglades is still alive and kicking. To quote Butcher, “I try to use the largest film possible for the particular subject I’m planning to photograph. So, if I have a huge, broad landscape, I use the 12×20 view camera. If I am photographing something like the Ghost Orchid I use a 4×5″ view camera." There is a reason for that.
https://clydebutcher.com
The GFX will provide a different feel for an image. The reason comes from the physics of how an image is produced and the appropriate mathematics to describe that physics is one that does not match with the concept of Euclidian distance in the image plane.
If one can see it or not - doesn't really matter. If one cares are not doesn't really matter. If it doesn't matter enough for one to want to drop the coin to buy a GFX 100 or larger format camera, that's their choice. If one is happy with how their APSC camera renders the world - no problem. However, if we don't understand the fundamental projective geometry and teach our self driving cars to work in projective space - then they are going to be running into each other. :-D
BTW in radar imagery, the raw image is equivalent to a "light field" that is a the three dimensional properties of the sector of space is preserved in the raw radar returns. The image is focused on an image plane after the fact through signal processing of the returns. The image plane can be changed and the returns refocused on a new plane. The image can be formed on any arbitrary plane that does not include the pointing vector for the radar to the scene center. However, any two different image planes will produce different looking versions of the scene - and distance and lengths are not preserved.
That fact makes radar imagery more difficult than optical imagery to interpret for a human.