No, it's just simple physics. The larger senor results in a sharper image.
No, sharpness is an almost entirely lens sided characteristic and has very little to do with the sensor. Basically the only thing that affects sharpness on a sensor is whether it has an AA filter or not.
You are not talking about the same thing I am talking about. You are talking about the sharpness of the lens, I am talking about the sharpness of the whole image.
The D500 sensor doesn't have an AA filter while the Z6 does, so if anything you should be able to get slightly sharper photos with the D500.
You cannot. And if you tried it, you would see that you didn't.
If one gets less sharp photos with it it's either because a less sharp lens has been used or perhaps focus is a bit off. It could also be that the Z6 has IBIS and he's getting some motion blur in his D500 images.
Nope. You seem to fail to understand what the sharpness of an image is.
Sharpness is measured in line pairs over a distance. Sharpness of a lens is measured in line pairs per mm (lp/mm). Sharpness of an mage is measured in line pairs per picture height lp/ph).
Let's imagine using the very same hypothetical lens on a D500 and a D850.. I suggest the D850 because it has a very similar pixel pitch as the D500. Now our hypothetical lens can resolve, on an optical test bed, an average of 50 line pairs per millimetre over the height of the image circle. So how many line pairs are begin cast over the height of the sensor? Well, that depends on how high the sensor is. On the D500's DX sensor it will cast about 800 line pairs. That's because it is casting 50lp/mm over about 16mm. Over the D850 FX sensor, the very same lens casts about 1,200 line pairs.
So even before digitization, the image cast on the FX sensor has approximately 50% more line pairs per picture height. The digitization process will reduce the achieved resolution, but since the two sensors have similar pixel pitch the reduction on each sensor will be very similar.
Now, reality does get a little more complicated than this hypothetical example. One complication is that lenses tend to produce an image circle that gets decreasingly sharp as one approaches the edge of the circle from the middle. In the centre of the image circle, the lens will cast the same number of line pairs per millimetre on senors of either size. But at the edge of the DX image circle the lens will be casting a higher number of line pairs per millimetre than it will at the edge of the FX image circle.
The average image sharpness over the image circle will be about (Image height in millimetres x average of centre and edge sharpness. If C is centre sharpness of the lens in lp/mm, D is the sharpness at the DX edge and F is the sharpness at the FX edge, the number of line pairs cast over the FX sensor height will be better than 24mm x (C+F)/2 and the number of line pair cast over the DX image height will be better than 16mm x (C+D)/2.
Using basic algebra we can see what condition would have to be true for the DX image to be sharper (have more line pairs per picture height).
Simplifying, we want to see the relation of F and D where
12C + 12F < 8C + 8D
4C + 12F < 8D
C/2 + 3F/2 < D
So the DX image will only be sharper than the FX image when the DX edge sharpness in lp/mm is greater than the sum of half the centre sharpness plus three halves of the FX edge sharpness. There may be a few lenses where edge sharpness drops off this badly from DX edge to FX edge, but they are very rare.
In general, FX
images from a given lens are significantly sharper than DX images from the very same lens (which obviously has the same sharpness itself regardless of which sensor you put it in front of).
Sensor size has a significant effect on image sharpness. Another sensor parameter that affects image sharpness is pixel count.
Imagine a 6-pixel sensor (not a 6MP sensor, a sensor with just 6 pixels). It should be obvious that the highest number of line pairs per image height it can digitize is one. Imagine the opposite case: a sensor with an infinite number of pixels. It should be obvious that the highest number of line pairs per image height it can digitize is the number of line pairs cast on the sensors by the lens. In between these two extremes, the maximum lp/ph achievable is a convolution of the pixel count and the lens' analog resolution.
So both the size and pixel count of the sensor influence how sharp the resulting image will be.