>>4x CPU power only results in 2x speed

Possibly because hyperthreading is in the mix? Just guessing. HT has almost zero benefit in tasks like JPEG encoding, and as I recall the application has to mask HT cores (or the OS masks them for the app). Forget how this process exactly goes.

Video encoding is highly scalable because sequential frame rendering can be assigned sequential threads. You'd think JPEG batch encoding would be just as efficient with many images to process, and there's *less* data to share than video encoding in the time domain.

H265 is rumored to speed a lot of this up, and should be more efficient than the dated JPEG spec.

The OP's CPU is a very fast 4 cores CPU with HT, which can run 8 threads concurrently. It is true that on early HT CPUs (P4 and Xeons from circa 2002) the advantages were not that obvious and turning HT on actually slowed down some workloads a bit.

On modern chips, like the one of the OP, when running a mix of number crunching and some I/O, HT CPUs acts *mostly* like having twice as many cores, so 8 in his case, the obvious exception being of course when running 100% CPU-bound tasks with no I/O or wait time at all. ("The entire pipeline of the Nehalem-based processor core is set up to recognize 2 separate streams of instructions (one for each hardware thread)." -- Intel)

I theory, if LR scaled linearly, the OP would benefit from running as many as 8 LR export tasks concurrently, if his disks and available RAM are able to keep up with the load.

Regarding parallel JPEG encoding, a 2010 IEEE paper describe 7x speedup using parallel encoding without quality degradation and a method to go beyond that limit, with up to 18.8x speedups relative to sequential processing:

