There are many algorithms that can combine many low-res images into a single hi-res image.
(This one was announced 15 yeas ago: https://users.soe.ucsc.edu/~milanfar/publications/journal/SR-challengesIJIST.pdf )

I think that Silkypix just uses a newer and better hi-res algorithm than ACR.

But that's not what's going on at the raw converter stage to a HiRes raw file. The camera internally merges the subframes using a fixed algorithm (perhaps similar to what's discussed in the linked paper) and outputs an ordinary looking Bayer-style raw file, albeit a much larger one constructed from the subsampling of each normal-sized pixel position. The raw converters are not called upon to do anything different with one of these HiRes raws than they do with normal raws.

The hi-res takes 8 Bayer-shots (2 RGB-shots). The key point is, pixel size is still the same as a normal shot. When all shots are stacked together, pixels are overlapped. So we need a good algorithm to separate the information of overlapped pixels.

Yes, that's understood and implicit in my reference to "subsampling of each normal-sized pixel position." What you're not addressing is my point that the "good algorithm" you're referencing is applied in camera during the construction of the Bayer-style raw file, not later in the raw converter. Consider this: When the S1R generates a HiRes image does it output 8 individual/interim raw image files or just one? If it's the former, then you would be correct that the raw converter would have to have a built-in capability of handling the 8 samples per subpixel position. But in fact it's the latter, which means that the camera has already done the heavy algorithmic lifting for merging the subsamples into a specific R,G1,G2 or B value for each of the subpixels. This allows the raw converter of choice to simply see the raw file as a normal Bayer style RGGB raw file.

