The problem has an "easy solution" using general techniques.
There are extensions of Yao's Millionaries problem about how to securely compute any circuit $C(x, y)$. This is an entire field of study, which usually goes by the name "Secure Computation" or "Multiparty Computation".
There exist circuits that can sort inputs that go by the name of sorting networks. While there are technically $O(n \log n)$ ones (AKS sorting network + Zigzag sort), they are technically complex, with bad constants, so it is almost always better in practice to use an $O(n (\log n)^2)$ sorting network of which there are many (say Batcher Odd-Even Mergesort).
This is to say your problem is solved by computing a sorting network (say Batcher Odd-Even Mergesort) using general 2 party computation techniques.
Looking at implementations of MPC protocols, it looks like this one by Boston University includes a relevant demo. In particular, they
- sum two input arrays, and then
- sort the result, using Bubblesort.
You can probably re-use some of this code, but will need to be somewhat careful.
The security guarantee of MPC is that
The MPC protocol only reveals $C(x,y)$, e.g. is as good as both participants giving their inputs $x, y$ to some trusted third party $T$, who then computes $C(x,y)$, and tells them the answer.
You could theoretically take the sample code above, swap the part that sums things to a concatenation, and go on your way. This will lead to a practically insecure scheme --- given $x$ and $C(x,y)$, it is trivial to recover $y$.
I would instead suggest have $C(x,y)$ compute (in the sorted representation of the concatenation $x||y$) a bit vector indicating where each party's elements should end up.
This still encodes all of the relevant order information, but hides each party's input from the other party (except for the order information, which "must leak").
This can be done by
- Mapping each element $t$ of $x$ to $2t$,
- Mapping each element $t$ of $y$ to $2t+1$,
- concatenating these transformed elements,
- sorting the concatenation,
- mapping $t\mapsto t\bmod 2$ at the end.
This circuit computes the function I described previously, and "breaks ties" between $x$ and $y$ by putting the element of $y$ later in the vector. This choice is arbitrary.
Of course, this is my idea of a sensible circuit $C$ to compute which may achieve your goals.
You may think it leaks too much information, and have some other circuit you want to compute.
You can still use the general techniques of MPC to do this --- provided you can specify the computation you want to do.
In the comments you mentioned a desire for a multi-party version of this.
This is immediate from the construction above, but I will discuss it explicitly anyway.
Let $P_0,\dots, P_{k-1}$ be parties with inputs $\vec x_0,\dots, \vec x_{k-1}$.
Define a circuit $C(x_0,\dots, x_{k-1})$ that
- for each $i$, applies the function $t\mapsto kt + i$ to each coordinate of $x_i$,
- concatenates all of the results together
- sorts the results, for example with a sorting network,
- computes the function $t\mapsto t\bmod k$ on each element of the output.
The result will be a vector with elements in $\{0,\dots,k-1\}$.
For simplicity, say that the result is $[0,3,2,2,1,0,1,3]$.
This would indicate to all 4 parties that the relative order of their inputs is
- party 0's smallest element
- party 3's smallest element
- party 2's smallest element
- party 2's largest element
- party 1's smallest element
- party 0's largest element
- party 1's largest element
- party 3's largest element.
If one uses an appropriate MPC scheme (the words to look up are "Maliciously Secure Multiparty Computation"), then this is provably all that an adversary will learn, modulo some technical assumptions. These assumptions depend on the particular scheme you use, but they are typically something like
- not too many participants in the protocol "collude to break the protocol",
- the adversaries are computationally efficient, and
- perhaps there is some "trusted setup" phase.