I don't see anywhere on that page that describes oblivious transfer in the way you described.
Oblivious transfer:
- Alice holds two strings $x_0, x_1$
- Bob holds a bit $b$
- Bob learns $x_b$ but learns nothing about $x_{1-b}$
- Alice learns nothing about $b$
Garbled circuit:
- When Alice garbles a boolean circuit $f$, she obtains a big collection of ciphertexts $F$ ("the garbled circuit") and she also learns two keys (labels) on each wire of the circuit. On wire #$i$ she learns $W_i^0$ which represents false on that wire, and $W_i^1$ which represents true on that wire.
- If Bob knows the garbled circuit $F$ and for each input wire $i$ of the circuit he knows exactly one of $\{ W_i^0, W_i^1\}$ then he can evaluate the garbled circuit to learn exactly one label on each wire of the circuit (not just the input wires). He learns the "correct" wire label (i.e., the one corresponding to the behavior of $f$) on each wire, according to which of the input labels he starts with.
- As Bob evaluates the garbled circuit, he does not learn whether he is holding $W_i^0$ or $W_i^1$ for any wire $i$. In other words, he doesn't know whether any wire in the circuit carries true or false. He evaluates the circuit "blindly."
If Alice & Bob want to evaluate a function using garbled circuits, then Alice garbles the circuit and sends the garbled circuit $F$ to Bob.
But Bob needs to also learn, for each input wire $i$ in the circuit, exactly one of $\{ W_i^0, W_i^1\}$.
Both Alice & Bob have inputs to this computation.
If wire #$i$ is one of Alice's input wires, then she can just send the "correct" one from $\{W_i^0, W_i^1\}$, since she knows both of these and she knows her input bit on that wire.
If wire #$i$ is one of Bob's input wires, there must be a way for Bob to choose to learn exactly one of $\{W_i^0, W_i^1\}$.
Since Bob chooses between these values according to his private input, Alice should not learn which one he chose.
Oblivious transfer is used for exactly this step.
For input wire #$i$ belonging to Bob's input, Alice provides $W_i^0$ and $W_i^1$ to an instance of oblivious transfer, and Bob chooses which of these to learn.
There are variants of oblivious transfer where Alice has $N$ strings and Bob chooses one of them to learn.
If Alice & Bob want to compute a very simple function $f(a,b)$, where there are only $N$ possible inputs for Bob ($b \in \{1, \ldots, N\}$), then yes they can use oblivious transfer to compute the function directly like you suggest, without any garbled circuits.
Alice uses $f(a,1), f(a,2), \ldots, f(a,N)$ as inputs to oblivious transfer.
Bob chooses to learn the $b$th of these, which is $f(a,b)$.
This only works when $N$ is very small, because it requires communication and computation proportional to $N$.
Remember that $N$ is the number of possible inputs for Bob.
If there is a circuit where Bob has $k$ input wires, then he has $N=2^k$ possible inputs.
Hence, usually Bob has too many inputs to use this approach.
Response to edited post:
Your approach for the millionaire's problem is not exactly clear, so I have to guess at a few things.
It doesn't work to reveal partial information about the inputs during the protocol.
If Alice has input 1, she shouldn't be able to distinguish between Bob having input 2 vs input $2^{20}$ -- but these two inputs would look very different if the outputs of the comparisons were revealed in the clear.
With this in mind, you suggest encrypting things somehow. But you are essentially proposing a binary search. A binary search requires you to know the result of the previous comparisons to decide which comparison to make next. If you are encrypting the result of the comparisons then it is not clear how to proceed in the next rounds.
A big problem with your proposal is that it is inherently sequential. The parties can't know their inputs to the next round until receiving output from the previous round.
So for $k$ comparisons you need $\Theta(k)$ rounds of communication.
Compare this to a GC-based approach which always takes a constant number of communication rounds.
I think you are onto a good conceptual connection, though.
Imagine the function $f$ is very small, so that it's possible to write down the entire truth table of $f$.
Let's say $f : \{1,2,3,4\}^2 \to \{0,1\}$.
Alice chooses random encryption keys $A_1, \ldots, A_4$ and $B_1, \ldots, B_4$.
She also encrypts the truth table of $f$ as
$$\textsf{Enc}( A_1 \| B_1, f(1,1)), ~~
\textsf{Enc}( A_1 \| B_2, f(1,2)), ~~\ldots,
\textsf{Enc}( A_i \| B_j, f(i,j)), ~~\ldots$$
To evaluate $f$ on their private inputs, Alice can send all of these ciphertexts to Bob.
She can send the correct $A_i$ value, and they can use 1-out-of-4 OT to let Bob learn the correct $B_j$ value.
Now Bob can decrypt exactly one of these ciphertexts, which will contain the correct output of $f$.
Garbled circuits is just what you get when you take this simple construction, but instead of encrypting the final function output, you encrypt keys to another gadget of the same type.