I would assume that for quantum computer to execute one step executing the Shor's algorithm would need to pass (logically) one input through all those gates
You misunderstand what is being measured by 'gate operations. The Quantum Computer wouldn't have $2.9 \times 10^7$ gates (and all the data is set through that set of gates repeatedly).
Instead, the Quantum Computer would need to perform a total of $2.9 \times 10^7$ gate operations; obviously, there is no need to perform them all simultaneously (and in fact, with Shor's, we can't, both because of the no-cloning theorem prohibits generating copies of Qubits to send to independent gates, and for the more pragmatic reason that the inputs of some gate operations depend on previous gate operations).
As for how these $2.9 \times 10^7$ gate operations are mapped to hardware gates, well, it is quite unlikely that we will have $2.9 \times 10^7$ physical gates; some hardware gates are likely to be reused multiple times during the course of the computation (just like, when a classical computer performs an RSA operation, the same gates are reused to implement the various modular multiply operations).
And if you need any error correction between the gates, that will require extra space and hence increase latency, too.
Yes, we know; the $2.9 \times 10^7$ figure above reflects logical qubits; that would translate to some larger number of physical qubits - the size of the increase would depend on the quantum error correction code used (which would depend on, among other things, the actual error rate of the physical qubit operations).
How many guesses on average you would actually need for factoring numbers used in 2048 bit RSA?
With extremely high probability, one. The Quantum Computer finds the order of $g$ modulo $n$, that is, the value $x$ where $g^x \equiv 1 \bmod n$. Unless the order of $g$ with respect to both $p$ and $q$ (the prime factors) is anomalously small (which can be shown to happen only with tiny probability if $g$ was selected randomly), that value of $x$ can be used to quickly factor.