The method in the (revised) question is (mostly†) OK. That is, $A_i$ calculates the shared key as $Y_i^{a_i^{-1}\bmod p}$, where the exponentiation is in the group $G$ of order $p$. And (assuming no alteration of messages) every party including $B$ gets the same shared key, because
$$\begin{align}
Y_i^{a_i^{-1}\bmod p}&={X_i}^{b\,(a_i^{-1}\bmod p)}\\
&=g^{a_i\,(b\,(a_i^{-1}\bmod p))}\\
&=g^{b\,(a_i\,(a_i^{-1}\bmod p))}\\
&=g^{b\,(k\,p+1)}&&\text{for some }k\in\mathbb Z\\
&=g^{p\,k\,b+b}\\
&=(g^p)^{k\,b}\,g^b\\
&=1^{k\,b}\,g^b&&\text{because the order of }g\text{ is }p\\
&=g^b
\end{align}
$$
Given that the initial question used two primes, it's possible that a confusion is made between a prime modulus $n$ used to compute in the group $G$, and the prime order $p$ (formerly $q$) of group $G$ and it's generator $g$. That would explain why numerical experiments fail.
Among many others, our options include:
- Making $G$ an unspecified abstract group of order $p$, and not making numerical experiments.
- Making $G$ the subgroup of quadratic residues of $\mathbb Z_n^*$ for $n$ a safe prime, which has prime order $p$ the matching Sophie Germain prime with $p=(n-1)/2$. A toy (unsafe) example is $n=83$, $p=41$. An actual example of such group is the 2048-bit MODP group of RFC 3526.
- Making $G$ a Schnorr group, that is a subgroup $G$ of prime order $p$ of $\mathbb Z_n^*$. Then $p$ must be a prime divisor of $n-1$. This extends the above, allowing smaller $p$ thus faster operation. An actual example of such group is the 2048-bit MODP group with 256-bit prime order subgroup of RFC 5114.
- Using an elliptic curve of prime order $n$ on the prime field $\mathbb F_n$. An actual example would be secp256r1 (except using the $p$ there as our $n$ and the $n$ there as our $p$), with the group operation described in sec1.
† It fails when one of the $a_i$ is $p$, but that has vanishing probability $1-(1-1/p)^n$.