$
\newcommand{\AA}{\mathsf{A}}
\newcommand{\sO}{\mathcal{O}}
\newcommand{\fO}{O}
\newcommand{\str}{\{0,1\}^*}
\newcommand{\strn}{\{0,1\}^n}
\newcommand{\NN}{\mathbb{N}}
\newcommand{\adv}{\varepsilon_{\AA,\fO}}
\newcommand{\sP}{\mathcal{P}}
\newcommand{\SUCCESS}{\text{SUCCESS}_{\AA,\fO,x}}
\newcommand{\DEVIATION}{\text{DEVIATION}_{\AA,\fO,n}}
$
Let's go through the proof from [Y,IR] in a bit more detail since it is technical and subtle (I struggled a lot with it). The role of Borel-Cantelli lemma will become clear in the process, and is summarised at the end.
Random oracles. We consider (function) oracles $\fO:\str\to\str$ interpreted as an ensemble of oracles $\{O_1,O_2,\cdots\}$, where $\fO_n:\strn\to\strn$.
Let $\sO$, denote the set of all such oracles.
Since $|\sO|$ is uncountable, before talking about random oracles, we need to define what it means to randomly sample from a sample space that is uncountable.
To this end, one defines a probability measure.
Since there is a one-to-one correspondence$^*$ between $\sO$ and $[0,1)$, one can resort to Borel sets and Lebesgue measure: see this lecture note for more details.
Random oracles are one-way. Now, our goal is to show that random oracles are one-way in a very strong sense: for measure $1$ of random oracles $\fO$, $\fO$ is a one-way function (OWF), i.e.,
$$
\Pr_{\fO\leftarrow\sO}[\forall\AA\in\text{PPT}:\adv(\cdot)\text{ is negligible}]=1,
$$
where the advantage $\adv(\cdot)$ is defined as
$$
\adv(n):=\Pr_{\AA,x\leftarrow\strn}[\underbrace{\AA(\fO(x))\in\fO^{-1}(O(x))}_{\text{Event }\SUCCESS}],
$$
and it is negligible if
$$
\forall c\in\NN~\exists n_c\in\NN~\forall n>n_c:\adv(n)\geq1/n^c.
$$
We proceed as follows.
Let's first analyse the advantage with respect to a fixed adversary and input.
It can be shown by lazy sampling$^{**}$ that
$$
\forall\AA\in\text{PPT}~\forall n\in\NN~\forall x\in\strn:\Pr_{\fO\leftarrow\sO}[\SUCCESS]\leq n^a/2^n,
$$
where, for $a\in\NN$ (which depends on $\AA$), $n^a$ is the upper bound on $\AA$'s runtime.
Next, we bound the probability that $\AA$ deviates from expected behaviour.
To this end, let's define a bad event
$$
\DEVIATION:\adv(n)> n^{a+2}/2^n.
$$
It can be shown by applying Markov's inequality$^{**}$ that
$$
\forall\AA\in\text{PPT}\forall n\in\NN:\Pr_{\fO\leftarrow\sO}[\DEVIATION]\leq1/n^2.
$$
We are now ready to apply Borel-Cantelli lemma.
Since
$$
\sum_{n=1}^\infty\Pr_{\fO\leftarrow\sO}[\DEVIATION]\leq \sum_{n=1}^\infty1/n^2<\infty,
$$
by Borel-Cantelli lemma, we get that
$$
\forall\AA\in\text{PPT}:\Pr_{\fO\leftarrow\sO}[\DEVIATION \text{ occurs for infinitely-many } n \text{s}]=0.
$$
This means for each adversary $\AA$, we can fix a measure $0$ of "bad" oracles $\sO^*_\AA\subseteq\sO$.
Since the number of PPT Turing machines is countable (but infinite) and union of countable measure $0$ oracles is still measure $0$, we get that the set of all bad oracles $\sO^*=\cup_\AA\sO^*_\AA\subseteq\sO$ is still measure $0$.
Therefore, we can switch the order of the quantifiers to get:
$$
\Pr_{\fO\leftarrow\sO}[\forall\AA\in\text{PPT}:\DEVIATION \text{ occurs for infinitely-many } n \text{s}]=0.
$$
Finally, let's establish one-wayness.
The above equation is equivalent to
$$
\Pr_{\fO\leftarrow\sO}[\forall\AA\in\text{PPT}:\adv(n)>n^{a+2}/2^n \text{ for finitely-many } n \text{s}]=1.
$$
It follows that there exists a $n_\AA\in\NN$ such that $\forall n>n_\AA,\adv(n)\leq n^{a+2}/2^n$.
Since $n^{a+2}/2^n$ is a negligible function, and $\adv(n)$ grows slower than $n^{a+2}/2^n$ for all $n>n_\AA$, it follows that $\adv(n)$ is also negligible, which completes the proof.
To sum up, the Borel-Cantelli lemma is used to show that the bad event, $\DEVIATION$, does not occur infinitely-often, which then implies that it does not occur after a sufficiently large $n$, just like in the definition of negligible. This is key to establishing that the advantage is negligible.
$^*$This correspondence is easy to see for bit oracles: given a real number $r=0.r_0r_1\cdots\in[0,1)$, simply set the output $O(x)$ as $r_x$, the $x$-th bit of $r$. This can be extended to the function oracles as shown in [IR].
$^{**}$ See [IR,Y] for details.
[IR]: Impagliazzo and Rudich, Limits on the Provable Consequences of One-Way Permutations. STOC 1989.
[Y]: Yerukhimovich, A Study of Separations in Cryptography: New Results and Models, 2011, PhD Thesis