I am concerning the proof of continuous mapping theorem for the convergence in probability (see link)
To make the post self-contained, I will provide the necessary definitions here.
Defn 1: Convergence in Probability
The sequence of random variabels $\{T_n\}_{n\geq 1}$ converges in probability to $T$ (possibly a random variable) if and only if \begin{align*} \forall \epsilon>0:\lim_{n\rightarrow\infty}(\|T_n-T\|\leq \epsilon)=1\\ \forall \epsilon>0:\lim_{n\rightarrow\infty}(\|T_n-T\|> \epsilon)=0. \end{align*} We denote this by $T_n\xrightarrow[]{p}T$, $\text{plim}_{n\rightarrow\infty}T_n=T$ or $T_n-T=o_p(1)$.
Defn 2: Continuity
Let $(X,d)$ and $(Y,\rho)$ be two metric spaces, and let $f:X\mapsto Y$ be a funciton. The function $f$ is called continuous at $x\in X$ if for every $\epsilon>0$ there exists $\delta>0$ such that $d(y,x)<\delta\implies \rho(f(y),f(x))<\epsilon$. If $f$ is continuous at all $x\in X$, then we say $f$ is continuous on $X$.
Theorem: Continuous Mapping Theorem
Let $\{X_n\}$, $X$ be random elements defined on a metric space $S$. Suppose a function $g:S\mapsto S'$ (where $S'$ is another metric space) has the set of discontinuity points $D_g$ such that $Pr(X\in D_g)=0$. Then \begin{align*} X_n\xrightarrow[]{(\cdot)}X\implies g(X_n)\xrightarrow[]{(\cdot)}g(X) \end{align*} where $\xrightarrow[]{(\cdot)}$ can be either convergence in probability or convergence in distribution or convergence almost surely.
Question
In the proof, it says
The second term converges to zero as $\delta\rightarrow 0$, since the set $B_\delta$ shrinks to an empty set.
I felt like this was a correct statement but was not the reason that $Pr(X\in B_\delta)=0$. Instead, the correct explanation should use the property of continuity. In paticular, since $Pr(|X_n-X|\geq \delta)\rightarrow 0$ for any value of $\delta$ and $g(\cdot)$ is continuous at $X$, for all $\epsilon>0$, there exists $\delta_\epsilon$ such that $|X_n-X|<\delta_\epsilon\implies|g(X_n)-g(X)|<\epsilon$. This means $Pr(|X_n-X|<\delta_\epsilon)\leq Pr(|g(X_n)-g(X)|<\epsilon)$.Since $\lim_{n\rightarrow\infty}Pr(|X_n-X|<\delta_\epsilon)=1$, we have $\lim_{n\rightarrow\infty}Pr(|g(X_n)-g(X)|<\epsilon)=1$ (I have essentially proved the claim here). This also means $\lim_{n\rightarrow\infty}Pr(|g(X_n)-g(X)|>\epsilon)=0$ and I suspect it's possible to use this fact to bound $\lim_{n\rightarrow \infty}Pr(x\in B_{\delta_\epsilon})$. Overall, I don't understand why the need to define this $B_\delta$ object.
I have edited the last bit as the previous use of notation is inaccurate.