Okay, I still don't completely understand you, but I think I pinpointed the problem.
Using your notation, $S$ is not really a section of the associated bundle, but rather the component representation of such. So the notation $\nabla_T S$ is either incorrect, or just requires a lot of interpretation.
Let $F$ be a real, $k$ dimensional vector space, to see an example. Let us assume, that the latin indices $a$ range from $1$ to $k$.
We actually have some separate things here. We have a principal fiber bundle $(P,\pi,M,G)$ and an associated vector bundle $(E,\pi,M)$ with $E=P\times_\rho F$, where $\rho:G\rightarrow GL(F)$ the representation that defines the associated vector bundle.
The thing is, if $\psi:M\rightarrow E$ is a (smooth) section of $E$, then there exists a unique corresponding equivariant map $\phi:P\rightarrow F$, but for all intents and purposes, they are not the same.
You should instead think about them like this: If ${e_{a}}$ is a local frame of $E$, then we have locally $\psi=\psi^ae_a$, where the components $\psi^a$ depend on two things - they depend on the manifold points $x\in M$, but also on the local frame $e_a(x)$ (at $x$) chosen to represent it, so $$ \psi^a=\psi^a(x,\{e_a(x)\}). $$
But if $E$ is an associated vector bundle to $P$, then, essentially, $P$ is an associated principal bundle to $E$, so $P$ can be identified as the frame bundle of $E$ (because a local frame $e_a$ provides a local trivialization of both $E$ and $P$, so local sections of $P$ can be understood as frames of $E$). Well, not the total frame bundle, but some restriction of the frame bundle to a $G$-subbundle.
So the point is, that because the components $\psi^a$ depend on points of $P$, rather than $M$, it is an $F$-valued (essentially $\mathbb{R}^k$-valued) function on $P$. But it needs to be equivariant. Why? Because if $\Lambda^a_{\ b}(x)$ is a local frame transformation on $E$, then the components transform by $\Lambda^{-1}$. But a local frame transformation (acting on the model fiber $F$) is equivalent to the fibrewise transitive right action of $G$ on $P$, so $$ \psi^a(x,e(x)\Lambda(x))=(\Lambda^{-1})^a_{\ b}(x)\psi^b(x,e(x)).$$
Essentially, invariant fields are sections of $E$, component representations of it are realized basis independently as equivariant functions on $P$. Equivariance guarantees that the components transform "properly" under frame transformations (frame transformation = group action on $P$).
With this discussed, let us concatenate notation:
$\psi$ is the invariant section of $E$.
$\psi^a$ are components of it taken with respect to a local trivialization of either $E$ or $P$ (they agree).
$\phi$ is the equivariant function on $P$ satisfying $$ \phi(x,e(x))=(\psi^1(x,e(x)),...,\psi^k(x,e(x))), $$ where $(x,e(x))$ is a point of $P$.
What actually is, is that $$ \nabla_T\psi=(d(\sigma^*\psi^a)(T)+(\sigma^*\omega)^a_{\ b}(T)(\sigma^*\psi^b))e_a, $$ where the frame $e_a$ is the frame that generates the local trivialization/section $\sigma$.
Since this got quite complicated, and the otherwise simple essence gets lost, what your "$\nabla_TS$" is, is an analogue of $\nabla_T X^\mu$ for tangent vectors, when, in fact, what interests you is not $\nabla_T X^\mu$, but $\nabla_TX=\nabla_T(X^\mu\partial_\mu)$.
You got the component representation of the covariant derivative. Which is fine, just the notation confuses you, because $\nabla_T$ should not act on the components, but the invariant section.