You’re perfectly justified in being confused. The notation isn’t great, and this is made worse by the fact that everything is happening in $\Bbb{R}^3$, so it is very easy to get confused by a vector, its components with respect to one basis, its components with respect to a different basis, and also the unnatural use of the cross product (I know you didn’t object to this, but this is confusing (atleast for me when I was learning)). Let us therefore speak in general terms, so as to limit the chances for such confusions. Also, for the most part I won’t talk about two systems; I’ll just talk about one and do things generally so that for the second, you just apply the analogous equations.
Let $V$ be an $n$-dimensional real (or complex) vector space. Let $I\subset\Bbb{R}$ be an open interval, and $\xi_1,\dots,\xi_n:I\to V$ be differentiable maps such that for each $t\in I$, $(\xi_i(t))_{i=1}^n$ is an ordered basis for $V$; we may write this as simply $\xi(t)$. Now, let us define an operator $\frac{d_{\xi}}{dt}$:
Definition 1. (‘Time derivative relative to $\xi$’)
With the notation $V,\xi$ as above, for each differentiable map $f:I\to V$, let us first denote the components of $f$ relative to the $\xi$ basis as $f_{\xi}^1,\dots,f_{\xi}^n:I\to\Bbb{R}$, meaning that for each $t\in I$, we have $f(t)=f_{\xi}^i(t)\xi_i(t)$ (Einstein summation used). Next, we shall define the operator $\frac{d_{\xi}}{dt}$ as follows: its action on a function $f$ is defined to be
\begin{align}
\frac{d_{\xi}f}{dt}\bigg|_t:=\frac{d f_{\xi}^i}{dt}\bigg|_t\,\xi_i(t).
\end{align}
The motivation for this definition is that someone who “moves along with the basis $\xi$” sees it as fixed so this is how they would describe the derivative of $f$. Anyway, this should not be confused with the derivative $f’=\frac{df}{dt}$ (see below).
Next comes the definition of (generalized) angular velocity
Definition 2. (Angular velocity endomorphism)
With $V,\xi$ defined as above, note that for each $t\in I$, we have the velocity vectors $\dot{\xi_i}(t)\equiv\frac{d\xi_i}{dt}(t)$ for $i\in\{1,\dots, n\}$. So by linear algebra there is a unique linear transformation $V\to V$ which sends the basis elements $\xi_1(t),\dots,\xi_n(t)$ to $\dot{\xi_1}(t),\dots, \dot{\xi_n}(t)$. Let us denote this unique linear transformation $\Omega^{(\xi)}(t):V\to V$, and call this the angular velocity at time $t$ of the basis $\xi$.
If there is no chance for confusion, we shall simply write $\Omega$ instead of $\Omega^{(\xi)}$. So, to emphasize, the defining property of $\Omega^{(\xi)}$ is that for each $t\in I$ and each $i\in\{1,\dots, n\}$,
\begin{align}
\frac{d\xi_i}{dt}\bigg|_t\equiv\dot{\xi_i}(t)&=\Omega^{(\xi)}(t)[\xi_i(t)].
\end{align}
Also, the only reason I put “generalized” in brackets is because so far I have not restricted myself to $3$ dimensions, and I did not assume I had an inner product on $V$, and I did not assume that the basis is orthonormal.
With these two definitions in place, we see that for any differentiable function $f:I\to V$, each $t\in I$,
\begin{align}
\frac{df}{dt}\bigg|_t&=\frac{d}{dt}\bigg|_t\left(f_{\xi}^i\,\xi_i\right)\\
&=\frac{df_{\xi}^i}{dt}\bigg|_t\xi_i(t)+f_{\xi}^i(t)\frac{d\xi_i}{dt}\bigg|_t\tag{product rule}\\
&=\frac{d_{\xi}f}{dt}\bigg|_t+f_{\xi}^i(t)\,\Omega^{(\xi)}(t)[\xi_i(t)]\tag{definitions 1,2}\\
&= \frac{d_{\xi}f}{dt}\bigg|_t+\Omega^{(\xi)}(t)[f_{\xi}^i(t)\xi_i(t)]\tag{$\Omega^{(\xi)}(t)$ is linear}\\
&= \frac{d_{\xi}f}{dt}\bigg|_t+\Omega^{(\xi)}(t)[f(t)]
\end{align}
This gives us our sought-after equation (omitting the $t$ from the notation):
\begin{align}
\frac{df}{dt}&=\frac{d_{\xi}f}{dt}+\Omega^{(\xi)}(\cdot)[f(\cdot)].\tag{$*$}
\end{align}
This equation tells us how the derivative of an arbitrary function $f:I\to V$ can be expressed as a sum of two parts: the first part is the “naive interpretation by an otherwise oblivious observer” and the second part “fixes the observer’s blunder”. So, really, $(*)$ is nothing but the product rule plus notation/definition.
Special Cases.
Now let’s talk about some special cases.
The first and most important is of course if $\frac{d\xi_i}{dt}=0$ identically for each $i$, because then $\Omega^{(\xi)}(t)=0$ for all $t\in I$. In words, if the basis is not changing in time, then it has no angular velocity. As a result, from $(*)$, we have $\frac{df}{dt}=\frac{d_{\xi}f}{dt}$. In particular, if we now have two bases, $e$ and $\xi$, where $e$ is not time-dependent, then
\begin{align}
\frac{df}{dt}&=\frac{d_ef}{dt}=\frac{d_{\xi}f}{dt}+\Omega^{(\xi)}(\cdot)[f(\cdot)].
\end{align}
Another special case to keep in mind is when we have $V=\Bbb{R}^3$, and the $\xi_i$’s are orthonormal. In this case, the linear map $\Omega^{\xi}(t):\Bbb{R}^3\to\Bbb{R}^3$ is skew-adjoint, meaning that it has a skew-symmetric matrix representation relative to the standard basis, and hence there is a unique vector $\omega^{(\xi)}(t)\in\Bbb{R}^3$, the angular velocity vector, such that for all $v\in\Bbb{R}^3$, $\Omega^{(\xi)}(t)[v]=\omega^{(\xi)}(t)\times v$ is given by the cross product. In this case, equation $(*)$ becomes
\begin{align}
\frac{df}{dt}=\frac{d_{\xi}f}{dt}+\omega^{(\xi)}\times f.
\end{align}
Putting the two remarks above together, we see that in $\Bbb{R}^3$, if we have an orthonormal (time-dependent) basis $\xi=\{\xi_1,\xi_2,\xi_3\}$ and a time-independent basis $e=\{e_1,e_2,e_3\}$, then for any differentiable function $f:I\to\Bbb{R}^3$,
\begin{align}
\frac{df}{dt}=\frac{d_ef}{dt}=\frac{d_{\xi}f}{dt}+\omega^{(\xi)}\times f.
\end{align}
It is this second equality which is written in your post and in all textbooks.
Next, just so we directly answer everything:
- $(\vec{A})_S$ typically means exactly what you wrote: take the vector $\vec{A}$, figure out its components relative to a basis defining $S$, stick those components into a 3-tuple. However as you hopefully agree with my presentation, you don’t really need this notation.
- for your questions regarding the various derivatives, see my definition 1, and my first bullet point above. In particular, in light of the notation $(\vec{A})_{S’}$ (your point 1) the notation $\left(\frac{d\vec{A}}{dt}\right)_{S’}$ is indeed confusing (especially since $S’$ is non-inertial) because it does not mean “write $\frac{d\vec{A}}{dt}=\alpha^i\hat{e}_i’$ and consider the tuple $(\alpha^1,\alpha^2,\alpha^3)$”, i.e there is a wonderful conflict of notation with the bracket and subscript $(\cdot)_{S’}$. The intended meaning is what I wrote in definition 1. This is why I wrote the operator as $\frac{d_{\xi}}{dt}$, to emphasize that it is a differential operator built using the basis $\xi$, and not that you take the time derivative then extract the components relative to that basis.
- For your question (A), the components of $\frac{df}{dt}$ relative to the $\xi$ basis is different from the components of $\frac{d_{\xi}f}{dt}$ relative to the $\xi$ basis (there’s the extra angular velocity term which needs to be accounted for).
- For your question (B), hopefully I’ve answered it already in my definition 1, and in my special cases above.
Calculating accelerations.
You didn’t ask, but let’s work out second derivatives using this language. We start from equation (*), and apply $\frac{d}{dt}$ to both sides (and of course assume everything is twice differentiable):
\begin{align}
\frac{d^2f}{dt^2}&=\frac{d}{dt}\left(\frac{d_{\xi}f}{dt}+\Omega^{(\xi)}[f]\right)\\
&=\frac{d}{dt}\left(\frac{d_{\xi}f}{dt}\right)+\frac{d}{dt}\left(\Omega^{(\xi)}[f]\right)\\
&=\left(\frac{d_{\xi}}{dt}\left(\frac{d_{\xi}f}{dt}\right)+\Omega^{(\xi)}\left[\frac{d_{\xi}f}{dt}\right]\right)
+
\frac{d\Omega^{(\xi)}}{dt}[f]+\Omega^{(\xi)}\left[\frac{df}{dt}\right]\\
&=\left(\frac{d_{\xi}}{dt}\right)^2f+\Omega^{(\xi)}\left[\frac{d_{\xi}f}{dt}\right]+
\frac{d\Omega^{(\xi)}}{dt}[f]+ \Omega^{(\xi)}\left[\frac{d_{\xi}f}{dt}+\Omega^{(\xi)}[f]\right]\\
&=\frac{d_{\xi}^2f}{dt^2}
+ 2\Omega^{(\xi)}\left[\frac{d_{\xi}f}{dt}\right]
+\frac{d\Omega^{(\xi)}}{dt}[f]
+\left(\Omega^{(\xi)}\circ\Omega^{(\xi)}\right)[f]
\end{align}
Keep in mind that the square brackets $[\cdot]$ mean you evaluate the endomorphism on the given vector (after plugging in the time $t$ everywhere). So, we see that the second derivative of a function $f$ equals a sum of a whole bunch of terms: the “second derivative relative to $\xi$”, the coriolis acceleration term, the Euler acceleration term, and the centrifugal acceleration term respectively.