I have a few questions on how to use Lie series as a canonical transformation, which are widely used in perturbation theory (celestial mechanics).
I know that these series are related to a Taylor expansion in $\varepsilon$ around the origin, but I struggle to reproduce the whole story up to the variable transformations given above. I'll give a summary of this process as given by Sections 5.2 and 5.3 of the book "Canonical Perturbation Theories: Degenerate Systems and Resonance" by Ferraz-Mello (2007).
Lie Series
Let's start by considering $(p,q)$ which are sets of $2N$ canonically conjugated variables which depend on a time-like parameter $\lambda$ and are related by the Lie generating function $W(p,q)$, that is
\begin{equation} \frac{dq_i}{d\lambda} = \frac{dW}{dp_i}\ \ ;\ \ \frac{dp_i}{d\lambda} = -\frac{dW}{dq_i},\ \ \ \ \ \ \ (i=1,...,N) \end{equation}
Now, let's consider the Taylor series of $(p,q)$ around $\lambda=0$
\begin{equation} q_i(\lambda) = \sum_{n=0}^\infty\frac{\lambda^n}{n!}\frac{d^nq_i}{d\lambda^n}\Big\rvert_{\lambda=0} \end{equation}
\begin{equation} p_i(\lambda) = \sum_{n=0}^\infty\frac{\lambda^n}{n!}\frac{d^np_i}{d\lambda^n}\Big\rvert_{\lambda=0} \end{equation}
Let's consider the Poisson bracket of any function $f(q(\lambda),p(\lambda))$ (which does not depend on $\lambda$ explicitly) with $W$
\begin{equation} \frac{df}{d\lambda}=\sum_{i=0}^{N}\frac{df}{dq_i}\frac{dq_i}{d\lambda} + \frac{df}{dp_i}\frac{dp_i}{d\lambda} = \sum_{i=0}^{N}\frac{df}{dq_i}\frac{dW}{dp_i} - \frac{df}{dp_i}\frac{dW}{dq_i} = \{f,W\} = D_W(f) \end{equation}
where $D_W(f)$ is defined as an operator known as the Lie derivative of f generated by W.
Inputing $\frac{df}{d\lambda}$ instead of $f$ in the previous equation gives us
\begin{equation} \frac{d^2f}{d\lambda^2} = \{\{f,W\},W\}=D_W(D_W(f))=D_W^2(f) \end{equation}
And so on
\begin{equation} \frac{d^nf}{d\lambda^n} = D_W^n(f) \end{equation}
If $f$ takes the very simple form of the variables $q_i,p_i$ we get an expression for its derivatives
\begin{equation} \frac{d^nq_i}{d\lambda^n} = D_W^n(q_i)\ \ ;\ \ \frac{d^np_i}{d\lambda^n} = D_W^n(p_i) \end{equation}
which we can use in their Taylor representations given above
\begin{equation} q_i(\lambda) = \sum_{n=0}^\infty\frac{\lambda^n}{n!}D_W^n(q_i)\Big\rvert_{\lambda=0} = E_Wq_i \end{equation}
\begin{equation} p_i(\lambda) = \sum_{n=0}^\infty\frac{\lambda^n}{n!}D_W^n(p_i)\Big\rvert_{\lambda=0} = E_Wp_i \end{equation}
where $E_Wq_i$ and $E_Wp_i$ are the Lie series representation of $(p,q)$ around $\lambda=0$.
Noting that the left-hand side of the previous equation depends on $(p,q)$ but the right hand side depends on $(p_0,q_0)=(p(0),q(0))$, we start to see where the canonical transformation to new variables is going.
In order to note that the right-hand side depends on these new variables, Ferraz-Mello rewrites the previous equation as follows:
\begin{equation} q_i = E_{W^*}q_i^* \end{equation}
\begin{equation} p_i = E_{W^*}p_i^* \end{equation}
where $(p^*,q^*)=(p_0,q_0)$ and $W^*=W(p^*,q^*)$.
It can be shown that this procedure can be extended for any function $f=f(p,q)$ (commutation theorem), giving
\begin{equation} f = E_{W^*}f(p^*,q^*) \end{equation}
Hori's method
Everything said is frequently applied in Hori's averaging method. Ferraz-Mello does aswell in Chapter 6, where we have a Hamiltonian $H=H(J,\theta)$ which depends on "old-variables" and we seek to transform it into $H^*=H^*(J^*,\theta^*)$.
The Lie series given above are known to be canonical and are proposed as transformations, that is
\begin{equation} \theta_i = E_{W^*}\theta_i^* \end{equation}
\begin{equation} J_i = E_{W^*}J_i^* \end{equation}
\begin{equation} H = E_{W^*}H(J^*,\theta^*) \end{equation}
which is actually a tranformation from new to old variables and not the other way around, but this way ends up being of practical use.
Questions
Is the derivation given above (which uses Taylor series to arrive at Lie series) correct?
With respect to which variables are the canonical equations with the generating function $W$ defined in Hori's method? Hori himself (and other sources) defines the generating function with respect to the new variables ($(J^*,\theta^*)$ in our case), although the explanation given above requires the relation with $W$ to be with the old variables.