Before I get to your questions I need to do a quick derivation. Recall the Taylor expansion of a function f(x) around zero is given by
$$
f(x) = f(0) + \frac{df}{dx}\Bigg{|}_{0}x + \frac{1}{2}\frac{d^2f}{dx^2}\Bigg{|}_{0}x^2 + \cdots.
$$
The multivariable generalization of this is simply
$$
f(x^{\mu}) = f(0) + \partial_\nu f(x^\mu)|_{0}x^\nu + \frac{1}{2}\partial_\nu\partial_\rho f(x^\mu)|_0 x^\nu x^\rho+ \cdots.
$$
Now if we look at a function like $f(\xi^\mu):=g(x^\mu + \xi^\mu)$, where $\xi^\mu:=\epsilon v(x^\mu)$ for infinitesimal $\epsilon$ and arbitrary function $v(x^\mu)$, we can do a multivariate Taylor expansion of $f(\xi^\mu)$ around zero, giving us
\begin{align*}
g(x^\mu+\xi^\mu)=f(\xi^\mu) &= f(0) + \xi^\nu\frac{\partial f}{\partial \xi^\nu}\Bigg{|}_0 + \mathcal{O}(\epsilon^2)\\
&= g(x^\mu)+\xi^\nu\left(\frac{\partial(x^\rho+\xi^\rho)}{\partial\xi^\nu}\frac{\partial g(x^\mu+\xi^\mu)}{\partial (x^\rho+\xi^\rho)}\right)_{\xi=0}+\mathcal{O}(\epsilon^2)\\
&= g(x^\mu)+\xi^\rho\partial_\rho g(x^\mu)+\mathcal{O}(\epsilon^2).
\end{align*}
Applying this to the problem at hand we find
\begin{align*}
\delta S&=\int \mathcal{L}'(x')d^4x'- \int \mathcal{L}(x)d^4x\\
&=\int (\mathcal{L}'(x)+\xi^\mu\partial_\mu\mathcal{L}'(x))d^4x'- \int \mathcal{L}(x)d^4x.
\end{align*}
But we still have to deal with the transformed measure, $d^4x'$. Recall that the measure transforms as follows
$$
d^4x'=\left|\frac{\partial x'^\mu}{\partial x^\nu}\right|d^4x
$$
which in our case, since $\xi^\mu(x)$ is proportional to an infinitesimal, gives (check this yourself)
$$
d^4x'=(1+\partial_\mu\xi^\mu + \mathcal{O}(\epsilon^2))d^4x.
$$
Finally using $\mathcal{L}'(x') = \mathcal{L}(x)$ (i.e.
the Lagrangian density is a scalar) to show that $\xi^\mu\partial_\mu\mathcal{L}'(x) = \xi^\mu\partial_\mu\mathcal{L}(x) + \mathcal{O}(\epsilon^2)$, the product rule, and ignoring terms of order $\epsilon^2$, we can arrive at
$$
\delta S=\int (\delta \mathcal{L}(x)+\partial_{\mu}(\mathcal{L}(x)\xi^{\mu}))d^4x
$$
as desired.
Hopefully it should be clear now where the extra term comes from: since our transformation depended explicitly on the coordinates we could not avoid its affect on the measure or the Taylor expansion of $\mathcal{L}$. However, this is not unique to local transformations such as the coordinate variation, as we would obtain a derivative term even if we had a translation by a constant amount $x'^\mu=x^\mu+c^\mu$. The important thing is that these are total derivatives and thus, given sufficiently nice boundary conditions, this term vanishes and the action would be invariant making this a symmetry of the theory.
A note on your statement:
Usually while deriving the Euler-Lagrange equations we only consider
$δL$
while varying the action...
When deriving the Euler-Lagrange equations we are varying with respect to the fields, not the coordinates, as those are the degrees of freedom in a classical field theory. The above is not an attempt at deriving the equations of motion but rather an examination of what happens to the action under an infinitesimal coordinate transformation. Such an examination is useful when you want to determine the necessary restrictions you would need to apply to the Lagrangian density to make such a transformation a symmetry of the theory.