28
$\begingroup$

I am looking for a method to transform my dataset from its current mean and standard deviation to a target mean and a target standard deviation. Basically, I want to shrink/expand the dispersion and scale all numbers to a mean.

It doesn't work to do two separate linear transformations, one for standard deviation, and then one for mean. What method should I use?

$\endgroup$
3
  • 4
    $\begingroup$ If $Y = aX + b$, then $E(Y) = a E(X) + b$ and $Var(Y) = a^2 Var(X)$. Does this help? $\endgroup$
    – ocram
    Commented Dec 22, 2012 at 9:00
  • $\begingroup$ @ocram, I think that's an answer (and a good one)... $\endgroup$ Commented Dec 22, 2012 at 10:09
  • $\begingroup$ @PeterEllis: Thanks! I'll make it an answer then :-) $\endgroup$
    – ocram
    Commented Dec 22, 2012 at 10:20

2 Answers 2

47
$\begingroup$

Suppose you start $\{x_i\}$ with mean $m_1$ and non-zero standard deviation $s_1$ and you want to arrive at a similar set with mean $m_2$ and standard deviation $s_2$.

Then multiplying all your values by $\frac{s_2}{s_1}$ will give a set with mean $m_1 \times \frac{s_2}{s_1}$ and standard deviation $s_2$.

Now adding $m_2 - m_1 \times \frac{s_2}{s_1}$ will give a set with mean $m_2$ and standard deviation $s_2$.

So a new set $\{y_i\}$ with $$y_i= m_2+ (x_i- m_1) \times \frac{s_2}{s_1} $$ has mean $m_2$ and standard deviation $s_2$.

You would get the same result with the three steps: translate the mean to $0$, scale to the desired standard deviation; translate to the desired mean.

$\endgroup$
0
9
$\begingroup$

Let’s consider the z-score calculation of data $x_i$ with mean $\bar{x}$ and standard deviation $s_x$.

$$z_i = \dfrac{x_i-\bar{x}}{s_x}$$

This means that, given some data $(x_i)$, we can transform to data with a mean of $0$ and standard deviation of $1$.

Rearranging, we get:

$$x_i = z_i s_x+ \bar{x}$$

This gives us back our original data with the original mean $\bar{x}$ and standard deviation $s_x$. But we could’ve gone to data $y_i$ with any mean $\bar{y}$ and standard deviation $s_y$.

$$y_i = z_i s_y +\bar{y}$$

Now combine the two transformations, first to $z_i$ and then to $y_i$.

$$y_i = \dfrac{x_i-\bar{x}}{s_x}s_y + \bar{y}$$

This is the same as what Henry posted, but I do think it is helpful to see that we get there by first going to standardized data and then transforming to data with the mean and standard deviation values we desire.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.