3
$\begingroup$

I want to confirm whether taking the Exponent of the MAD of Log Transformed Data gives me a measure of relative distance from median of the original untransformed data.

So say I have a MAD of 0.2 for the Log Transformed Data and I get value of 1.22 by taking its exponent. Does this 1.22 mean the original data varies by 22% around the median?

$\endgroup$
2
  • 1
    $\begingroup$ It's not actually clear what you're asking; what do you mean by "varies by 22% around the median"? Is there a particular measure of variability on the untransformed data you have in mind? $\endgroup$
    – jbowman
    Commented Apr 21 at 18:32
  • $\begingroup$ @jbowman What I am wondering is does the MAD of the Log Transformed Data measure the relative distance from median of the original data without needing to take exponent or do I need to take exponent of it in order to get the relative distance? $\endgroup$
    – Anon9001
    Commented Apr 21 at 18:42

2 Answers 2

5
$\begingroup$

Well, it gives you a measure of relative distance around the median, but I'm not sure if it's the one you want. But taking e to the MAD of the transformed data does not give you the MAD of the original. Of course, if you want the MAD of the original data, you could just take that. Here is some R code. Anything after a # is a comment. E.g. with some mixed up distribution:

set.seed(1234)

x <- c(rnorm(100), rnorm(100,3,2), rbinom(100, 1, .5)) + 10 #Adding 10 to avoid
   #negative numbers


mad(x)  #1.43
mad(log(x)) #0.14
exp(0.14)  #1.15

Does the original data "vary by 15% around the median"? I don't know. It depends on what you mean. But % is probably not the right way to look at it, because you can change that by adding a constant. Here is a plot of the density of the original data with lines at 10.96 and 15% below and above.

enter image description here

$\endgroup$
6
  • $\begingroup$ Firstly thank you for the quick answer! But I am wondering whether the exponent is even necessary to get a measure of relative distance from median for the original data. I think its not necessary as the absolute difference of two logs is equal to the relative difference of the original values but please correct me if I am wrong! $\endgroup$
    – Anon9001
    Commented Apr 21 at 18:54
  • 2
    $\begingroup$ R's default mad has a scaling factor of $1.4826$ which means that it does not give the actual median absolute deviation. Try mad(c(-1,1)) where the median absolute deviation is $1$ but R gives $1.4826$ $\endgroup$
    – Henry
    Commented Apr 22 at 0:42
  • 1
    $\begingroup$ @Henry Geeze. R does a lot of weird things by default, but that is really weird. $\endgroup$
    – Peter Flom
    Commented Apr 22 at 8:27
  • 2
    $\begingroup$ The "justification" for this is that for a normal distribution $X \sim N(\mu, \sigma^2)$ you have $\operatorname{mad}(X) = \sigma\, \Phi^{-1}(\frac34)$ and so a robust estimator of $\sigma$ is $\frac{\operatorname{mad}(\{x_i\})}{ \Phi^{-1}(\frac34)}\approx 1.4826\, \operatorname{mad}(\{x_i\})$. But it is only useful when you want an estimate of $\sigma$ from a normal distributed sample, and you need to change the constant to $1$ if you want the actual median absolute deviation. $\endgroup$
    – Henry
    Commented Apr 22 at 9:06
  • 1
    $\begingroup$ That makes sense, but they should rename the function. $\endgroup$
    – Peter Flom
    Commented Apr 22 at 9:13
7
$\begingroup$

Let's call your data $\{x_i\}$ with median $m_x$ and log-transformed data $\{y_i\}=\{\log_e(x_i)\}$ with median $m_y$. Note that $m_y = \log(m_x)$ thanks to the monotonicity of logarithms.

Then knowing that the median absolute deviation of the $\{y_i\}$ is $0.2$ tells you that half of the $y_i$ satisfy $$m_y-0.2 \le y_i \le m_y+0.2$$

though this does not tell you that a quarter is below $m_y-0.2$ and a quarter above $m_y+0.2$, since there is no presumption of symmetry in the distribution of the data.

Since $e^{0.2} \approx 1.2214$, this then tells you that half of the $x_i$ satisfy $$\frac{m_x}{1.2214} \le x_i \le 1.2214 \,m_x$$

and again this does not tell you that a quarter is below $\frac{m_x}{1.2214}$ and a quarter above $1.2214 \,m_x$.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.