3
$\begingroup$

Mahalanobis distance provides a value that might be used for the detection of outliers. My question: how to calculate the direction of the outlier (as a vector)?

A simple answer would be to use the distance between the center of the distribution and the outlier but this answer would not use the "normalization" property provided by the Mahalanobis distance...

$\endgroup$
6
  • 1
    $\begingroup$ Hello, I am not sure to understand what you mean by "the direction of an outlier". I would go with the direction of the observation, taking the center of your data as origin, as you suggested. You could normalize it by the positive definite square root of the variance matrix of your data, but I wonder if this is relevant... $\endgroup$
    – Pohoua
    Commented Aug 10, 2020 at 12:52
  • 2
    $\begingroup$ The direction, as always, is given by the vector going from the center to the outlier. What more might you be looking for?? $\endgroup$
    – whuber
    Commented Aug 10, 2020 at 16:47
  • $\begingroup$ @whuber, it seems like the solution should be related to mapping of the outlier vector by multiplication with the inverse covariance matrix and then the measurement of the angle. Problem is that multiplication with the inverse covariance matrix is similar to division by variance rather than division by std, which seems to be more appropriate in this case $\endgroup$ Commented Aug 11, 2020 at 8:41
  • $\begingroup$ You ignore the square root in the formula, Gideon: this turns the variance into the equivalent of an SD. $\endgroup$
    – whuber
    Commented Aug 11, 2020 at 13:28
  • $\begingroup$ @GideonKogan Could you please draw out your idea in two dimensions? $\endgroup$
    – Dave
    Commented Mar 21, 2022 at 19:42

2 Answers 2

0
$\begingroup$

The direction of the outlier could be calculated by zero phase component analysis whitening: enter image description here

$\endgroup$
-1
$\begingroup$

Option (1):

You cans use an angle as a direction $$tan\theta = \frac{y_{center}-y_1}{x_{center}-x_1}$$ And a Mahanalobis distance itself as a magnitude of a vector.

Option (2):

To caculate the angle between one of the eigen vectors and the point (outlier):

enter image description here

$\endgroup$
9
  • 1
    $\begingroup$ Since there are at least two distances in play--the original Euclidean distance and the Mahalanobis distance--could you explain which distance should be used to compute the angle and why? And what exactly do the terms in your formula mean? They look like a slope in a 2D problem. The ratio is almost surely not an angle! $\endgroup$
    – whuber
    Commented Aug 10, 2020 at 16:46
  • 1
    $\begingroup$ Do you mean the $\arctan$ of that ratio? I could see the polar-style coordinate working for data in two dimensions, but what happens in three dimensions or in ten dimensions? $\endgroup$
    – Dave
    Commented Aug 10, 2020 at 17:16
  • 2
    $\begingroup$ Re the edit: to calculate the angle in the original metric, take the arc cosine of the dot product of the unit vector to the point with the unit (directed) eigenvector. To calculate the angle in the Mahalanobis metric, first standardize the point as described at stats.stackexchange.com/a/62147/919 and proceed with the preceding recipe. These formulas work in any number of dimensions -- but please note that the angle does not usually give full information about the direction requested in the question, which asks for a "vector." $\endgroup$
    – whuber
    Commented Aug 10, 2020 at 21:05
  • $\begingroup$ @whuber, waht else in addition to angle can contribute to direction? $\endgroup$
    – Michael D
    Commented Aug 11, 2020 at 8:36
  • $\begingroup$ @whuber, I did not find how to standardize the point in the original coordinates, in the attached link. Seems like it might be something similar to SVD, right? $\endgroup$ Commented Aug 11, 2020 at 8:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.