Skip to main content

Questions tagged [distance]

Measure of distance between distributions or variables, such as Euclidean distance between points in n-space.

0 votes
0 answers
17 views

GLMM with cumulative distances

I have calculated the cumulative distances travelled for each individual. The objective is to represent the data in a graph with a tendency curve using the R package ggplot2. However, the data are too ...
Klervi's user avatar
  • 11
2 votes
1 answer
32 views

analytical asymptotic approximation of the expected maximum, mean, and minimum distance of nearest neighbours in unit ball

Say I uniformly at random distribute $x = n^3$ (independent identically distributed) points in a ball of radius $r=1$ in $\mathbb{R}^3$. What can be said about the expected maximum, minimum, and mean ...
kram1032's user avatar
  • 255
0 votes
0 answers
28 views

Wasserstein distance to assess the degree of normality

The Wasserstein distance between two probability measures with quantile functions $F^{-1}$ and $G^{-1}$ is given by \begin{align} W(F,G) = \int_{[0,1]} |F^{-1}(t) - G^{-1}(t)|dt \end{align} Now let's ...
thesecond's user avatar
  • 390
0 votes
0 answers
21 views

Looking for a suitable way to find groups of events

I have an excel file in which I have three columns. The first one is the name of an event, the second one is the moment the event starts and the third one is the time at which an event ends. Let's say ...
slow_learner's user avatar
1 vote
1 answer
24 views

After matching: How do I interpret the value of the type ‘distance’ (=Propensity score) in the balance measures table of the r-package cobalt bal.tab?

I have used the R-package ‘MatchIt’ to perform (1) a nearest neighbour propensity score matching (NNM) based on the Framingham Heart Study and (2) for comparison, an optimal PS matching (OM) for the ...
user19939387's user avatar
1 vote
1 answer
58 views

Hypothesis testing by asymptotic distribution

Consider the following hypothesis testing problem: under $H_0$: $(X_1,\cdots,X_n) \sim P_n,$ under $H_1$: $(X_1,\cdots,X_n) \sim Q_n.$ We want to show that the minimum testing error goes to zero when $...
efsdfmo12's user avatar
  • 123
1 vote
1 answer
33 views

Distance between two multivariate distributions

I have data structured as followed: 2 years -> several metrics per year (let's say 2000) -> several measurements per metric (let's say 1000 for year 1 and 800 for year 2). So I have a 2000x1000 ...
Lou's user avatar
  • 21
1 vote
1 answer
45 views

Hierarchical clustering of a distance matrix with element weights

I am computing a hierarchical clustering of some geospatial data. I need to add in an element weighting to the approach. My current approach is: I compute temporal cross-correlations between my N ...
JoshD's user avatar
  • 51
2 votes
1 answer
44 views

What are downsides to "genetic matching," particularly outside of causal inference settings?

Multivariate matching methods typically involve two steps. First the user computes $D$, a matrix of the multivariate distances between units. Second, the user applies a matching function (e.g., 1:1 ...
socialscientist's user avatar
1 vote
0 answers
71 views

Understanding $\chi^2$ in plain language [closed]

Suppose I was trying to explain what $\chi^2$ is and why it's important to my grandma. I want to give core intuition to this formula: $$\chi^2 = \sum_{i}\frac{(O_i-E_i)^2}{E_i}$$ I would tell her ...
Swike's user avatar
  • 139
1 vote
0 answers
48 views

Why not use the $L^2$ norm as the difference between two probability distributions (as opposed to KL-Divergence and others) [closed]

So I was wondering why not just use: $$dist(p,q)=\bigg(\int_{x \in X} |p(x)-q(x)|^2 dx\bigg)^{1/2}$$ instead of the commonly used KL-Divergence, which isn't even a distance measure and therefore not ...
Anon's user avatar
  • 121
0 votes
0 answers
10 views

Distance to find similar samples in a multivariate dataset

Apart from Euclidean and Mahalanobis distance metrics. Given a sample with multivariable values, is there a way to find the samples that are similar to the given sample? Does KNN clustering find the ...
sveer's user avatar
  • 103
0 votes
0 answers
85 views

PCA and Gower's Distance

I have a dataset with nutrient information about different ingredients. There are a total of 70 nutrients (numeric features) and 3 categorical features for a total of around 550 ingredients. I am ...
MSingh's user avatar
  • 1
0 votes
0 answers
24 views

Is it possible to convert a similarity (distance) index to correlation coefficient?

I have two cases that each one has some values on a series of variables (e.g. A, B & C). Is it possible to calculate a distance or similarity index between these two cases and then convert it to a ...
Mahdi Karvandi's user avatar
0 votes
0 answers
18 views

Distance metric for dummy and continous variables

I'm trying to apply the KNN regression model to the data I have at my disposal which contains one dummy variable and two continuous variables (which I have normalized). I was wondering if it is okay ...
soph's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
50