0
$\begingroup$

Let $Y$, $X_1$ and $X_2$ be three continuous real random variable with $f(x_1, x_2) >0$ everywhere on $R^2$ and denote by $g(x_1, x_2) = E[Y|X_1 = x_1, X_2 = x_2]$. Then $g(0,0) = E[Y|X_1 = 0, X_2 = 0]$ is well defined.

Suppose we apply a transformation $h(\cdot)$ to $X_1$ and $X_2$ such that $h(X_1 = x_1, X_2 = 0) = 0$ iff $X_1 = 0$ and $X_2 = 0$. Suppose further that the resulting density of $Z = h(X_1, X_2)$, denote it by $f_z(z)$, is such that $f_z(0) = 0$ and there exists a neighborhood $N_0$ of $z = 0$ where $f_z(z) >0$ $\forall z \in N_0$.

My question is, is it true that $E[Y| Z = 0] = E[Y|X_1 = 0, X_2 = 0] = g(0,0)$? My problem is that $E[Y| Z = 0]$ is not well defined because $f_z(0)$ is zero, so the latter equality is meaningless, but maybe there is a way to approximate $g(0,0)$ by exploiting the fact that $E[Y|Z = z]$ is defined in a neighborhood of $z = 0$? Maybe imposing smoothness conditions on $g(x_1, x_2)$?

I encountered this problem studying local polynomial estimators with transformed variables, but also could apply to estimation of regression functions conditional on a univariate variable with sparse density.

$\endgroup$
6
  • $\begingroup$ See en.m.wikipedia.org/wiki/Borel%E2%80%93Kolmogorov_paradox $\endgroup$
    – Mason
    Commented Nov 21, 2023 at 18:01
  • $\begingroup$ In general, you can't expect $E(Y \mid Z = 0) = E(Y \mid X_1 = 0, X_2 = 0)$, because both of these quantities are one point of a function that is in a sense only defined up to almost sure equivalence. $\endgroup$
    – Mason
    Commented Nov 21, 2023 at 18:02
  • $\begingroup$ Thanks a lot! I have a follow-up question. In many bivariate Regression Discontinuity Designs (a type of non-experimental design, see for example here) it is assumed that $E[Y | f(X_1, X_2) = f(x_1, x_2)]$ is continuous in $(x_1, x_2)$ where $f(\cdot)$ is the Euclidean distance from (0,0). In this way it is stated that $\lim_{ (x_1,x_2) \to (0,0)}E[Y | f(X_1, X_2) = f(x_1, x_2)] =E[Y| X_1 = 0, X_2 = 0]$. Y is continuos. Must I then conclude that this is potentially not always true? $\endgroup$
    – Ldt
    Commented Nov 22, 2023 at 17:50
  • $\begingroup$ If $E(Y \mid X_1 = 0, X_2 = 0)$ is replaced with $E(Y \mid f(X_1, X_2) = 0)$ then it is true by assumption, since $f(0, 0) = 0$. However, If you repeat this procedure with a function $g$ instead of $f$, then even if $g(x_1, x_2) = 0$ if and only if $x_1 = x_2 = 0$, it may be that $E(Y \mid f(X_1, X_2) = 0) \neq E(Y \mid g(X_1, X_2) = 0)$. I recall there is a simple example of this in Casella and Bergers intro statistical inference book. Here you have $g(X_1, X_2) = (X_1, X_2)$. $\endgroup$
    – Mason
    Commented Nov 22, 2023 at 23:25
  • $\begingroup$ I understand, so these papers are actually incorrect in some of their conclusions. Are there any conditions to impose to make these equalities hold? For example, I've seen this formulation. $E[Y|X_1 = x_1, X_2 = x_2]$ is continuous so define a neighborhood of radius $\epsilon$ around $(0,0)$, $N_\epsilon$, then $lim_{\epsilon \to 0}E[Y|(X_1, X_2) \in N_\epsilon] = E[Y|X_1 = 0, X_2 = 0]$. Is this correct? To me it seems the same thing as defining the Euclidean distance function and taking the limit as before, but maybe I'm missing something. Thanks a lot for your time! $\endgroup$
    – Ldt
    Commented Nov 23, 2023 at 11:56

0

You must log in to answer this question.