6
$\begingroup$

I am assuming a model of the form

$$Y_i=\alpha+\beta X_i+g(\mathbf{Z}_i)+\epsilon_i,$$

here $\mathbf{Z}_i$ is an $m$ dimensional vector and $\epsilon_i$ is i.i.d. white noise. I would like to establish whether $\beta$ is statistically significant based on my data without taking a strong stance of the form of $g$. What type methods are usually applied for this type of problem?

$\endgroup$

2 Answers 2

5
$\begingroup$

This sounds like a great job for GAMs via the mgcv package. Use a penalized smoothing spline to estimate $g$ and add an additive effect of $X$. The model would look like gam(y ~ x + s(z).

library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.8-31. For overview type 'help("mgcv-package")'.


z = rnorm(1000)
x = rnorm(1000)
y = 2 + 0.25*x + sin(pi*z) + rnorm(1000, 0, 0.3)
d = data.frame(x, y, z)

model = gam(y ~ x + s(z), data = d)

summary(model)
#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> Formula:
#> y ~ x + s(z)
#> 
#> Parametric coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 1.968566   0.009514  206.91   <2e-16 ***
#> x           0.262245   0.009888   26.52   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Approximate significance of smooth terms:
#>        edf Ref.df     F p-value    
#> s(z) 8.977      9 625.1  <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> R-sq.(adj) =  0.865   Deviance explained = 86.6%
#> GCV = 0.091407  Scale est. = 0.090404  n = 1000

Created on 2020-10-20 by the reprex package (v0.3.0)

$\endgroup$
2
  • $\begingroup$ Thanks will try the package. $\endgroup$
    – fes
    Commented Oct 20, 2020 at 19:36
  • $\begingroup$ The package seems to work fine. Clarifying question: My $\textbf{Z}$ consists of 6 variables. Does this approach account for possible interactions between these variables? $\endgroup$
    – fes
    Commented Oct 30, 2020 at 12:26
3
$\begingroup$

This model is a partially linear regression models, and in your case, $g(Z)$ is a nuisance parameter. See page 62 of this link for a primer on the subject. Of especial note in application is Robinson's Transformation (Section 7.7 on page 62 of the linked file).

Inference is particularly tricky in these settings, since it's hard to say anything about the asymptotics of $g(Z)$ in a general sense, so you typically need to assume it lies in some space. A recent very general approach to inference was proposed by Chernozhukov et al. (2017) if of interest.

$\endgroup$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.