0
$\begingroup$

I have a question that it has been proposed that if a treatment causing a censoring, and if there is an unmeasured confounder associated with censoring, as well as the outcome variables.

In this case, the censoring becomes a collider, and when we analyze data with only observed participants will cause a selection bias.

I provide a figure to show this selection bias.

enter image description here

It has been proposed that Inverse probability censoring weighting can be used to create a pseudo population which have all population, I use it to remove the relationship between treatment and censoring. Then based on this idea, because no relationship between treatment and censoring, no selection bias will occur. However, in my results, it still has biased results.

Please help me to address this issue and provide me code to obtain a unbiased result.

sas code

data t;
  call streaminit(123);
  do i = 1 to 1000000;
    /* Simulating x with probability p_x */
    p_x = 1/3;
    x = rand("bernoulli", p_x);
    
    /* Simulating u with probability p_u */
    p_u = 1/2;
    u = rand("bernoulli", p_u);

    /* Calculating the linear predictor for c from x and u */
    model_c = 0 + x * 0.811 - 0.5 * u;
    /* Logistic transformation to get probability p_c */
    p_c = exp(model_c) / (1 + exp(model_c));
    /* Simulating c using the calculated probability p_c */
    c = rand("bernoulli", p_c);
    /* Calculating the linear predictor for y from u */
    y = rand ("normal",100,15)+ u * 10;
    output;
  end;
run;

/*correct result*/
/* Unconditional logistic regression of y on x */
proc reg data=t;
  model y = x;
run;

/*bias results*/
/* Conditional logistic regression of y on x, conditioned on c=0 */
proc reg data=t;
where c = 0;
  model y = x;
run;

/*IPW for censoring*/
proc psmatch data=t; 
class c;
psmodel c  = x;
output out=z 
atewgt = w_c;
run; 

/*still a unbiased result*/
proc reg data=z;
where c = 1;
model y =x;
weight w_c;
run;

/*pseudo population with all observations with weight*/
proc means sum data=z;
where c = 1;
var w_c;
run;

r code

rm(list=ls(all=TRUE))
library(tidyverse)

# Set the seed for reproducibility
set.seed(123)
# Define parameters
sample_size <- 100000
p_x <- 1/3
p_u <- 1/2
# Generate the data frame
t <- tibble(
  x = rbinom(n = sample_size, size = 1, prob = p_x),  # Simulating x
  u = rbinom(n = sample_size, size = 1, prob = p_u),  # Simulating x
  model_c = 0 + x * 0.811 - 0.5*u,  # Calculating the linear predictor directly from x
  p_c = exp(model_c) / (1 + exp(model_c)),  # Logistic transformation
  c = rbinom(n = sample_size, size = 1, prob = p_c),
  y = rnorm (n = sample_size,100,15)+ u * 50)  # Simulating y
#correct
summary(lm(y~x, data=t))

#contional on c
summary(lm(y~x, data=t, subset = c==0))

psmodel <- glm(c ~ x, data = t,
                family = binomial(link = "logit"))

t$weight <- 1/predict(psmodel, data = t, type = "response")

#still a unbiased result
summary(lm(y~x, data=t, subset = c==0, weight=weight))
$\endgroup$
1
  • $\begingroup$ I found that this path cannot be sloved by IPW. $\endgroup$
    – Elong Chen
    Commented Jun 9 at 14:27

0

Browse other questions tagged or ask your own question.