0
$\begingroup$

Given a high-dimensional dataset X containing potentially redundant features, how can we efficiently aggregate and/or select features to achieve accurate prediction of target variable Y while reducing dimensionality? (I do not want to use methods like PCA, VAE etc which reduce the dimensions but with no or very little understanding of the latent dimensions as I want to them to be more explainable)

The hypothesis is that some of the features of $X$ might not be needed at all, while the others might be only needed as aggregates. $X$ is a d dimensional sequence, and the idea is to aggregate $X$ by aggregating functions like mean, maximum and minimum and reduce its dimensions to m where m<<d.

I would like to know about the methods that exist in statistics and can do this.

One of the attempts that I made was to train a transformer model on $X$, and correspondingly, I got attention values $A$ (per sample). Attention values have the same dimensions as of input and can be interpreted as a contribution of every feature to the output (for each sample). I was wondering if the attention values can also be used in some sense to aggregate the features of $X$ that still give us a good prediction of $Y$.

So, this question is basically enquiring about the existing methods that can help me achieve feature aggregation and/or selection for input while preserving its predictive accuracy. And can deep learning help here?

$\endgroup$

0

You must log in to answer this question.

Browse other questions tagged .