
I am currently writing my masters thesis on the Double Descent Curve in Neural Networks and as I was doing some research, I came across the paper "On the Double Descent of Random Features Models Trained with SGD". The paper introduces a quite basic model. See the following picture.

enter image description here .

The authors then present the following plot

enter image description here

which I am currently trying to understand. I do not understand what is meant by "the Gaussian kernel outputs the 2m feature mapping, i.e. $\sigma(Wx) \in \mathbb{R}^{2m}$". Does somebody know what is meant there? Any help is appreciated. Thank you!



You must log in to answer this question.

Browse other questions tagged .