I am currently writing my masters thesis on the Double Descent Curve in Neural Networks and as I was doing some research, I came across the paper "On the Double Descent of Random Features Models Trained with SGD". The paper introduces a quite basic model. See the following picture.
The authors then present the following plot
which I am currently trying to understand. I do not understand what is meant by "the Gaussian kernel outputs the 2m feature mapping, i.e. $\sigma(Wx) \in \mathbb{R}^{2m}$". Does somebody know what is meant there? Any help is appreciated. Thank you!