There is some resemblance. I am going to use some of what I wrote in my answer to the previous post.
Using the notation there, we have
\begin{align*}
f(x^*) \mid y &\sim N\Big(K(x^*,x)\big(\sigma^2 I + K(x,x)\big)^{-1} y, \\ &\qquad K(x^*,x^*) - K(x^*,x) \big(\sigma^2 I + K(x,x)\big)^{-1} K(x,x^*)\Big).
\end{align*}
Let us forget about the covariance and only look at the posterior mean, i.e.,
$$
\hat f(x^*) := \mathbb E [f(x^*) \mid y] = K(x^*,x)\big(\sigma^2 I + K(x,x)\big)^{-1} y.
$$
Recall that $x = (x_1,\dots,x_n)$ contains the training data. Let me write $K_{tr} = K(x,x)$ for the kernel matrix obtained from the training data. Also, let $x^*$ be a single point that we are interested in. Furthermore, let us rename $x^*$ to $t$ for simplicity. Our function estimate is
\begin{align*}
\hat f(t) = K(t, x) \big(\sigma^2 I + K_{tr}\big)^{-1}y
\end{align*}
that is
$$
\hat f(t) = \sum_{i=1}^n \alpha_i(y) K(t, x_i), \quad
\alpha_i(y) := [\big(\sigma^2 I + K_{tr}\big)^{-1} y]_i
$$
and $[\cdot]_i$ denotes the $i$th coordinate of a vector.
That is, the function estimate for GP regression is a linear combination of the kernel functions anchored at training points $x_i$ with weights $\alpha_i(y)$. Note the resemblance with the solution of the kernel-ridge regression (to be found elsewhere!).
The $k$-NN function estimate is
$$
\hat g(t) = \sum_{j} \beta_j(y)\,1_{N_j}(t)
$$
where $N_j$ is one of the cells in the $k$th order Voronoi partition of the space by the training data $\{x_i\}$ and $1_{N_j}(t) = 1\{t \in N_j\}$ is the indicator function of set $N_j$. In other words, $N_j$ is such that all the points in it have the same $k$ nearest neighbors in the training data $\{x_i\}$. Here,
$
\beta_j(y) = \sum_i y_i 1\{x_i \in N_j\}
$
is the average of the training-data response over cell $N_j$.
Both estimator are linear in $y$, since both $\alpha_i(y)$ and $\beta_j(y)$ are linear functions of $y$. The GP regression estimate is a weighted combination of kernel functions anchored at training features. The $k$NN estimate is a piecewise constant function over the cells defined by the training features. You are replacing the hard indicators $1_{N_j}(t)$ with soft kernels $K(t, x_j)$ when going from $k$NN to GP regression.
In my opinion, it is much easier to understand GP regression than $k$-nearest neighbor regression (due to the complexity of the Voronoi partition.)