It seems easier to discuss the stronger "combinatorial" square theorem, which states:
For each integer $r>0$ there is some $n>0$ such that, whenever $[n]^2$ is $r$-colored, there exists a set of the form $$\{(x,y), (x+d,y), (x,y+d), (x+d,y+d)\}$$ whose four elements all receive the same color.
Here (and later), $[n]$ refers to $\{1,2,\dots,n\}$.
The combinatorial square theorem implies the square theorem in $\mathbb R^2$, since we can draw $[n]^2$ as a grid in $\mathbb R^2$. Looking at the theorem in the plane also allows "lopsided" squares; it seems unlikely, but possible, that those make the proof easier.
The combinatorial theorem is sandwiched between the length-$4$-AP case of van der Waerden's theorem and the size-$4$-alphabet case of the Hales–Jewett theorem in the Ramsey-type theorem hierarchy. Specifically, there are maps
$$
[4]^d \longrightarrow [2^d]^2 \longrightarrow [3 \cdot 2^d]
$$
such that every combinatorial line in $[4]^d$ maps to a square in $[2^d]^2$ (though not all squares are obtained in this way), and every square in $[2^d]^2$ maps to a length-$4$ arithmetic progression in $[3\cdot 2^d]$ (though not all arithmetic progressions are obtained in this way).
This means that the Hales–Jewett theorem implies the square theorem. Given an $r$, let $d$ be the Hales–Jewett number for a size-$4$ alphabet and $r$ colors, and take $n=2^d$. Any $r$-coloring of $[n]^2$ can be pulled back to a coloring of $[4]^d$, which must contain a combinatorial line, and the image of that combinatorial line is a monochromatic square in our coloring of $[n]^2$.
This also means that the square theorem implies the length-$4$-AP case of van der Waerden's theorem. Given an $r$, choose $n$ large enough that the $n/3$ by $n/3$ grid is big enough for the square theorem. Any $r$-coloring of $[n]$ can be pulled back to a coloring of $[n/3]^2$, which must contain a monochromatic square, and the image of that monochromatic square is a length-$4$ AP in our coloring of $[n]$.
For this reason, the square theorem should not have a proof significantly easier than van der Waerden's theorem, unless the length-$4$ case is easier than the general case to prove.
Here are the details of the two maps mentioned earlier.
The map from $[4]^d$ to $[2^d]^2$ is defined recursively. To map $[4]$ to $[2]^2$, send $1$ to $(1,1)$, $2$ to $(1,2)$, $3$ to $(2,1)$, and $4$ to $(2,2)$. To map $[4]^d$ to $[2^d]^2$, separate $[4]^d$ out into $4$ copies of $4^{d-1}$, with last coordinate $1$, $2$, $3$, or $4$. Map each copy to $[2^{d-1}]^2$. Then shift the second copy over by $(0,2^{d-1})$, the third copy over by $(2^{d-1},0)$, and the fourth copy over by $(2^{d-1},2^{d-1})$. The effect of this is that a combinatorial line which changes coordinates in a set $I \subseteq [d]$ corresponds to a square with side length $\sum_{i \in I} 2^{i-1}$.
The map from $[2^d]^2$ to $[3 \cdot 2^d]$ is much simpler: the point $(x,y)$ is mapped to $x+2y$. A square $\{(x,y), (x+d,y), (x,y+d), (x+d,y+d)\}$ is mapped to $\{x+y, x+y+d, x+y+2d, x+y+3d\}$, which is an arithmetic progression of length $4$.