Use
ffmpeg -i video1.mp4 -i video2.mp4 -filter_complex '[1][0]scale2ref[2nd][ref];[ref][2nd]vstack[vid]''[1][0]scale2ref=iw:ow/mdar[2nd][ref];[ref][2nd]vstack[vid]' -map [vid] -c:v libx264 -crf 23 -preset veryfast output.mp4
The vstack output pad hasn't been labelled, so the map won't refer to anything. Depending on your shell, you may need to quote the map value.