With timestamp offsets, ffmpeg will effect that through edit lists in the output MP4. Looks like browsers don't parse them. So, we need a workaround.
ffmpeg -i remote.mp4 -i local.mp4 \
-filter_complex \
" [1:v]scale=iw/4:-1:flags=lanczos[loc0]; \
[0:v]transpose=1,setpts='if(eq(nN,0),PTS,PTS+2.501/TB)',fps=30[rotate1]; \
[rotate1][loc0]overlay=main_w-overlay_w-10:main_h-overlay_h-10:eof_action=pass[rem0]; \
[0:a]adelay=2501|2501,apad[0a]; \
[0a][1:a]amerge=inputs=2[a]" \
-map "[rem0]" -map "[a]" \
-ac 2 -vcodec libx264 \
-ar 44100 -acodec aac \
completed.mp4
The setpts
shifts timestamps of all frames except the first. The fps
filter then fills in that gap with duplicates of the first frame. I've assumed an input stream rate of 30.