I have these 2 lines on more than 3000 HTML pages:
<link rel="canonical" href="https://mywebsite.com/hi/about.html" />
and
<link rel="canonical" href="https://mywebsite.com/about.html" />
So, I want to find with regex all those pages that contain those lines which DO NOT contain this word hi
from the /hi/
link.
hi
always betweenmywebsite.com/
and/about.html
or can it be anywhere in the url?