Let us say that I want to search (on some Stack Exchange site) for posts linking to some website, and the URL contains the tilde ~
. For the sake of example, let us try python.net/~goodger
. (I basically took some random site of this form that appears on SO.)
I have already realized that in addition to searching for url:"*python.net/~goodger*"
(Stack Overflow, the whole network) I need to try url:"*python.net/%7Egoodger*"
, too (Stack Overflow, Stack Exchange). Even for the posts from the second search, in the source I see python.net/~goodger
. But in SEDE, it seems that Posts.Body actually contains %7E
. Here is a query on GIS. (I chose a smaller site - I expect that on SO such a query will time out.)
Are there some other variations I have to try to make sure that I find all such posts? Is this expected behavior, or should this be considered a bug?
(I stumbled upon this after some bulk replacements that were done on some sites recently - such as Physics and Mathematics. But I suppose that sometimes people might use search of this type for entirely different reasons.)
PostHistory.Text
into HTML, i.e.Posts.Body
) around 2020. I encountered similar peculiarities while repairing broken links, and I think I still account for it in some of the unit tests in that project. Of course, both styles are equally valid, though it's not necessary to use the percent-escape.liberte
orliberté
will coincide and only google may, or may not, keep them separate. But if you use search in a page your browser also conflates the results. Besides, partial vs full text search is a yet unsolved problem.