20

This is related to:

Often when searching, I get duplicate results:

screenshot of duplicate search results

The problem used to be that canonical URLs were wrong (e.g. the canonical URL for an "answer URL" or "comment URL" pointed to itself), but now it seems that canonical URLs are entirely removed.

Please add them and ensure they are correct. From a user point of view, this is incredibly annoying.

9
  • 3
    I don't understand how people can ignore this. This is utterly annoying. And it happens not only for the answer links, also for all sorts of query params like ?rq=1 (e.g. google.com/search?q=asterism%20latex). Commented Feb 3, 2019 at 10:40
  • @user1666456 This was changed for a while, but the change was reverted because it was causing problems. The ending portion of a link to an answer in the URL is in fact necessary to change the page if the answer is not on the first page, on questions with multiple pages of answers. Commented Feb 11, 2019 at 14:37
  • It's not about changing the URL, but indicating to search engines that these different URLs point to the same content, using a canonical tag (see e.g. support.google.com/webmasters/answer/139066). Agreed, people who are really searching for a specific answer of a question on Google (does anyone even do this?) might be mislead to the first page, but much more people will search for questions and get duplicate results. This happens on a daily basis for me right now. And bear in mind the ?rq=1 and similar query parameters which cause the same problem. Commented Feb 11, 2019 at 14:50
  • @SonictheIntrovertedHedgehog Actually, the Related section hinted me at this post meta.stackexchange.com/q/130749/384103 regarding multi-page questions which, from my point of view, comes to the same conclusion as I do, and there is even a possible solution shown (the show all page) that could resolve both issues. Yet, since then, someone has decided to remove the canonical tags altogether, but I haven't yet found a post where that is explained. Anyways, at least the query parameters rq, noredirect, lq etc. that apparently make no difference in content should be considered. Commented Feb 11, 2019 at 17:16
  • What's also annoying is that sometimes one of the results has a different title because it's an earlier cached version.
    – Laurel
    Commented Feb 12, 2019 at 3:00
  • It dawns on me that the root cause of this issue is the Hot Network Questions section, which is regenerated randomly on every page load. Normally, Google would probably detect duplicate pages by itself, but it can't just arbitrarily discard that "new content" it sees with every access (be it under a different URL or not). I think there is no way to exclude this section from indexing, so again, rel="canonical" is the only solution. But apparently nobody cares... Commented Feb 17, 2019 at 18:29
  • 4
    This came up in other investigations today - it was a series of combined issues that weren't rendering canonicals or rendering them incorrectly (e.g. per-answer). This should be fixed up first thing Monday. Commented Feb 1, 2020 at 2:41
  • Okay I lied about the Monday part in total. A fix went out, and then was rolled back. The issue that we stepped upon here is more systemic. A much more global fix is on track to go out tomorrow to fix many other cases. Sorry about that, almost done though! Commented Feb 3, 2020 at 22:11
  • @Stijn and also a system to avoid asking duplicate questions
    – user718628
    Commented Mar 9, 2020 at 20:47

1 Answer 1

13

A fix for this went live last week - canonical URLs should now be properly working across the network. Canonicals are also the question, not per-answer which should clean up Google over time.

This won't be immediate as there are tens of millions of links to crawl, but I'm keeping an eye on their search console dashboards and trying to remedy what I can.


Some Details:

For the curious, the root of this was several issues that combined into craziness. Attempted fixes years ago for proper one-boxing of answers on various social platforms (which never actually worked when we dug in) was why canonicals were being set per-answer.

Additionally, a subtle and important behavior in ASP.NET MVC that we use is how ViewData works. Due to the semantics of this collection being a copy-on-write dictionary, when we set a canonical on a top level .ViewContext.ViewData, it should be set for the master layout view we're in. And that worked for many, many years. Until a call ahead of the canonical further down was added which caused that copy-on-write style dictionary fork in the view chain. This meant we wrote the canonical the master view should be using just fine, but it wasn't the same dictionary the master view was using anymore.

It's a crazy subtle behavior that's always been there, but it bit us here. One of the reasons my team is working hard on moving us to .NET Core (almost done for Q&A!) is the ability to have things like this in a test, and in a fully supported way. Testing views in MVC 5 is a nightmarish mess...but in Core it's beautiful. This will be one of the first tests we add to the new suite.

1
  • 1
    Thank you for all you and your team have done to address this! Commented Feb 11, 2020 at 14:32

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .