Closed Bug 1878690 Opened 6 months ago Closed 2 months ago

Returning early for error pages in "onStateChange" stop of the WebProgressListener leads to not yet loaded error pages

Categories

(Remote Protocol :: Agent, defect, P2)

defect
Points:
3

Tracking

(firefox128 fixed)

RESOLVED FIXED
128 Branch
Tracking Status
firefox128 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [webdriver:m11][webdriver:relnote])

Attachments

(2 files, 1 obsolete file)

Currently we run a check for Ci.nsIWebProgressListener.LOCATION_CHANGE_ERROR_PAGE in the onLocationChange handler of the WebProgressListener to detect if an error page has been loaded:

https://searchfox.org/mozilla-central/rev/26dd94a024f21971e539a8650c2f177323c98fe2/remote/shared/Navigate.sys.mjs#300-306

This is not accurate given that at this point the actual page hasn't finished loading yet, and accessing baseURI will not yet report the final URL - which might also not have synced via IPC yet. Instead if needed we could set an internal flag to indicate that an error page is about to load. Once we get the stop state we could then correctly validate the document URI.

Examples where this can be seen are safe browsing tests.

This bug actually blocks the work for Marionette (bug 1664165) to use the WebProgress listener implementation.

Points: --- → 2
Priority: -- → P2
Whiteboard: [webdriver:m10]

I was actually wrong with my above assumption. Further debugging and testing has shown that the onLocationChange callback is actually called after onStateChange with state=stop. This is because we actually have two navigations here. The first one is for the real request, while the second one is for the load of the error page. As of now it's not clear why we sometimes get another state=stop after the onLocationChange and sometimes not.

Because of the above behavior and the fact that we also raise an error in state=stop, we are stopping the listener early before the actual error page has been loaded, which actually leads to a not yet expected documentURI, and a possible overlap of navigations if the test starts another navigation right away.

Summary: Using error page detection only in "onLocationChange" of the WebProgressListener leads to invalid results → Returning early for error pages in "onStateChange" stop of the WebProgressListener leads to not yet loaded error pages

The severity field is not set for this bug.
:whimboo, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(hskupin)
Severity: -- → S3
Flags: needinfo?(hskupin)
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Whiteboard: [webdriver:m10] → [webdriver:m11]
Attachment #9390144 - Attachment is obsolete: true
Attachment #9390145 - Attachment description: WIP: Bug 1878690 - [remote] Improve progress listener for error pages. → Bug 1878690 - [remote] Improve progress listener for error pages.

Hi Nika, while implementing a solution for this issue it's still unclear to me if there is a possibility to know at the time when the onStateChange notification of the WebProgress listener is fired if there might be a following onLocationChange notification or not. It is really important for WebDriver to wait until the target page is fully loaded, because otherwise failures can occur like retrieving elements from the page.

I've also talked to Olli and I checked the various flags and states for different errors as sent via the onStateChange event, but there is no indication for it. As of now I have seen that errors like NS_BINDING_ABORTED and NS_ERROR_ABORTED do not trigger loading an error page. But there are probably way more kind of errors that behave the same way. And I'm not eager to blacklist all of them with the risk that we still have some uncovered scenarios which could cause our navigation code to wait forever or timeout after a number of seconds.

Flags: needinfo?(nika)

Error page loading is handled in https://searchfox.org/mozilla-central/rev/dd6e430c1bc2db90d9b3b1dd7e5215b4edc4d51a/docshell/base/nsDocShell.cpp#6314 in response to the OnStateChange call from the web progress listener itself (https://searchfox.org/mozilla-central/rev/dd6e430c1bc2db90d9b3b1dd7e5215b4edc4d51a/docshell/base/nsDocShell.cpp#5637), so the decision about whether to load an error page is made during the OnStateChange call.

This is a normal progress listener added to the docshell from within nsDocShell::Create (https://searchfox.org/mozilla-central/rev/dd6e430c1bc2db90d9b3b1dd7e5215b4edc4d51a/docshell/base/nsDocShell.cpp#488-490), so there's not really a clean way to propagate flags from within it out to other listeners.

I do think that the nsDocShell listener will have run before any other listeners, so by the time the listener is running state should have updated such that the error page load is starting. Only a limited set of flags for that are visible to another process, however there's a chance that if we're loading an error page we'll have already started the load, so the nsIWebProgress::loadType value may have been updated already to LOAD_ERROR_PAGE. Might be worth a check to see if that's the case, as we should send that info to the parent and then make it available from BrowsingContextWebProgress (https://searchfox.org/mozilla-central/rev/dd6e430c1bc2db90d9b3b1dd7e5215b4edc4d51a/dom/ipc/BrowserParent.cpp#3148-3151).

Flags: needinfo?(nika)

Thank you for the details Nika! I checked if the webprogress for the canonical browsing context that we observe has different load types set, but as it looks like this is not the case. When I reproduce those situations I can see that we get two different values in onStateChange:

  1. 0x100001: Set when we indeed see a load of an error page and get a onLocationChange notification
  2. 0x200001: Set when the navigation is aborted

Both the 0x200000 and 0x100000 flags don't have anything to do with error pages but are related to URI fixup:
https://searchfox.org/mozilla-central/rev/7a8904165618818f73ab7fc692ace4a57ecd38c9/docshell/base/nsIWebNavigation.idl#233-242

So I don't think that both are helpful and that we can use the different values to assume a location change.

Also LOAD_FLAGS_ERROR_PAGE is defined in nsDocShellLoadTypes as 0x0001U which is similar to LOAD_CMD_NORMAL that is defined as 0x1 as well. So maybe we hide the state of loading the error page behind the normal load?

Flags: needinfo?(nika)

While talking to Olli today I noticed that I've made a mistake when writing my last comment. It's actually not 0x100001 that I get when an error page is loaded but 0x10001. I was too eagerly copying/pasting because of the other value.

Knowing that the LOAD_FLAGS_ERROR_PAGE constant has a value of 0x0001U and that it gets shifted by 16 when constructing the load type the result is 0x10000, which perfectly matches the flag that I get.

Taking this into account I've updated my local changes and as it looks like all the tests that we run are passing. That means I should be fine and not blocked anymore to get this bug fixed. Thanks again to Nika and Olli for their help!

Flags: needinfo?(nika)

Adjusting the points based on the fact that this bug was a bit more complicated to get fixed.

Points: 2 → 3
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5dc3d359cbe9
[remote] Improve progress listener for error pages. r=webdriver-reviewers,jdescottes
https://hg.mozilla.org/integration/autoland/rev/3d4b2ae33187
[wdspec] Improve navigation error tests for loaded about:error pages. r=webdriver-reviewers,jdescottes
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 128 Branch
Whiteboard: [webdriver:m11] → [webdriver:m11][webdriver:relnote]
You need to log in before you can comment on or make changes to this bug.