PyQuery object doesn't use the correct parser #245

spookylukey · 2022-08-24T16:06:03Z

The pyquery object that WebTest constructs does not specify the parser:

Line 502 in 561ef78

d = PyQuery(self.testbody)

Which means it uses the xml parser:

https://pyquery.readthedocs.io/en/latest/tips.html#using-different-parsers

For html responses, we should be using the html parser. For an example of the difference it makes, consider code inside a <script> tag:

>>> PyQuery('<html><body><script>var x = "<span></span>"; </script></body></html>').find('span')
[<span>]
>>> PyQuery('<html><body><script>var x = "<span></span>"; </script></body></html>', parser='html').find('span')
[]

The latter is what we want, and agrees with what browsers do.

The text was updated successfully, but these errors were encountered:

gawel closed this as completed in 30d4a7b Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyQuery object doesn't use the correct parser #245

PyQuery object doesn't use the correct parser #245

spookylukey commented Aug 24, 2022

PyQuery object doesn't use the correct parser #245

PyQuery object doesn't use the correct parser #245

Comments

spookylukey commented Aug 24, 2022