Questions tagged [lxml]
lxml is a full-featured, high performance Python library for processing XML and HTML.
lxml
5,469
questions
-2
votes
0
answers
17
views
Issue with lxml Freezing in requests_html After Multiple Iterations in Python
I'm facing an issue with my Python code that opens multiple HTML pages and extracts elements of a specific class, if they exist. The code works fine for a few iterations but eventually freezes after ...
1
vote
1
answer
36
views
How to fix "XPath syntax error: Invalid expression" when using etree.XPath in Python while using union operator "|"
I'm trying to compile an XPath expression using etree.XPath in Python, but I'm encountering a syntax error. Here's the code snippet:
XPATH = '//bridge-domain/(bridge-domain-group-name|bridge-domain-...
0
votes
0
answers
24
views
LXML automatically converts Windows newlines [duplicate]
I am trying to parse an XML string that contains Windows newlines (the CR, LF pair):
from lxml.etree import XML
root = XML('<root>_\r\n_\n_</root>')
print(
[ord(char) for char in root....
1
vote
2
answers
89
views
Find the index of a child in lxml
I am using Python 3.12 and lxml.
I want to find a particular tag, and I can do it with elem.find("tag"). elem is of type Element.
But I want to move child elements of this child into the ...
-1
votes
1
answer
53
views
XPath Python Error: The 'list' object has no attribute 'xpath'
I'm brand new to Python and web scraping and cannot figure out what is wrong with my code for the life of me. Is it because I'm scraping just one element and not a list? I've checked my XPaths so many ...
0
votes
1
answer
47
views
Cannot extract element xpath from docx
With python-docx-oss I use the following code (I want to write the Heading 3 style to a TXT file and include only the outline/level of numbering that is level 3, i.e. x.x.x):
from docx import Document
...
1
vote
3
answers
124
views
Code reads from an XML file but can't find anything
I want to read an XML file and write the information to a MySQL database (localhost).
The table:
CREATE TABLE ClimateMonitoring (
id INT PRIMARY KEY,
name VARCHAR(255),
label VARCHAR(255),
...
0
votes
1
answer
86
views
how to retrieve all the text (including the tags/child_elements) from an element using lxml
xml_content = <root><para>Brother set had private his letters observe outward resolve. Shutters ye marriage to throwing we as. <child1>Effect in if agreed he wished wanted admire ...
0
votes
1
answer
241
views
How to solve the xmlsec Error: (100, 'lxml & xmlsec libxml2 library version mismatch')
I am facing the following error when deploying my Django (version 4.1) Backend, I have the following Dockerfile (some non-relevant parts omitted) and need to install python3-saml (which has ...
0
votes
0
answers
40
views
In Python, is it possible to control xmlns attributes ordering when printing [duplicate]
Say, I've a valid XML file. The root element may have various xmlns attributes, which are in a random order. I want to write out the same XML back but with a predictable order for xmlns attributes (...
0
votes
1
answer
75
views
imported module not found in class in Pyodide
I am trying to use pyodide with lxml and urllib3, for some reasons I don't understand, when I try to use urllib3 in a class supposed to be a Resolver for lxml etree I get an error NameError: name '...
0
votes
1
answer
561
views
I encounter the error xmlsec.InternalError: (-1, 'lxml & xmlsec libxml2 library version mismatch')
When I run sentry devserver --workers on my computer, I encounter the error xmlsec.InternalError: (-1, 'lxml & xmlsec libxml2 library version mismatch'). My computer is running MacOS m3 pro, with ...
0
votes
2
answers
89
views
Using XPath with Python's lxml module, how to validate a node's path?
I'm using XPath with Python's lxml module, and have the following xml code.
<library>
<section1>
<book>
<title>Harry Potter</title>
<author>J.K. ...
0
votes
1
answer
70
views
xml.etree.ElementTree.ParseError: unclosed token with XML Library [duplicate]
I am getting following error when I am trying to parse XML using Python XML library.
xml.etree.ElementTree.ParseError: unclosed token
I am using following code to parse xml string.
from xml.etree ...
0
votes
1
answer
29
views
lxml.etree.tostring show both start and end tags for empty node [duplicate]
from lxml import etree
tree = etree.XML('<foo class="abc"></foo>')
print(etree.tostring(tree, encoding='utf-8').decode('utf-8'))
The above code shows the following.
<foo ...