0

I have an XMLTV file. I need to get all of the programme sections for a specific channel. An example snippet:

    <programme start="2023031305000 -0400" end="2023031305300 -0400" channel="Bleah.us"
      <title>This is the Title</title>
      <desc>This is the Description</desc>
    </programme>

Simple enough if the programme section was always 4 lines: grep -A3 channel="Bleah.us"

But the programme section may change lengths. Sometimes it may have addition sub-elements like <sub-title> and/or <category> and/or <icon>.

So my question. How do I find the lines that contains 'channel="Bleah.us"' and print that line and all of the lines until '</programme>' is found (and print that line too)? There could be 1 secion, there could be 100 sections, I won't know.

Thanks in advance!

2
  • Please provide expected output. Text or XML? Consider using xidel, xmllint or xmlstarlet Commented Mar 13, 2023 at 9:53
  • Advice to newcomers: If an answer solves your problem, please accept it by clicking the large check mark (✓) next to it and optionally also up-vote it (up-voting requires at least 15 reputation points). If you found other answers helpful, please up-vote them. Accepting and up-voting helps future readers. Please see the relevant help-center article Commented Mar 13, 2023 at 12:14

1 Answer 1

1

grep, sed, awk are not the tools to parse XML. Instead, use a proper XML parser:

With xidel:

$ cat file.xml
<root>
    <programme start="2023031305000 -0400" end="2023031305300 -0400" channel="Bleah.us"
      <title>This is the Title</title>
      <desc>This is the Description</desc>
    </programme>
    <programme start="2023031305000 -0400" end="2023031305300 -0400" channel="a"
      <title>This is the Title</title>
      <desc>This is the Description</desc>
    </programme>
    <programme start="2023031305000 -0400" end="2023031305300 -0400" channel="s"
      <title>This is the Title</title>
      <desc>This is the Description</desc>
    </programme>
</root>
$ xidel --output-node-format=xml -e '//programme[@channel="Bleah.us"]' file.xml

With xmllint:

$ xmllint --html --xpath '//programme[@channel="Bleah.us"]' file.xml 2>/dev/null

Output

<programme start="2023031305000 -0400" end="2023031305300 -0400" channel="Bleah.us" <title="">This is the Title
      <desc>This is the Description</desc>
    </programme>
3
  • Well, shoot - that it makes it soooo much easier! I didn't have xidel as an option, but I was able to do the same thing with xmllint: Commented Mar 13, 2023 at 10:51
  • If your XML is well formated, yes Commented Mar 13, 2023 at 10:59
  • Added xmllint way Commented Mar 13, 2023 at 12:06

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .