10

I have the following HTML:

<!--
<option value="HVAC">HVAC</option>
<option value="Cooling">|---Cooling</option>
<option value="Heating">|---Heating</option>
-->
....

I fetch this file dynamically using jQuery's get method and store it in a string variable named load_types.

How can I strip the HTML comment tags and everything outside of them? I only want the inside HTML:

<option value="HVAC">HVAC</option>
<option value="Cooling">|---Cooling</option>
<option value="Heating">|---Heating</option>

I tried to use the solutions here but nothing worked properly--I just get null as a match.

Thanks for the help!

3

2 Answers 2

17

Please never use regex to parse HTML. You can use the following instead:

var div = $("<div>").html(load_types),
    comment = div.contents().filter(function() {
        return this.nodeType === 8;
    }).get(0);

console.log(comment.nodeValue);

DEMO: http://jsfiddle.net/HHtW7/

10
  • 5
    Native JavaScript solution: jsfiddle.net/4g3FT , Native JS solution assuming ES5 jsfiddle.net/TUR65 Commented May 25, 2013 at 20:19
  • 1
    @MatíasFidemraizer Do I really need to post a really complicated counter example, or is something trivial like <!--- <script>var str="hello ---> world";</script> ---> be enough to convince you that a regex is a bad tool for this sort of thing? Commented May 26, 2013 at 19:41
  • 1
    @MatíasFidemraizer that thing may be regular, but the context in which to find it is certainly not
    – PeeHaa
    Commented May 26, 2013 at 19:44
  • 2
    You're going to the edge case. OP case is regular, very very very very very regular. OP case wasn't "I want to parse any HTML" but just the HTML shown in the sample code. Is this regular? It is very regular! I know that you won't create a full HTML parser using regexs, but regexs could be enough for the whole case. Commented May 27, 2013 at 6:16
  • 2
    @MatíasFidemraizer Regular expressions are an extremely poor choice to extract a string out of a comment, if OP's string changes even the slightest the regular expression you suggested earlier can break. While it surely is regular (just imagine the automaton) to match a comment in a string as in OPs, in practice it's an extremely poor idea, especially since this ability is already built into the browser and any slightly-competent JS code who knows DOM 101 can do so effortlessly. Commented May 27, 2013 at 19:32
0

You can simply get the html of the parent tag where the comment is and do a .replace("<!--","").replace("-->", "") which will simply remove the comment tags and then append this markup to some other parent or replace your current markup or create a new parent for it and append it.

This will allow you to use the jQuery selectors to retrieve the required data.

var comment = '<!-- <option value="HVAC">HVAC</option> <option value="Cooling">|---Cooling</option> <option value="Heating">|---Heating</option> --> ';

jQuery("#juni").append("<select>"+comment.replace("<!--", "").replace("-->", "") + "</select>")
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="juni"></div>

Not the answer you're looking for? Browse other questions tagged or ask your own question.