7

Suppose I have this HTML in a string:

<meta http-equiv="Set-Cookie" content="COOKIE1_VALUE_HERE">
<meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE">
<meta http-equiv="Set-Cookie" content="COOKIE3_VALUE_HERE">

And I have this regular expression, to get the values inside the content attributes:

/<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig

How do I, in JavaScript, get all three content values?

I've tried:

var setCookieMetaRegExp = /<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig;
var match = setCookieMetaRegExp.exec(htmlstring);

but match doesn't contain the values I need. Help?

Note: the regular expression is already correct (see here). I just need to match it to the string. Note: I'm using NodeJS

0

6 Answers 6

3

You were so close! All that needs to be done now is a simple loop:

var htmlString = '<meta http-equiv="Set-Cookie" content="COOKIE1_VALUE_HERE">\n'+
'<meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE">\n'+
'<meta http-equiv="Set-Cookie" content="COOKIE3_VALUE_HERE">\n';

var setCookieMetaRegExp = /<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig;

var matches = [];
while (setCookieMetaRegExp.exec(htmlString)) {
  matches.push(RegExp.$1);
}

//contains all cookie values
console.log(matches);

JSBIN: http://jsbin.com/OpepUjeW/1/edit?js,console

1
  • So this only seems to work when content comes at the end of the tag. If an attribute follows content then it picks it up in the regexp. How to do you tell the matching to stop when it reaches the closing quote? Here is my bin of the problem... jsbin.com/xebimu/1/edit?js,console Commented Nov 7, 2014 at 21:22
3

Keep it simple:

/content=\"(.*?)\">/gi

demo: http://regex101.com/r/dF9cD8

Update (based on your comment):

/<meta http-equiv=\"Set-Cookie\" content=\"(.*?)\">/gi

runs only on this exact string. Demo: http://regex101.com/r/pT0fC2

You really need the (.*?) with the question mark, or the regex will keep going until the last > it finds (or newline). The ? makes the search stop at the first " (you can change this to [\"'] if you want to match either single or double quote).

1
  • I need to run the regular expression specifically on set-cookie, and the HTML string is a complete HTML document
    – Obay
    Commented Jan 24, 2014 at 3:20
1

no need for regular expressions just do some dom work

var head = document.createElement("head");
head.innerHTML = '<meta http-equiv="Set-Cookie" content="COOKIE1_VALUE_HERE"><meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE"><meta http-equiv="Set-Cookie" content="COOKIE3_VALUE_HERE">';

var metaNodes = head.childNodes;
for(var i=0; i<metaNodes.length; i++){
   var contentValue = metaNodes[i].attributes.getNamedItem("content").value;
}

As you are using nodejs and BlackSheep mentions using cheerio you could use their syntax if you wish to use that lib:

//Assume htmlString contains the html
var cheerio = require('cheerio'),
$ = cheerio.load(htmlString);
var values=[];
$("meta").each(function(i, elem) {
  values[i] = $(this).attr("content");
});
5
  • @Obay you might want to mention that you are using NodeJS in your question then lol Commented Jan 24, 2014 at 3:20
  • 1
    @Obay Why don't you use cheerio lib?
    – Ram
    Commented Jan 24, 2014 at 3:21
  • Sorry about that! Will modify :P
    – Obay
    Commented Jan 24, 2014 at 3:22
  • @Obay edited to include snippet on how to do it with cheerio lib since BlackSheep mentions that lib. Commented Jan 24, 2014 at 3:32
  • 1
    While the OP may not relate to browsers, it's worth noting for this answer that not all browsers allow setting of the innerHTML property of head elements (e.g. IE).
    – RobG
    Commented Jan 24, 2014 at 4:49
1

Try this

(?:class|href)([\s='"./]+)([\w-./?=&\\#"]+)((['#\\&?=/".\w\d]+|[\w)('-."\s]+)['"]|)

example :

function getTagAttribute(tag, attribute){    
    var regKey = '(?:' + attribute + ')([\\s=\'"./]+)([\\w-./?=\\#"]+)(([\'#\\&?=/".\\w\\d]+|[\\w)(\'-."\\s]+)[\'"]|)'
    var regExp = new RegExp(regKey,'g');
    var regResult = regExp.exec(tag);   
    if(regResult && regResult.length>0){                        
        var splitKey = '(?:(' + attribute + ')+(|\\s)+([=])+(|\\s|[\'"])+)|(?:([\\s\'"]+)$)'                
        return regResult[0].replace(new RegExp(splitKey,'g'),'');
    }else{
        return '';
    }
}


getTagAttribute('<a href  =   "./test.html#bir/deneme/?k=1&v=1"    class=   "xyz_bir-ahmet abc">','href');'

//return  "./test.html#bir/deneme/?k=1&v=1"

Live Regexp101

Live JS Script Example

0

try this:

var setCookieMetaRegExp = "/<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig";
var match = stringToFindPartFrom.match(setCookieMetaRegExp);
2
  • It doesn't work, even when changing the parameter of exec() into htmlstring
    – Obay
    Commented Jan 24, 2014 at 3:21
  • I tried the modified code, it doesn't work. Uncaught TypeError
    – Obay
    Commented Jan 24, 2014 at 3:23
0

Try this:

var myString = '<meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE">';
var myRegexp = /<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig;
var match = myRegexp.exec(myString);
alert(match[1]); // should show you the part
1
  • use single quotes around the string, otherwise you will get errors due to double quotes being in the string. Commented Jan 24, 2014 at 3:35

Not the answer you're looking for? Browse other questions tagged or ask your own question.