Using Regex.exec with parentheses regex to extract matches of a string

suggest change

Sometimes you doesn’t want to simply replace or remove the string. Sometimes you want to extract and process matches. Here an example of how you manipulate matches.

What is a match ? When a compatible substring is found for the entire regex in the string, the exec command produce a match. A match is an array compose by firstly the whole substring that matched and all the parenthesis in the match.

Imagine a html string :

<html>
<head></head>
<body>
  <h1>Example</h1>
  <p>Look a this great link : <a href="https://stackoverflow.com">Stackoverflow</a> http://anotherlinkoutsideatag</p>
  Copyright <a href="https://stackoverflow.com">Stackoverflow</a>
</body>

You want to extract and get all the links inside an a tag. At first, here the regex you write :

var re = /<a[^>]*href="https?:\/\/.*"[^>]*>[^<]*<\/a>/g;

But now, imagine you want the href and the anchor of each link. And you want it together. You can simply add a new regex in for each match OR you can use parentheses :

var re = /<a[^>]*href="(https?:\/\/.*)"[^>]*>([^<]*)<\/a>/g; 
var str = '<html>\n    <head></head>\n    <body>\n        <h1>Example</h1>\n        <p>Look a this great link : <a href="https://stackoverflow.com">Stackoverflow</a> http://anotherlinkoutsideatag</p>\n\n        Copyright <a href="https://stackoverflow.com">Stackoverflow</a>\n    </body>\';\n';
var m;
var links = [];

while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    console.log(m[0]); // The all substring
    console.log(m[1]); // The href subpart
    console.log(m[2]); // The anchor subpart

    links.push({
      match : m[0],   // the entire match
      href : m[1],    // the first parenthesis => (https?:\/\/.*)
      anchor : m[2],  // the second one => ([^<]*)
    });
}

At the end of the loop, you have an array of link with anchor and href and you can use it to write markdown for example :

links.forEach(function(link) {
  console.log('[%s](%s)', link.anchor, link.href);
});

To go further :

Feedback about page:

Feedback:
Optional: your email if you want me to get back to you:



Table Of Contents