MnSearch Snippets April 2019: Screaming Frog Custom Extraction - Griffin Roer

Presented by
SCREAMING FROG
CUSTOM EXTRACTION
April 24, 2019

“The industry leading website crawler… trusted by
thousands of SEOs and agencies worldwide for
technical SEO audits.”
Screaming Frog
MORE THAN AN SEO
AUDIT TOOL
● Custom website scraping
● Advanced reporting
○ GA & GSC integrations
2Screaming Frog Custom Extraction - MnSearch - April 2019

CUSTOM EXTRACTION 101
Scrape any data from the HTML of a web page.
3

DEFAULT
EXTRACTIONS
Screaming Frog extracts a
bunch of data from the HTML of
web pages by default.
Page Title
Meta Description
Meta Keywords
H1
H2
Meta Robots
Meta Refresh
Canonical Link
Pagination
On-page links
Anchor text
Alt text
Hreflang
AMP

CUSTOM
EXTRACTIONS
Examples:
● Publish date
● Product reviews
● Article comments
● Schema markup
● Pixel ID’s

HOW THEY HELP
Use cases:
● Analyze performance by
factors you may normally
not have access to
● Diagnose hidden site issues
● Speed up data-gathering

HOW TO USE CUSTOM
EXTRACTION
7

8
ACCESS CUSTOM
EXTRACTION
Go to Configuration > Custom >
Extraction.

9
SET UP YOUR
EXTRACTION RULES
More on this in a minute.

10
RUN YOUR CRAWL
The data you extract is available
in the Custom tab. Set the Filter
dropdown to Extraction.

EXTRACTION RULES
How to tell Screaming Frog what you
want it to extract.
11

12
SET UP YOUR
EXTRACTION RULES
Screaming Frog requires some
information to know how and
what to extract:
● Extractor Name (optional)
● Extraction Method
● Rule
● Extraction Filter

● Use to extract any HTML element of a
webpage
○ Anything in a <div>, <p>, <span>, <a>,
<meta>, etc.
13
XPATH & REGEX
Two syntaxes that you can use
to tell Screaming Frog what you
want to extract from a web
page.
XPATH
● Use to extract inline JavaScript
○ Like, schema markup in JSON-LD or a
an account ID from a tracking pixel
REGEX

bit.ly/custom-extraction
14
EXTRACTION RULES
YOU CAN COPY!
Check out this guide for
extraction rules you can copy +
paste into Screaming Frog, plus
learn to write your own.

15
EXAMPLES
A quick example showing how
to extract the date from articles
on the MnSearch blog.
Chrome: Right-Click > Inspect
Date is in a <span> element with the class
“meta-date date updated”

16
EXAMPLES
A quick example showing how
to extract the date from articles
on the MnSearch blog. Custom Extraction Results
XPath Rule: //span[@class='meta-date date updated']

17
EXAMPLES
Extract information about each
article on a blog.

18
EXAMPLES
Extract product information
from the schema markup.

19
EXAMPLES
See what types of schema
markup are being used on each
page.
● This example shows one
rule using XPath and
another using regex

bit.ly/custom-extraction
20
ONE MORE TIME...
Check out this guide for
extraction rules you can copy +
paste into Screaming Frog, plus
learn to write your own.

MnSearch Snippets April 2019: Screaming Frog Custom Extraction - Griffin Roer

Related slideshows

More Related Content

MnSearch Snippets April 2019: Screaming Frog Custom Extraction - Griffin Roer