Here is a quick JavaScript snippet to extract all URLs from a webpage fast with Google Chrome Developer Tools. No Browser Extension is required!
The JavaScript code generates a list of URLs in CSV format with the anchor texts, and a boolean to know if the URL is internal or external to the current website. Just copy-paste the results into Google Sheets or with the Datablist CSV editor.
Notes: The code works also with Firefox, Safari, etc.
Step 1: Run JavaScript code in Google Chrome Developer Tools
- Open Google Chrome Developer Tools with
Cmd + Opt + i
(Mac) orF12
(Windows). And click on theConsole
tab. - Copy-Paste the following JavaScript code and press
Enter
.
const results = [
['Url', 'Anchor Text', 'External']
];
var urls = document.getElementsByTagName('a');
for (urlIndex in urls) {
const url = urls[urlIndex]
const externalLink = url.host !== window.location.host
if(url.href && url.href.indexOf('://')!==-1) results.push([url.href, url.text, externalLink]) // url.rel
}
const csvContent = results.map((line)=>{
return line.map((cell)=>{
if(typeof(cell)==='boolean') return cell ? 'TRUE': 'FALSE'
if(!cell) return ''
let value = cell.replace(/[\f\n\v]*\n\s*/g, "\n").replace(/[\t\f ]+/g, ' ');
value = value.replace(/\t/g, ' ').trim();
return `"${value}"`
}).join('\t')
}).join("\n");
console.log(csvContent)
Step 2: Copy-paste exported URLs into a CSV file or spreadsheet tools
The JavaScript code will extract all URLs from the webpage with the following values:
- URL - The link URL
- Anchor Text - The label associated with the link. Called "Anchor Text".
- External - A boolean (
TRUE
,FALSE
)
The result is printed on the console using CSV format. You can copy-paste the result into spreadsheet tools or with our online CSV editor. Or create a new text file with your text editor and save the file with a .csv
extension.
Google automatically adds a Copy
button when you get a lot of results.
Step 3: Filter CSV data to get relevant links
To filter quickly your links, open Datablist CSV editor. Create a collection and paste your CSV data. Check the Google and Amazon examples below for a more step-by-step guide.
Select 'First row contains header' and click on the "+" button for each column to import them.
Then you will be able to use text search or filter on a specific property. You can use Datablist to consolidate all your CSV data. And Datablist comes with an algorithm for deduplication to have a clean list or URLs.
Example with Google Results
This is a step-by-step example with the Google results page. We want to extract all external links from a Google search result.
Step 1: Search for a Google term that you want to extract links.
A nice tip is to add the &num=100
parameter to the URL to force Google into showing 100 results per page.
Step 2: Copy-paste the code and run it
Open the console
in Google Developer Tools, paste the JavaScript code and press Enter
.
Copy the results. Select the line and ctrl + c
or use the Copy
button when Chrome shows it.
Step 3: Filter results
Open Datablist CSV editor, create a collection, and paste the copied URLs.
Note: Don't forget to select the option
First row contains headers
and to click the "+" button as shown.
Once imported. Click "Filter on property" in the "External" column.
And filter items with External
to TRUE
and Url
that does not contain google
. Check our guide on how to clean your scraped data
And here are your external links:
Example with Amazon Results
This is a step-by-step example with the Amazon products page. We want to extract all links from this page.
Open the Chrome Developer Tools:
Copy-paste the JavaScript code:
And run it by pressing Enter
:
With Datablist CSV editor, create a collection.
Paste your scrapped URLs:
Your scrapped URLs are listed on the tool. Filter the links, deduplicate the results, and clean your scraped data.