How to Scrape Websites with Node.js and Cheerio

PHOTO

Fri Nov 11 2022 13:37:02 GMT+0000 (Coordinated Universal Time)

// Loading the dependencies. We don't need pretty
// because we shall not log html to the terminal
const axios = require("axios");
const cheerio = require("cheerio");
const fs = require("fs");

// URL of the page we want to scrape
const url = "https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3";

// Async function which scrapes the data
async function scrapeData() {
  try {
    // Fetch HTML of the page we want to scrape
    const { data } = await axios.get(url);
    // Load HTML we fetched in the previous line
    const $ = cheerio.load(data);
    // Select all the list items in plainlist class
    const listItems = $(".plainlist ul li");
    // Stores data for all countries
    const countries = [];
    // Use .each method to loop through the li we selected
    listItems.each((idx, el) => {
      // Object holding data for each country/jurisdiction
      const country = { name: "", iso3: "" };
      // Select the text content of a and span elements
      // Store the textcontent in the above object
      country.name = $(el).children("a").text();
      country.iso3 = $(el).children("span").text();
      // Populate countries array with country data
      countries.push(country);
    });
    // Logs countries array to the console
    console.dir(countries);
    // Write countries array in countries.json file
    fs.writeFile("coutries.json", JSON.stringify(countries, null, 2), (err) => {
      if (err) {
        console.error(err);
        return;
      }
      console.log("Successfully written data to file");
    });
  } catch (err) {
    console.error(err);
  }
}
// Invoke the above function
scrapeData();

COPY

https://www.freecodecamp.org/news/how-to-scrape-websites-with-node-js-and-cheerio/

Save snippets that work from anywhere online with our extensions

Comments

More like this

#html #javascript #nodejs

How to Save HTML Form Data in JSON - Express

<form action="/new" method="post">
 
  <input name="title" type="text">
  <input name="description" type="text">
  <button type="submit">Submit Form</button>
 
</form>

> More steps

#javascript

Get URL and URL Parts in JavaScript

var newURL = window.location.protocol + "//" + window.location.host + "/" + window.location.pathname + window.location.search

#javascript

Add event listener to multiple buttons with the same class

    btns = document.getElementsByClassName("saveBtn");
    for (var i = 0; i < btns.length; i++) {
        btns[i].addEventListener("click", function () {
			//Add function here
        });
    }

#javascript

Replace Textarea with Codemirror editor

<link rel="stylesheet" type="text/css" href="plugin/codemirror/lib/codemirror.css">

<body>
	<textarea class="codemirror-textarea"></textarea>
</body>

<script>

$(document).ready(function(){
    var codeText = $(".codemirror-textarea")[0];
    var editor = CodeMirror.fromTextArea(codeText, {
        lineNumbers : true
    });
});

</script>

<script type="text/javascript" src="plugin/codemirror/lib/codemirror.js"></script>

#javascript #nodejs #commandline

HTTPS Redirect in Nodejs

npm install heroku-ssl-redirect

> More steps

#javascript #nodejs

Save to MongoDB without Duplicates Using Mongoose

var mongoose = require('mongoose');
var Schema = mongoose.Schema;

const exampleSchema = new Schema({
    title: { type: String , required: true},
    content: [{type: String}]
});


var Example = mongoose.model('Example', exampleSchema);
module.exports = Example;

> More steps

#html #javascript #jsfunctions #dom #dommanipulation

getAttribute() - how to get the value of any attribute of HTML element

<html>

<input id="contact" name="address">

<script>

    var x = document.getElementById("contact").getAttribute('name');

</script>

</html>

#javascript

Full Stack Developer

function full_stack_developer() {
    full_stack_developer();
}

#javascript #javascript #functions #parameters

Required Parameters for Functions in JavaScript

 const isRequired = () => { throw new Error('param is required'); };

const hello = (name = isRequired()) => { console.log(`hello ${name}`) };

// These will throw errors
hello();
hello(undefined);

// These will not
hello(null);
hello('David');
The idea here is that it uses default parameters, like how the b parameter here has a default if you don’t send it anything:
function multiply(a, b = 1) {
  return a * b;
}

#javascript

3 Ways To Detect Selenium Bots Using Javascript - CodingTutz

if (navigator.webdriver) {
    document.body.innerHTML = "This is a Bot";
}

#javascript

Copy Text to Clipboard with Line Breaks

function copyToClipboard(){

    var codeToBeCopied = document.getElementById('code-snippet').innerText;
    var emptyArea = document.createElement('TEXTAREA');
    emptyArea.innerHTML = codeToBeCopied;
    const parentElement = document.getElementById('post-title');
    parentElement.appendChild(emptyArea);

    emptyArea.select();
    document.execCommand('copy');

    parentElement.removeChild(emptyArea);
    M.toast({html: 'Code copied to clipboard'})

    }

#javascript #promises #howto

How to Get the Domain From A URL Using Javascript

var url = "http://scratch99.com/web-development/javascript/";
var urlParts = url.replace('http://','').replace('https://','').split(/[/?#]/);
var domain = urlParts[0];

#javascript

Mapping an Object

const object1 = {
  a: 'somestring',
  b: 42
};

for (let [key, value] of Object.entries(object1)) {
  console.log(`${key}: ${value}`);
}

#javascript #jquery

jQuery.post() - AJAX for post requests

$.ajax({
  type: "POST",
  url: url,
  data: data,
  success: success,
  dataType: dataType
});

#javascript

How to store fetch response in javascript variable - Stack Overflow

async function fun() {
  return fetch('https://jsonplaceholder.typicode.com/todos/1').then(res => res.json());
}

const data  = await fun();

How to Scrape Websites with Node.js and Cheerio

Save snippets that work from anywhere online with our extensions

Comments

More like this

Browse more snippets >>

Embed code snippet