Scraping Data From the Web
Coming August 2018…
About this Course
Almost any information you want is available on the Internet. Web scraping is a key tool for data mining that information allowing for web page exploration and collection for a variety of reporting. The tools and techniques used in this course allow for data to be collected that would otherwise not be easily accessible without robotic assistance.
What you'll learn
- An introduction to the Beautiful Soup Python package
- How to scrape a web page with Beautiful Soup
- An introduction to the Scrapy Python package
- How to crawl a website with Scrapy
- Web scraping considerations
Introducing Data Scraping
A look at what data scraping is and how it is used. We'll have a discussion about how a web page is designed and look at the Python package, *Beautiful Soup*, to scrape data from the web.
A World Full of Spiders
To go beyond scraping a single web page we need to crawl the web. Enter web crawlers, or *spiders*. We'll take a look the basics of crawling the web with Scrapy and talk about saving scraped data.
Additional Scraping Tasks
Going beyond static web pages can be a challenge when scraping. Working with web forms and APIs can require a different approach. We'll also touch on how to write tests for a web scraper.
Ken W. Alger
Ken has a long history around computers starting with early Commodore PETs and VIC-20s. He enjoys discussing programming and how to get started in the tech industry and is a MongoDB Certified Developer.
He lives in Oregon with his wife and three children. He can be found most places online @kenwalger.