Introduction to Beautiful Soup
Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages. Say you’ve found some webpages that display data relevant to your research, such as date or address information, but that do not provide any way of downloading the data directly. Beautiful Soup helps yo...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Editorial Board of the Programming Historian
2012-12-01
|
Series: | The Programming Historian |
Subjects: | |
Online Access: | http://programminghistorian.org/lessons/intro-to-beautiful-soup |
Summary: | Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages. Say you’ve found some webpages that display data relevant to your research, such as date or address information, but that do not provide any way of downloading the data directly. Beautiful Soup helps you pull particular content from a webpage, remove the HTML markup, and save the information. It is a tool for web scraping that helps you clean up and parse the documents you have pulled down from the web. |
---|---|
ISSN: | 2397-2068 |