RCrawler: An R package for parallel web crawling and scraping

RCrawler is a contributed R package for domain-based web crawling and content scraping. As the first implementation of a parallel web crawler in the R environment, RCrawler can crawl, parse, store pages, extract contents, and produce data that can be directly employed for web content mining applicat...

Full description

Bibliographic Details
Main Authors: Salim Khalil, Mohamed Fakir
Format: Article
Language:English
Published: Elsevier 2017-01-01
Series:SoftwareX
Online Access:http://www.sciencedirect.com/science/article/pii/S2352711017300110