A Reproducible IT-Blog Corpus
The dataset comprises text and metadata extracted from several hundred IT-blogs and websites, along with a method to duplicate the data by updating its contents and downloading it to the user’s local machine. The targets have been hand-picked with the intention to represent the discourse on blogs an...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ubiquity Press
2021-07-01
|
Series: | Journal of Open Humanities Data |
Subjects: | |
Online Access: | https://openhumanitiesdata.metajnl.com/articles/35 |