Multi-class and Multi-label classication of Darkweb Data

abstract: In this research, I try to solve multi-class multi-label classication problem, where the goal is to automatically assign one or more labels(tags) to discussion topics seen in deepweb. I observed natural hierarchy in our dataset, and I used dierent techniques to ensure hierarchical integ...

Full description

Bibliographic Details
Other Authors: Patil, Revanth (Author)
Format: Dissertation
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.48469
Description
Summary:abstract: In this research, I try to solve multi-class multi-label classication problem, where the goal is to automatically assign one or more labels(tags) to discussion topics seen in deepweb. I observed natural hierarchy in our dataset, and I used dierent techniques to ensure hierarchical integrity constraint on the predicted tag list. To solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised model based on elastic search(ES) document relevance score. I evaluate our models using standard K-fold cross-validation method. Ensuring hierarchical integrity constraints improved F1 score by 11.9% over standard supervised learning, while our ES based semi-supervised learning model out-performed other models in terms of precision(78.4%) score while maintaining comparable recall(21%) score. === Dissertation/Thesis === Masters Thesis Computer Science 2018