Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations

Citations play an important role in ranking of authors, journals, institutions, and organizations. Sometimes, citing documents cite a reference many times in their full-text, which is further used in many application scenarios, such as: 1) finding relationship between cited and citing papers; 2) ide...

Full description

Bibliographic Details
Main Authors: Riaz Ahmad, Muhammad Tanvir Afzal, Muhammad Abdul Qadir
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7891553/
id doaj-b55eb203035f4d2ca979ec9aa4b9b2d5
record_format Article
spelling doaj-b55eb203035f4d2ca979ec9aa4b9b2d52021-03-29T20:09:37ZengIEEEIEEE Access2169-35362017-01-0155819582810.1109/ACCESS.2017.26899257891553Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text CitationsRiaz Ahmad0https://orcid.org/0000-0003-2908-5020Muhammad Tanvir Afzal1Muhammad Abdul Qadir2Department of Computer Science, Capital University of Science and Technology, Islamabad, PakistanDepartment of Computer Science, Capital University of Science and Technology, Islamabad, PakistanDepartment of Computer Science, Capital University of Science and Technology, Islamabad, PakistanCitations play an important role in ranking of authors, journals, institutions, and organizations. Sometimes, citing documents cite a reference many times in their full-text, which is further used in many application scenarios, such as: 1) finding relationship between cited and citing papers; 2) identifying influential cited paper from set of references in citing paper; 3) identification of suitable citation functions; and 4) study of in-text citations in different logical sections of papers to conclude different findings. The accurate identification of in-text citations remained an open area of research. Recently, the complexities involving automatic identification of in-text citations have been reported with an accuracy rate of 58%. This is due to many issues as highlighted by the state-of-the-art research. This paper investigates such issues in further details: 1) by taking benefits from the previous research; 2) by analyzing different referencing formats; and 3) by experimenting on a comprehensive data set. Based on the investigation, this paper proposes a taxonomy and workable system, which utilizes a set of heuristics build from detailed study. The proposed model is then applied on unseen diversified data set taken from the Journal of Universal Computer Science and CiteSeer. The proposed model was able to achieve an average F-score of 0.97 as compared with the baseline 0.58.https://ieeexplore.ieee.org/document/7891553/Citationscitation-anchortaxonomyciting papersreference stringin-text citations
collection DOAJ
language English
format Article
sources DOAJ
author Riaz Ahmad
Muhammad Tanvir Afzal
Muhammad Abdul Qadir
spellingShingle Riaz Ahmad
Muhammad Tanvir Afzal
Muhammad Abdul Qadir
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
IEEE Access
Citations
citation-anchor
taxonomy
citing papers
reference string
in-text citations
author_facet Riaz Ahmad
Muhammad Tanvir Afzal
Muhammad Abdul Qadir
author_sort Riaz Ahmad
title Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
title_short Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
title_full Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
title_fullStr Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
title_full_unstemmed Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
title_sort pattern analysis of citation-anchors in citing documents for accurate identification of in-text citations
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description Citations play an important role in ranking of authors, journals, institutions, and organizations. Sometimes, citing documents cite a reference many times in their full-text, which is further used in many application scenarios, such as: 1) finding relationship between cited and citing papers; 2) identifying influential cited paper from set of references in citing paper; 3) identification of suitable citation functions; and 4) study of in-text citations in different logical sections of papers to conclude different findings. The accurate identification of in-text citations remained an open area of research. Recently, the complexities involving automatic identification of in-text citations have been reported with an accuracy rate of 58%. This is due to many issues as highlighted by the state-of-the-art research. This paper investigates such issues in further details: 1) by taking benefits from the previous research; 2) by analyzing different referencing formats; and 3) by experimenting on a comprehensive data set. Based on the investigation, this paper proposes a taxonomy and workable system, which utilizes a set of heuristics build from detailed study. The proposed model is then applied on unseen diversified data set taken from the Journal of Universal Computer Science and CiteSeer. The proposed model was able to achieve an average F-score of 0.97 as compared with the baseline 0.58.
topic Citations
citation-anchor
taxonomy
citing papers
reference string
in-text citations
url https://ieeexplore.ieee.org/document/7891553/
work_keys_str_mv AT riazahmad patternanalysisofcitationanchorsincitingdocumentsforaccurateidentificationofintextcitations
AT muhammadtanvirafzal patternanalysisofcitationanchorsincitingdocumentsforaccurateidentificationofintextcitations
AT muhammadabdulqadir patternanalysisofcitationanchorsincitingdocumentsforaccurateidentificationofintextcitations
_version_ 1724195233307033600