Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations
Citations play an important role in ranking of authors, journals, institutions, and organizations. Sometimes, citing documents cite a reference many times in their full-text, which is further used in many application scenarios, such as: 1) finding relationship between cited and citing papers; 2) ide...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2017-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/7891553/ |
id |
doaj-b55eb203035f4d2ca979ec9aa4b9b2d5 |
---|---|
record_format |
Article |
spelling |
doaj-b55eb203035f4d2ca979ec9aa4b9b2d52021-03-29T20:09:37ZengIEEEIEEE Access2169-35362017-01-0155819582810.1109/ACCESS.2017.26899257891553Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text CitationsRiaz Ahmad0https://orcid.org/0000-0003-2908-5020Muhammad Tanvir Afzal1Muhammad Abdul Qadir2Department of Computer Science, Capital University of Science and Technology, Islamabad, PakistanDepartment of Computer Science, Capital University of Science and Technology, Islamabad, PakistanDepartment of Computer Science, Capital University of Science and Technology, Islamabad, PakistanCitations play an important role in ranking of authors, journals, institutions, and organizations. Sometimes, citing documents cite a reference many times in their full-text, which is further used in many application scenarios, such as: 1) finding relationship between cited and citing papers; 2) identifying influential cited paper from set of references in citing paper; 3) identification of suitable citation functions; and 4) study of in-text citations in different logical sections of papers to conclude different findings. The accurate identification of in-text citations remained an open area of research. Recently, the complexities involving automatic identification of in-text citations have been reported with an accuracy rate of 58%. This is due to many issues as highlighted by the state-of-the-art research. This paper investigates such issues in further details: 1) by taking benefits from the previous research; 2) by analyzing different referencing formats; and 3) by experimenting on a comprehensive data set. Based on the investigation, this paper proposes a taxonomy and workable system, which utilizes a set of heuristics build from detailed study. The proposed model is then applied on unseen diversified data set taken from the Journal of Universal Computer Science and CiteSeer. The proposed model was able to achieve an average F-score of 0.97 as compared with the baseline 0.58.https://ieeexplore.ieee.org/document/7891553/Citationscitation-anchortaxonomyciting papersreference stringin-text citations |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Riaz Ahmad Muhammad Tanvir Afzal Muhammad Abdul Qadir |
spellingShingle |
Riaz Ahmad Muhammad Tanvir Afzal Muhammad Abdul Qadir Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations IEEE Access Citations citation-anchor taxonomy citing papers reference string in-text citations |
author_facet |
Riaz Ahmad Muhammad Tanvir Afzal Muhammad Abdul Qadir |
author_sort |
Riaz Ahmad |
title |
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations |
title_short |
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations |
title_full |
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations |
title_fullStr |
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations |
title_full_unstemmed |
Pattern Analysis of Citation-Anchors in Citing Documents for Accurate Identification of In-Text Citations |
title_sort |
pattern analysis of citation-anchors in citing documents for accurate identification of in-text citations |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2017-01-01 |
description |
Citations play an important role in ranking of authors, journals, institutions, and organizations. Sometimes, citing documents cite a reference many times in their full-text, which is further used in many application scenarios, such as: 1) finding relationship between cited and citing papers; 2) identifying influential cited paper from set of references in citing paper; 3) identification of suitable citation functions; and 4) study of in-text citations in different logical sections of papers to conclude different findings. The accurate identification of in-text citations remained an open area of research. Recently, the complexities involving automatic identification of in-text citations have been reported with an accuracy rate of 58%. This is due to many issues as highlighted by the state-of-the-art research. This paper investigates such issues in further details: 1) by taking benefits from the previous research; 2) by analyzing different referencing formats; and 3) by experimenting on a comprehensive data set. Based on the investigation, this paper proposes a taxonomy and workable system, which utilizes a set of heuristics build from detailed study. The proposed model is then applied on unseen diversified data set taken from the Journal of Universal Computer Science and CiteSeer. The proposed model was able to achieve an average F-score of 0.97 as compared with the baseline 0.58. |
topic |
Citations citation-anchor taxonomy citing papers reference string in-text citations |
url |
https://ieeexplore.ieee.org/document/7891553/ |
work_keys_str_mv |
AT riazahmad patternanalysisofcitationanchorsincitingdocumentsforaccurateidentificationofintextcitations AT muhammadtanvirafzal patternanalysisofcitationanchorsincitingdocumentsforaccurateidentificationofintextcitations AT muhammadabdulqadir patternanalysisofcitationanchorsincitingdocumentsforaccurateidentificationofintextcitations |
_version_ |
1724195233307033600 |