Criminal networks analysis in missing data scenarios through graph distances

Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same informati...

Full description

Bibliographic Details
Main Authors: Annamaria Ficara, Lucia Cavallaro, Francesco Curreri, Giacomo Fiumara, Pasquale De Meo, Ovidiu Bagdasar, Wei Song, Antonio Liotta
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8357088/?tool=EBI
id doaj-11c848f35e2c47729b66b384f6ea4022
record_format Article
spelling doaj-11c848f35e2c47729b66b384f6ea40222021-08-14T04:31:04ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01168Criminal networks analysis in missing data scenarios through graph distancesAnnamaria FicaraLucia CavallaroFrancesco CurreriGiacomo FiumaraPasquale De MeoOvidiu BagdasarWei SongAntonio LiottaData collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific methods: (i) random edge removal, simulating the scenario in which the Law Enforcement Agencies fail to intercept some calls, or to spot sporadic meetings among suspects; (ii) node removal, modeling the situation in which some suspects cannot be intercepted or investigated. Finally we compute spectral distances (i.e., Adjacency, Laplacian and normalized Laplacian Spectral Distances) and matrix distances (i.e., Root Euclidean Distance) between the complete and pruned networks, which we compare using statistical analysis. Our investigation identifies two main features: first, the overall understanding of the criminal networks remains high even with incomplete data on criminal interactions (i.e., when 10% of edges are removed); second, removing even a small fraction of suspects not investigated (i.e., 2% of nodes are removed) may lead to significant misinterpretation of the overall network.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8357088/?tool=EBI
collection DOAJ
language English
format Article
sources DOAJ
author Annamaria Ficara
Lucia Cavallaro
Francesco Curreri
Giacomo Fiumara
Pasquale De Meo
Ovidiu Bagdasar
Wei Song
Antonio Liotta
spellingShingle Annamaria Ficara
Lucia Cavallaro
Francesco Curreri
Giacomo Fiumara
Pasquale De Meo
Ovidiu Bagdasar
Wei Song
Antonio Liotta
Criminal networks analysis in missing data scenarios through graph distances
PLoS ONE
author_facet Annamaria Ficara
Lucia Cavallaro
Francesco Curreri
Giacomo Fiumara
Pasquale De Meo
Ovidiu Bagdasar
Wei Song
Antonio Liotta
author_sort Annamaria Ficara
title Criminal networks analysis in missing data scenarios through graph distances
title_short Criminal networks analysis in missing data scenarios through graph distances
title_full Criminal networks analysis in missing data scenarios through graph distances
title_fullStr Criminal networks analysis in missing data scenarios through graph distances
title_full_unstemmed Criminal networks analysis in missing data scenarios through graph distances
title_sort criminal networks analysis in missing data scenarios through graph distances
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2021-01-01
description Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific methods: (i) random edge removal, simulating the scenario in which the Law Enforcement Agencies fail to intercept some calls, or to spot sporadic meetings among suspects; (ii) node removal, modeling the situation in which some suspects cannot be intercepted or investigated. Finally we compute spectral distances (i.e., Adjacency, Laplacian and normalized Laplacian Spectral Distances) and matrix distances (i.e., Root Euclidean Distance) between the complete and pruned networks, which we compare using statistical analysis. Our investigation identifies two main features: first, the overall understanding of the criminal networks remains high even with incomplete data on criminal interactions (i.e., when 10% of edges are removed); second, removing even a small fraction of suspects not investigated (i.e., 2% of nodes are removed) may lead to significant misinterpretation of the overall network.
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8357088/?tool=EBI
work_keys_str_mv AT annamariaficara criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT luciacavallaro criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT francescocurreri criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT giacomofiumara criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT pasqualedemeo criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT ovidiubagdasar criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT weisong criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
AT antonioliotta criminalnetworksanalysisinmissingdatascenariosthroughgraphdistances
_version_ 1721207731713474560