Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management

Despite rapid advances in automated text processing, many related tasks in transit and other transportation agencies are still performed manually. For example, incident management reports are often manually processed and subsequently stored in a standardized format for later use. The information con...

Full description

Bibliographic Details
Main Authors: Peyman Noursalehi, Haris N. Koutsopoulos, Jinhua Zhao
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Open Journal of Intelligent Transportation Systems
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9261594/
id doaj-b8e014672a1b4b9596652f00062037df
record_format Article
spelling doaj-b8e014672a1b4b9596652f00062037df2021-04-09T23:00:28ZengIEEEIEEE Open Journal of Intelligent Transportation Systems2687-78132020-01-01122723610.1109/OJITS.2020.30383959261594Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption ManagementPeyman Noursalehi0https://orcid.org/0000-0001-5491-835XHaris N. Koutsopoulos1https://orcid.org/0000-0003-3830-9794Jinhua Zhao2https://orcid.org/0000-0002-1929-7583Department of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA, USADepartment of Civil and Environmental Engineering, Northeastern University, Boston, MA, USADepartment of Urban Studies and Planning, Massachusetts Institute of Technology, Cambridge, MA, USADespite rapid advances in automated text processing, many related tasks in transit and other transportation agencies are still performed manually. For example, incident management reports are often manually processed and subsequently stored in a standardized format for later use. The information contained in such reports can be valuable for many reasons: identification of issues with response actions, underlying causes of each incident, impacts on the system, etc. In this article, we develop a comprehensive, pragmatic automated framework for analyzing rail incident reports to support a wide range of applications and functions, depending on the constraints of the available data. The objectives are twofold: a) extract information that is required in the standard report forms (automation), and b) extract other useful content and insights from the unstructured text in the original report that would have otherwise been lost/ignored (knowledge discovery). The approach is demonstrated through a case study involving analysis of 23,728 records of general incidents in the London Underground (LU). The results show that it is possible to automatically extract delays, impacts on trains, mitigating strategies, underlying incident causes, and insights related to the potential actions and causes, as well as accurate classification of incidents into predefined categories.https://ieeexplore.ieee.org/document/9261594/Incidentsinformation extractionnatural language processingdeep learningBERT
collection DOAJ
language English
format Article
sources DOAJ
author Peyman Noursalehi
Haris N. Koutsopoulos
Jinhua Zhao
spellingShingle Peyman Noursalehi
Haris N. Koutsopoulos
Jinhua Zhao
Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management
IEEE Open Journal of Intelligent Transportation Systems
Incidents
information extraction
natural language processing
deep learning
BERT
author_facet Peyman Noursalehi
Haris N. Koutsopoulos
Jinhua Zhao
author_sort Peyman Noursalehi
title Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management
title_short Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management
title_full Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management
title_fullStr Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management
title_full_unstemmed Machine-Learning-Augmented Analysis of Textual Data: Application in Transit Disruption Management
title_sort machine-learning-augmented analysis of textual data: application in transit disruption management
publisher IEEE
series IEEE Open Journal of Intelligent Transportation Systems
issn 2687-7813
publishDate 2020-01-01
description Despite rapid advances in automated text processing, many related tasks in transit and other transportation agencies are still performed manually. For example, incident management reports are often manually processed and subsequently stored in a standardized format for later use. The information contained in such reports can be valuable for many reasons: identification of issues with response actions, underlying causes of each incident, impacts on the system, etc. In this article, we develop a comprehensive, pragmatic automated framework for analyzing rail incident reports to support a wide range of applications and functions, depending on the constraints of the available data. The objectives are twofold: a) extract information that is required in the standard report forms (automation), and b) extract other useful content and insights from the unstructured text in the original report that would have otherwise been lost/ignored (knowledge discovery). The approach is demonstrated through a case study involving analysis of 23,728 records of general incidents in the London Underground (LU). The results show that it is possible to automatically extract delays, impacts on trains, mitigating strategies, underlying incident causes, and insights related to the potential actions and causes, as well as accurate classification of incidents into predefined categories.
topic Incidents
information extraction
natural language processing
deep learning
BERT
url https://ieeexplore.ieee.org/document/9261594/
work_keys_str_mv AT peymannoursalehi machinelearningaugmentedanalysisoftextualdataapplicationintransitdisruptionmanagement
AT harisnkoutsopoulos machinelearningaugmentedanalysisoftextualdataapplicationintransitdisruptionmanagement
AT jinhuazhao machinelearningaugmentedanalysisoftextualdataapplicationintransitdisruptionmanagement
_version_ 1721532337316954112