A survey of backdoor attacks and defences: From deep neural networks to large language models

Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability...

詳細記述

書誌詳細
出版年:Journal of Electronic Science and Technology
主要な著者: Ling-Xin Jin, Wei Jiang, Xiang-Yu Wen, Mei-Yu Lin, Jin-Yu Zhan, Xing-Zhi Zhou, Maregu Assefa Habtie, Naoufel Werghi
フォーマット: 論文
言語:英語
出版事項: KeAi Communications Co., Ltd. 2025-09-01
主題:
オンライン・アクセス:http://www.sciencedirect.com/science/article/pii/S1674862X25000278
その他の書誌記述
要約:Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research.
ISSN:2666-223X