Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion

Traditional spam detection methodologies often neglect user privacy preservation, potentially incurring data leakage risks. Furthermore, current federated learning models for spam detection face several critical challenges: (1) data heterogeneity and instability during server-side parameter aggregat...

Full description

Bibliographic Details
Published in:Informatics
Main Authors: Yunpeng Xiong, Junkuo Cao, Guolian Chen
Format: Article
Language:English
Published: MDPI AG 2025-09-01
Subjects:
Online Access:https://www.mdpi.com/2227-9709/12/3/93
_version_ 1848776377250086912
author Yunpeng Xiong
Junkuo Cao
Guolian Chen
author_facet Yunpeng Xiong
Junkuo Cao
Guolian Chen
author_sort Yunpeng Xiong
collection DOAJ
container_title Informatics
description Traditional spam detection methodologies often neglect user privacy preservation, potentially incurring data leakage risks. Furthermore, current federated learning models for spam detection face several critical challenges: (1) data heterogeneity and instability during server-side parameter aggregation, (2) training instability in single neural network architectures leading to mode collapse, and (3) constrained expressive capability in multi-module frameworks due to excessive complexity. These issues represent fundamental research pain points in federated learning-based spam detection systems. To address this technical challenge, this study innovatively integrates federated learning frameworks with multi-feature fusion techniques to propose a novel spam detection model, FPW-BC. The FPW-BC model addresses data distribution imbalance through the FedProx aggregation algorithm and enhances stability during server-side parameter aggregation via a horse-racing selection strategy. The model effectively mitigates limitations inherent in both single and multi-module architectures through hierarchical multi-feature fusion. To validate FPW-BC’s performance, comprehensive experiments were conducted on six benchmark datasets with distinct distribution characteristics: CEAS, Enron, Ling, Phishing_email, Spam_email, and Fake_phishing, with comparative analysis against multiple baseline methods. Experimental results demonstrate that FPW-BC achieves exceptional generalization capability for various spam patterns while maintaining user privacy preservation. The model attained 99.40% accuracy on CEAS and 99.78% on Fake_phishing, representing significant dual improvements in both privacy protection and detection efficiency.
format Article
id doaj-art-cd5a035a5ffa4c70bf278a2e02beb211
institution Directory of Open Access Journals
issn 2227-9709
language English
publishDate 2025-09-01
publisher MDPI AG
record_format Article
spelling doaj-art-cd5a035a5ffa4c70bf278a2e02beb2112025-09-26T14:47:09ZengMDPI AGInformatics2227-97092025-09-011239310.3390/informatics12030093Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature FusionYunpeng Xiong0Junkuo Cao1Guolian Chen2School of Information Science and Technology, Hainan Normal University, Haikou 571158, ChinaInformation Network and Data Center, Hainan Normal University, Haikou 571158, ChinaState-Owned Assets Management Office, Hainan Normal University, Haikou 571158, ChinaTraditional spam detection methodologies often neglect user privacy preservation, potentially incurring data leakage risks. Furthermore, current federated learning models for spam detection face several critical challenges: (1) data heterogeneity and instability during server-side parameter aggregation, (2) training instability in single neural network architectures leading to mode collapse, and (3) constrained expressive capability in multi-module frameworks due to excessive complexity. These issues represent fundamental research pain points in federated learning-based spam detection systems. To address this technical challenge, this study innovatively integrates federated learning frameworks with multi-feature fusion techniques to propose a novel spam detection model, FPW-BC. The FPW-BC model addresses data distribution imbalance through the FedProx aggregation algorithm and enhances stability during server-side parameter aggregation via a horse-racing selection strategy. The model effectively mitigates limitations inherent in both single and multi-module architectures through hierarchical multi-feature fusion. To validate FPW-BC’s performance, comprehensive experiments were conducted on six benchmark datasets with distinct distribution characteristics: CEAS, Enron, Ling, Phishing_email, Spam_email, and Fake_phishing, with comparative analysis against multiple baseline methods. Experimental results demonstrate that FPW-BC achieves exceptional generalization capability for various spam patterns while maintaining user privacy preservation. The model attained 99.40% accuracy on CEAS and 99.78% on Fake_phishing, representing significant dual improvements in both privacy protection and detection efficiency.https://www.mdpi.com/2227-9709/12/3/93spam email detectionfederated learningmulti-feature fusionFedProxprivacy protection
spellingShingle Yunpeng Xiong
Junkuo Cao
Guolian Chen
Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion
spam email detection
federated learning
multi-feature fusion
FedProx
privacy protection
title Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion
title_full Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion
title_fullStr Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion
title_full_unstemmed Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion
title_short Federated Learning Spam Detection Based on FedProx and Multi-Level Multi-Feature Fusion
title_sort federated learning spam detection based on fedprox and multi level multi feature fusion
topic spam email detection
federated learning
multi-feature fusion
FedProx
privacy protection
url https://www.mdpi.com/2227-9709/12/3/93
work_keys_str_mv AT yunpengxiong federatedlearningspamdetectionbasedonfedproxandmultilevelmultifeaturefusion
AT junkuocao federatedlearningspamdetectionbasedonfedproxandmultilevelmultifeaturefusion
AT guolianchen federatedlearningspamdetectionbasedonfedproxandmultilevelmultifeaturefusion