| Summary: | Traditional spam detection methodologies often neglect user privacy preservation, potentially incurring data leakage risks. Furthermore, current federated learning models for spam detection face several critical challenges: (1) data heterogeneity and instability during server-side parameter aggregation, (2) training instability in single neural network architectures leading to mode collapse, and (3) constrained expressive capability in multi-module frameworks due to excessive complexity. These issues represent fundamental research pain points in federated learning-based spam detection systems. To address this technical challenge, this study innovatively integrates federated learning frameworks with multi-feature fusion techniques to propose a novel spam detection model, FPW-BC. The FPW-BC model addresses data distribution imbalance through the FedProx aggregation algorithm and enhances stability during server-side parameter aggregation via a horse-racing selection strategy. The model effectively mitigates limitations inherent in both single and multi-module architectures through hierarchical multi-feature fusion. To validate FPW-BC’s performance, comprehensive experiments were conducted on six benchmark datasets with distinct distribution characteristics: CEAS, Enron, Ling, Phishing_email, Spam_email, and Fake_phishing, with comparative analysis against multiple baseline methods. Experimental results demonstrate that FPW-BC achieves exceptional generalization capability for various spam patterns while maintaining user privacy preservation. The model attained 99.40% accuracy on CEAS and 99.78% on Fake_phishing, representing significant dual improvements in both privacy protection and detection efficiency.
|