Summary: | As the worst-case interacting false data to the power system state estimation (SE), cyber data attacks can avoid being filtered out by most bad data detectors. In this study, coordinated attacks (unobservable attack and logic bomb attack) and coordinated defences (honeypot and weakening vision) are used to analyse attackers’ and defenders’ behaviours, respectively. To quantify the potential physical influences (attack-and-defence) benefits, the residual of the expected state is devised. Subsequently, a zero-sum stochastic game is utilised to model the interaction between the cyber-physical power system and the external attack-and-defence actions. This game is demonstrated to admit a Nash equilibrium and the minimax Q-learning algorithm is introduced to enable the two players to reach their equilibrium strategies while maximising their respective minimum rewards in a sequence of stages. Numerous simulations of the stochastic game model on the IEEE 14-bus system show that while resisting the isolated or coordinated attacks, the optimal coordinated defences are more effective than those of isolated attacks.
|