A Reinforcement Learning Framework for Smart, Secure, and Efficient Cyber-Physical Autonomy

Embedded sensors, computation, and communication have enabled the development of sophisticated control devices for a wide range of cyber-physical applications that include safety monitoring, surveillance, health care, motion planning, search and rescue, traffic monitoring, and power systems. However, the deployment of such devices has been slowed down by concerns regarding their sensitivity to modeling accuracy and their vulnerability to both stochastic failures and malicious attacks. Nowadays the efficiency will be defined by potentials to adapt (complete autonomy) in decentralized, unknown, and complex environments to enable capabilities beyond human limits. Until the achievement of such autonomy, cyber-physical technologies remain a critical issue.

Methods from network security and control theory will be combined to design a new paradigm of proactive defense control mechanisms. For such a problem, different modes of operation for the system will be defined to isolate and identify suspicious actuators and sensors. Following the principles of moving target defense, the system’s unpredictability will be maximized, quantified by the information entropy, in order to dynamically and stochastically switch the attack surface while optimally controlling the system. To better understand the behavior of the attackers that act on this system, a framework of bounded reasoning will be introduced to approximate the strategies utilized by attackers of different levels of intelligence.

Finally, a novel model-free deep Q-learning control framework will be presented to combine all the aforementioned techniques and converge online in real time to game-theoretic control solutions in the presence of persistent adversaries while guaranteeing closed-loop stability of the equilibrium point.