Elsevier

Future Generation Computer Systems

Volume 115, February 2021, Pages 171-187
Future Generation Computer Systems

Autonomous mitigation of cyber risks in the Cyber–Physical Systems

https://doi.org/10.1016/j.future.2020.09.002Get rights and content

Highlights

  • In this paper, we introduce a new Autonomous Response Controller (ARC) to respond against the attacks across the Cyber Physical Systems (CPS).

  • ARC provides a scalable and autonomous way with or without human intervention based on the criticality of the CPS asset that can be protected.

  • ARC provides quick and timely responses by considering the system situation awareness at each response point using a new intelligent Competitive Markov Model.

  • ARC composes a long-term response plan that optimizes on long-term gains to respond against the tactic multistage attacks and considers the special CPPS characteristics, requirements, and response impact.

  • We introduce two practical case studies along with several experiments to evaluate the proposed approaches.

Abstract

The Cyber–Physical Systems (CPS) attacks and vulnerabilities are increasing and the consequences of such attacks can be catastrophic. The CPS needs to be self-resilient to cyber-attacks through a precise autonomous and timely risk mitigation model that can analyze and assess the risk of the CPS and apply a proper response strategy against the ongoing attacks. There is a limited amount of work on the self-protection of the cyber risks in the CPS. This paper contributes toward the need of advanced security approaches to respond against the attacks across the CPS in an autonomous way, with or without including a system administrator in the loop for troubleshooting based on the criticality of the CPS asset that can be protected, once the alert about a possible intrusion has been raised. To this end, this paper augments our existing security framework with an Autonomous Response Controller (ARC). ARC uses our quantitative Hierarchical Risk Correlation Tree (HRCT) that models the paths an attacker can traverse to reach certain goals and measures the financial risk that the CPS assets face from cyber-attacks. ARC also uses a Competitive Markov Decision Process (CMDP) to model the security reciprocal interaction between the protection system and the attacker/adversary as a multi-step, sequential, two-player stochastic game in which each player tries to maximize his/her benefit. The experiments’ results depict that the accuracy of ARC outperforms the traditional Static Intrusion Response System (S-IRS) by 43.61%. To experimentally test and validate ARC in real-time large-scale data, we run the Aurora attack to open the generator breaker in our testbed to create a cascading failure and voltage collapse. ARC was able to recover the CPS system and provide a timely response in less than 6 s. We compared the output of ARC against the current state of the art, the Suricata intrusion response system. ARC was able to mitigate the single line to ground (SLG) attacks and recover the CPS to its normal state in 122 s before Suricata does.

Introduction

Most CPS protocols were designed long before network security perceived to be a problem. The traditional CPS was a closed serial network that contained only trusted devices with little or no connection to the outside world. As CPS and control networks evolved, the use of TCP/IP and Ethernet became commonplace and interfacing with business systems became the norm. The result was that the closed trust model no longer applied and vulnerabilities in these systems began to appear [1]. In particular, network security problems from the business network and the world at large could be passed onto the process and CPS networks, putting industrial production, environmental integrity, and human safety at risk. There are increasing security assessment requirements for the CPS, specifically to achieve compliance requirements for regulatory agencies. CPS need a timely response in the presence of attacks. In this paper, we will focus on power systems as one of the application domains of the CPS.

The energy sector is one of the most important sectors within which the electricity delivery system has a critical role to maintain most of the functionality of all the sectors. Coordinated cyber-attacks on Cyber–Physical Power Systems (CPPS) may cause cascading failures over large areas of operation. Cyber-attacks on CPPS can create N-k contingencies, which are even more critical compared to a single or double component failure. Coordinated attacks on multiple components can disrupt electrical power over a wide geographical area. Current practices in power systems design do not consider cyber-security requirements, and the monitoring and control practice do not consider the potential cyber borne contingencies and disturbances. [2] Hence, there is a need to incorporate advanced security management in power systems to enhance situational awareness for reliable operation. Besides, the tool must be able to identify the event location as fast as possible to avert chain failure and isolate the affected part of the system to ensure system stability and reliability.

Recently, attacks and other misfortunes on the CPPS have increased and led to cascading failures and power outages [3]. There is a limited amount of work on the development of an autonomous risk mitigation systems that adaptively work for the CPPS. Such systems should be able to select the proper countermeasure to mitigate and respond against the ongoing attacks in the CPPS in an autonomous way.

The state of art introduced two types of Intrusion Response Systems (IRS), a static and dynamic IRS. The static IRS is based on a static mapping between the detected attack and its best countermeasure (e.g., [4]) and it exhibits all the limits of a static approach such as the scalability problems, a static mapping requires the system administrator to periodically update the set of known attacks and to associate them to the proper response. The dynamic IRS is based on a dynamic evaluation of all the response actions, according to the detected attack, the observed alerts, and a list of evaluation criteria to determine an appropriate response to take. The first step in the response selection process is to determine which services in the system are likely affected, taking into account the characteristics of the detector, the network topology, etc. The actual choice of the response is then taken dependent on a host of factors, such as the amount of evidence about the attack, the severity of the response, etc. The last step is to determine the effectiveness of the deployed response to decide if further responses are required for the current attack or to modify the measure of the effectiveness of the deployed response to guide future choices. Not all IRSs in this class include all these steps. A wide variety is discernible in this class based on the sophistication of the algorithms. The systems presented in [5], [6], [7], [8], [9], [10], [11], [12], [13], [14] fall in this category.

In [10] authors introduced a recent real-time IRS in SDN (Software Defined Networking) using precomputation to estimate the likelihood of future attack paths from an ongoing attack. The experimental analysis shows that the proposed system can estimate possible attack paths of an ongoing attack to mitigate it in real-time, as well as showing the security metrics that depend on the flow table, including the SDN component. Other real-time IDS/IPS solutions on the market in 2020 are the Snort IDS log analyzer tool [15], which works with Snort and the Security Event Manager [16]. The dynamic IRS is more scalable than the static one but it also exposes an evident scalability problem, as the administrator must identify a score for all the considered response actions concerning all the considered attacks. None of these systems, particularly in the CPPS security domain, considers composing a long term response plan that connects the attack-related events and system states with its response selection process, this is also called the situation awareness that should be considered at each response point, nor considers the special CPPS characteristics and requirements such as the criticality of the assets’ operation, high response impact and consequences, compliance requirements for regulatory agencies, and high level of scalability and interoperability that CPPS maintains. Recent studies in [17], [18], [19] introduced some machine learning techniques to automate the security process of the CPS. However, none of these studies consider the system situation awareness at each response point and they are not practical in real-time deployment. In [20] authors focus on the problem of structural controllability in the context of electrical power network control. These problems are known to be NPNP-hard with poor approximability. The authors study strategies for the efficient restoration of controllability following attacks and attacker-defender interactions in power-law networks. In [21] authors introduced a model for wide-area situational awareness that is based on a set of current technologies such as the wireless sensor networks, the ISA100.11a standard, and cloud-computing together with a set of high-level functional services such as the detection of anomalies in the observation tasks, response to incidents, recovery of states and control in crisis situations, and global and local support for prevention through a simple forecast scheme. In [22] authors studied the cyber–physical control systems security from the. redundancy-based restoration mechanisms perspective. They presented a network infrastructure based on three layers, where the redundant support is primarily concentrated on a fog-based structure to protect a specific subset of cyber–physical control devices. The specification of the context and the abstract construction of the approach include a set of conceptual theories related to structural controllability, power dominance, supernode, and opinion dynamics.

Based on the state of art, the proper autonomous controller for the CPPS should be able to (1) compose a long term response plan, (2) consider the attack-related events, the CPPS states, and response actions in such a way that helps in identifying a set of target system states, and (3) replace the manual mapping done by the CPPS system administrator by an autonomous response system to drive the CPPS towards the set of desired target states. Such long term plan can exploit different combinations of the same atomic actions to deal with different attacks effectively, including unknown attacks. To build such an autonomous controller, there are two main components to be developed, a risk assessment model and an autonomous countermeasure selector controller.

In this paper, we introduce a new Autonomous Response Controller (ARC) to respond against the attacks across the CPPS in a scalable and autonomous way with or without human intervention based on the criticality of the CPPS asset that can be protected. ARC uses our risk assessment model that uses a Hierarchical Risk Correlation Tree (HRCT) described in Section 5. HRCT quantitatively assesses the risk in the CPPS and provides the required risk assessment input parameters to the ARC and helps assess how much security is improved if a specific response or security enhancement is applied. This in turn helps the ARC to select between various enhancement choices, prioritize them by their relative effectiveness by measuring the improvement in the proposed degree-of-security indices, and make cost justifications.

The ARC can: (1) compute proper responses despite the IDSs alert granularity shortcomings of today’s IDSs that cannot generate alerts that match perfectly to successful intrusions. It does so by concurrently accounting for inherent uncertainties (i.e., false alarm rates) in IDS alerts with attack-response sets converted to a Competitive Markov Decision Processes (CMDP) [23]. In this way, ARC considers the system situation awareness at each response point by connecting the attack-related events and system security states with the response selection process, (2) compose a long term response plan that optimizes long-term gains. Such plans are used to respond against the tactic multistage attacks in which the attackers execute multistage attacks with intelligent planned strategies, to address countermeasures taken by intrusion response systems along the way. (3) Consider the special CPPS characteristics and requirements such as the criticality of assets operation, availability of the CPPS resources, and response impact and consequences by giving dynamic weights for the payoff gain and security risk. (4) Provide timely responses against ongoing attacks due to the low-security state space that the HRCT produces.

Our results indicate that (1) adjusting detection and response strength in response to attacker strength and behavior detected can significantly improve the reliability of the CPS, (2) the ARC can mitigate a multistage attack and recovers the affected CPS services, and to take longer-term actions of reconfiguration of the CPS to make future attacks of a similar kind less likely to succeed.

This paper is organized as follows. Section 2 describes our CPPS testbed, Section 3 highlights our existing security framework then, Section 4 discusses the attack scenarios in our Testbed. After that, Section 5 highlights our existing risk assessment model. Section 6 introduces the ARC and a practical case study for the implementation of the ARC model. Finally, Section 7 concludes the paper and draws the future work.

Section snippets

The CPPS testbed

Our CPPS testbed is modeled according to [12] using a physical device integration including phasor measurement units (PMU), relays, PDC, and a real-time digital simulator (RTDS), see Fig. 1. It also has a PC running the Snort IDS [24], [25], OpenPDC, Syslog, and a PC to launch attacks. The testbed consists of three main components: the MATLAB/RSCAD parameter calculation engine, the data collection, and the processing system, and the power system model. This testbed is capable of simulating

The existing security framework

In the following, we give a high-level description of our security framework components and processes as shown in Fig. 2. In this paper, we will focus only on the autonomic response selection process. For more details about the framework’s processes see [28], [29].

(a) Collection. This process collects events and logs from several signatures based sensors and sends them to the integration process. The collection sensors perform three core functions through various means: collecting logs,

Attack scenarios in the testbed

We used our testbed to simulate attacks from each of the following five categories:

1- Cyber command injection attacks and physical attacks are available to create contingencies. Relays can be tripped by remote command injection attacks. Relays from two vendors are available in the testbed. Attacks are available to remotely trip both types of relays. In both cases, a network packet capture tool was used to capture commands used to remotely trip the relay. In the attack scenario, these commands

The hierarchical risk correlation tree model

In this section, we briefly highlight our risk assessment model that assesses the risk in the CPPS infrastructure based on the alert level of different events by measuring the potential impact of a threat on assets given the probability that it will occur, and it provides useful information to evaluate the system’s overall security state. The estimated risk of each event is not assigned statically; rather it is assigned an initial value that is modified dynamically as the event is correlated to

The autonomous response controller

The purpose of the ARC is to plan an optimal response policy able to drive the system towards a secure state. Since we use a long-term probabilistic model, it is important to verify, step by step, that the system evolves as predicted. To develop the ARC, we model the security reciprocal interaction between the ARC system and the attacker/adversary as a multi-step, sequential, two-player stochastic game in which each player tries to maximize his/her benefit. ARC leverages the HRCT and the

Conclusion and future work

The power grid has become one of the vital Cyber–Physical Systems. However, there are increasing security assessment requirements for the CPPS, specifically to achieve compliance requirements for regulatory agencies. There is a limited amount of work on the development of an autonomous risk assessment system that adaptively works for the CPPS and quantitatively assesses the financial damage of the attacks and computes the severity of such attacks in the CPPS. In this paper, we introduced a

CRediT authorship contribution statement

Hisham A. Kholidy: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing - original draft, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research was generously supported in part by the SUNY Polytechnic Institute Research Seed Grant Program. The author approved the version of the manuscript to be published.

Hesham AbdElazim I. Mohamed (Hisham A. Kholidy) received his Ph.D. in Computer Science in a joint Ph.D. program between University of Pisa in Italy and University of Arizona in USA. He works as an assistant professor at department of Networks and Computer Security (NCS), College of Engineering, State University of New York (SUNY) Polytechnic Institute. Prior to that, he worked as a postdoctoral associate at Mississippi State University. During his Ph.D., he worked as associate researcher at the

References (50)

  • AlcarazCristina et al.

    WASAM: A dynamic wide-area situational awareness model for critical domains in smart grids

    Future Gener. Comput. Syst.

    (2014)
  • MuChengpo. et al.

    Chengpo mu yingjiu li an intrusion response decision-making model based on hierar-chical task network planning

    Expert Syst. Appl.

    (2010)
  • ByresE.J. et al.

    The use of attack trees in assessing vulnerabilities in SCADA systems

  • AdhikariU.

    Event and Intrusion Detection Systems for Cyber-Physical Power Systems

    (2015)
  • J.Wang et al.

    Cascade-based attack vulnerability on the US power grid

    Saf. Sci.

    (2009)
  • M.E. Locasto, K. Wang, A.D. Keromytis, S.J. Stolfo, FLIPS: Hybrid adaptive intrusion prevention, in: Paper presented at...
  • KholidyHisham A. et al.

    A risk mitigation approach for autonomous cloud intrusion response system

    J. Comput.

    (2016)
  • Hisham A. Kholidy, Abdelkarim Erradi, A cost-aware model for risk mitigation in cloud computing systems, in: Successful...
  • Shameli-SendiA. et al.

    Orcef: Online response cost evaluation framework for intrusion response system

    J. Netw. Comput. Appl.

    (2015)
  • OssenbuhlS. et al.

    Towards automated incident handling: How to select an appropriate response against a network-based attack?

  • ChungC.-J. et al.

    Nice: Network intrusion detection and countermeasure selection in virtual network systems

    IEEE Trans. Dependable Secure Comput.

    (2013)
  • EomTaehoon et al.

    A framework for real-time intrusion response in software defined networking using precomputed graphical security models

    Secur. Commun. Netw. J.

    (2020)
  • MiehlingE. et al.

    Optimal defense policies for partially observable spreading processes on bayesian attack graphs

  • Uttam Adhikari, Thomas H. Morris, Shengyi Pan, A cyber-physical power system test bed for intrusion detection systems,...
  • ChenQian et al.

    Towards realizing a distributed event and intrusion detection system

  • Hisham A. Kholidy, Abdelkarim Erradi, Sherif Abdelwahed, Abdulrahman Azab, A finite state hidden markov model for...
  • Snort IDS Log Analyzer Tool,...
  • Security Event Manager,...
  • Cedric CarterZachary, ThomasChristian Birk Jones, Intrusion Detection & Response using an Unsupervised Artificial...
  • DingDing Jianguo

    Intrusion detection, prevention, and response system (IDPRS) for cyber-physical systems (CPSs)

  • MitchellRobert et al.

    Effect of intrusion detection and response on reliability of cyber physical systems

    IEEE Trans. Reliab.

    (2013)
  • AlcarazC. et al.

    Recovery of structural controllability for control systems

  • AlcarazC.

    Cloud-assisted dynamic resilience for cyber-physical control systems

    IEEE Wirel. Commun.

    (2018)
  • Wallenberg AI, Autonomous Systems and Software Program (WASP), Sequential Decision Making,...
  • T. Morris, R. Vaughn, Y. Dandass, A Retrofit Network Intrusion Detection System for MODBUS RTU and ASCII Industrial...
  • Cited by (55)

    • Intrusion response systems for cyber-physical systems: A comprehensive survey

      2023, Computers and Security
      Citation Excerpt :

      Both notification and manual systems introduce delays between detection and response time. In the automatic intrusion response systems (AIRS) (Kholidy, 2021), which is the focus of this taxonomy, the optimal response is selected and executed automatically without the need for any human intervention. This makes AIRS suitable for the time and availability demands of CPS.

    View all citing articles on Scopus

    Hesham AbdElazim I. Mohamed (Hisham A. Kholidy) received his Ph.D. in Computer Science in a joint Ph.D. program between University of Pisa in Italy and University of Arizona in USA. He works as an assistant professor at department of Networks and Computer Security (NCS), College of Engineering, State University of New York (SUNY) Polytechnic Institute. Prior to that, he worked as a postdoctoral associate at Mississippi State University. During his Ph.D., he worked as associate researcher at the NSF Cloud and Autonomic Computing Center, Electrical and Computer Engineering Dept. at the University of Arizona. He holds two patents in Cybersecurity published by United State Patent and Trade Mark Office (USPTO), and he published more than 30 papers on main journals and conferences including IEEE transactions, IET, and Springer journals. He participated as PI, Co PI, and senior personnel in 6 international research projects. His research interests include Cybersecurity and SCADA systems security, 5G and SDN systems, autonomic and cloud computing systems, and Machine Learning systems.

    View full text