Run-time optimization for learned controllers through quantitative games

G. Avni, R. Bloem, K. Chatterjee, T.A. Henzinger, B. Konighofer, S. Pranger, in:, 31st International Conference on Computer-Aided Verification, Springer, 2019, pp. 630–649.

Download
OA 659.77 KB

Conference Paper | Published | English
Author
; ; ; ; ;
Series Title
LNCS
Abstract
A controller is a device that interacts with a plant. At each time point,it reads the plant’s state and issues commands with the goal that the plant oper-ates optimally. Constructing optimal controllers is a fundamental and challengingproblem. Machine learning techniques have recently been successfully applied totrain controllers, yet they have limitations. Learned controllers are monolithic andhard to reason about. In particular, it is difficult to add features without retraining,to guarantee any level of performance, and to achieve acceptable performancewhen encountering untrained scenarios. These limitations can be addressed bydeploying quantitative run-timeshieldsthat serve as a proxy for the controller.At each time point, the shield reads the command issued by the controller andmay choose to alter it before passing it on to the plant. We show how optimalshields that interfere as little as possible while guaranteeing a desired level ofcontroller performance, can be generated systematically and automatically usingreactive synthesis. First, we abstract the plant by building a stochastic model.Second, we consider the learned controller to be a black box. Third, we mea-surecontroller performanceandshield interferenceby two quantitative run-timemeasures that are formally defined using weighted automata. Then, the problemof constructing a shield that guarantees maximal performance with minimal inter-ference is the problem of finding an optimal strategy in a stochastic2-player game“controller versus shield” played on the abstract state space of the plant with aquantitative objective obtained from combining the performance and interferencemeasures. We illustrate the effectiveness of our approach by automatically con-structing lightweight shields for learned traffic-light controllers in various roadnetworks. The shields we generate avoid liveness bugs, improve controller per-formance in untrained and changing traffic situations, and add features to learnedcontrollers, such as giving priority to emergency vehicles.
Publishing Year
Date Published
2019-07-12
Proceedings Title
31st International Conference on Computer-Aided Verification
Volume
11561
Page
630-649
Conference
CAV: Computer Aided Verification
Conference Location
New York, NY, United States
Conference Date
2019-07-13 – 2019-07-18
ISSN
IST-REx-ID

Cite this

Avni G, Bloem R, Chatterjee K, Henzinger TA, Konighofer B, Pranger S. Run-time optimization for learned controllers through quantitative games. In: 31st International Conference on Computer-Aided Verification. Vol 11561. Springer; 2019:630-649. doi:10.1007/978-3-030-25540-4_36
Avni, G., Bloem, R., Chatterjee, K., Henzinger, T. A., Konighofer, B., & Pranger, S. (2019). Run-time optimization for learned controllers through quantitative games. In 31st International Conference on Computer-Aided Verification (Vol. 11561, pp. 630–649). New York, NY, United States: Springer. https://doi.org/10.1007/978-3-030-25540-4_36
Avni, Guy, Roderick Bloem, Krishnendu Chatterjee, Thomas A Henzinger, Bettina Konighofer, and Stefan Pranger. “Run-Time Optimization for Learned Controllers through Quantitative Games.” In 31st International Conference on Computer-Aided Verification, 11561:630–49. Springer, 2019. https://doi.org/10.1007/978-3-030-25540-4_36.
G. Avni, R. Bloem, K. Chatterjee, T. A. Henzinger, B. Konighofer, and S. Pranger, “Run-time optimization for learned controllers through quantitative games,” in 31st International Conference on Computer-Aided Verification, New York, NY, United States, 2019, vol. 11561, pp. 630–649.
Avni G, Bloem R, Chatterjee K, Henzinger TA, Konighofer B, Pranger S. 2019. Run-time optimization for learned controllers through quantitative games. 31st International Conference on Computer-Aided Verification. CAV: Computer Aided Verification, LNCS, vol. 11561. 630–649.
Avni, Guy, et al. “Run-Time Optimization for Learned Controllers through Quantitative Games.” 31st International Conference on Computer-Aided Verification, vol. 11561, Springer, 2019, pp. 630–49, doi:10.1007/978-3-030-25540-4_36.
All files available under the following license(s):
Creative Commons License:
CC-BYCreative Commons Attribution 4.0 International Public License (CC-BY 4.0)
Main File(s)
File Name
Access Level
OA Open Access
Last Uploaded
2019-08-14T09:35:24Z


Export

Marked Publications

Open Data IST Research Explorer

Search this title in

Google Scholar
ISBN Search