The goal of this paper is to study the potential applicability of a stochastic approximationbased policy gradient method for optimal office building HVAC (Heating, Ventilation, and Air Conditioning) control systems. A real-world building thermal dynamics with occupant interactions is the main focus of this paper. It is a complex stochastic system in the sense that its statistical properties depend on its state variables. In this case, existing approaches, for instance, stochastic model predictive control methods, cannot be applied to optimal control designs. As a remedy, we approximate the gradient of the cost function using simulations and use a gradient descent type algorithm to design a suboptimal control policy. We assess its performance through a simulation study of building HVAC systems.


Building Control, Simulation-based Methods, Stochastic Optimal Control

Date of this Version