Conference Year



active learning, building energy forecasting, data-driven model, training data quality, excitation method


For data-driven building energy forecasting modeling, the quality of training data strongly affects a model’s accuracy and cost-effectiveness. In order to obtain high-quality training data within a short time period, experiment design, active learning, or excitation is becoming increasingly important, especially for nonlinear systems such as building energy systems. Experiment design and system excitation have been widely studied and applied in fields such as robotics and automobile industry for their model development. But these methods have hardly been applied for building energy modeling. This paper presents an overall discussion on the topic of applying system excitation for developing building energy forecasting models. For gray-box and white-box models, a model’s physical representations and theories can be applied to guide their training data collections. However, for black-box (pure-data-driven) models, the training data’s quality is sensitive to the model structure, leading to a fact that there is no universal theory for data training.  The focus of black-box modeling has traditionally been on how to represent a data set well. The impact of how such a data set represents the real system and how the quality of a training data set affect the performances of black-box models have not been well studied. In this paper, the system excitation method, which is used in system identification area, is used to excite zone temperature set-points to generate training data. These training data from system excitation are then used to train a variety of black-box building energy forecasting models. The models’ performances (accuracy and extendibility) are compared among different model structures. For the same model structure, its performances are also compared between when it is trained using typical building operational data and when it is trained using exited training data. Results show that the black-box models trained by normal operation data achieve better performance than that trained by excited training data but have worse model extendibility; Training data obtained from excitation will help to improve performances of system identification models.