Conference Year
2021
Keywords
Vapour Compression Refrigeration Cycle, Refrigeration Cycle Optimal Control, Artificial Intelligence, Reinforcement Learning, Refrigeration energy efficiency
Abstract
Vapor compression refrigeration cycles (VCRC) control optimization is an effective method to increase its' reliability and energy efficiency. In modern VCRC systems, the introduction of inverter compressors, electronic expansion valves, fan speed, and pump speed control has significantly increased their controllability. These improvements led to the development of many multivariable and predictive control strategies that improved the system's temperature tracking performance and increased the coefficient of performance (COP) by up to 30% compared to conventional on/off and PID controls. However, a VCRC also has high nonlinearities and parameter couplings, making it difficult to apply these modern control laws. This problem incentivizes the application of reinforcement learning (RL) to optimize VCRC control as RL demonstrated an unprecedented ability to optimize complex control problems. This study explores this idea by using RL to train a direct optimal controller for a VCRC. To test the concept, a VCRC simulation model was developed in the MATLAB Simulink environment to train an RL VCRC controller using the MATLAB reinforcement learning toolbox. The controller's goal is to track the desired internal air temperature and a 10˚C superheat setting. The controller used 17 observations containing the VCRC states, tracking errors, operating conditions, and previous actions to determine the optimal compressor speed and expansion valve opening percentage. The VCRC operating conditions were limited to ambient and internal air temperature ranges of 28-32˚C and 16-20˚C, respectively. This study used the twin delayed deep deterministic policy gradient (TD3) RL algorithm to train the controller. The TD3 training hyperparameters such as the noise model and deep neural network parameters were tuned to balance the exploration and exploitation of the solution space. The training converged to a suboptimal solution after completing 6500 episodes in 5 days using an Intel Core i7-8700 CPU 3.2GHz with 32 GB RAM. The developed RL controller was tested using custom ambient and internal air temperature profiles. The controller tracked both the internal air and superheat temperature settings well with low error and fast response time. However, when the ambient temperature fell below 29˚C, the actuators began to fluctuate, indicating that it did not learn a good policy for this region. This study showed that RL could optimize VCRC control, but more research is necessary to improve it.