energy audit, school buildings, energy consumptions, energy retrofit, cluster analysis
Nowadays the debate in Europe concerning the energy retrofit of existing buildings is oriented to the research of the most convenient retrofit actions from a technical and economic point of view. The methodology is a cost-optimal analysis of diverse retrofit improvements, which could be conducted on a representative reference building, as happens for the definition of the new law performance requisites. Defining a reference building in a sample, implies the analysis of a large amount of information. Many data mining algorithms can be used to find correlations and patterns. One of such techniques is clustering analysis, according to which a set is divided into several homogeneous groups whose elements have similar characteristics. The aim of this work is to explore the possibility of supporting the energy audit of a large building stock using few synthetic descriptors calculated for homogeneous groups found out by means of clustering. A group of 60 schools located in the North Italian province of Treviso has been analyzed. Metered energy consumptions and seasonal degree days were available for the last five year period. Regarding the schools’ geometrical features, the gross and net heated volume, the floor area, the window area, and the dispersing envelope surface are known. Moreover the thermal resistance of the building envelope components and the type of heating system are available. Energy and geometrical indicators have been calculated: the ratio between dispersing area and gross heated volume, the window to wall ratio, the energy consumption per volume unit and the energy per volume unit and degree day. In order to cluster the schools, the sets of parameters explaining the energy performance has been determined by considering the best multiple regressions between each possible group of parameters and total energy consumption. K-means cluster analysis has then performed on the school population considering the parameters in those sets. Two are the main issues to deal with in this analyis: the type and the most suitable number of parameters to be correlated to energy consumption and the suitable number of clusters to be determined. Concerning the first aspect, all parameters have been grouped in all the possible combination from 2 to 8 elements and a multiple linear regression was calculated for each single configuration set. The more numerous the set, the more precise is expected to be the correlation, but negligible changes in the coefficient of determination was shown for more than 6 parameters which seems to be an acceptable compromise between representativeness and complexity. As regards the second issue, the regression analysis has been repeated for each cluster found, to check if the correlations between the parameters and the energy consumption improves inside each cluster with respect to the whole sample. The number of clusters is expected to improve the correlation coefficient. In the paper optimization techniques have been applied to define the parameters and the minimum number of clusters that gives the best level of correlation.