Towards Building Analytical Models for Monitoring Large-scale System
The core businesses of many companies rely on the support of its IT system, which includes large quantity of heterogeneous software and hardware resources. To maintain the performance and functionalities of whole IT system when failures inevitably happen from time to time has already become a bigger and bigger challenge for the IT administrators from these companies. A monitoring system helps IT administrators know ever-changing states of resources and can speed up the problem solving process in the case of a failure or an anomaly happens. Although a number of open source and commercial tools are available in the market each with varying capabilities, they can not satisfy the requirements like scalability, intelligent analysis and high level decision making support when the IT system becomes more and more complicated. As a way to solve parts of problems of system monitoring, analytical models should be developed. In this tutorial, the challenges for developing these models will be introduced. Then three important topics, i.e., an event based approach for information representation, processing and transferring, performance prediction algorithms and anomaly detection algorithms, will be discussed in detail.
AnalyticModelsforMonitoring.pdf — PDF document, 13169Kb