Cloud Control Systems - Real Time Analytics
Abstract:
Event Driven, Service-Oriented Cloud Systems can be viewed as systems of multi-variable time-series data. This makes them amenable to analysis using discrete-time signal-processing and feedback control systems analytic techniques. Some illustrative examples from a real-world Cloud System are presented: Anomaly detection and classification using statistical tests, correlation, and causal inference based on service dependencies; Qualification of new code deployments into production, automated failure detection and recovery; and Auto-scaling challenges and remedies, with use of feed-forward predictors to survive outages, A general architecture for Cloud Systems Analytics is presented, together with a view of the interesting challenges ahead.
Biography:
Dr. P. Simon Tuffs obtained his D.Phil degree in Self Tuning Control Systems at the University of Oxford, and has worked in a wide range of positions spanning Control and Signal Processing Systems, and Software Engineering for the past 25 years. He is currently engaged at Netflix performing Real-Time analytics: automated decision making based on real-time data to diagnose, adapt, and repair the large-scale Netflix Cloud system.