Sparsity and asynchrony in distributed optimization: models and convergence results
Abstract:
We live in a post-Moore era. The single-thread performance of computers has ceased to improve and it is no longer possible to rely on the next generation of processors to solve an optimization problem faster or to deal with more data. Instead, we will have to develop optimization algorithms that make efficient concurrent use of multiple computing elements, be it the many cores of modern CPUs, the thousands of cores in typical GPUs, or the seemingly limitless pool of virtual machines in the cloud. The development of efficient parallel optimization algorithms requires a clear understanding of where performance is wasted and creative solutions for reducing this waste. On desktop computers, it is essential to respect limitations of communication buses and memory hierarchies and to reduce idle times of processing elements. In the cloud, it is important to parallelize algorithms to run close to data stores while accounting for heterogeneous workloads and communication delays between participating compute servers. In this talk, we will describe some of our recent efforts to understand how asynchrony and problem structure affect the provable performance of optimization algorithms. In the first part of the talk, we will focus on the effect of asynchrony on iterative algorithms. Two simple, yet powerful, convergence results will be given that allow to quantify how the degree of asynchrony affects the convergence times. We will demonstrate how these results allow us to provide short and sharp convergence proofs for a number of lock-free and distributed optimization algorithms. In the second part of the talk, we will turn our attention to understanding how problem structure affects the convergence speed of parallel optimization algorithms. We will discuss several measures of data sparsity and quantify how they influence the admissible step-sizes and the guaranteed convergence times of mini-batch algorithms for convex optimization. Special attention will be given to the pre-processing effort required to compute the different measures, and to their effectiveness in speeding up the on-line performance for mini-batch optimization on real-world data sets. The talk is based on joint work with Hamid Reza Feyzmahdavian, Arda Aytekin, Sarit Khirirat and Vien Van Mai.
Biography:Mikael Johansson earned the M.Sc. and Ph.D. in Electrical Engineering from Lund University, Sweden, in 1994 and 1999, respectively. He held postdoctoral research positions at Stanford University and UC Berkeley before joining KTH in 2002, where he now serves as a full professor.
His research interests are in distributed optimization, wireless networking, and control. He has published several ISI-highly cited papers, has served on the editorial board of Automatica and the IEEE Transaction on Control of Network Systems, as well as on the program committee for conferences such as IEEE CDC, IEEE Infocom, ACM SenSys.