Machine Learning for Architecture: Self-Learning, Predictive, Computer Systems

GOAL

The research team lead by Dr. Yoav Etsion (Technion) will lay the foundation for intelligent computer systems that make use of machine learning methods to  construct a computer system that learn from the workload it experiences, adapt to it, and improve its performance, efficiency and robustness over time. This overall objective can be broken into several intermixed research paths such as:

  • Self-learning workload characterization. Apply machine learning and data-mining techniques to workload analysis and develop efficient methods that enable a computer system to uncover system resource usage patterns and predict future requirements.
  • Predictive resource allocation and scheduling. Develop resource management and scheduling algorithms that combine the dynamically refined usage patterns and performance characteristics to improve the overall system performance over time.
  • Intelligent computer architectures. Design architectural mechanisms that learn from and adapt to the running workload e.g., smart memory hierarchies and processing units. Explore architectural support for the analysis of execution patterns such as performance counters and data flow trackers to enables real-time system adaptations. This architectural support will be used as features (or context) in many machine learning algorithms.
  • Agile memory and storage architectures. Designing efficient memory hierarchies and endowing them with dynamic and adaptive placement and access capabilities.

These objectives embody the tight connection between predictable application behavior and efficient resource provisioning. At its most fundamental level, the research will focus on tracking application behavior and predicting its future execution profile and resource consumption. The natural fit between these objectives and machine learning have sporadically yielded promising results in the design of memory controllers [1], performance estimation [2], and even branch predictors [3]. This project strives to generalize this concept and to apply machine learning methods to the computer system as a whole.

The first two planned projects are:

Getting More from the Memory Hierarchy with Machine Learning. The memory hierarchy consists of layers (register, caches, DRAM, NVM, disks) where each layer is (usually) smaller, faster and more expensive from the layer below it. When attempting to manage the memory hierarchy effectively, we address at least three questions: (1) which data should be inserted into the hierarchy (concerns prediction); (2) when should the data be inserted (concerns prefetching); and (3) into which level should the data be inserted (concerns cache bypassing). Employing the sophisticated machine learning approaches that use advanced prediction may yield a significant performance improvement.

Predictive Task Scheduling. Task-based programming is emerging as an effective methodology for expressing parallelism that enables programmers to focus on expressing the computation itself and delegate the management and scheduling of parallel tasks to a runtime software layer. Existing task schedulers are limited in their potential since they are incognizant to program semantics and typically have no notion of intra-task characteristics, inter-task data and control dependencies, and the essence of the computation. This project will integrate machine learning into the task scheduling process. Machine learning can dynamically model application-specific workloads, adapt to them, and guide the task scheduler whenever a task dispatch decision must be made.

The team plans to evaluate the research by extending the gem5 architectural simulator and the Linux kernel with experimental proof-of-concept implementations of the exploratory designs. These proof-of-concept implementations will carry several intelligent properties, including autonomous workload characterization mechanisms that will automatically generate workloads models; predictive data and work scheduling that will use workload models to improve the overall system efficiency. The first phase (end of 2013) of the project will be dedicated to study how existing machine learning methods are suited for computer systems workloads and how they should be enhanced to model such workloads. Towards the end of the project’s third year, the team will design architectural and software components – such as resource scheduler – that integrate machine intelligence into the computer system proper. This experimental stage will enhance the Linux kernel’s existing subsystems with machine intelligence-enabled components.

STATUS
PEOPLE
Prof. Yoav Etsion, Technion EE/CS
Prof. Assaf Schuster, Technion EE
Prof. Dan Tsafrir, Technion EE
Prof. Shie Mannor, Technion EE
Nadav Amit
Muli Ben-Yehuda
Adi Fuchs
Mickey Gabel
Moshe Malka
PUBLICATIONS
Yoav Etsion ➭
  1. Adi Fuchs, Shie Mannor, Uri Weiser and Yoav Etsion, “Loop-Aware Memory Prefetching Using Code-Block Working Sets”, Under submission to ISCA 2014.

Assaf Schuster ➭

Dan Tsafrir ➭

Shie Mannor ➭