Heterogeneous Computing Platforms

GOAL

The research team under Prof. Uri Weiser (Technion) will use an holistic system approach. The approach encompass HW solutions and user- and OS-level SW to the multiple facets of heterogeneous platforms (e.g., power management, scheduling, and memory sharing), while providing optimal solutions for multiple individual aspects arising in this context. This objective is broken into five interrelated pillars:

  • Optimal resource allocation. One of the basic themes of the team’s latest findings – the MultiAmdahl framework – will be used to expand the turf of optimal design for a range of problems/solutions: e.g. optimal on-die heterogeneous design, optimal system solution, cloud resource sharing, optimal scheduling, and more. The space of the optimal system targets under different limited resources will be explored. The formalization of the Targets vs. resources should enable researchers and designers to achieve new insights that will drive new research domains. The target of the optimal solution can be defined in different terms: e.g., maximum performance, minimum power, minimum energy, maximum performance/power etc. Resources may be defined individually, e.g., power, energy, area, cost, etc.

The MultiAmdahl framework will be applied to diversity of applications and marketing domains that have variety of desired targets and resource constrains. The Heterogeneous research results will be supported by a simulator (e.g. Simics based), running real benchmarks. The simulator will enable dynamics evaluation using machine learning algorithms of system constructed based on equations framework.

  • Heterogeneous Power management. Power managements will have centralized role in Heterogeneous systems. Dynamic power tuning may provide considerable improvements in system efficiency. The goal of this research is to achieve maximum performance within a power envelope, and minimizing power and energy consumption. The research will attempt to formalize the physical constraints to computation density, address power efficient architectures for a class of applications and propose formal controls and heuristics to manage power performance at the various levels.
  • Machine Learning-Based Scheduling. The research will investigate Future Heterogeneous scheduling based on machine learning. The system will collect behavior information of available tasks, using a dynamic optimal resource sharing (Dynamic MultiAmdahl based) to optimally assign OS tasks to the available heterogeneous HW. Scheduling and task assignments will be done by firmware based on OS characteristics of the ready tasks, available HW and dynamic evaluation of the optimal solution. The team will investigate the usage of machine learning and other techniques to optimally schedule tasks in a heterogeneous system.
  • HW-aware middleware SW. Programming multi-core hardware is widely considered one of the key challenges facing the computing world today. Heterogeneity, asymmetry, and novel cache architectures render the programming task even more daunting. We therefore contend that the usability of heterogeneous platforms critically relies on good programming abstractions, implemented as part of middleware services. This research will provide a range of abstractions for effective programming of parallel, heterogeneous systems. Seeking to provide standard OS abstractions, such as a file system abstraction, that can be utilized across diverse accelerators. Such abstraction would allow, for example, a CPU and a GPU to interact in a standard way using files, breaking the current restrictive master-slave computing model of GPU usage.
  • Data sharing. Sharing of data in today’s computing systems comprising of multiple heterogeneous devices might be difficult, cumbersome, and accelerator-specific, both within the operating system and within the applications it supports. Systems that utilize accelerators must be written so that they can deal with the details and peculiarities of each different accelerator they use. The goal of this research is to unify the disjoint system-software/operating-system layers of memory management, developing a data sharing model for applications and accelerators in which sharing is made simple and seamless. The result would be better system’s performance/power-oriented software and applications, and a simplified programming model that increases productivity.

The research team plans a holistic approach to Heterogeneous system by merging some of the project into one coherent solution.

The research outcome will include:

  • First year deliverables:
    • Concept of the practical tools for optimal resource sharing.
    • Propose algorithms and heuristics to develop new power efficient CPU management methods.
    • Phase 1 of the programming abstractions to work the same way across a range of architectures.
    • Conceptual SW abstractions for HW idiosyncrasies such as LLC, NUMA, accelerators etc.
    • Unified framework for the multiple memory-management SW layers and data sharing models.
  • Long term (3 years) deliverables:
    • Provide practical tools for optimal resource sharing.
    • Programming abstractions that will work the same way across a range of architectures. The approach includes libraries to facilitate parallel programming, and OS-like abstractions for multi-process synchronization across heterogeneous processors.
    • Exploit the unified framework for the multiple memory-management SW layers and data sharing models for applications and accelerators in which sharing is made simple and seamless.
    • Efficient implementation of abstractions for HW idiosyncrasies like idiosyncrasies such as LLC,  NUMA, amd accelerators, and their interfaces. Thus tackle a key obstacle towards widespread adoption of heterogeneous platforms with a multitude of computing units.

Significantly simplified and better performing system software and user applications, leading to a reduction in runtime overheads and increasing application development productivity, notably in domains that benefit from accelerators such as machine learning and data analytics.

STATUS
TBD
PEOPLE
Prof. Uri Weiser, Technion EE
Prof. Yoav Etsion, Technion EE/CS
Prof. Ran Ginosar, Technion EE
Prof. Idit Keidar, Technion EE
Prof. Isaac Keslassy, Technion EE
Prof. Dan Tsafrir, Technion CS
Tomer Morad
Noam Shalev
Tsahee Zidenberg
Alon Nave
Oved Itzhak
Efi Rotem
Yinnon Meshei
Leeor Peled
Muli Ben-Yehuda
Nadav Amit
Moshe Malka
Ilya Lesokhin
PUBLICATIONS
  1. Jawad Haj-Yihia, Yosi Ben-Asher, Efraim Rotem, “Compiler Assessed CPU Power Management”, Compiler, Architecture and Tools Conference, sponsored by HiPeac, Haifa , Israel, November 2013
  2. Jawad Haj-Yihia, Yosi Ben-Asher, Efraim Rotem, ” Superscalar Micro-Architecture and Compiler Techniques for Power Reduction”, submitted to ISPASS 2014

Uri Weiser ➭

  1. E. Rotem, A. Mendelson, R. Ginosar, and U. C. Weiser. 2009. Multiple clock and voltage domains for chip multi processors”. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42).
  2. Amir Morad, Leonid Yavits, Tomer Morad, Ran Ginosar, Uri Weiser , “Generalized MultiAmdahl: Optimization of Heterogeneous Multi-Accelerator SoC”, IEEE Computer Architecture Letters, 2013.
  3. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, “Energy-efficient Computing in High Performance Systems”, in: Proceedings of the Fifth International Workshop on Energy-Efficient Design (WEED 2013), held with ISCA-40, June 24th, 2013.
  4. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” Power and thermal constraints of modern system on a chip computer”, in: Proceedings of the 19th International Workshop on Thermal Investigations of ICs, THERMINIC 2013, September 2013.
  5. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” Power and thermal constraints of modern system on a chip computer”, accepted for publication in Elsevier Microelectronics Journal
  6. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” Energy Aware Race to Halt: A Down to EARtH Approach for Platform Energy Management,” IEEE Computer Architecture Letters, vol. 99.
  7. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” H-EARtH: Heterogeneous Platform Energy Management”, submitted to publication in IEEE TC
  8. T. Zidenberg, Isaac Keslassy, U. Weiser, “Optimal Resource Allocation with MultiAmdahl”, IEEE MICRO Journal August 2013.
  9. T. Morad, A. Kolodny, U. Weiser, “Task Scheduling Based On Thread Essence and Resource Limitations”,  Journal of Computers, Vol 7, No 1 (2012), 53-64, January 2012.
  10. T. Zidenberg, Isaac Keslassy, U. Weiser, “MultiAmdahl: How Should I Divide My Heterogeneous Chip?”Computer Architecture Letters (CAL), February 20th, 2012 CAL best paper award.
  11. T. Morad, A. Kolodny, U. Weiser, “Scheduling Multiple Multithreaded Applications on Asymmetric and Symmetric Chip Multiprocessors”, PAAD’10, Conference, Dalian, LiaoNing, China, December 2010.
  12. Zvika Guz, Oved Itzhak, Idit Keidar, Avinoam Kolodny, Avi Mendelson, Uri C. Weiser, “Threads vs. Caches: Modeling the Behavior of Parallel Workloads on High-Performance Engines” , ICCD, Amsterdam, Holland, October 2010.
  13. E. Rotem, R. Ginosar, A. Mendelson, U. Weiser, “Multiple Clock and Voltage Domains for Chip Multi Processors” MICRO 2009 conference, NY, NY December 12th 2009 HiPeach grant award.
  14. S. Kvatinsky, Y. Nacson, Y. Etsion, E. Friedman, A. Kolodny, U. Weiser, “Memristor-Based Multithreading”, Computer Architecture Letters (CAL) Journal March 2013
  15. S. Kvatinsky, E. G. Friedman, A. Kolodny, and U. C. Weiser, “TEAM – ThrEshold Adaptive Memristor Model,” IEEE Transactions on Circuits and Systems, Journal Vol. 60, No. 1, pp. 211-221, January 2013
  16. S. Kvatinsky, K. Talisveyberg. D. Fliter, E. G. Friedman, A. Kolodny, and U. C. Weiser, “Models of Memristors for SPICE Simulations” Proc. of the IEEE Convention of Electrical and Electronics Engineers in Israel, pp. 1-5, November 2012
  17. S. Kvatinsky, K. Talisveyberg, D. Fliter, E.G. Friedman, A. Kolodny, and U. Weiser “Verilog-A for Memristor Models”, Technion CCIT Technical Report #801 January 2012
  18. O. Itzhak, I. Keidar, A. Kolodny, and U. C. Weiser, “Performance Scalability and Dynamic Behavior of Parsec Benchmarks on Many-Core Processors”. In 4th Workshop on Systems for Future Multicore Architectures (SFMA 2014), co-located with EuroSys 2014.

Yoav Etsion ➭

  1. N. Azuelos, Y. Etsion, I. Keidar, A. Zaks, and E. Ayguade, “Introducing Speculative Optimizations in Task Dataflow with Language Extensions and Runtime Support”. DFM Workshop, in conjunction with PACT’12.
  2. S. Kvatinsky, Y. Nacson, Y. Etsion, E. Friedman, A. Kolodny, U. Weiser, “Memristor-Based Multithreading”, Computer Architecture Letters (CAL) Journal March 2013

Ran Ginosar ➭

  1. Amir Morad, Leonid Yavits, Tomer Morad, Ran Ginosar, Uri Weiser , “Generalized MultiAmdahl: Optimization of Heterogeneous Multi-Accelerator SoC”, IEEE Computer Architecture Letters, 2013.
  2. Amir Morad, Leonid Yavits, Tomer Morad, Ran Ginosar,  “Optimization of Asymmetric and Heterogeneous MultiCore”, submitted to IEEE transactions on Computers, 2013.
  3. Amir Morad, Leonid Yavits, Ran Ginosar,  “Convex Optimization of Resource Allocation in Asymmetric and Heterogeneous MultiCores”, in preparation.
  4. Leonid Yavits, Amir Morad, and Ran Ginosar. “Cache Hierarchy Optimization.” IEEE Computer Architecture Letters, 2013.
  5. Leonid Yavits, Amir Morad, and Ran Ginosar. “The effect of communication and synchronization on Amdahl’s law in multicore systems”, Parallel Computing journal, 2013.
  6. E. Rotem, A. Mendelson, R. Ginosar, and U. C. Weiser. 2009. Multiple clock and voltage domains for chip multi processors”. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42).
  7. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, “Energy-efficient Computing in High Performance Systems”, in: Proceedings of the Fifth International Workshop on Energy-Efficient Design (WEED 2013), held with ISCA-40, June 24th, 2013.
  8. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” Power and thermal constraints of modern system on a chip computer”, in: Proceedings of the 19th International Workshop on Thermal Investigations of ICs, THERMINIC 2013, September 2013.
  9. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” Power and thermal constraints of modern system on a chip computer”, accepted for publication in Elsevier Microelectronics Journal
  10. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” Energy Aware Race to Halt: A Down to EARtH Approach for Platform Energy Management,” IEEE Computer Architecture Letters, vol. 99.
  11. E. Rotem, R. Ginosar, U. C. Weiser, A. Mendelson, ” H-EARtH: Heterogeneous Platform Energy Management”, submitted to publication in IEEE TC
  12.  E. Rotem, R. Ginosar, A. Mendelson, U. Weiser, “Multiple Clock and Voltage Domains for Chip Multi Processors” MICRO 2009 conference, NY, NY December 12th 2009 HiPeach grant award.

Idit Keidar ➭

  1. Zvika Guz, Oved Itzhak, Idit Keidar, Avinoam Kolodny, Avi Mendelson, Uri C. Weiser, “Threads vs. Caches: Modeling the Behavior of Parallel Workloads on High-Performance Engines” , ICCD, Amsterdam, Holland, October 2010.
  2. E. Gidron, I. Keidar, D. Perelman, and Y. Perez, “SALSA: Scalable and Low Synchronization NUMA-aware Algorithm for Producer-Consumer Pools”. SPAA’12
  3. M. Silberstein, B. Ford, I. Keidar and E. Witchel, “GPUfs: Integrating a File System with GPUs”. To appear in TOCS, earlier version in ASPLOS’13.
  4. N. Azuelos, Y. Etsion, I. Keidar, A. Zaks, and E. Ayguade, “Introducing Speculative Optimizations in Task Dataflow with Language Extensions and Runtime Support”. DFM Workshop, in conjunction with PACT’12.
  5. O. Itzhak, I. Keidar, A. Kolodny, and U. C. Weiser, “Performance Scalability and Dynamic Behavior of Parsec Benchmarks on Many-Core Processors”. In 4th Workshop on Systems for Future Multicore Architectures (SFMA 2014), co-located with EuroSys 2014.
  6. S. Patterson, Y. Eldar, and I. Keidar, “Distributed Compressed Sensing in Dynamic Networks”. In the 1st IEEE Global Conference on Signal and Information Processing (GlobalSIP’13), December 2013, Austin, Texas. Full version (also of ICASSP’13 paper): Distributed Compressed Sensing For Static and Time-Varying Networks, arXiv:1308.6086v1.
  7. I. Eyal, I. Keidar, S. Patterson, and R. Rom, “In-Network Analytics for Ubiquitous Sensing”. In the 27th Int’l Symp. on DIStributed Computing (DISC), Lecture Notes in Computer Science Volume 8205, pages 512-526, Jerusalem, Israel, October 2013.
  8. I. Eyal, F. Junqueira, and I. Keidar, “Thinner Clouds with Preallocation”. In 5th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud ’13), San Jose, CA, June 2

Isaac Keslassy ➭

  1. T. Zidenberg, Isaac Keslassy, U. Weiser, “Optimal Resource Allocation with MultiAmdahl”, IEEE MICRO Journal August 2013.
  2. Ori Rottenstreich, Yossi Kanizo and Isaac Keslassy, “The Variable-Increment Counting Bloom Filter”, IEEE/ACM Transactions on Networking, accepted for publication, 2014.
  3. Ori Rottenstreich, Pu Li, Inbal Horev, Isaac Keslassy and Shivkumar Kalyanaraman, “The Switch Reordering Contagion: Preventing a Few Late Packets from Ruining the Whole Party,” IEEE Transactions on Computers, accepted for publication, 2014.
  4. Eitan Zahavi, Isaac Keslassy, and Avinoam Kolodny, “Distributed Adaptive Routing Convergence to Non-Blocking DCN Routing Assignments,” IEEE Journal on Selected Areas in Communications, Vol. 56, No. 1, January 2014.
  5. Ori Rottenstreich, Marat Radan, Yuval Cassuto, Isaac Keslassy, Carmi Arad, Tal Mizrahi, Yoram Revah, and Avinatan Hassidim, “Compressing Forwarding Tables,” IEEE Infocom ’13, Turin, Italy, April 2013.
  6. Ori Rottenstreich, Marat Radan, Yuval Cassuto, Isaac Keslassy, Carmi Arad, Tal Mizrahi, Yoram Revah, and Avinatan Hassidim, “Compressing Forwarding Tables for Datacenter Scalability,” IEEE Journal on Selected Areas in Communications, Vol. 56, No. 1, January 2014.
  7. Ori Rottenstreich, Isaac Keslassy, Yoram Revah, and Aviran Kadosh, “Minimizing Delay in Shared Pipelines,” IEEE Hot Interconnects 21, San Jose, CA, August 2013.
  8. Ori Rottenstreich, Amit Berman, Yuval Cassuto and Isaac Keslassy, “Compression for Fixed-Width Memories,” IEEE ISIT ’13, Istanbul, Turkey, July 2013.
  9. Ori Rottenstreich, Isaac Keslassy, Avinatan Hassidim, Haim Kaplan, and Ely Porat, “On Finding an Optimal TCAM Encoding Scheme for Packet Classification,” IEEE Infocom ’13, Turin, Italy, April 2013.
  10. Yossi Kanizo, David Hay and Isaac Keslassy, “Palette: Distributing Tables in Software-Defined Networks,” IEEE Infocom ’13, Mini-conference, Turin, Italy, April 2013.
  11. Yossi Kanizo, David Hay and Isaac Keslassy, “Access-efficient Balanced Bloom Filters,” Computer Communications, Vol. 36, No. 4, pp. 373-385, February 2013.
  12. Eitan Zahavi, Isaac Keslassy, and Avinoam Kolodny, “Distributed Adaptive Routing for Big-Data Applications Running on Data Center Networks,” ACM/IEEE ANCS ’12, Austin, TX, October 2012.
  13. Alex Shpiner, Erez Kantor, Pu Li, Israel Cidon and Isaac Keslassy, “On the Capacity of Bufferless Networks-on-Chip,” 50th Allerton Conference, Monticello, IL, October 2012.

Dan Tsafrir ➭