Ring-mesh: A Scalable and High-performance Approach for Manycore Accelerators

Somnath Mazumdar*, Alberto Scionti

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

There is increasing number of works addressing the design challenges of fast, scalable solutions for the growing number of new type of applications. Recently, many of the solutions aimed at improving processing element capabilities to speed up the execution of machine learning application domain. However, only a few works focused on the interconnection subsystem as a potential source of performance improvement. Wrapping many cores together offer excellent parallelism, but it brings other challenges (e.g. adequate interconnections). Scalable, power-aware interconnects are required to support such a growing number of processing elements, as well as modern applications. In this paper, we propose a scalable and energyefficient network-on-chip architecture fusing the advantages of rings as well as the 2D mesh without using any bridge router to provide high performance. A dynamic adaptation mechanism allows to better adapt to the application requirements. Simulation results show efficient power consumption (up to 141.3% saving for connecting 1024 cores), 2× (on average) throughput growth with better scalability (up to 1024 processing elements) compared to popular 2D mesh while tested in multiple statistical traffic pattern scenarios.
Original languageEnglish
JournalJournal of Supercomputing
Volume76
Issue number9
Pages (from-to)6720–6752
Number of pages33
ISSN0920-8542
DOIs
Publication statusPublished - Sep 2020
Externally publishedYes

Keywords

  • Interconnect
  • Network-on-chip
  • Manycores
  • Performance
  • Energy
  • Latency
  • Throughput

Cite this