Network scientific periodic publication

Trends of high-speed communication networks development for multiprocessor computing systems

2019, No. 108

DOI: 10.34759/trd-2019-108-14

Аuthors

Simonov A. S.¹^*, Semenov A. S.²^**, Makagon D. V.¹^***

1. Scientific Research Center of Electronic Computing Technology, 125, Varshavskoye shosse, Moscow, 117587, Russia
2. Moscow Aviation Institute (National Research University), 4, Volokolamskoe shosse, Moscow, А-80, GSP-3, 125993, Russia

*e-mail: simonov@nicevt.ru
**e-mail: semenov@nicevt.ru
***e-mail: makagond@nicevt.ru

Abstract

The inter-nodal communication network is a key component for mass-parallel supercomputers. JSC “NICEVT” has developed the first generation of the Angara high performance communication network designed for application as a part of multiprocessor computational systems (MCS) of petaflop class, and ensuring high computer capacity scalability at real problems solving.

The following main trends can be distinguished in the MCS development:

connectivity enhancing,
optical connections application,
network adapter (NIC) and processor integration,
MCS universalization to extend application on mass market by supporting the Ethernet protocol family.

The second generation of the MCS Angara with multidimensional torus topology, which development is scheduled for completion in 2019, is designed fro creating an MCS of subexaflopcic performance range. It is characterized by the modified Torus topology support, and much higher NIC characteristics, allowing plugging-in up to four processors at each node.

In the course of specification development of the second-generation Angara communication network, besides the operational parameters improvement, emphasis was placed on a significant functionality expansion, aimed at effective operation in both the high-performance MCSs segment and for data storage and Bulky Data processing systems.

These functional capabilities relate to SR IOV virtualization technology support, batches protection from the third-party tasks, non-guaranteed batches delivery for TCP/IP protocols stack realization, more effective routing algorithms for supporting the networks with “modified torus” topology, zonal adaptive routing and other features.

The MCS Angara third generation, which development is planned to be performed in 2021-2023, is focused on supercomputers creation of exaflops range, and is characterized by further improvement in NIC characteristics and optical connections application.

Keywords:

multiprocessor computing systems, communication network, performance scalability

References

Simonov A.S., Zhabin I.A., Makagon D.V. Nauchno-tekhnicheskaya konferentsiya “Perspektivnye napravleniya razvitiya sredstv vychislitel’noi tekhniki”: tezisy dokladov, Moscow, NITsEVT, 2011, pp. 17 – 19.
Clutskin A.I., Simonov A.S., Zhabin I.A., Makagon D.V., Syromyatnikov E.L. Uspekhi sovremennoi radioelektroniki, 2012, no. 1, pp. 6 – 10.
Zhabin I.A., Makagon D.V., Polyakov D.A., Simonov A.S., Syromyatnikov E.L., Shcherbak A.N. Naukoemkie tekhnologii, 2014, vol. 15, no. 1, pp. 21 – 27.
Zhabin I.A., Makagon D.V., Simonov A.S., Syromyatnikov E.L., Frolov A.S., Shcherbak A.N. Superkomp’yutery, 2013, no. 4 (16), pp. 46 – 49.
Heydemann M.C. Cayley graphs and interconnection networks, Graph symmetry, Springer, Dordrecht, 1997, pp. 167 – 224.
Kim J. et al. Technology-driven, highly-scalable dragonfly topology, International Symposium on Computer Architecture, IEEE, 2008. pp. 77 – 88.
Besta M., Hoefler T. Slim fly: A cost effective low-diameter network topology, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE Press, 2014, pp. 348 – 359.
Rogers J. Power efficiency and performance with ORNL’s cray XK7 Titan, High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, IEEE, 2012, pp. 1040 – 1050.
Cordery M.J. et al. Analysis of Cray XC30 performance using Trinity-NERSC-8 benchmarks and comparison with Cray XE6 and IBM BG/Q, International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Springer, Cham, 2013, pp. 52 – 72.
Spisok TOP500, iyun’ 2018, available at: https://www.top500.org/list/2018/06/
Vishnu A., ten Bruggencate M., Olson R. Evaluating the potential of Cray Gemini interconnect for PGAS communication runtime systems, 19th Annual Symposium on High Performance Interconnects, IEEE, 2011, pp. 70 – 77.
Faanes G. et al. Cray cascade: a scalable HPC system based on a Dragonfly network, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, IEEE, 2012, pp. 1 – 9.
Chen D. et al. The IBM Blue Gene/Q interconnection network and message unit, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, 2011, pp. 1 – 10.
Ajima Y., Sumimoto S., Shimizu T. Tofu: A 6d Mesh/Torus interconnect for exascale computers, IEEE Computer, 2009, no 11 (42), pp. 36 – 40.
Kondrashin M.A., Arsenov O.Yu., Kozlov I.V. Trudy MAI, 2016, no. 89, available at: http://trudymai.ru/eng/published.php?ID=73411
PCI Special Interest Group, available at: http://www.pcisig.com/home
Lumsdaine A. et al. Challenges in parallel graph processing, Parallel Processing Letters, 2007, no. 1 (17), pp. 5 – 20.
Sengupta D. et al. Graphin: An online high performance incremental graph processing framework, European Conference on Parallel Processing, Springer, Cham, 2016, pp. 319 – 333.
Mazeev A., Semenov A., Simonov A. A Distributed Parallel Algorithm for the Minimum Spanning Tree Problem, International Conference on Parallel Computational Technologies, Springer, Cham, 2017, pp. 101 – 113.
MPI: A Message-Passing Interface Standard, Version 3.1, available at: https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf
OpenSHMEM Application Programming Interface, Version 1.4, available at: http://www.openshmem.org/site/sites/default/site_files/OpenSHMEM-1.4.pdf
Charm++ Parallel Programming Framework, available at: http://charmplusplus.org/
Hong S. et al. Green-Marl: a DSL for easy and efficient graph analysis, ACM SIGARCH Computer Architecture News, 2012, no. 1 (40), pp. 349 – 362.
Lustre parallel filesystem, available at: http://lustre.org/
Libfabric OpenFabrics, available at: https://ofiwg.github.io/libfabric/

Download

mai.ru — informational site MAI

Вход