

"Aurora, with over 60,000 Intel Max GPUs, a very fast I/O system, and an all-solid-state mass storage system, is the perfect environment to train these models."Įven though Aurora blades have been installed, the supercomputer still has to undergo and pass a series of acceptance tests, a common procedure for supercomputers. " While we work toward acceptance testing, we are going to be using Aurora to train some large-scale open-source generative AI models for science," said Rick Stevens, Argonne National Laboratory associate laboratory director. Meanwhile, before the system passes ANL's acceptance tests, it will be used for large-scale scientific generative AI models. The supercomputer, which will be used for a wide variety of workloads from nuclear fusion simulations to whether prediction and from aerodynamics to medical research, uses HPE's Shasta supercomputer architecture with Slingshot interconnects.

For now, Argonne National Laboratory does not publish official power consumption numbers for Aurora or its storage subsystem.

Meanwhile, that does not count the storage subsystem of Aurora, which employs 1,024 all-flash storage nodes offering 220TB of storage capacity and a total bandwidth of 31 TB/s. It spans eight rows and occupies a space equivalent to two basketball courts. The Aurora machine uses 166 racks that house 66 blades each. On the memory side of matters, Aurora has 1.36 PB of on-package HBM2E memory and 19.9 PB of DDR5 memory that is used by the CPUs as well as 8.16 PB of HBM2E carried by the Ponte Vecchi compute GPUs. The machine is powered by 21,248 general-purpose processors with over 1.1 million cores for workloads that require traditional CPU horsepower and 63,744 compute GPUs that will serve AI and HPC workloads. The Aurora supercomputer looks quite impressive, even by the numbers. "Aurora is the first deployment of Intel's Max Series GPU, the biggest Xeon Max CPU-based system, and the largest GPU cluster in the world," said Jeff McVeigh, Intel corporate vice president and general manager of the Super Compute Group. The system will come online later this year. The system promises to deliver a peak theoretical compute performance over 2 FP64 ExaFLOPS using its array of tens of thousands of Xeon Max 'Sapphire Rapids' CPUs with on-package HBM2E memory as well as Data Center GPU Max 'Ponte Vecchio' compute GPUs. Argonne National Laboratory and Intel said on Thursday that they had installed all 10,624 blades for the Aurora supercomputer, a machine announced back in 2015 with a particularly bumpy history.
