Tuesday, August 22, 2017

2017-08 Visit to TACC (Texas Advanced Computing Center)

After working for almost a year now on a software team that enables High Performance Computing (HPC), today, I finally saw my first super computer in person. In fact, perhaps counting it as only one super computer is a misnomer. The chips arranged in groups of machines are called clusters. Each cluster has it's own name, types of chips and purpose. However, the largest cluster of all at the Texas Advanced Computing Center (TACC) is made up of 4200 Intel(R) Knights Landing (KNL) chips and 1300 Intel(R) SkyLake (SKL) processors. This is the newly installed Stampede2 system.



Stampede2

We not only got to view Stampede2 from behind the glass, we actually got to walk down one if it's "hot aisles", or an aisle with all the backs of the machines facing in. It's "hot" because the cooling system blows cold air on the fronts of all the racks. The air becomes hot by the time it arrives at the back of the cabinets. Stampede2 takes up five and a half rows. Looking in from the back of the machine, we were able to see this rack full of Intel (R) Omni-Path switches and cabling. These specialized switches and cabling enable all of the chips in the Stampede2 cluster to communicate more quickly than over standard switches and cabling.

Omni-Path switches and cabling

Lest you think that this is all one big advertisement for Intel HPC products, we did see several other clusters, including ones with AMD and Nvidia chips. Strange but I seemed to have forgotten to take pictures of those...

The last cluster we viewed is a collaboration between a Japanese corporation, a Japanese government agency and TACC. It is an experiment to see if a super computer can be powered by solar. The solar panels are installed as a carport in the parking lot. Unfortunately, current solar panel technology doesn't provide enough power to run a super computer, even a smaller cluster like Hikari.

Checking out the solar versus grid powering Hikari