ArcaStream delivers HPC Storage Refresh to the University of Edinburgh

Having recently procured a new 5600-core Dell HPC cluster to benefit research users in areas including bioinformatics, speech processing, particle physics, material physics, chemistry, cosmology, medical imaging and psychiatry, the University of Edinburgh called upon ArcaStream to design and implement a storage solution to drive it.


Having recently procured a new 5600-core Dell HPC cluster to benefit research users in areas including bioinformatics, speech processing, particle physics, material physics, chemistry, cosmology, medical imaging and psychiatry, the University of Edinburgh called upon ArcaStream to design and implement a storage solution to drive it.

1c58ed1

Dr Orlando Richards Research Services Manager – University of Edinburgh

The new HPC system, a refresh of the ‘Eddie’ series of computational resources at the University, required a scalable storage solution that could provide guaranteed performance of both throughput and IOPS to serve a wide and varied workload. The storage solution also required the capability to be able to scale granularly when additional compute capacity is added to the overall cluster.

Employing a software-defined approach to storage solution architecture, In partnership with Dell, Esteem and Alces Software, ArcaStream provided the University with nearly 1PB of usable capacity delivering a sustained 22.5GB/s and 102,000 4K random read IOPS to the cluster. Using IBM’s Spectrum Scale file system to aggregate commodity Dell PowerVault storage hardware into a single usable namespace with 40Gb connectivity into the cluster fabric, ArcaStream exceeded several of the University’s initial performance requirements by over 4X.

Their software-defined solution… allowed us to invest in more compute resource within the project than we had originally anticipated, whilst exceeding the storage performance targets.

“ArcaStream’s engineers are amongst the best we’ve worked with, and they are able to deliver storage solutions using hardware not traditionally considered for large-scale HPC systems.” Explains Dr Orlando Richards, Research Services Manager at the University of Edinburgh. “Their software-defined solution allowed us to provide our researchers a generous amount of storage performance per core at a very compelling price. This has allowed us to invest in more compute resource within the project than we had originally anticipated, whilst exceeding the storage performance targets.”

“With higher education and researching funding increasingly being squeezed, the traditional hardware-led approach to HPC storage solutions is no longer sustainable,” says Barry Evans, Founder and Technical Director of ArcaStream. “With our software defined architecture, we have provided the University with a robust and scalable high performance solution. It enables storage hardware to be treated as it should be – a low cost commodity without vendor lock-in.”

Our users benefit from greatly simplified data management.

The solution also introduced a method for integrating existing research data repositories across the University with HPC storage namespace, removing the need for users to copy data back and forth between isolated islands of storage resource. Data is automatically cached on access to HPC storage, then expunged after capacity thresholds are exceeded – without any impact to the user experience. Researchers see a single view of all of their data, no matter where they happen to be accessing it from. Additionally, any new data committed to HPC storage is automatically delivered back to the researcher’s particular data repository. This functionality provides researchers with the ability to access HPC-driven data from their workstation without any additional management overhead.

“With the solution built in partnership with ArcaStream, our users benefit from greatly simplified data management. They can store their data in one location, with confidence that it will have the required protection, performance and capacity without having to consider where the data is physically located themselves,” continues Dr Richards. “With the scalable software defined architecture, we can independently scale up performance and capacity where it’s needed, using the best and most cost effective available hardware technologies for the purpose.”