Building world-class computational services for data-driven innovation | Edinburgh International Data Facility

The Edinburgh International Data Facility has entered an exciting phase with the launch of new services and hardware - and there’s still more to come.

The Edinburgh International Data Facility brings together regional, national and international datasets to create new products, services, and research. It is funded by the UK and Scottish Governments under the Data Driven Innovation Initiative of the Edinburgh and South-East Scotland City Region Deal.

EIDF virtual desktops

In June 2022 we were delighted to launch the first version of our data science virtual desktop service, offering Linux virtual machines (VMs) through a “desktop in a browser” virtual desktop interface (VDI). This service has been designed for users who need more compute and storage than their local resources allow but don’t need the full power of a high-performance computing (HPC) or dedicated machine learning system.

EIDF virtual desktops run on a computing cloud hosted entirely within the University of Edinburgh’s Advanced Computing Facility. Desktop machines can be ordered with a range of pre-installed data science software including Python, R, and JupyterLab. The service also offers a full visual desktop rather than a bare Linux command line, enabling the use of graphical tools such as RStudio or Jupyter Notebooks.

EIDF computing services

The data science cloud is our latest EIDF computing service, joining the large-memory Ultra2 system and two Cerebras CS-2 AI supercomputers.

Our plans for the rest of the year will complete the set, seeing the launch of a lightweight data science notebook service, a new Nvidia GPU cluster, and the acquisition of a Graphcore BowPod64 AI system. The GPU cluster and BowPod64 plus the CS-2 will give us an unrivalled array of leading-edge hardware for the hardest of machine learning challenges.

The combination of data science notebooks and desktops covers the smaller-scale, long tail of data science, while the Ultra2 system, in tandem with our traditional HPC services ARCHER2 and Cirrus, offers high-performance data analytics at scale.

But what about data?

The computational side of things, then, is in good shape, but EIDF will be nothing without data, of course. It currently sports some 50 petabytes of storage and to complete the overall picture we are working on the final stages of a general-purpose data ingest, processing, and data publishing pipeline which will supply the rich and varied analytics-ready data lake at the heart of the Facility.

Our eventual aim is to make this data visible to any and all of the computational services – including the national HPC services – to enable our users to incorporate data of all types into their research.

Sign up

Join our mailing list to receive updates about EIDF.

Author

Rob Baxter, EIDF Programme Manager

Publication date

24 Nov, 2023