You are here

Parallel I/O & Libraries (PRACE training course)

Date: 
Thu, 03/22/2018 - 9:30am to Fri, 03/23/2018 - 5:00pm
Registration deadline: 
Mon, 03/19/2018 - 11:45pm
Venue: 
VŠB - Technical University Ostrava, IT4Innovations building, room 207
Tutor: 
Nicole Audiffren (CINES, France), Sebastian Lührs (JSC, Germany)
Level: 
beginners-intermediate
Language: 
English

Summary and benefits for the attendees

Numerical simulations conducted on current high-performance computing (HPC) systems face an ever growing need for scalability. Larger HPC platforms provide opportunities to push the limitations on size and properties of what can be accurately simulated. Serial approaches on handling I/O in a parallel application will dominate the performance on massively parallel systems. Heterogeneity of platforms can impose a high level of maintenance, when different data representations are needed. Portable, self-describing data formats such as HDF5 are examples of already widely used data formats within certain communities.

The course will present the main concepts relevant to IO on high-end parallel systems. MPI-IO, the underlying standard for parallel IO will be introduced, and then two high-level IO libraries (HDF5 and SIONlib) will be presented.

Writing a large number of files in the same directory can cause trouble for some parallel file system meta-data drivers and leads to under-performance. The purpose of MPI-IO is to provide a high performance, portable, parallel I/O interface for high performance parallel MPI programs.
The HDF5 library provides a set of high level library functions for describing and storing simple and complex data structures. It also allows for parallel I/O through MPI-IO.

SIONlib is a library for writing and reading binary data to/from several thousands of processors into one or a small number of physical files. The SIONlib file layout and API allow the application to take advantage of the scaling behaviour and asynchronous access of a logical task-local pattern while keeping the number of files independent of and significantly smaller than the number of processes.

About the tutors

After a PhD in fluid mechanics, Nicole Audiffren worked for several years in computational atmospheric sciences at the Observatoire de Physique du Globe de Clermont-Ferrand (France). In 2002, she joined the Support Team at CINES (Centre Informatique de l'Enseignement Supérieur, Montpellier) who hosts one of the major french Tier-1: the supercomputer Occigen (3.5 Pflops). Her main work is giving advice to scientists dealing with large I/O and specially on Lustre systems. She is also regularly involved in the bidding processes for the Tier-1 machine. Besides that, she is a regular member of national hiring panels and is a member of an European Tier-1 resources allocation Board.

Sebastian Lührs studied Technomathematics at the University of Applied Science Aachen. Since 2014 he is working as part of the cross-sectional team application optimization in the division application support of the Jülich Supercomputing Centre of Forschungszentrum Jülich GmbH. Beside general application optimization, especially in context of I/O, for the Tier-0 and Tier-1 systems in Jülich his main work areas are tool development and user training. In addition he is involved in several industrial, national and EU funded projects like PRACE and EoCoE.

Preliminary agenda

Day 1

 9:30-10:00     Registration
10:00-10:45    IT4Innovations - intro and IO hardware overview (Branislav Jansík, IT4Innovations)
10:45-11.45     General parallel IO strategies (Sebastian Lührs)
11:45-13:00     Lunch break
13:00-13:45     IO Profiling (Sebastian Lührs)
13:45-14:30     MPI-IO (Nicole Audiffren)
Overview of MPI-IO (motivation, file view code, collective I/O, hints)
14:30-15:00     Coffee
15:00-16:30     MPI-IO (Nicole Audiffren)
Performance and ROMIO (hints)
16:30-17:00     Coffee
17:00-18:00     MPI-IO (Nicole Audiffren)
Hands-on

Day 2

9:00-10:30      HDF5 (Nicole Audiffren)
Overview of basic HDF5 and API (manipulating HDF5 files, predefined datatypes, dataspaces and datasets, writing & reading data)
Hands-on (serial HDF5)
10:30-11:00    Coffee
11:00-12:30     Parallel HDF5 (Nicole Audiffren)
Overview (creating/accessing a file, writing and reading hyperslabs programming model)
Hands-on:  hyperslab example
12:30-13:30     Lunch break
13:30-15:00    SIONlib (Sebastian Lührs)
Introduction (motivation, SIONlib file format)
Basic routines
Hands-on
15:00-15:30     Coffee
15:30-17:00     SIONlib (Sebastian Lührs)
Advanced SIONlib features
Utility tools
Hands-on

Prerequisites

Knowledge of C or Fortran programming language, prior exposure to parallel programming with MPI.

Participants need to have their own notebook to access Anselm and/or Salomon. Training accounts will be provided during the on-spot registration.

Registration

Obligatory registration through the event web page on the PRACE events portal.

Capacity and Fees

30 participants. The event is provided free of charge.

Practicalities

  • See the links below for how to get to the campus of  VŠB - Technical University Ostrava and to the IT4Innovations building.
  • Documentation for IT4Innovations' computer systems is available at https://docs.it4i.cz/.

Acknowledgements

This course is organized as a joined event of the IT4Innovations National Supercomputing Center, Czech PTC (PRACE Training Centre), Maison de la Simulation and CINES, PATC (PRACE Advanced Training Centre) in France, and Jülich Supercomputing Centre, PATC in Germany. All those centres provide high-level HPC training for the Partnership for Advanced Computing in Europe (PRACE).

Attachments: