The goal of this tutorial is to serve as a first step towards exploring the Hadoop platform and also to provide a short introduction into working with big data in Hadoop. An overview on Big Data including definitions, the source of Big Data, and the main challenges introduced by Big Data, will be presented. We will then present the MapReduce programming model as an important programming model for Big Data processing in the Cloud. Hadoop ecosystem and some of major Hadoop features will then be discussed. Finally, we will discuss several approaches and methods used to optimise the performance of Hadoop in the Cloud.
About the tutor
Dr. Shadi Ibrahim is a permanent Inria research scientist within the KerData research team. He obtained his Ph.D. in Computer Science from Huazhong University of Science and Technology in Wuhan of China in 2011. His research interests are in cloud computing, big data management, data-intensive computing, high performance computing, virtualization technology, and file and storage systems. He has published several research papers in recognized big data and cloud computing research journals and conferences, among which, several papers on optimizing and improving Hadoop MapReduce performance in the cloud and one book chapter on MapReduce framework.
|Wednesday October 14, 2015|
|10:00-11:30||An introduction to Big Data|
|13:00-14:30||Big Data processing in the Cloud: The MapReduce programming model|
|15:00-16:30||Hadoop ecosystem: An overview|
|17:00-18:00||Practical session on deploying Hadoop|
|Thursday October 15, 2015|
|09:00-10:30||Hadoop: Optimizations and open issues|
|11:00-12:45||Practical session on using and configuring Hadoop|
Practical session on developing MapReduce applications
This tutorial assumes some experience with using the Linux command-line. Programming skills in Java are a plus for this tutorial. To participate in the exercises a laptop is needed.
NEW: For the practical session environment, we are going to use an Ubuntu virtual machine, to be installed on participants' laptops, preferably before the event. Please download it from
To run this virtual machine, install a free VMware workstation player to run this VM. The link:
Obligatory registration - registration form here; deadline see above or exhausted course capacity.
The event is provided free of charge for the participants.
- NEW: For training environment preparation, please see the Prerequisites section above.
- See a page on transport and accommodation (in Czech) how to get to the campus of VŠB - Technical University Ostrava and to the new IT4Innovations building.
- Participants without the IT4Innovations card please arrive early enough to settle the formalities with obtaining an entry permit.
- System documentation is available at http://support.it4i.cz/docs.