Jste zde

Data science with R and Python (PTC course)

Termín: 
St, 24.04.2019 9:30 - Čt, 25.04.2019 14:30
Uzávěrka registrace: 
Po, 15.04.2019 23:00
Místo: 
VŠB - Technical University Ostrava, IT4Innovations building, room 207
Lektor: 
Tomáš Martinovič, Stanislav Böhm (IT4Innovations)
Úroveň: 
beginner-intermediate
Jazyk: 
English

Annotation

The R part of course (first day) will be focused on presenting the basics of data analysis in R and visualization of data. The course will cover the introduction to the R statistical language introducing the basic data types and workflow. Afterwards, packages from the “tidyverse” collection will be presented. These includes packages for the loading of data, preprocessing data, basic data exploration, and visualization.

The Python oriented part (second day) will introduce essential data-scientific packages and will be complemented with hands-on exercises that will demonstrate their usage with real world data analytic problems, and showing how to tackle such problems.

The course will be up to 50% hands-on exercises covering all topics to practice the techniques, and patterns gained.

Purpose of the course (benefits for the attendees)

Target audience: Users that want to use Python and/or R for data analysis and prototyping. The participants will learn basic and intermediate skills for exploratory data analysis and visualization in the programming languages of R and Python.

About the tutor(s)

Tomáš Martinovič obtained his PhD in computational sciences at IT4Innovations, VSB - Technical University of Ostrava in 2018. From 2015 to 2018 he worked in a team focused on analysis of complex dynamical systems, where he worked on scalable implementations of algorithms from the field of nonlinear time series analysis. Since the start of 2019 he has been working in a team focused on high performance data analysis with the defined objective of research and transfer of knowledge in cooperation with industry.

Stanislav Böhm has a PhD in computer science, and is a researcher at IT4Innovations. He is interested in distributed systems, verification, and scheduling.

Preliminary agenda

Day 1
9:30 - 10:00Registration
10:00 - 11:30
Introduction
Data import in R
Tidying data in R
Hands-on
11:30 - 13:00lunch
13:00 - 14:30 Exploratory analysis with tidyverse in R
Hands-on
14:30 - 15:00coffee
15:00 - 16:30
Advanced data visualization and analysis with ggplot2 and trelliscopejs in R
Hands-on
Q & A

 

Day 2

  
9:45 - 10:45Introduction to Pandas
10:45 - 11:00coffee
11:00 - 12:00Hands-on

12:00 - 13:00

lunch
13:00 - 14:30Exploratory analysis in Pandas

 

Prerequisites

Basic knowledge of Python and/or R.
Participants must bring their own laptops.

Registration

Obligatory registration via the PRACE Events Portal  and its registration form.

Capacity and Fees

30 participants. The event is provided free of charge.

Practicalities

  • See the links below for how to get to the campus of VŠB - Technical University Ostrava, and to the IT4Innovations building.
  • Documentation for IT4Innovations' computer systems is available at https://docs.it4i.cz/.

Remark

This training is a PRACE Training Centre (PTC) course, co-funded by the Partnership of Advanced Computing in Europe (PRACE). The main web page of the course is located on the PRACE Events Portal.

Acknowledgements

This event was partially supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project „IT4Innovations excellence in science - LQ1602“ and partially by the PRACE-5IP project - the European Union’s Horizon 2020 research and innovation programme under grant agreement No 730913.
 

Přílohy: