Veit Schiele

Veit Schiele
HomeDienstleistungenSchulungenSeminarkalenderSeminar «Data Analysis in Python» from May 03 to May 04, 2017

Seminar «Data Analysis in Python» from May 03 to May 04, 2017

After this course you will be able to process, summarize and visualize tabular data efficiently using the pandas library.
When May 03, 2017 09:00 AM to
May 04, 2017 05:00 PM
Where Veit Schiele Communications GmbH, Mansteinstr. 7, D-10783 Berlin
Contact Name
Contact Phone +49 30 8185667-1
Add event to calendarvCal

Target Audience

Analysts, researchers and engineers who would like to handle larger data sets more efficiently.


Basic knowledge of Python

Course Description

The pandas Python library is a practical everyday tool for the analysis of tabular data. This course improves your skillset for working with datasets ranging from a few dozen to a several million entries in Python. The course uses hands-on examples to cover exploratory data analysis, extracting relevant summaries and creating attractive diagrams. The integration of pandas with interactive environments like IPython und Jupyter will allow you to support answers to many questions with data quickly.

Course Duration

2 days

Course Outline

Day 1Day 2
Introduction to pandasAggregation
Data WranglingAnalyzing Time Series
Summarizing DataGeographical Data
Data Visualizationpandas Best Practices

Day 1

Introduction to pandas

  • Your environment for interactive data analysis
  • overview of the pandas library
  • Series
  • DataFrames
  • Improvements in Python 3
  • Jupyter Notebooks

Data Wrangling

  • reading CSV- and Excel files to pandas
  • sorting data
  • transposing tables
  • selecting rows and columns
  • saving pandas-tables

Summarizing data

  • extracting statistical metrics
  • merging tables
  • hierarchical indexing
  • crosstables
  • pivot tables

Data Visualization

  • creating diagrams with matplotlib
  • using matplotlib from within pandas
  • visualizing data in Jupyter notebooks
  • heatmaps
  • multi-panel diagrams
  • creating high-quality figures
  • other libraries for visualizing data

Day 2


  • iterating rows and columns
  • grouping
  • aggregation functions
  • transformation functions
  • applying your own functions

Analyzing Time Series

  • series of timestamps
  • rescaling time series
  • changing timezones
  • handling data with gaps
  • rolling means
  • simple predictions

Geographical Data

  • storing coordinates in pandas
  • drawing maps with Basemap

Best Practices

  • myths and facts
  • Numpy
  • machine learning models in scikit-learn
  • alternative libraries and modeling strategies
  • handling huge datasets
  • do's and don'ts