Online course: Introduction to data processing using Python

This is a short introductory course, where participants will learn how to utilize python for some basic data processing. The concepts of python and data processing examples are explained and exemplified through a series of video lectures. As videos are pre-recorded, participants can take the course at a time of their convenience. The work load of the course is approx. one to two days fulltime.

Aim:

The aim is to give the participants competences and skills to apply python for their daily data processing operations, including preparing datasets in formats suitable for the statistical analysis introduced in the course on GLMs.

Learning goals:

After the course, the participant is expected to be able to:

• Prepare and import data into Python (pandas dataframes).
• Conduct basic cleaning of data.
• Sort, slice and select in data for analysis.
• Export data to record format (text or excel, which is ready for use in GLMs, as exemplified in course on GLMs).
• Apply basic statistical analysis.
• Apply basic data visualizations.

Course coordinators:

Prof. Dennis Trolle (trolle@ecos.au.dk).
Senior researcher Anders Nielsen (an@ecos.au.dk).

Video lecture introducing Python

Video lecture on basic Python code syntax

Video lecture on how to work with exercises

Course materials

Software

To get started with Python, we recommend to download and install the Anaconda distribution here: Anaconda website

You may download the free open source distribution (you do not need to set up an account for downloading or using this). The Anaconda installation includes the Spyder graphical interface as well as a lot of the commonly used Python libraries for data analysis.

Lectures slides

The slides used as a basis for the introductory lecture video can be downloaded as a single zip file here: lecture slides

Exercises

To train yourself in the basics of Python for data processing and analysis, a series of code examples are provided below, which cover the topics of the lectures may be completed.

You can copy/paste the code example directly into your own Spyder editor and run the examples. Try to complete all the examples. Once you have completed these, try to replace the example dataset with one of your own datasets, and adapt the code to be able to re-run the various analysis for your own data.

python

Online course: Introduction to data processing using Python

Video lecture introducing Python

Video lecture on basic Python code syntax

Video lecture on how to work with exercises

Course materials

Exercises

Importing data into a pandas dataframe

Simple line plots

Slicing and selecting data

Boxplots

Example of t-test (comparing the means of two datasets)

Exporting data to file