Introduction to pandas for Data Analysis

About this course

This course focuses on creating a rapid understanding for using the pandas library for data analysis in Python. If you know some basic Python and want to learn and use pandas for data tasks in Python, this course is for you! This course demonstrates how to create pandas datasets from other Python data types, introduces built-in pandas functionality, and builds the skills necessary to use pandas for creating, cleaning, and analyzing datasets.

By the end of this hands-on online course, you’ll understand:

How to create pandas DataFrames from various data sources
How to subset, filter, and combine DataFrames together
How to use descriptive statistics, aggregate functions, and “group by” in order to view different dimensions of your data

And you’ll be able to:

Work with pandas to rapidly slice and dice your data
Import data into pandas
Clean, process, and export your pandas DataFrames
Transform, reshape, and quantify your data

This training is for you because...

You have some Python basics and want to learn and use the popular pandas library to import, clean, subset, combine, and analyze your data
You currently work with spreadsheets and need to process larger amounts of data in a more programmatic way
You want to become a data analyst, data scientist, or machine learning engineer

Prerequisites

Foundational knowledge of Python (experience with data types, operators, and syntax), or Introduction to Python Programming Learning Path.

Recommended follow-up

Introduction to Data Visualization with Python course

Setup

To follow along using your desktop IDE:

Install or update to the latest version of Anaconda
Launch your command line tool and configure your conda environment

For macOS and Linux users: Search and launch Terminal in your system

For Windows users: Locate and launch Anaconda Prompt in your system

3. (Optional but recommended) From the command line, run the following prompts to create and activate a new environment

conda create --name NEW_ENV_NAME

conda activate NEW_ENV_NAME

4. Install required packages in the command line

conda install pandas

5. Launch JupyterLab from the command line

jupyter lab

To open Anaconda Notebooks:

Go to Anaconda Notebooks
Click on 'Sign Up' or 'Sign In' if you already have an account from the top navigation menu
Click 'Launch Notebook'

The Notebooks for this course are also available at this public GitHub link.

Facilitator Bio

Ryan Orsinger serves as the Director of Data Science and Research at Haven for Hope, a large non-profit serving individuals experiencing homelessness in South Texas. Prior to joining Haven for Hope, Ryan taught data science and software development for 8 years at Codeup, a programming bootcamp, where he helped over 600 individuals become practicing data scientists and software developers.

As an individual contributor, Ryan has worked on data projects from customer segmentation analysis and anomaly detection for security to building software for learning management systems, events management platforms, and CRM systems.

You can find Ryan at:

GitHub

YouTube

Questions? Issues? Contact learning@anaconda.com.

Curriculum03:35:23

Getting started with Anaconda Notebooks 00:01:02
Starting with pandas Series
Creating and filtering series 00:08:40
Exercise 1: pandas series 00:04:23
Getting the data you need out of a series 00:08:23
Exercise 2: pandas series 00:08:28
Strings 00:08:55
Exercise 3: Strings 00:08:36
Introducing the pandas DataFrame
Making DataFrames from Python collections and describing your DataFrame 00:10:30
Renaming columns + exercise 1: DataFrames 00:07:28
Creating a DataFrame, counting true values in a Boolean series, and using AND and OR operators 00:08:52
Exercise 2: DataFrames 00:10:25
Using pandas methods to clean, transform, and summarize your data
Identifying missing values 00:12:44
Exercise 1: Identifying missing values 00:03:08
Filling missing values 00:12:03
Exercise 2: Filling missing values 00:07:18
Using group-by and aggregate functions on DataFrames
Using .crosstab and .pivot_table 00:11:31
Exercise 1: Aggregating 00:07:15
Using .groupby and aggregate methods 00:12:59
Exercise 2: Aggregating 00:06:28
Combining DataFrames, more on reading files, and writing DataFrames
Using .concat to combine DataFrames horizontally or vertically 00:07:45
Using .merge to join DataFrames 00:13:28
Exercise 1: Combining DataFrames 00:06:56
DataFrames: Working with files 00:10:51
DataFrames: Separator characters (Delimiters) 00:08:07
Exercise 2: Working with files 00:08:14
Conclusion 00:00:54
Practice Quiz: Introduction to pandas for Data Analysis
End of course survey
Course Completion

About this course