This video is still being processed. Please check back later and refresh the page.

Uh oh! Something went wrong, please try again.

Introduction to pandas for Data Analysis

Building a foundation in Python using pandas DataFrames for analysis.

rate limit

Code not recognized.

About this course

This course focuses on creating a rapid understanding for using the pandas library for data analysis in Python. If you know some basic Python and want to learn and use pandas for data tasks in Python, this course is for you! This course demonstrates how to create pandas datasets from other Python data types, introduces built-in pandas functionality, and builds the skills necessary to use pandas for creating, cleaning, and analyzing datasets.

By the end of this hands-on online course, you’ll understand:

  • How to create pandas DataFrames from various data sources
  • How to subset, filter, and combine DataFrames together
  • How to use descriptive statistics, aggregate functions, and “group by” in order to view different dimensions of your data

And you’ll be able to:

  • Work with pandas to rapidly slice and dice your data
  • Import data into pandas 
  • Clean, process, and export your pandas DataFrames
  • Transform, reshape, and quantify your data

This training is for you because...

  • You have some Python basics and want to learn and use the popular pandas library to import, clean, subset, combine, and analyze your data
  • You currently work with spreadsheets and need to process larger amounts of data in a more programmatic way
  • You want to become a data analyst, data scientist, or machine learning engineer

Prerequisites

Recommended follow-up

Setup

To follow along using your desktop IDE:

  1. Install or update to the latest version of Anaconda
  2. Launch your command line tool and configure your conda environment

For macOS and Linux users: Search and launch Terminal in your system

For Windows users: Locate and launch Anaconda Prompt in your system

3. (Optional but recommended) From the command line, run the following prompts to create and activate a new environment

conda create --name NEW_ENV_NAME

conda activate NEW_ENV_NAME 

4. Install required packages in the command line

conda install pandas

5. Launch JupyterLab from the command line

jupyter lab

To open Anaconda Notebooks:

  1. Go to Anaconda Notebooks
  2. Click on 'Sign Up' or 'Sign In' if you already have an account from the top navigation menu
  3. Click 'Launch Notebook'

The Notebooks for this course are also available at this public GitHub link

Facilitator Bio

Ryan Orsinger serves as the Director of Data Science and Research at Haven for Hope, a large non-profit serving individuals experiencing homelessness in South Texas. Prior to joining Haven for Hope, Ryan taught data science and software development for 8 years at Codeup, a programming bootcamp, where he helped over 600 individuals become practicing data scientists and software developers.

As an individual contributor, Ryan has worked on data projects from customer segmentation analysis and anomaly detection for security to building software for learning management systems, events management platforms, and CRM systems.

You can find Ryan at:

LinkedIn

GitHub

YouTube

Curriculum03:35:23

  • Getting started with Anaconda Notebooks 00:01:02
  • Starting with pandas Series
  • Creating and filtering series 00:08:40
  • Exercise 1: pandas series 00:04:23
  • Getting the data you need out of a series 00:08:23
  • Exercise 2: pandas series 00:08:28
  • Strings 00:08:55
  • Exercise 3: Strings 00:08:36
  • Introducing the pandas DataFrame
  • Making DataFrames from Python collections and describing your DataFrame 00:10:30
  • Renaming columns + exercise 1: DataFrames 00:07:28
  • Creating a DataFrame, counting true values in a Boolean series, and using AND and OR operators 00:08:52
  • Exercise 2: DataFrames 00:10:25
  • Using pandas methods to clean, transform, and summarize your data
  • Identifying missing values 00:12:44
  • Exercise 1: Identifying missing values 00:03:08
  • Filling missing values 00:12:03
  • Exercise 2: Filling missing values 00:07:18
  • Using group-by and aggregate functions on DataFrames
  • Using .crosstab and .pivot_table 00:11:31
  • Exercise 1: Aggregating 00:07:15
  • Using .groupby and aggregate methods 00:12:59
  • Exercise 2: Aggregating 00:06:28
  • Combining DataFrames, more on reading files, and writing DataFrames
  • Using .concat to combine DataFrames horizontally or vertically 00:07:45
  • Using .merge to join DataFrames 00:13:28
  • Exercise 1: Combining DataFrames 00:06:56
  • DataFrames: Working with files 00:10:51
  • DataFrames: Separator characters (Delimiters) 00:08:07
  • Exercise 2: Working with files 00:08:14
  • Conclusion 00:00:54
  • Practice Quiz: Introduction to pandas for Data Analysis
  • End of course survey
  • Course Completion

About this course

This course focuses on creating a rapid understanding for using the pandas library for data analysis in Python. If you know some basic Python and want to learn and use pandas for data tasks in Python, this course is for you! This course demonstrates how to create pandas datasets from other Python data types, introduces built-in pandas functionality, and builds the skills necessary to use pandas for creating, cleaning, and analyzing datasets.

By the end of this hands-on online course, you’ll understand:

  • How to create pandas DataFrames from various data sources
  • How to subset, filter, and combine DataFrames together
  • How to use descriptive statistics, aggregate functions, and “group by” in order to view different dimensions of your data

And you’ll be able to:

  • Work with pandas to rapidly slice and dice your data
  • Import data into pandas 
  • Clean, process, and export your pandas DataFrames
  • Transform, reshape, and quantify your data

This training is for you because...

  • You have some Python basics and want to learn and use the popular pandas library to import, clean, subset, combine, and analyze your data
  • You currently work with spreadsheets and need to process larger amounts of data in a more programmatic way
  • You want to become a data analyst, data scientist, or machine learning engineer

Prerequisites

Recommended follow-up

Setup

To follow along using your desktop IDE:

  1. Install or update to the latest version of Anaconda
  2. Launch your command line tool and configure your conda environment

For macOS and Linux users: Search and launch Terminal in your system

For Windows users: Locate and launch Anaconda Prompt in your system

3. (Optional but recommended) From the command line, run the following prompts to create and activate a new environment

conda create --name NEW_ENV_NAME

conda activate NEW_ENV_NAME 

4. Install required packages in the command line

conda install pandas

5. Launch JupyterLab from the command line

jupyter lab

To open Anaconda Notebooks:

  1. Go to Anaconda Notebooks
  2. Click on 'Sign Up' or 'Sign In' if you already have an account from the top navigation menu
  3. Click 'Launch Notebook'

The Notebooks for this course are also available at this public GitHub link

Facilitator Bio

Ryan Orsinger serves as the Director of Data Science and Research at Haven for Hope, a large non-profit serving individuals experiencing homelessness in South Texas. Prior to joining Haven for Hope, Ryan taught data science and software development for 8 years at Codeup, a programming bootcamp, where he helped over 600 individuals become practicing data scientists and software developers.

As an individual contributor, Ryan has worked on data projects from customer segmentation analysis and anomaly detection for security to building software for learning management systems, events management platforms, and CRM systems.

You can find Ryan at:

LinkedIn

GitHub

YouTube

Curriculum03:35:23

  • Getting started with Anaconda Notebooks 00:01:02
  • Starting with pandas Series
  • Creating and filtering series 00:08:40
  • Exercise 1: pandas series 00:04:23
  • Getting the data you need out of a series 00:08:23
  • Exercise 2: pandas series 00:08:28
  • Strings 00:08:55
  • Exercise 3: Strings 00:08:36
  • Introducing the pandas DataFrame
  • Making DataFrames from Python collections and describing your DataFrame 00:10:30
  • Renaming columns + exercise 1: DataFrames 00:07:28
  • Creating a DataFrame, counting true values in a Boolean series, and using AND and OR operators 00:08:52
  • Exercise 2: DataFrames 00:10:25
  • Using pandas methods to clean, transform, and summarize your data
  • Identifying missing values 00:12:44
  • Exercise 1: Identifying missing values 00:03:08
  • Filling missing values 00:12:03
  • Exercise 2: Filling missing values 00:07:18
  • Using group-by and aggregate functions on DataFrames
  • Using .crosstab and .pivot_table 00:11:31
  • Exercise 1: Aggregating 00:07:15
  • Using .groupby and aggregate methods 00:12:59
  • Exercise 2: Aggregating 00:06:28
  • Combining DataFrames, more on reading files, and writing DataFrames
  • Using .concat to combine DataFrames horizontally or vertically 00:07:45
  • Using .merge to join DataFrames 00:13:28
  • Exercise 1: Combining DataFrames 00:06:56
  • DataFrames: Working with files 00:10:51
  • DataFrames: Separator characters (Delimiters) 00:08:07
  • Exercise 2: Working with files 00:08:14
  • Conclusion 00:00:54
  • Practice Quiz: Introduction to pandas for Data Analysis
  • End of course survey
  • Course Completion