-
Getting started with Anaconda Notebooks 00:01:02
- Starting with pandas Series
-
Creating and filtering series 00:08:40
-
Exercise 1: pandas series 00:04:23
-
Getting the data you need out of a series 00:08:23
-
Exercise 2: pandas series 00:08:28
-
Strings 00:08:55
-
Exercise 3: Strings 00:08:36
- Introducing the pandas DataFrame
-
Making DataFrames from Python collections and describing your DataFrame 00:10:30
-
Renaming columns + exercise 1: DataFrames 00:07:28
-
Creating a DataFrame, counting true values in a Boolean series, and using AND and OR operators 00:08:52
-
Exercise 2: DataFrames 00:10:25
- Using pandas methods to clean, transform, and summarize your data
-
Identifying missing values 00:12:44
-
Exercise 1: Identifying missing values 00:03:08
-
Filling missing values 00:12:03
-
Exercise 2: Filling missing values 00:07:18
- Using group-by and aggregate functions on DataFrames
-
Using .crosstab and .pivot_table 00:11:31
-
Exercise 1: Aggregating 00:07:15
-
Using .groupby and aggregate methods 00:12:59
-
Exercise 2: Aggregating 00:06:28
- Combining DataFrames, more on reading files, and writing DataFrames
-
Using .concat to combine DataFrames horizontally or vertically 00:07:45
-
Using .merge to join DataFrames 00:13:28
-
Exercise 1: Combining DataFrames 00:06:56
-
DataFrames: Working with files 00:10:51
-
DataFrames: Separator characters (Delimiters) 00:08:07
-
Exercise 2: Working with files 00:08:14
-
Conclusion 00:00:54
-
Practice Quiz: Introduction to pandas for Data Analysis
-
End of course survey
-
Course Completion
This video is still being processed. Please check back later and refresh the page.
Uh oh! Something went wrong, please try again.
Introduction to pandas for Data Analysis
Building a foundation in Python using pandas DataFrames for analysis.
This course focuses on creating a rapid understanding for using the pandas library for data analysis in Python. If you know some basic Python and want to learn and use pandas for data tasks in Python, this course is for you! This course demonstrates how to create pandas datasets from other Python data types, introduces built-in pandas functionality, and builds the skills necessary to use pandas for creating, cleaning, and analyzing datasets.
By the end of this hands-on online course, you’ll understand:
- How to create pandas DataFrames from various data sources
- How to subset, filter, and combine DataFrames together
- How to use descriptive statistics, aggregate functions, and “group by” in order to view different dimensions of your data
And you’ll be able to:
- Work with pandas to rapidly slice and dice your data
- Import data into pandas
- Clean, process, and export your pandas DataFrames
- Transform, reshape, and quantify your data
This training is for you because...
- You have some Python basics and want to learn and use the popular pandas library to import, clean, subset, combine, and analyze your data
- You currently work with spreadsheets and need to process larger amounts of data in a more programmatic way
- You want to become a data analyst, data scientist, or machine learning engineer
Prerequisites
- Foundational knowledge of Python (experience with data types, operators, and syntax), or Introduction to Python Programming Learning Path.
Recommended follow-up
Setup
To follow along using your desktop IDE:
- Install or update to the latest version of Anaconda
- Launch your command line tool and configure your conda environment
For macOS and Linux users: Search and launch Terminal in your system
For Windows users: Locate and launch Anaconda Prompt in your system
3. (Optional but recommended) From the command line, run the following prompts to create and activate a new environment
conda create --name NEW_ENV_NAME
conda activate NEW_ENV_NAME
4. Install required packages in the command line
conda install pandas
5. Launch JupyterLab from the command line
jupyter lab
To open Anaconda Notebooks:
- Go to https://anaconda.cloud
- Click on 'Notebooks' from the top navigation menu
- Create an account or login if you already have one
The Notebooks for this course are also available at this public GitHub link.
Facilitator Bio
Ryan Orsinger serves as the Director of Data Science and Research at Haven for Hope, a large non-profit serving individuals experiencing homelessness in South Texas. Prior to joining Haven for Hope, Ryan taught data science and software development for 8 years at Codeup, a programming bootcamp, where he helped over 600 individuals become practicing data scientists and software developers.
As an individual contributor, Ryan has worked on data projects from customer segmentation analysis and anomaly detection for security to building software for learning management systems, events management platforms, and CRM systems.
You can find Ryan at: