PandasZoo

Pandas Python Tutorial: Learn Pandas and Python in stages

Let's get started!

Pandas Tutorials

Pandas Tutorial 1 - Titanic data set.

Pandas Basics: Titanic exercises

The first tutorial is using the Titanic data set. It is an Excel file called titanic3.xls. The only tab is called titanic3.

The questions we use for this tutorial are based on the titanic3 dataset, which can be found here

Note: At PandasZoo we use single quotes for our answers.

This is a sample of what the titanic3 data set looks like:

pclass survived name sex age sibsp parch ticket fare cabin embarked boat body home.dest
0 1 1 Allen, Miss. Elisabeth Walton female 29.0000 0 0 24160 211.3375 B5 S 2 NaN St Louis, MO
1 1 1 Allison, Master. Hudson Trevor male 0.9167 1 2 113781 151.5500 C22 C26 S 11 NaN Montreal, PQ / Chesterville, ON
2 1 0 Allison, Miss. Helen Loraine female 2.0000 1 2 113781 151.5500 C22 C26 S NaN NaN Montreal, PQ / Chesterville, ON
3 1 0 Allison, Mr. Hudson Joshua Creighton male 30.0000 1 2 113781 151.5500 C22 C26 S NaN 135.0 Montreal, PQ / Chesterville, ON
4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female 25.0000 1 2 113781 151.5500 C22 C26 S NaN NaN Montreal, PQ / Chesterville, ON

Question 1

Import the Pandas module.

Hint: We put in an example answer that you should try typing in.





Question 2

Import the NumPy module.





Question 3

Import the matplotlib.pyplot module.





Question 4

Read in the titanic3.xls data set. The only tab is called titanic3. Make sure NAs are labeled 'NA'.

When you read in the dataframe, call it titanic_df. Also, make sure index_col is set to None.

The order we want the read_excel function is: filename, tab name, index_col, and na_values.

Hint: We refer to Pandas as pd. You can find official documentation for read_excel here





Question 5

Use the head function to look at the titanic_df DataFrame.





Question 6

Use the describe function to learn more about the titanic_df dataset.





Question 7

Drop the 'ticket', 'cabin', 'boat', and 'body' columns in this order from the titanic_df dataframe.





Question 8

Let's make a bar plot using Pandas on the 'survived' column using value counts.

Hint: we refer to Pandas as pd





Question 9

Let's have a look at the mean of the people that survived, i.e the 'survived' column.

Hint: Don't make an object.





Question 10

Let's group our data by the sex of the passenger and see what the means are.

Hint: Don't make an object.





Question 11

Let's group the data by sex and class of the passenger in that order and see what the means are.

Hint: Don't make an object.

Another Hint: 'pclass' is the class column.





Question 12

Let's look at the sex and the class in but only look at those under 18 years old and see what the means are.

Hint: Don't make an object and do this in one line of code.

Hint: The group by order should be sex then class.