Reading Stock Data from a CSV File with Python Pandas

June 30, 2022

Now that you learned how easy it is to work with spreadsheet data in Pandas, let’s practice using some real financial data! In this activity, you will create a DataFrame from a CSV file and then explore its contents using the DataFrame’s built-in functions.

Starter file

1. Import the `pandas` library as `pd` and the `Path` class from `pathlib`

# initial imports

2. Create a DataFrame by reading in a CSV file

# set the file path

# create a Pandas DataFrame from a csv file

3. Explore the data

# get the first 10 rows from the DataFrame

4. Fix the column names

# set column names

# recreate the DataFrame

# add column names

5. Get the first 10 rows

# get the first 10 rows from the DataFrame

6. Challenge: Get the bottom 10 rows

# get the bottom 10 rows from the DataFrame

Instructions

Using the starter file, complete the following steps.

Import the Pandas library by initializing the program with import pandas as pd.
Create a DataFrame by reading in the shopify_stock_data.csv file containing historical price data for Shopify from 2015 to 2019 at the Toronto Stock Exchange.
Perform an initial data exploration by getting the top 10 rows of the DataFrame.
Oh no! There are no column names on the DataFrame. Fix this problem by recreating the DataFrame and setting the column names to “Date”, “Close”, “Volume”, “Open”, “High”, “Low”.
When the column names are fixed, get the first 10 rows from the DataFrame.

Challenge

Get the bottom 10 rows of the DataFrame. Use Google to figure out how to do this.

Hint

Consult the Pandas head() function documentation.

Solution

1. Import the `pandas` library as `pd` and the `Path` class from `pathlib`

In [1]:

# initial imports
import pandas as pd
from pathlib import Path

2. Create a DataFrame by reading in a CSV file

In [2]:

# set the file path
file_path = Path("../Resources/shopify_stock_data.csv")

# create a Pandas DataFrame from a csv file
df = pd.read_csv(file_path)

3. Explore the data

In [3]:

# get the first 10 rows from the DataFrame
df.head(10)

Out [3]:

“\n”, “\n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, “

	5/21/2015 16:00:00	31.25	211058	35.03	35.03.1	30
0	5/22/2015 16:00:00	34.94	224174	32.32	38.00	32.00
1	5/25/2015 16:00:00	37.26	105460	35.00	37.47	35.00
2	5/26/2015 16:00:00	36.92	75935	37.26	37.69	36.30
3	5/27/2015 16:00:00	34.50	135778	38.00	38.16	33.63
4	5/28/2015 16:00:00	34.00	28756	34.60	34.60	33.14
5	5/29/2015 16:00:00	33.52	15319	33.71	34.11	33.35
6	6/1/2015 16:00:00	33.85	8839	33.73	34.27	33.52
7	6/2/2015 16:00:00	33.13	7318	34.72	34.72	33.13
8	6/3/2015 16:00:00	34.19	29821	33.98	34.26	33.00
9	6/4/2015 16:00:00	33.45	32221	34.99	34.99	31.88

\n”,

4. Fix the column names

In [4]:

# set column names
col_names = ["date", "close", "volume", "open", "high", "low"]

# recreate the DataFrame
df = pd.read_csv(file_path, header = None)

# add column names
df.columns = col_names

5. Get the first 10 rows

In [5]:

# get the first 10 rows from the DataFrame
df.head(10)

Out [5]:

“\n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, ” \n”, “

	date	close	volume	open	high	low
0	5/21/2015 16:00:00	31.25	211058	35.03	35.03	30.00
1	5/22/2015 16:00:00	34.94	224174	32.32	38.00	32.00
2	5/25/2015 16:00:00	37.26	105460	35.00	37.47	35.00
3	5/26/2015 16:00:00	36.92	75935	37.26	37.69	36.30
4	5/27/2015 16:00:00	34.50	135778	38.00	38.16	33.63
5	5/28/2015 16:00:00	34.00	28756	34.60	34.60	33.14
6	5/29/2015 16:00:00	33.52	15319	33.71	34.11	33.35
7	6/1/2015 16:00:00	33.85	8839	33.73	34.27	33.52
8	6/2/2015 16:00:00	33.13	7318	34.72	34.72	33.13
9	6/3/2015 16:00:00	34.19	29821	33.98	34.26	33.00

\n”,

6. Challenge: Get the bottom 10 rows

In [6]:

# get the bottom 10 rows from the DataFrame
df.tail(10)

Out [6]:

	date	close	volume	open	high	low
1145	12/13/2019 16:00:00	508.43	312817	488.94	516.60	488.90
1146	12/16/2019 16:00:00	517.42	221975	513.98	521.26	502.85
1147	12/17/2019 16:00:00	510.85	252381	517.20	521.97	500.77
1148	12/18/2019 16:00:00	520.56	391152	513.00	530.17	512.97
1149	12/19/2019 16:00:00	516.06	178895	520.99	526.66	513.85
1150	12/20/2019 16:00:00	513.22	715483	516.20	524.99	510.29
1151	12/23/2019 16:00:00	511.62	243940	516.83	524.99	510.37
1152	12/24/2019 13:30:00	525.39	125214	512.46	527.17	508.21
1153	12/27/2019 16:00:00	534.76	156355	539.97	544.00	528.00
1154	12/30/2019 16:00:00	517.79	162031	535.36	535.36	512.57

\n”,

Reading Stock Data from a CSV File with Python Pandas

Starter file

1. Import the `pandas` library as `pd` and the `Path` class from `pathlib`

2. Create a DataFrame by reading in a CSV file

3. Explore the data

4. Fix the column names

5. Get the first 10 rows

6. Challenge: Get the bottom 10 rows

Instructions

Challenge

Hint

Solution

1. Import the `pandas` library as `pd` and the `Path` class from `pathlib`

2. Create a DataFrame by reading in a CSV file

3. Explore the data

4. Fix the column names

5. Get the first 10 rows

6. Challenge: Get the bottom 10 rows

Using Pandas for Complex Data Instead of Spreadsheets

Reading Financial Statements with Python Pandas

Getting Familiar with the Terminal

Pseudocode a Solution Formula with Python – Part 2

Pseudocode a Solution Formula with Python – Part 1

Introduction to Python Pandas

Leave a reply Cancel reply

Reading Stock Data from a CSV File with Python Pandas

Starter file

1. Import the pandas library as pd and the Path class from pathlib

2. Create a DataFrame by reading in a CSV file

3. Explore the data

4. Fix the column names

5. Get the first 10 rows

6. Challenge: Get the bottom 10 rows

Instructions

Challenge

Hint

Solution

1. Import the pandas library as pd and the Path class from pathlib

2. Create a DataFrame by reading in a CSV file

3. Explore the data

4. Fix the column names

5. Get the first 10 rows

6. Challenge: Get the bottom 10 rows

Using Pandas for Complex Data Instead of Spreadsheets

Reading Financial Statements with Python Pandas

Getting Familiar with the Terminal

Pseudocode a Solution Formula with Python – Part 2

Pseudocode a Solution Formula with Python – Part 1

Introduction to Python Pandas

Leave a reply Cancel reply

1. Import the `pandas` library as `pd` and the `Path` class from `pathlib`

1. Import the `pandas` library as `pd` and the `Path` class from `pathlib`