Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Video Player
00:00
00:00
00:00
- 2x 2x
- 1.75x 1.75x
- 1.5x 1.5x
- 1.25x 1.25x
- 1.1x 1.1x
- 1x 1x
- 0.75x 0.75x
- 0.5x 0.5x
Welcome! In this video, we'll introduce you to the concat() function and the arguments used to successfully concatenate data between two datasets.
Data files for download
Load 2019 data into pandas
billboard19 = pd.read_csv("billboard_100_2019.csv", index_col ="ID")
spotify19 = pd.read_csv("spotify_200_2019.csv", index_col ="ID")
Create DataFrames for 2019 Ariana Grande Billboard and Spotify song data
ariana_bill19 = billboard19[billboard19["Artists"].str.contains("Ariana Grande")]
ariana_spot19 = spotify19[spotify19["Artists"].str.contains("Ariana Grande")]
Concatenate 2017-2018 Ariana Grande song data with 2019 Ariana Grande song data in Billboard
ariana_bill_all = pd.concat([ariana_bill, ariana_bill19])
Concatenate 2017-2018 Ariana Grande song data with 2019 Ariana Grande song data in Spotify
ariana_spot_all = pd.concat([ariana_spot, ariana_spot19])
Additional Resources
- Pandas API: concat() function
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Welcome, in this video,
we will combine two data frames into one,
0:00
using concatenation.
0:04
That's a big word.
0:06
But, if you're in this workshop
you're probably familiar with string
0:08
concatenation, which is the combination
of two or more strings.
0:11
In Python,
this is performed with the plus operator.
0:15
For example, print
0:19
("book" + "keeper")
0:23
output is bookkeeper.
0:28
We simply combine two small
words into a new, larger word.
0:34
It's like we added the words together.
0:38
You can even consider this a full
outer join of the two words.
0:41
Notice there is no merge
among the letters.
0:45
The end of "book" and the beginning
of "keeper" are the same letter, and
0:48
both ks are retained in the final output.
0:53
Where am I going with this?
0:56
Well, I have found more data for
us to use.
0:57
Six additional months of Billboard
100 charts and Spotify 200 charts,
1:00
I'd like to add this to my existing lists.
1:05
Billboard_100_2019.csv contains
the Billboard 100 chart data from 2019.
1:12
Spotify_200_2019.csv contains
the Spotify 200 chart data from 2019.
1:20
This new data has the same
columns as the original data,
1:27
new dates, new songs, new artists.
1:32
So we just need to stitch these rows
to the end of the existing data frame.
1:36
Let's first load this new data
into their own data frames.
1:43
Make sure you download these files
from the teacher's notes and
1:46
save them to your working folder.
1:49
I'll add 19 to the end to differentiate
them from the original datasets.
1:51
Billboard19 =
2:00
pd.read_csv("Billboard_100_2019.csv",
2:03
index_col="ID").
2:16
Spotify19 =
2:26
pd.read_csv("Spotify_200_2019.csv",
2:29
index_ col="ID".
2:40
And let's isolate Ariana Grande songs, so
2:45
we can work on a smaller
portion of the data.
2:49
Ariana_bill19 = billboard19[billboard
2:55
19["Artists"].str.contains("Ariana
3:05
Grande")].
3:16
Ariana_spot19 =
3:25
spotify19[spotify19["Artists"].str.contai-
ns("Ariana
3:29
Grande")].
3:44
Let's take a peek at the top of
the original Billboard dataset.
3:50
Ariana_bill.head Then the new dataset.
3:59
Ariana_bill19.head.
4:13
Great, they have the same headings.
4:20
Let's see the shape of the original.
4:27
Ariana_bill.shape, and the new.
4:31
Ariana_bill19.shape.
4:38
What I wanna do here is
add the 101 new rows,
4:43
to the 102 existing rows
in my original dataset.
4:46
In this case, I want to concatenate
the two data frames into a new data frame.
4:51
Let's talk about the concat function.
4:57
I appreciate the abbreviation here.
5:00
It has one required argument,
5:02
a Python list of objects in
the order we wish to connect them.
5:04
By default, it performs an outer join
along the row axis, which is what we want.
5:08
Because the existing data and the new
data have the same column headings,
5:15
we don't need any optional
arguments in this case.
5:18
So let's start the concatenation.
5:22
Ariana_bill_all = pd.concat([ariana_bill,
5:28
ariana_bill19]).
5:38
Let's check the dimensions,
ariana_bill_all.shape.
5:46
So this was my expectation.
5:56
We added the first dataset
to the second dataset.
5:57
The new dataset has 102 plus 101 or
203 rows.
6:01
They both have the same 7 columns, so
the new dataset also has 7 columns.
6:08
Now we'll do the same for
the Spotify data.
6:15
Let's make sure they
have the same headings.
6:17
The original, ariana_spot.head()
6:21
ariana_spot19.head Same headings,
6:35
let's check the shape, ariana_spot.shape.
6:45
And new ariana_spot19.shape.
6:54
Let's concatenate the Spotify data.
7:03
ariana_spot_all
7:05
= pd.concat([ariana_spot,
7:10
ariana_spot19]).
7:18
And let's check its dimensions.
7:28
Ariana_spot_all.shape, the new
7:31
dataset has 186 plus 196.
7:37
That's 382 records.
7:44
They both have the same 5 columns, so
the new data frame also has 5 columns.
7:46
The concat method has optional arguments,
although we didn't need any for our data,
7:51
but make sure to check out
the teacher's notes for more info.
7:55
I have another challenge for you.
7:59
We concatenated the new Ariana Grande
Billboard data to her existing data.
8:02
We started with 24 months of data,
we now have 30 months of data.
8:07
We did the same for her Spotify data.
8:11
I would like to use the same method
demonstrated in this video to concatenate
8:13
a full set of existing Billboard data,
with the new Billboard dataset.
8:18
Do the same for
the full set of existing Spotify data,
8:23
with the new Spotify dataset.
8:26
Call your new data frames billboard_all,
and spotify_all.
8:29
In the next video,
I'll show you my solution.
8:35
See you there.
8:37
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up