1 00:00:00,200 --> 00:00:03,990 Often, the data that we work with will not be just in one single data frame, 2 00:00:03,990 --> 00:00:06,409 it will be spread across multiple data frames. 3 00:00:06,409 --> 00:00:09,368 And our job is often to take these different data frames, and 4 00:00:09,368 --> 00:00:12,280 combine them together, and somehow produce a new result. 5 00:00:12,280 --> 00:00:16,544 Sometimes we're lucky and these data frames have clearly related information, 6 00:00:16,544 --> 00:00:20,572 and other times it's a bit of a challenge to figure out how to relate the rows. 7 00:00:20,572 --> 00:00:24,507 If the labels match between the data frames, it's possible to join the two 8 00:00:24,507 --> 00:00:29,022 together quite easily, much like you would see in SQL with primary and foreign keys. 9 00:00:29,022 --> 00:00:31,735 And, of course, sometimes it's not that cut and dry, 10 00:00:31,735 --> 00:00:34,643 you have to do some work to relate the data frames together. 11 00:00:34,643 --> 00:00:35,809 But that's okay. 12 00:00:35,809 --> 00:00:36,979 We're ready for that work. 13 00:00:36,979 --> 00:00:39,051 We've been picking up manipulation skills so 14 00:00:39,051 --> 00:00:41,924 that we can get things together in the right shape for our needs. 15 00:00:41,924 --> 00:00:44,166 After your data friends are merged together, 16 00:00:44,166 --> 00:00:47,509 some new problems will most likely enter the introduce themselves. 17 00:00:47,509 --> 00:00:50,692 You might have to worry about duplicate records or missing data. 18 00:00:50,692 --> 00:00:51,985 Let's do some merging and 19 00:00:51,985 --> 00:00:55,054 cleaning up of those problems that you're bound to run into,