1 00:00:00,350 --> 00:00:04,440 When analyzing data, it'll usually be presented as a table. 2 00:00:04,440 --> 00:00:08,250 You might be used to viewing tabular data in Microsoft's Excel or 3 00:00:08,250 --> 00:00:09,800 Apple's Numbers app. 4 00:00:09,800 --> 00:00:12,656 However, for this course, we'll be using Google Sheets, 5 00:00:12,656 --> 00:00:16,100 which offers a spreadsheet app right within the browser. 6 00:00:16,100 --> 00:00:17,970 If you're not familiar with Google Sheets, or 7 00:00:17,970 --> 00:00:20,430 need a refresher on how to use spreadsheets, 8 00:00:20,430 --> 00:00:23,710 check out our spreadsheet basics course linked in the teacher's notes below. 9 00:00:24,980 --> 00:00:28,098 Let's go to sheets.google.com to get started. 10 00:00:31,675 --> 00:00:34,870 You'll need a Google account to work with Google Sheets. 11 00:00:34,870 --> 00:00:38,580 If you don't have one, it's easy and free to sign up. 12 00:00:38,580 --> 00:00:43,077 Here, we can create a new spreadsheet by clicking the blank section under start 13 00:00:43,077 --> 00:00:44,225 a new spreadsheet. 14 00:00:47,431 --> 00:00:51,482 For this course, we'll be looking at the results of the Boston Marathon, 15 00:00:51,482 --> 00:00:54,510 which is a popular race in the United States. 16 00:00:54,510 --> 00:00:59,055 So for the first step, let's click up here on Untitled Spreadsheet and 17 00:00:59,055 --> 00:01:02,023 change the title to Boston Marathon Results. 18 00:01:04,915 --> 00:01:08,460 Awesome, next we need to import the data. 19 00:01:08,460 --> 00:01:11,951 It's available down below in the teachers notes as a CSV file. 20 00:01:14,164 --> 00:01:17,459 CSV stands for Common Separated Value,s and 21 00:01:17,459 --> 00:01:20,241 if we open up the file in a text editor, 22 00:01:23,816 --> 00:01:28,210 We can see that each line of the file represents a different runner. 23 00:01:28,210 --> 00:01:32,670 And each piece of data about that runner is separated by a comma. 24 00:01:32,670 --> 00:01:37,240 CSV files are easy to understand and easy to deal with. 25 00:01:37,240 --> 00:01:41,550 You'll be seeing lots of CSV files as a data analyst. 26 00:01:41,550 --> 00:01:47,223 Once you've downloaded the marathon_results_2017.csv file, 27 00:01:47,223 --> 00:01:52,369 back in Google Sheets, choose file, Import, 28 00:01:52,369 --> 00:01:56,575 and then on the upload tab, drag in your 29 00:01:56,575 --> 00:02:01,428 marathon_results_2017.csv file. 30 00:02:02,979 --> 00:02:07,290 We then need to choose an import action and a separator character. 31 00:02:07,290 --> 00:02:11,430 For the import option, since all we've got is an empty sheet, 32 00:02:11,430 --> 00:02:15,038 let's just choose to, Append rows to current sheet. 33 00:02:15,038 --> 00:02:18,424 And for the separator character we could choose Comma, 34 00:02:18,424 --> 00:02:22,200 but, Detect automatically will work just fine. 35 00:02:22,200 --> 00:02:24,370 So let's leave it as is, and click Import. 36 00:02:25,650 --> 00:02:31,573 It might take a minute to import the over 25,000 finishers of the Boston Marathon. 37 00:02:31,573 --> 00:02:36,178 But after it does it should look something like this, and to make it a little easier 38 00:02:36,178 --> 00:02:40,150 to see on my screen I'm going to enter presentation mode. 39 00:02:40,150 --> 00:02:44,390 In this data set each row represents a single finisher, and 40 00:02:44,390 --> 00:02:49,360 each column represents a discrete piece of information about that finisher. 41 00:02:50,360 --> 00:02:53,590 Let's look through the data we have for each finisher. 42 00:02:53,590 --> 00:02:57,823 First off, we've got a unused column that seems to be a line number. 43 00:02:57,823 --> 00:02:59,280 Followed by their Bib number, 44 00:02:59,280 --> 00:03:02,820 which is the number the runner was wearing during the race. 45 00:03:02,820 --> 00:03:06,263 Next, we've got the runners Name, Age, and 46 00:03:06,263 --> 00:03:09,988 whether the runner registered as male or female. 47 00:03:09,988 --> 00:03:13,360 After that, we've got some geographic information. 48 00:03:13,360 --> 00:03:16,546 City, state, country, and even citizenship for 49 00:03:16,546 --> 00:03:19,593 athletes that live outside their home country. 50 00:03:22,528 --> 00:03:26,073 Next, we have an empty column followed by the runners' times at 51 00:03:26,073 --> 00:03:28,990 various intervals throughout the race. 52 00:03:28,990 --> 00:03:30,793 K here stands for kilometers. 53 00:03:34,040 --> 00:03:37,206 Then, we've got the runner's pace per mile, 54 00:03:37,206 --> 00:03:42,700 an empty projected time column, and the official time of the runner. 55 00:03:42,700 --> 00:03:46,600 Finally, at the end, we've got the runner's overall ranking, 56 00:03:46,600 --> 00:03:50,890 ranking within their gender, and the ranking within their division. 57 00:03:50,890 --> 00:03:53,990 It's a lot of data, but I'm sure we'll be able to analyze it.