1 00:00:00,270 --> 00:00:03,970 Let's get our soccer game data into a more workable format. 2 00:00:03,970 --> 00:00:07,230 We can try to break up this big file content string 3 00:00:07,230 --> 00:00:10,200 by using some helpful methods on the string class. 4 00:00:10,200 --> 00:00:13,663 Our string actually has some hidden characters in it. 5 00:00:13,663 --> 00:00:16,291 In a text document like data.text, 6 00:00:16,291 --> 00:00:21,240 when I press the Enter key, the cursor moves to the next line. 7 00:00:21,240 --> 00:00:25,610 Well, that's actually being stored as a new line, or a line break character. 8 00:00:26,700 --> 00:00:29,760 Another word for it is a carriage return, which is, kind of, 9 00:00:29,760 --> 00:00:34,190 a holdover from the typewriter days when the carriage physically moved down and 10 00:00:34,190 --> 00:00:38,510 to the beginning of the next line, like it does in our text document. 11 00:00:38,510 --> 00:00:43,140 In some systems, it's only one character, and in others, it's two characters. 12 00:00:43,140 --> 00:00:45,070 And those characters can differ. 13 00:00:45,070 --> 00:00:47,940 In Unix, it's called a line feed character. 14 00:00:47,940 --> 00:00:50,600 In other systems, it's called a carriage return. 15 00:00:50,600 --> 00:00:55,080 And in Windows, it's two characters, both a line feed and a carriage return. 16 00:00:55,080 --> 00:00:55,960 Confusing, right? 17 00:00:56,960 --> 00:00:58,870 We can use these characters to our advantage, 18 00:00:58,870 --> 00:01:03,270 though, in combination with a method on the String class called Split. 19 00:01:03,270 --> 00:01:07,244 Let's check out the documentation on String.Split. 20 00:01:07,244 --> 00:01:12,471 String.split.net. 21 00:01:12,471 --> 00:01:13,794 Here it is. 22 00:01:15,896 --> 00:01:20,037 Returns a string array that contains the substrings in this instance that 23 00:01:20,037 --> 00:01:24,390 are delimited by elements of a specified string or Unicode character array. 24 00:01:25,840 --> 00:01:28,310 Notice the word delimited. 25 00:01:28,310 --> 00:01:30,980 Delimited means that something has a boundary. 26 00:01:30,980 --> 00:01:35,040 So in the case of our CSV file, each of the lines in the file 27 00:01:35,040 --> 00:01:37,530 are delimited by a newline character at the end. 28 00:01:38,740 --> 00:01:41,520 Let's try and figure out what those characters look like. 29 00:01:41,520 --> 00:01:43,820 We can set a breakpoint in Visual Studio, and 30 00:01:43,820 --> 00:01:48,720 take a peek at what the value of fileContents looks like. 31 00:01:48,720 --> 00:01:49,785 Press F5 to debug. 32 00:01:51,723 --> 00:01:55,180 We can hover over the file contents here and see its value. 33 00:01:56,270 --> 00:02:01,440 It's pretty long, but I can see some funny stuff right here, with the backslashes. 34 00:02:01,440 --> 00:02:06,790 We've got a \r, \n, those must be our newline characters. 35 00:02:06,790 --> 00:02:10,080 These backslashes are indicating escape sequences. 36 00:02:10,080 --> 00:02:16,080 Like the \u in our Unicode character, and the \\ in our string directory. 37 00:02:16,080 --> 00:02:21,930 The \r, \n, represents two escape sequences in this string. 38 00:02:21,930 --> 00:02:27,100 The \r is a carriage return, and the \n is a newline. 39 00:02:27,100 --> 00:02:30,690 There are a handful of other escape sequences too, for quotes, and tabs, and 40 00:02:30,690 --> 00:02:31,990 other characters. 41 00:02:31,990 --> 00:02:34,790 I've linked to the full list in the notes. 42 00:02:34,790 --> 00:02:38,600 The backslash itself has to be represented by an escape sequence, 43 00:02:38,600 --> 00:02:44,190 the double backslash, since it's used to indicate an escape sequence has started. 44 00:02:44,190 --> 00:02:49,218 We can stop debugging, and let's use the Split method 45 00:02:49,218 --> 00:02:54,586 on the file content string, fileContents.Split(). 46 00:02:54,586 --> 00:03:00,625 It'll return a string array, so let's assign it to a string[] fileLines, 47 00:03:03,760 --> 00:03:06,637 And then, we'll pass in an array of our newline characters. 48 00:03:06,637 --> 00:03:14,139 (new char[ ] {, and so we'll do a '\r' character, 49 00:03:14,139 --> 00:03:18,136 then we'll also do a '\n'. 50 00:03:21,494 --> 00:03:23,782 Then, we can print each one of these to the console. 51 00:03:26,613 --> 00:03:32,015 Foreach(var line in fileLines). 52 00:03:35,579 --> 00:03:42,090 Console.WriteLine, and (line). 53 00:03:44,976 --> 00:03:47,207 Let's run this and see what we get. 54 00:03:47,207 --> 00:03:49,764 Ctrl+F5. 55 00:03:52,420 --> 00:03:55,780 Well it looks like we've got a double line break here. 56 00:03:55,780 --> 00:03:59,320 I think since we're passing it two characters, it's splitting our string at 57 00:03:59,320 --> 00:04:04,080 the carriage return, and the newline, and giving us an empty string between the two. 58 00:04:04,080 --> 00:04:06,050 That's not really ideal. 59 00:04:06,050 --> 00:04:07,490 Well, we could go in afterwards, 60 00:04:07,490 --> 00:04:11,100 and try to remove or ignore the empty string elements later. 61 00:04:11,100 --> 00:04:13,680 But let's check out the documentation on String.Split again. 62 00:04:15,570 --> 00:04:17,610 One of these overloads might help us. 63 00:04:18,990 --> 00:04:23,320 Aha, splits a string into substrings based on the characters in an array. 64 00:04:23,320 --> 00:04:28,210 You can specify whether the substrings include empty array elements. 65 00:04:28,210 --> 00:04:32,420 We can use this overload to tell it that we don't want the empty array elements. 66 00:04:32,420 --> 00:04:34,660 I bet a lot of developers had the same problem we did. 67 00:04:35,860 --> 00:04:41,605 Back in our code, we'll pass an additional 68 00:04:41,605 --> 00:04:47,041 parameter here, StringSplitOptions, 69 00:04:47,041 --> 00:04:51,869 and then, .RemoveEmptyEntries. 70 00:04:53,672 --> 00:04:58,130 This StringSplitOptions is a type you may not have seen before. 71 00:04:58,130 --> 00:05:01,370 It's called an enum, which is short for a enumeration. 72 00:05:01,370 --> 00:05:05,200 An enum is a type that defines a related set of constants. 73 00:05:05,200 --> 00:05:08,600 Let's check it out, Go To Definition. 74 00:05:10,410 --> 00:05:16,420 So this enum has two constants defined, None and RemoveEmptyEntries. 75 00:05:16,420 --> 00:05:19,250 Each of these constants has an integer value attached to it. 76 00:05:20,710 --> 00:05:25,500 Enums are pretty useful for when you have a finite set of values, like Options. 77 00:05:25,500 --> 00:05:28,240 We'll get into more enums later on in this course. 78 00:05:29,430 --> 00:05:33,009 Let's run it to see if our empty array elements are gone. 79 00:05:33,009 --> 00:05:35,900 Ctrl+F5. 80 00:05:35,900 --> 00:05:36,630 Looks like it worked. 81 00:05:38,920 --> 00:05:43,080 Now that we could read our file, next up is to do something with it. 82 00:05:43,080 --> 00:05:44,410 In the videos that follow, 83 00:05:44,410 --> 00:05:47,690 we'll parse our data into something we can use in our application.