1 00:00:00,690 --> 00:00:05,690 A common data structure built into nearly every programming language is the array. 2 00:00:05,690 --> 00:00:07,980 Arrays are a fundamental data structure, and 3 00:00:07,980 --> 00:00:10,550 can be used to represent a collection of values. 4 00:00:10,550 --> 00:00:12,480 But it is much more than that. 5 00:00:12,480 --> 00:00:16,922 Arrays are also used as building blocks to create even more custom data types and 6 00:00:16,922 --> 00:00:17,732 structures. 7 00:00:17,732 --> 00:00:22,604 In fact in most programming languages, text is represented using the string type. 8 00:00:22,604 --> 00:00:23,904 And under the hood, 9 00:00:23,904 --> 00:00:29,040 strings are just a bunch of characters stored in a particular order in an array. 10 00:00:29,040 --> 00:00:34,670 Before we go further and dig into arrays, what exactly is a data structure? 11 00:00:34,670 --> 00:00:37,850 A data structure is a way of storing data when programming. 12 00:00:37,850 --> 00:00:42,020 It's not just a collection of values and the format they are stored in, but 13 00:00:42,020 --> 00:00:44,760 the relationship between the values in the collection, 14 00:00:44,760 --> 00:00:49,380 as well as the operations applied on the data stored in the structure. 15 00:00:49,380 --> 00:00:52,440 An array is one of very many data structures. 16 00:00:52,440 --> 00:00:57,430 In general, an array is a data structure that stores a collection of values 17 00:00:57,430 --> 00:01:00,770 where each value is referenced using an index or a key. 18 00:01:01,898 --> 00:01:06,720 A common analogy for thinking about arrays is as a set of train cars. 19 00:01:06,720 --> 00:01:10,190 Each car has a number, and these cars are ordered sequentially. 20 00:01:10,190 --> 00:01:14,703 Inside each car, the array, or the train in this analogy stores some data. 21 00:01:14,703 --> 00:01:18,416 While this is the general representation of an array, 22 00:01:18,416 --> 00:01:22,058 it can differ slightly from one language to another. 23 00:01:22,058 --> 00:01:25,458 But for the most part, all these fundamentals remain the same. 24 00:01:25,458 --> 00:01:30,208 In a language like Swift or Java, arrays are homogeneous containers, 25 00:01:30,208 --> 00:01:34,097 which means they can only contain values of the same type. 26 00:01:34,097 --> 00:01:39,483 If you use an array to store integers in Java, it can only store integers. 27 00:01:39,483 --> 00:01:40,762 In other languages, 28 00:01:40,762 --> 00:01:45,031 arrays are heterogeneous structures that can store any kind of value. 29 00:01:45,031 --> 00:01:49,230 In Python, for example, you can mix numbers and text with no issues. 30 00:01:49,230 --> 00:01:51,597 Now regardless of this nuance, 31 00:01:51,597 --> 00:01:55,285 the fundamental concept of an array is the index. 32 00:01:55,285 --> 00:01:58,054 This index value is used for every operation on 33 00:01:58,054 --> 00:02:03,420 the array from accessing values, to inserting, updating, and deleting. 34 00:02:03,420 --> 00:02:05,930 In Python, the language we're going to be using for 35 00:02:05,930 --> 00:02:09,000 this course is a tiny bit confusing. 36 00:02:09,000 --> 00:02:13,340 The type that we generally refer to as an array in most languages is 37 00:02:13,340 --> 00:02:17,180 best represented by the least type in Python. 38 00:02:17,180 --> 00:02:21,040 Python does have a type called array as well, but it's something different. 39 00:02:21,040 --> 00:02:22,694 So, we're not going to use it. 40 00:02:22,694 --> 00:02:26,359 While Python calls it a list, when we use a list in this course, 41 00:02:26,359 --> 00:02:31,148 we'll be talking about concepts that apply to arrays as well in other languages. 42 00:02:31,148 --> 00:02:33,344 So definitely don't skip any of this. 43 00:02:33,344 --> 00:02:34,886 There's one more thing. 44 00:02:34,886 --> 00:02:40,019 In computer science, a list is actually a different data structure than an array and 45 00:02:40,019 --> 00:02:43,651 in fact we're going to build a list later on in this course. 46 00:02:43,651 --> 00:02:48,812 Generally though this structure is called a linked list as opposed to just list. 47 00:02:48,812 --> 00:02:51,720 So hopefully the terminology isn't too confusing. 48 00:02:51,720 --> 00:02:54,251 To properly understand how arrays work, 49 00:02:54,251 --> 00:02:58,630 let's take a peek at how arrays are stored under the hood. 50 00:02:58,630 --> 00:03:01,660 An array is a contiguous data structure. 51 00:03:01,660 --> 00:03:05,750 This means that the array is stored in blocks of memory that are right beside 52 00:03:05,750 --> 00:03:08,090 each other with no gaps. 53 00:03:08,090 --> 00:03:12,290 The advantage of doing this is that retrieving values is very easy. 54 00:03:12,290 --> 00:03:16,130 In a non-contiguous data structure, we're going to build one soon. 55 00:03:16,130 --> 00:03:21,088 The structure stores a value, as well as a reference to where the next value is. 56 00:03:21,088 --> 00:03:24,730 To retrieve that next value, the language has to follow that reference, 57 00:03:24,730 --> 00:03:28,140 also called a pointer, to the next block of memory. 58 00:03:28,140 --> 00:03:31,017 This adds some overhead, which as you will see, 59 00:03:31,017 --> 00:03:33,762 increases the run time of common operations. 60 00:03:33,762 --> 00:03:37,137 A second ago, I mentioned that depending on the language, 61 00:03:37,137 --> 00:03:41,256 a race can either be homogeneous, containing the same type of value, or 62 00:03:41,256 --> 00:03:44,385 heterogeneous, where any kind of value can be mixed. 63 00:03:44,385 --> 00:03:48,120 This choice also affects the memory layout of the array. 64 00:03:48,120 --> 00:03:53,500 For example, in a language like C, Swift, or Java, where arrays are homogeneous. 65 00:03:53,500 --> 00:03:55,161 When an array is created, 66 00:03:55,161 --> 00:03:59,084 since the kind of value is known to the language compiler, and 67 00:03:59,084 --> 00:04:03,988 you can think of the compiler as the brains behind the language, it can choose 68 00:04:03,988 --> 00:04:08,768 a contiguous block of memory that fits the arrays size and values created. 69 00:04:08,768 --> 00:04:12,838 If the values were integers, assuming an integer took up space, 70 00:04:12,838 --> 00:04:16,984 represented by one of these blocks, then for a five-item array, 71 00:04:16,984 --> 00:04:20,995 the compiler can allocate five blocks of equally sized memory. 72 00:04:20,995 --> 00:04:24,042 In Python, however, this not the case. 73 00:04:24,042 --> 00:04:26,205 We can put any value in a Python list. 74 00:04:26,205 --> 00:04:28,272 There's no restriction. 75 00:04:28,272 --> 00:04:32,843 The way this works is a combination of contiguous memory, and the pointers or 76 00:04:32,843 --> 00:04:34,895 references I mentioned earlier. 77 00:04:34,895 --> 00:04:36,926 When we create a list in Python, 78 00:04:36,926 --> 00:04:40,690 there is no information about what will go into that array, 79 00:04:40,690 --> 00:04:45,001 which makes it hard to allocate contiguous memory of the same size. 80 00:04:45,001 --> 00:04:48,281 There are several advantages to having contiguous memory. 81 00:04:48,281 --> 00:04:51,287 Since the values are stored beside each other, 82 00:04:51,287 --> 00:04:54,831 accessing the values happens in almost constant time. 83 00:04:54,831 --> 00:04:58,083 So this is a characteristic we want to preserve. 84 00:04:58,083 --> 00:05:02,628 The way Python gets around this is by allocating contiguous memory and 85 00:05:02,628 --> 00:05:06,788 storing in it, not the value we want to store, but a reference or 86 00:05:06,788 --> 00:05:10,968 a pointer to the value that's stored somewhere else in memory. 87 00:05:10,968 --> 00:05:15,867 By doing this, it can allocate equally sized contiguous memory since regardless 88 00:05:15,867 --> 00:05:20,842 of the value size, the size of the pointer to that value is always going to be equal. 89 00:05:20,842 --> 00:05:25,024 This incurs an additional cost in that when a value is accessed, we need 90 00:05:25,024 --> 00:05:29,707 to follow the pointer to the block of memory where the value is actually stored. 91 00:05:29,707 --> 00:05:33,967 But Python has ways of dealing with these costs that are outside the scope of this 92 00:05:33,967 --> 00:05:34,485 course. 93 00:05:34,485 --> 00:05:37,215 Now that we know how an array stores its values, 94 00:05:37,215 --> 00:05:40,777 let's look at common operations that we execute on an array.