1 00:00:00,590 --> 00:00:04,330 There are many different approaches, or models, in machine learning. 2 00:00:04,330 --> 00:00:07,600 But generally, they can be broken down into two 3 00:00:07,600 --> 00:00:13,010 major categories called supervised learning and unsupervised learning. 4 00:00:13,010 --> 00:00:17,310 I'm going to use a lot of new terms here that might be confusing at first. 5 00:00:17,310 --> 00:00:18,330 But don't worry, 6 00:00:18,330 --> 00:00:23,600 I'll break each one down a little further after the initial explanations. 7 00:00:23,600 --> 00:00:27,470 You might wanna pause the video periodically to read the teacher notes as 8 00:00:27,470 --> 00:00:28,920 each concept is introduced. 9 00:00:28,920 --> 00:00:33,580 You might also need to go back and rewatch parts of a video to review. 10 00:00:34,620 --> 00:00:38,800 Let's start with an important vocabulary word. 11 00:00:38,800 --> 00:00:44,150 When you hear the word model or algorithm in reference to machine learning, 12 00:00:44,150 --> 00:00:47,500 it essentially means an approach to the problem. 13 00:00:47,500 --> 00:00:53,870 A model in machine learning is just like an architectural model or a doll house. 14 00:00:53,870 --> 00:00:57,900 It's a simplification that attempts to simulate or 15 00:00:57,900 --> 00:01:01,630 demonstrate some aspect of the real world. 16 00:01:01,630 --> 00:01:06,060 Meteorologists use weather models to try and forecast the weather. 17 00:01:06,060 --> 00:01:09,590 Because even all the computing power in the world 18 00:01:09,590 --> 00:01:13,550 couldn't perfectly simulate every tiny air particle and 19 00:01:13,550 --> 00:01:17,200 temperature and pressure change that would influence the outcome. 20 00:01:17,200 --> 00:01:21,530 So instead, we can use a model to simplify the problem and 21 00:01:21,530 --> 00:01:24,310 get a useful approximation of the result. 22 00:01:25,400 --> 00:01:29,330 A model in machine learning is typically probabilistic. 23 00:01:29,330 --> 00:01:34,190 In other words, it usually does not produce an exact result. 24 00:01:34,190 --> 00:01:39,640 Rather, it makes a prediction with a corresponding percentage of confidence. 25 00:01:39,640 --> 00:01:43,620 This isn't something you need to understand in great detail, but 26 00:01:43,620 --> 00:01:45,030 it's good to know the basics. 27 00:01:46,050 --> 00:01:51,760 Probability is a means of expressing how likely it is that an event will occur, or 28 00:01:51,760 --> 00:01:58,240 a way of measuring how close a value might be to the actual correct value. 29 00:01:58,240 --> 00:02:03,330 Probability is typically quantified as a value between 0 and 1, 30 00:02:03,330 --> 00:02:08,920 with 0 being a complete guess, and 1 being complete certainty. 31 00:02:08,920 --> 00:02:13,610 For example, if you're rolling a die and hoping to roll a 2, 32 00:02:13,610 --> 00:02:17,010 you have a 1 in 6 chance of rolling a 2, 33 00:02:17,010 --> 00:02:22,032 because the die has 6 sides, and only one of them is a 2. 34 00:02:22,032 --> 00:02:27,192 1 divided by 6 is 0.16 repeating, 35 00:02:27,192 --> 00:02:30,401 or a 16.6% chance. 36 00:02:30,401 --> 00:02:35,360 This is relevant to machine learning because if a prediction is 37 00:02:35,360 --> 00:02:40,590 far outside the norm, the model might have low confidence in the answer, 38 00:02:40,590 --> 00:02:44,370 because it's highly probable that it's not correct. 39 00:02:44,370 --> 00:02:47,580 If the prediction matches up almost perfectly with 40 00:02:47,580 --> 00:02:53,280 an existing example in a data set, then the confidence will be very high. 41 00:02:53,280 --> 00:02:56,750 Now, with the definition of a model in mind, 42 00:02:56,750 --> 00:03:00,760 let's take a look at the two major categories of machine learning. 43 00:03:01,830 --> 00:03:05,570 The first of the two categories for machine learning approaches or 44 00:03:05,570 --> 00:03:09,340 models is called Supervised Learning. 45 00:03:09,340 --> 00:03:12,580 Supervised learning is when a machine intelligence 46 00:03:12,580 --> 00:03:17,740 is tasked with predicting a category or a quantity. 47 00:03:17,740 --> 00:03:19,330 Predicting a category or 48 00:03:19,330 --> 00:03:24,590 a quantity comprises the two subcategories of supervised learning, 49 00:03:24,590 --> 00:03:30,590 which are called classification and regression respectively, put another way. 50 00:03:30,590 --> 00:03:33,570 A classifier looks at a piece of data and 51 00:03:33,570 --> 00:03:38,160 tries to categorize it, or, in other words, classify it. 52 00:03:38,160 --> 00:03:43,850 And a regression tries to predict a quantity or a number. 53 00:03:43,850 --> 00:03:48,480 The second of the two major categories is Unsupervised Learning. 54 00:03:48,480 --> 00:03:53,330 Unsupervised learning is when a computer analyzes unlabeled data, and 55 00:03:53,330 --> 00:03:58,970 has no previous examples, and tries to identifies patterns in the data. 56 00:03:58,970 --> 00:04:03,470 One of the most common subcategories of unsupervised learning is called 57 00:04:03,470 --> 00:04:08,560 clustering, which are models that attempt to group similar things together. 58 00:04:08,560 --> 00:04:11,010 Because learning is unsupervised, 59 00:04:11,010 --> 00:04:15,240 the model's definition of similar might be different than our own. 60 00:04:16,598 --> 00:04:20,870 Regressions, classification, and clustering are not the only 61 00:04:20,870 --> 00:04:25,360 approaches in the two categories of supervised and unsupervised learning. 62 00:04:25,360 --> 00:04:26,930 There are many more, and 63 00:04:26,930 --> 00:04:31,280 you should check out the notes associated with this video for more resources. 64 00:04:31,280 --> 00:04:33,610 That's a lot of concepts all at once. 65 00:04:33,610 --> 00:04:38,760 So now, let's focus and break things down further by thinking about 66 00:04:38,760 --> 00:04:43,750 an example application for one of these starting with supervised learning, and 67 00:04:43,750 --> 00:04:46,700 one of its subcategories, classification. 68 00:04:48,050 --> 00:04:54,300 Let's say you want to classify an email as spam or not spam. 69 00:04:54,300 --> 00:04:58,600 You could give a machine intelligence millions of email messages that 70 00:04:58,600 --> 00:05:06,140 are already labeled as not spam, and millions that are labeled as spam. 71 00:05:06,140 --> 00:05:11,590 With each example message, you would identify features of the data, like 72 00:05:11,590 --> 00:05:18,210 the subject line, the sender, the body of the email, the attachments, and so forth. 73 00:05:18,210 --> 00:05:23,480 Then, when a new email comes through, the machine intelligence can refer 74 00:05:23,480 --> 00:05:28,460 to all of the features of the spam, and not spam messages and 75 00:05:28,460 --> 00:05:33,010 decide how closely the new email matches any patterns in the data. 76 00:05:34,010 --> 00:05:37,760 Then, it assigns a category to the new message 77 00:05:37,760 --> 00:05:39,670 with some percentage of confidence. 78 00:05:41,190 --> 00:05:46,080 In this course, we're going to create our own classifier that will attempt to label 79 00:05:46,080 --> 00:05:50,400 new entries into a data set based on the existing data. 80 00:05:50,400 --> 00:05:55,149 However, for completion, let's take a quick look at regressions and clustering. 81 00:05:57,592 --> 00:06:01,350 A regression is another type of supervised learning. 82 00:06:01,350 --> 00:06:06,980 Instead of attempting to categorize data, it tries to predict quantities. 83 00:06:06,980 --> 00:06:10,030 For example, say you're opening a restaurant, and 84 00:06:10,030 --> 00:06:13,890 you're trying to decide how dishes should be priced. 85 00:06:13,890 --> 00:06:19,210 A regression algorithm could look at a data set of other restaurants in the area, 86 00:06:19,210 --> 00:06:24,070 and use features like the average price of a dish, the relative distance to 87 00:06:24,070 --> 00:06:29,950 the new restaurant's location, the average review score from Yelp, and so forth. 88 00:06:29,950 --> 00:06:35,609 Based on that information, the regression could try and predict appropriate prices. 89 00:06:36,870 --> 00:06:40,490 The last approach I mentioned is called clustering, 90 00:06:40,490 --> 00:06:44,990 which is one of the biggest categories of approaches to unsupervised learning. 91 00:06:46,120 --> 00:06:48,210 Have you ever been on a social network and 92 00:06:48,210 --> 00:06:52,830 been shown suggested friends or targeted advertisements? 93 00:06:52,830 --> 00:06:56,240 Or have you watched something on a video sharing site 94 00:06:56,240 --> 00:06:58,445 that shows you suggested videos? 95 00:06:58,445 --> 00:07:01,847 [SOUND] How do these websites know what to show you? 96 00:07:01,847 --> 00:07:04,540 Or what content is similar? 97 00:07:04,540 --> 00:07:07,550 A cluster analysis can attempt the group 98 00:07:07,550 --> 00:07:12,840 data by similarity without attempting to apply any type of labels. 99 00:07:12,840 --> 00:07:18,460 This cluster analysis might help to identify hidden patterns in the data, 100 00:07:18,460 --> 00:07:22,600 and automatically group things together based on features of the data. 101 00:07:23,990 --> 00:07:28,080 This was a broad overview of the different approaches to machine learning. 102 00:07:28,080 --> 00:07:33,290 And, as you can imagine, it's a deep topic with lots more to explore. 103 00:07:33,290 --> 00:07:36,950 I encourage you to check the notes associated with this video 104 00:07:36,950 --> 00:07:39,100 to help review what we've learned. 105 00:07:39,100 --> 00:07:41,160 I also want to mention again, 106 00:07:41,160 --> 00:07:44,610 don't worry if you're not understanding everything right away. 107 00:07:44,610 --> 00:07:49,510 In upcoming videos, we'll take a closer look at some of these concepts. 108 00:07:49,510 --> 00:07:52,890 Machine learning is a huge area of study, and 109 00:07:52,890 --> 00:07:57,090 it might take some review for you to fully absorb everything. 110 00:07:57,090 --> 00:07:59,585 Remember, you can always go back and 111 00:07:59,585 --> 00:08:02,240 rewatch videos if you feel like you need to.