Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Machine Learning Machine Learning Basics Introduction to Machine Learning Supervised and Unsupervised Learning

Al Craig
Al Craig
22,220 Points

Probability: "With zero being a complete guess". Is this right? I think zero probability is certainty of non-occurrence.

I think the 'complete guess' situation would be most analogous to a probability of NULL (i.e. unquantified probability value).

3 Answers

Nick Pettit
STAFF
Nick Pettit
Treehouse Teacher

Thanks for pointing this out! This gets into probability theory and statistics, which are deep topics beyond the scope of an overview course. My goal was to highlight the probabilistic nature of ML models while still prioritizing the compact nature of the lessons, and sometimes that's challenging to do without generalizing or betraying the proper semantics of some explanations.

You are correct that a probability of zero indicates the impossibility of an event occurring, or non-occurrence. Using the example of a single die again, the sample space of outcomes can be expressed as {1,2,3,4,5,6} and so the probability of rolling a zero is, also, zero, or non-occurrence.

Most performance analysis of ML models stems from the fact that training sets are finite and therefore any new example might not perfectly match an existing example. It's common to impose probabilistic bounds on a model. For more on this, see this probability calibration page from the scikit-learn documentation.

In the case of a classifier, probability is used as a measure of confidence in the prediction. Put another way, we're not necessarily asking about accuracy, but rather we're asking something to the effect of, "What is the probability that this answer is accurate?"

Hopefully that makes sense. Again, I genuinely appreciate this type of feedback. It's hard to break down complicated concepts in an overview course, and sometimes I don't always get it right or make explanations as clear as they need to be. I am by no means an expert in probability and statistics - I'm a programmer and creator - so I welcome any additional questions or corrections you might have. :)

Steven Parker
Steven Parker
230,274 Points

The video mentions two applications of probability:

  1. a means of expressing how likely it is that an event will occur, or
  2. a way of measuring how close a value might be to the actual correct value.

I'd agree that "zero probability is certainty of non-occurrence" when talking about the first type. But the scale where 0 represents a "complete guess" (as shown in the video) would be for the 2nd type.

Following the scale with the die-rolling example was a bit confusing since the die rolling is an example of the first type. The probability of rolling a "2" would be 0.16, and the probability of rolling a "7" would be 0.

Al Craig
Al Craig
22,220 Points

My understanding of the second example is that this measure is not properly 'probability' but rather would be 'accuracy' (possibly combined also with 'precision').

Steven Parker
Steven Parker
230,274 Points

You might want to report this to Support. If they agree they might post a correction in the Teacher's Notes. And you'll get an "Exterminator" badge. :beetle:

Al Craig
Al Craig
22,220 Points

Having thought about it a little further, I think the video is confusing 'p-value' with 'probability'. These are closely linked concepts but not the same thing. I'm not really into collecting badges for the sake of badges so I will leave things here.

Steven Parker
Steven Parker
230,274 Points

The badge is just a way they thank you for your help. The real reason I encourage you to report this is to bring it to staff attention so they can make changes that will give other students more complete and correct information.

Al Craig
Al Craig
22,220 Points

And the subscription fee is my way to thank the staff for providing students with complete and correct information.

Steven Parker
Steven Parker
230,274 Points

Sure, but ultimately the benefit of correcting the course will be to the students, including those willing to read and respond to forum posts such as this one.

No need to bother this time, though. I've already given the staff a "heads-up" on this issue.

The value does not—in any way—refer to what its guess might be. It simply refers to how confident its guess is.