Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Data Analysis Data Analysis Basics The Data Analysis Process Defining Terms

Particular value seems arbitrary.

When choosing the value for Max Difference in this video, I can't see any reason why this particular number was chosen. Can someone provide more details as to why he would halve the runners per age to obtain the max difference?

1 Answer

codyl
codyl
4,704 Points

The number of runners per age (IF you take the oldest person, and the youngest persons age, look at the difference, and divide by the total number or runners) was ~400. Which means if youngest was 18, and there were 400x 18 year olds, and 400x 19 year olds and 400x, 20, 21, 22, 23, etc. to 84 that would total the amount of runners and be perfectly distributed. So the number would be impossible to be over 400, and unlikely to even be near it.

From there it's just some assumptions as to what's too low. If an age had 100, it might seem they're under represented as they're only 1/4 of what's possible given the total number of runners under a 'even' distribution.

So I think 200 was picked as it just happens to be sort if in the middle / half of what the max would be under a 100% even distribution and a 'good enough' assumption to use for the lesson.