Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trial
Igors Smirnovs
10,441 PointsLoading raw data - Data Science in Python
Hi,
We have been defining DATATYPES variable in the second part of the video, while using numpy library - Loading raw data (Data Science).
e.g. DATATYPES = [('price', 'f8'), ('name', 'a200'), ('brandId', '<i8') ...]
Where do these types come from? Why do we use them? It has been briefly mentioned what they mean, although it would be nice to expand more on that. Why are they in the certain type (f8, |a200, |s500 etc...)?
Thanks ;)
2 Answers
Chris Freeman
Treehouse Moderator 68,468 PointsThe data types are defined in the NumPy Spec:
The first character specifies the kind of data and the remaining characters specify the number
of bytes per item, except for Unicode, where it is interpreted as the number of characters. The
item size must correspond to an existing type, or an error will be raised. The supported kinds are
'b' boolean
'i' (signed) integer
'u' unsigned integer
'f' floating-point
'c' complex-floating point
'O' (Python) objects
'S', 'a' (byte-)string
'U' Unicode
'V' raw data (void)
The number represents the length. So 'f8' is an 8-byte floating point number, and 'a200' is a 200-byte string.
Peter Lodri
6,757 PointsI see, sorry for the irrelevant answer :)
Peter Lodri
6,757 PointsPeter Lodri
6,757 PointsIt's a list of tuples. "A tuple is a sequence of immutable Python objects. Tuples are sequences, just like lists. The differences between tuples and lists are, the tuples cannot be changed unlike lists and tuples use parentheses, whereas lists use square brackets."
And these tuples are contained in a list, which means you can easily iterate over them.
Gives you 'price', 'name', 'brandld' etc. So you see, you can really think of them as a list in a list. About the data you work with, I cannot comment on it, it's just raw data, don't think where they come from, or what they mean, unless you need that info :)
Hope this helped!