Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python Data Science Basics Describing Data Loading Raw Data

Eric Hodgins
Eric Hodgins
29,207 Points

numpy.genfromtxt() datatype error

I tried following along but when I run the program I get an error.

TypeError: data type "myInt" not understood

The only way i could get it to run was to set dtype=None. Not sure what I did wrong. I ran both locally and on workspaces. But any guidance or help is much appreciated!

Thanks, Eric

3 Answers

I copied and pasted the DATATYPES list of tuples from s2v2.py and was getting the same error until I looked back through the traceback in the error and realised that it was complaining about the data type for 'description', which was set as '|900'.

I changed that to '|S900', matching the other string fields and it all works now.

Eric Hodgins
Eric Hodgins
29,207 Points

Thanks Iain! That fixed for me.

No problem!

In general there seems to be a few issues in this course with differences in code between files, and between the downloaded/workspace copies... might be worth comparing things between multiple sources when something is going wrong.

Eric Hodgins
Eric Hodgins
29,207 Points

Thanks for the pointer, I'll definitely keep that in mind next time.

David Bentzon-Ehlers
David Bentzon-Ehlers
1,438 Points

I think I got this figured out. After spending a couple of hours trying to figure out why the TypeError: data type "<insert your data type here>" not understood would occur I tried looking at the Numpy Docs. Apparently, you have to write numpy.dtype before creating your "array protocol string". Your code should looke something likes this:

DATATYPES = numpy.dtype([('myint', 'i'), ('myid', 'i'), ('price', 'f8'), ('name', 'a200'), ('brandId', '<i8'), ('brandName', 'a200'), ('imageUrl', '|S500'), ('description', '|S900'), ('vender', '|S100'), ('pattern', '|S50'), ('material', '|S50'),])

You cannot just leave that part out, even though Dr Kat does exactly that in her video. .

Brian Verdi
Brian Verdi
10,170 Points

Thanks. That helped a lot.

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 68,454 Points

According to the http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html

numpy.genfromtxt(fname, dtype=<type 'float'>, comments='#', delimiter=None, skiprows=0, skip_header=0, skip_footer=0, converters=None, missing='', missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True)[source]

Load data from a text file, with missing values handled as specified.

Each line past the first skip_header lines is split at the delimiter character, and characters following the comments character are discarded.

Parameters:

  • fname : file or str -- File, filename, or generator to read. If the filename extension is gz or bz2, the file is first decompressed. Note that generators must return byte strings in Python 3k.
  • dtype : dtype, optional -- Data type of the resulting array. If None, the dtypes will be determined by the contents of each column, individually.
  • ...

By setting to None, the type is derived from the column contents, which apparently works.

In the video Loading Raw Data at 9:23, Dr Kat pastes in a DATATYPES list of tuples where each tuple contains a 'heading' and a 'type'. The first one listed is ('myint', 'i'). In her code, dtype=DATATYPES.

Eric Hodgins
Eric Hodgins
29,207 Points

Thanks Chris for the response. But I tried that. I Basically copied it line by line and when it didn't work I set to None. I guess I'll just keep going I really have no idea why it's doing that.