Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community!

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python

Identifying what kind of data is in a column of a csv

I have not begun this project yet and am currently just theorizing how I will go about this. I am very new to Python so I do not have a lot of tools currently at my disposal. What I'm wanting to do is take data from a csv file and then write it into a text document. The problem is, the data has to be formatted very specifically and each csv that is being imported may look different.

For example:

CSV1 may look like: First Name, Last Name, Email, Telephone Number, Zip Code

But CSV2 may look like: Email, Telephone Number, Last Name, First Name, Zip Code

but in the end I need both to go on the txt file as: First Name, Last Name, Zip Code, Telephone Number, Email

At first I was going to say first_name = row[0], last_name = row[1], etc. But that obviously won't work if the csv layouts are not the same. Does anyone have any suggestions?

1 Answer

Jeff Muday
MOD
Jeff Muday
Treehouse Moderator 28,249 Points

You should look into the Python CSV library. In particular, you would probably be interested in using csv.DictReader. It can automatically structure your data on an arbitrary header row. I think this is what you were asking about.

As long as the first line of the files in question have consistent fieldnames with the same case, you could write something like this in less than 10 lines of code. A more advanced approach might use the Python Pandas library. But that is something you would only want to tackle after you have mastered basic CSV reading/writing and Python dictionaries.

from the documentation: https://docs.python.org/3/library/csv.html

class csv.DictReader(f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds) Create an object that operates like a regular reader but maps the information in each row to an OrderedDict whose keys > are given by the optional fieldnames parameter.

The fieldnames parameter is a sequence. If fieldnames is omitted, the values in the first row of file f will be used as the fieldnames. Regardless of how the fieldnames are determined, the ordered dictionary preserves their original ordering.