Identifying what kind of data is in a column of a csv

Question

I have not begun this project yet and am currently just theorizing how I will go about this. I am very new to Python so I do not have a lot of tools currently at my disposal. What I'm wanting to do is take data from a csv file and then write it into a text document. The problem is, the data has to be formatted very specifically and each csv that is being imported may look different.

For example:

CSV1 may look like: First Name, Last Name, Email, Telephone Number, Zip Code

But CSV2 may look like: Email, Telephone Number, Last Name, First Name, Zip Code

but in the end I need both to go on the txt file as: First Name, Last Name, Zip Code, Telephone Number, Email

At first I was going to say first_name = row[0], last_name = row[1], etc. But that obviously won't work if the csv layouts are not the same. Does anyone have any suggestions?

Answer 1 · 2018-10-29T19:49:20Z

October 29, 2018 7:49pm

You should look into the Python CSV library. In particular, you would probably be interested in using csv.DictReader. It can automatically structure your data on an arbitrary header row. I think this is what you were asking about.

As long as the first line of the files in question have consistent fieldnames with the same case, you could write something like this in less than 10 lines of code. A more advanced approach might use the Python Pandas library. But that is something you would only want to tackle after you have mastered basic CSV reading/writing and Python dictionaries.

from the documentation: https://docs.python.org/3/library/csv.html

class csv.DictReader(f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds) Create an object that operates like a regular reader but maps the information in each row to an OrderedDict whose keys > are given by the optional fieldnames parameter.

The fieldnames parameter is a sequence. If fieldnames is omitted, the values in the first row of file f will be used as the fieldnames. Regardless of how the fieldnames are determined, the ordered dictionary preserves their original ordering.

Welcome to the Treehouse Community

Looking to learn something new?

Nate Spry

Nate Spry

Identifying what kind of data is in a column of a csv

1 Answer

Jeff Muday

Jeff Muday