Welcome to the Treehouse Community

The Treehouse Community is a meeting place for developers, designers, and programmers of all backgrounds and skill levels to get support. Collaborate here on code errors or bugs that you need feedback on, or asking for an extra set of eyes on your latest project. Join thousands of Treehouse students and alumni in the community today. (Note: Only Treehouse students can comment or ask questions, but non-students are welcome to browse our conversations.)

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and a supportive community. Start your free trial today.

Python CSV

CSV reading in Python: Is it a dictionary or a list?

So, I'm watching Kenneth read a csv file 4:45min in using csv.DictReader. Then I got stuck here... 'How does this work?!' You index like a list with "rows[1:3]" but then index like a dictionary with "row['group1']". I FINALLY realized that this is what you do when you have a list comprised of dictionaries and want to access a value in one of the dictionaries. I guess that's what we have here. But it STILL doesn't make sense because we used the list() function.... so I would think that the variable "rows" would be a list of tuples at best... and you can't index with a key in tuples.... well, I can't seem to. Any explanations why the example below (which I've recreated) works?

import csv
with open('museum.csv', newline='') as csvfile:
    artreader = csv.DictReader(csvfile, delimiter='|')
    rows = list(artreader)
    for row in rows[1:3]:
        print(row['group1'])

2 Answers

Chris Freeman
MOD
Chris Freeman
Treehouse Moderator 67,989 Points

Let's dig into the object being created by cvs.DictReader. A cvs.DictReader returns an iterator that produces each row as needed. To get all of the rows into a list, an iterator can be wrapped with list() to creat a list. In this case, all the data goes into the list rows. When no fieldnames argument is provided, cvs.DictReader uses the data in first row (i.e., row[0]) as the field names when creating the OrderedDict objects. Returning rows as OrderedDict objects is new in Python 3.6.

Hope this helps. Post back if you need more help. Good Luck!

Thanks, Chris- I checked out the documentation & played with IDLE beforehand (trying to get an answer myself)... I guess where I'm stumped is how using list() on an OrderDict object returns (what I assume to be a list with 1 dictionary for each row). I just now opened IDLE and used list() on an ordered dict and it made a list of just the keys.

Chris Freeman
Chris Freeman
Treehouse Moderator 67,989 Points

list() is being used on the iterable returned by csv.DictReader. Each item returned from that iterable object is an OrderedDict. So the list() creates a list of OrderedDict objects generated by the DictReader. list() is not run on the OrderedDict objects. The key idea is list() is being used to create a version of the data that can be sliced.

If the intent was to iterate over all the rows returned by csv.DictReader and not a slice of the data, then a simple for loop would work:

for row in artreader:
    print(row['group1'])

In this case, artreader would provide a new row as needed to feed the for loop. However, the intent is to loop over just certain rows of the data. One way would be to add indexing in the loop to count if the iteration count is 1, 2, or 3. But this gets messy:

for count, row in enumerate(artreader):
    if count > 0 and count <= 3:
        print(row['group1'])

An alternative presented by Kenneth is to run the iterator through to completion to create a list of all items before starting the loop. Using list() on a iterator will accomplish this and leave you with an list that can be sliced as needed. One downside to using the list() approach is memory space, since the entire file will need to be read in to create the list before the loop can start.

The class itertools.islice creates a way to slice an iterator object.

import itertools
rows = list(itertools.islice(artreader, 4))
for row in rows[0:3]:
    ....

This is the same as the original list() code, but will stop reading the file after the first 4 items are read.

Wow, great answer... thanks Chris! This link may be helpful too others like it was to me in explaining iterator vs interable: https://stackoverflow.com/questions/9884132/what-exactly-are-pythons-iterator-iterable-and-iteration-protocols