Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialBrendan Whiting
Front End Web Development Techdegree Graduate 84,738 PointsA couple questions about this challenge
1) Isn't the index
method already going to return an int? Why wrap this with the int() function?
2) I don't understand the purpose of this line of code: filtered_rows.append([str(x).encode('utf8') for x in row])
. Wasn't the data already utf-8?
See comments in code below:
import csv
def open_with_csv(filename, d='\t'):
data = []
with open(filename, encoding='utf-8') as tsvin: # doesn't the data become uft-8 starting here?
tie_reader = csv.reader(tsvin, delimiter=d)
for row in tie_reader:
data.append(row)
return data
def filter_col_by_string(the_data, field, filter_condition):
filtered_rows = []
col = int(the_data[0].index(field)) # isn't the index method guaranteed to return an int (or an error)?
filtered_rows.append(the_data[0])
for row in the_data[1:]:
if row[col] == filter_condition:
filtered_rows.append([str(x).encode('utf8') for x in row]) # Why do we need to encode everything to utf-8 here, didn't we do that already?
return filtered_rows
data_from_csv = open_with_csv('data.csv')
dkny_ties = filter_col_by_string(data_from_csv, "brandName", "DKNY")
2 Answers
Chris Freeman
Treehouse Moderator 68,454 PointsI would agree that the_data[0].index(field)
returns an int
or throws an error. So the int()
seems unnecessary.
UTF-8 is an encoding for raw files. The open
function is interpreting the file as currently encoded as UTF-8 and returns data
as regular strings.
The line filtered_rows.append([str(x).encode('utf8') for x in row])
is using a list comprehension to walk down each item in the list row
and return a list where each string is encoded back into a byte string with encoding UTF-8. It's not clear why this is necessary. Perhaps there is other code that expects filter_col_by_string
to return byte-strings. There is where a little docstring would have gone a long way.
In looking through the other code from this course, all instances of using the output of filter_col_by_string
, such as for totaling the prices or finding the minimum prices, all use float(row[col])
or something similar to convert the byte string into a float. This could just as easily been done if the strings were left in regular text format. I am curious if anyone else has more insight in this code.
Sorry, but the answer is IDK! :-/
Ary de Oliveira
28,298 Pointsimport csv #from s2q1.py
function from s2q2.py
def open_with_csv(filename, d='\t'): data = [] with open(filename, encoding='utf-8') as tsvin: tie_reader = csv.reader(tsvin, delimiter=d) for row in tie_reader: data.append(row) return data
def filter_col_by_string(the_data, field, filter_condition): filtered_rows = []
#find index of field in first row
col = int(the_data[0].index(field))
filtered_rows.append(the_data[0])
for row in the_data[1:]:
if row[col] == filter_condition:
filtered_rows.append([x for x in row])
return filtered_rows
data_from_csv = open_with_csv('data.csv')
code above this line is included to make this file compile on its own.
-------------------------------------------------------------------------
here is the answer:
dkny_ties = filter_col_by_string(data_from_csv, "brandName", "DKNY")
Brendan Whiting
Front End Web Development Techdegree Graduate 84,738 PointsBrendan Whiting
Front End Web Development Techdegree Graduate 84,738 PointsCool, thanks that much makes sense.