How to filter CSV rows by keywords from another CSV file?

Question

Hi,

I have a CSV file with up to 1000 rows, and I want to filter out rows within the CSV file that meets specific criteria.

I have a list of keywords/strings in one file ('maths teacher', 'English teacher', 'receptionist', 'deputy head' etc) named titles.csv

In the other file, I have multiple columns and rows, and one of the column fields is occupation. I want to be able to filter out all rows in this CSV file that don't include the list of occupations in titles.csv

Can anyone help me with this problem? I've heard Pandas may be the way to solve this, but I am keen to try and do it using csv module.

Answer 1 · 2018-08-06T02:57:39Z

on Aug 6, 2018

If you want to use pythons csv module you could basically open both of your csv files and store them as 2D lists [[],[],[]]. Then you could loop through both 2D list and compare the individual columns that would have matching keywords. Then only save the non-matching rows to a new 2D list and finally write that to a new csv file.

here is a start:

import csv
# function to read a csv file to a 2D list ex: [[col1,col2,col3],[col1,col,2col3]]
def read_from_file(csv_file):
    with open(csv_file, 'rb') as f:
        reader = csv.reader(f)
        data = list(reader)
    return data

data_from_first_csv = read_from_file(file1.csv)
data_from_second_csv = read_from_file(file2.csv)

output_data = []
for row in data_from_first_csv:
    for key in data_from_second_csv:
        # if col 1 in the first csv matches any key in col 2 of the second csv skip it
        if row[0] != key[1]:
            output_data.append(row)

// write the output to a new csv
with open("output.csv", "wb") as f:
    writer = csv.writer(f)
    writer.writerows(output_data)

Hope this helps

Welcome to the Treehouse Community

Looking to learn something new?

Sam Frampton

Sam Frampton

How to filter CSV rows by keywords from another CSV file?

1 Answer

Dave Faliskie

Dave Faliskie

Sam Frampton

Sam Frampton