How do I return all lines in a csv containing specific words or phrases in a specific column?

General Tech Bugs & Fixes . 2 years ago

  0   2   0   0   0 tuteeHUB earn credit +10 pts

5 Star Rating 5 Rating

Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Write Your Comments or Explanations to Help Others



Tuteehub forum answer Answers (2)


profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

I have a csv file containing a data set (in this case addresses). I would like to make a second csv file containing only the entries which have one of a set of phrases in a specific column. For example I would like to return all the people who currently live in "Viridian" but not those who previously lived there or never lived there.

The example data is:

First Name,Second Name,ID,Home Town,County,Current Town,Street
Sam,Smith,1234,Pallet,North,Orange,Lemon
Jenny,Walton,1456,Viridian,West,York,High View
Alan,Kirk,2378,Orange,West,Viridian,High street
Reese,Small,9840,Minsk,East,Viridian,Ocean Avenue
Audry,Owen,7865,York,South,Blackmarsh,8th Street
Marco,Jefferson,1580,Amsterdam,Central,Oxford,Church Road
Jim,Lowe,5218,Windy City,East,Windy City,Oak
Gillian,Pope,3217,Rome,Central,Rome,Low road

I have previously used this code:

town = ["Viridian", "Rome"]

with open("addresses.csv",) as oldfile, open("Filtered addresses.csv", "w") as newfile:
    for line in oldfile:
        if any(town in line.strip().lower() for town in town):
            newfile.write(line)

However, this returns lines with the specified cities in all columns - I just want the ones with the specified cities in the column "current town".

I tried this instead:

import csv

town = ["Viridian", "Rome"]

with open("Filtered addresses.csv", "w", encoding="Latin-1") as newfile:

    reader = csv.reader(open("addresses.csv", 'r', encoding="Latin-1"))

    for data in reader:
        if any(town in data[6] for town in town):
            newfile.write(data)

But this results in an error:

TypeError: write() argument must be str, not list

While altering the code to read:

newfile.write(str(data))

returns some entries but they are formatted as a single long line rather than rows.

What is the best way to achieve my aim? I would like to keep the full row of data in each case.

Thanks!

0 views   0 shares
profilepic.png
manpreet 2 years ago

 

pandas will make it extremely easy:

import pandas as pd

town = ["Viridian", "Rome"]
# Read csv as pandas dataframe
original = pd.read_csv("addresses.csv", index_col=False)
# Select rows where `Current Town` column's value is in `town`
filtered = original[original['Current Town'].isin(town)]
# Save the filtered dataframe to a file
filtered.to_csv("Filtered addresses.csv")

If you don't have pandas installed, you can easily install it running:

pip install pandas

in your command line


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.

tuteehub community

Join Our Community Today

Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.

tuteehub community