Import data from a text file into a pandas dataframe

General Tech Bugs & Fixes 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

I'm building a web app using Django. I uploaded a text file using

csv_file = request.FILES['file'].

I can't read the csv into pandas. The file that i'm trying to import has text and data, but I only want the data.

I've tried the following

  1. df = pd.read_csv(csv_file, sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) to try to remove the comments and just read the numbers

Error: pandas will not read all 3 columns. It only reads 1 column

  1. I tried df = pd.read_csv(csv_file, sep="\s{2}", sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) to try to remove the comments and just read the numbers

Error: cannot use a string pattern on a bytes-like object

  1. I tried df = pd.read_csv(csv_file.read(), sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) to try to remove the comments and just read the numbers

File I uploaded

% filename
% username
2.0000  117.441  -0.430
2.0100  117.499  -0.337
2.0200  117.557  -0.246
2.0300  117.615  -0.157
2.0400  117.672  -0.069

views.py

def new_measurement(request, pk):
    material = Material.objects.get(pk=pk)
    if request.method == 'POST':
        form = NewTopicForm(request.POST)
        if form.is_valid():
            topic = form.save(commit=False)
            topic.material = material
            topic.message=form.cleaned_data.get('message')
            csv_file = request.FILES['file']
            df = genDataFrame(csv_file)
            topic.data = df
            topic.created_by = request.user
            topic.save()
            return redirect('topic_detail', pk =  material.pk)
    else:
        form = NewTopicForm()
    return render(request, 'new_topic.html', {'material': material, 'form': form})
def genDataFrame(csv_file):
    df = pd.read_csv(csv_file, sep=" ", header=None, names=["col1","col2","col3"])
    df = df.convert_objects(convert_numeric=True)
    df = df.dropna()
    df = df.reset_index(drop = True)
    return df_list

0 views
0 shares
profilepic.png
manpreet 2 years ago

This works on the data you provided and gives you the dataframe you expect:

df = pd.read_csv(csv_filepath, sep='  ', header=None, 
                 names=['col1', 'col2', 'col3'], skiprows=2, engine='python')

Because sep is more than one character, you need to use the python engine instead of the C engine. The python engine sometimes has trouble with quotes, but you don't have any, so that's fine. You actually don't even need to specify the python engine, it will be selected automatically for you, but you'll get a warning to stderr; specifying the engine suppresses that.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.