Error with FuzzyWuzzy: StringProcessor.replace_non_letters_non_numbers_with_whitespace(s)

General Tech Bugs & Fixes 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

I cannot get the following function to run:

match, match_score = process.extractOne(score, pct_dict.keys())

I get a whitespace error I cannot seem to resolve. Any idea what is causing this?

What it should do: If the score is 15 it should return 0.026

Error:

Error: output = self.func(*resolved_args, **resolved_kwargs) wnas1
| File "/code/cleveland/templatetags/percentiles_ratings.py", line 32, in get_percentile_standard wnas1 | match, match_score = process.extractOne(score, pct_dict.keys()) wnas1 | File "/usr/local/lib/python3.7/site-packages/fuzzywuzzy/process.py", line 220, in extractOne wnas1 | return max(best_list, key=lambda i: i[1]) wnas1 | File "/usr/local/lib/python3.7/site-packages/fuzzywuzzy/process.py", line 78, in extractWithoutOrder wnas1 | processed_query = processor(query) wnas1 | File "/usr/local/lib/python3.7/site-packages/fuzzywuzzy/utils.py", line 95, in full_process wnas1 | string_out = StringProcessor.replace_non_letters_non_numbers_with_whitespace(s) wnas1 | File "/usr/local/lib/python3.7/site-packages/fuzzywuzzy/string_processing.py", line 26, in replace_non_letters_non_numbers_with_whitespace wnas1
| return cls.regex.sub(" ", a_string)

Code:

from __future__ import unicode_literals
from django import template
from fuzzywuzzy import fuzz
from fuzzywuzzy import process


register = template.Library()


@register.simple_tag
def get_perc(score):
    MATCH_THRESHOLD = 80
    pct_dict = {14: 0.016, 14.7: 0.021, 15.3: 0.026, 16: 0.034, 16.7: 0.04, 17.3: 0.05, 18: 0.07, 18.7: 0.09,
                    19.3: 0.11, 20: 0.13, 20.7: 0.17, 21.3: 0.21, 22: 0.26, 22.7: 0.31, 23.3: 0.38, 24: 0.47}
    if not score:
        return '--'
    elif score < 26.7:
        return '<1'

    match, match_score = process.extractOne(score, pct_dict.keys())

    if match_score >= MATCH_THRESHOLD:
        return pct_dict[match]
    else:
        return '--'
profilepic.png
manpreet 2 years ago

 

As per fuzzywuzzy documentation, you need to compare between two strings. Meaning you need to convert you values in string to compare them. Then you need to do it like this:

match, match_score = process.extractOne(str(score), pct_dict.keys())

I would not recommend this approach because that will not be accurate.

>>> x = ['1','2','3']
>>> y='2'
>>> process.extractOne(y,x)
('2', 100)
>>> y='2.2'
>>> process.extractOne(y,x)
('2', 90)
>>> y = '2.9'
>>> process.extractOne(y,x)
('2', 90)

Here in last 2 entries, you will see score 90 for both 2.2 and 2.9, where 2.9 is much closer to 3.

As you have numbers and I would recommend you to do simply compare them like this:

value = min(pct_dict, key=lambda x:abs(x - score))
# then some logics to see if value is close to score or put some static threshold value like `abs(value-score) < .3`

There are few SO answers which might help you regarding this.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.