Speak now
Please Wait Image Converting Into Text...
Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Challenge yourself and boost your learning! Start the quiz now to earn credits.
Unlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
General Tech Bugs & Fixes 2 years ago
Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.
Turn Your Knowledge into Earnings.
I'm trying to combine two types of parameters before clustering.
My parameters are Text - represented as sparse matrix, and another array representing other features of my data point.
I've tried to combine the 2 types of parameters into 1 array and passing it as an input to the algo:
db = DBSCAN(eps=1, min_samples=3, metric=get_distance).fit(array(combined_list))
Also I've built a custom distance metric which I'm going to use.
def get_distance(vec1,vec2): text_distance = cosine_similarity(vec1[0] ,vec2[0]) other_distance = vec1[1]-vec2[1] return (text_distance+other_distance)/2
But I'm getting an error when trying to pass my input array. The combined array was constructed as following:
combined_list = [] for i in range(len(hashes_list)): combined_list.append((hashes_list[i],text_list[i])) combined_list = array(combined_list)
Full Error Traceback:
db = DBSCAN(eps=1, min_samples=3, metric=get_distance ).fit(array(combined_list)) Traceback (most recent call last): File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec exec(exp, global_vars, local_vars) File "", line 1, in <module> File "/Users/tal/src/campaign_detection/Data_Extractor/venv/lib/python3.7/site-packages/sklearn/cluster/dbscan_.py", line 319, in fit X = check_array(X, accept_sparse='csr') File "/Users/tal/src/campaign_detection/Data_Extractor/venv/lib/python3.7/site-packages/sklearn/utils/validation.py", line 527, in check_array array = np.asarray(array, dtype=dtype, order=order) File "/Users/tal/src/campaign_detection/Data_Extractor/venv/lib/python3.7/site-packages/numpy/core/numeric.py", line 538, in asarray return array(a, dtype, copy=False, order=order) ValueError: setting an array element with a sequence.
Is this the correct approach for combining text vector with other parameters?
I have couple of suggestions for your approach.
From Documentation:
X : array or sparse (CSR) matrix of shape (n_samples, n_features), or array of shape (n_samples, n_samples)
get_distance()
Example:
>>> from sklearn.feature_extraction.text import TfidfVectorizer >>> corpus = [ ... 'This is the first document.', ... 'This document is the second document.', ... 'And this is the third one.', ... 'Is this the first document?', ... ] >>> vectorizer = TfidfVectorizer() >>> text_list = vectorizer.fit_transform(corpus) import numpy as np hashes_list = np.array([[12,12,12], [12,13,11], [12,1,16], [4,8,11]]) from scipy.sparse import hstack combined_list = hstack((hashes_list,text_list)) from sklearn.metrics.pairwise import cosine_similarity from sklearn.metrics.pairwise import euclidean_distances from sklearn.cluster import DBSCAN n1 = len(vectorizer.get_feature_names()) def get_distance(vec1,vec2): text_distance = cosine_similarity([vec1[:n1]], [vec2[:n1]]) other_distance = euclidean_distances([vec1[n1:]], [vec2[n1:]]) return (text_distance+other_distance)/2 db = DBSCAN(eps=1, min_samples=3, metric=get_distance ).fit(combined_list.toarray())
No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
General Tech 9 Answers
General Tech 7 Answers
General Tech 3 Answers
General Tech 2 Answers
Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.