Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Take A QuizChallenge yourself and boost your learning! Start the quiz now to earn credits.
Take A QuizUnlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
Take A QuizGeneral Tech Bugs & Fixes 2 years ago
Posted on 16 Aug 2022, this text provides information on Bugs & Fixes related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.
I have couple of suggestions for your approach.
X : array or sparse (CSR) matrix of shape (n_samples, n_features), or array of shape (n_samples, n_samples)
get_distance()
has to return single value and not a array. Hence, I would suggest you to use some measure for not text features. I have given an example for euclidean distance.Example:
>>> from sklearn.feature_extraction.text import TfidfVectorizer
>>> corpus = [
... 'This is the first document.',
... 'This document is the second document.',
... 'And this is the third one.',
... 'Is this the first document?',
... ]
>>> vectorizer = TfidfVectorizer()
>>> text_list = vectorizer.fit_transform(corpus)
import numpy as np
hashes_list = np.array([[12,12,12],
[12,13,11],
[12,1,16],
[4,8,11]])
from scipy.sparse import hstack
combined_list = hstack((hashes_list,text_list))
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise import euclidean_distances
from sklearn.cluster import DBSCAN
n1 = len(vectorizer.get_feature_names())
def get_distance(vec1,vec2):
text_distance = cosine_similarity([vec1[:n1]], [vec2[:n1]])
other_distance = euclidean_distances([vec1[n1:]], [vec2[n1:]])
return (text_distance+other_distance)/2
db = DBSCAN(eps=1, min_samples=3, metric=get_distance ).fit(combined_list.toarray())
No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
General Tech 10 Answers
General Tech 7 Answers
General Tech 3 Answers
General Tech 9 Answers
Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.
manpreet
Best Answer
2 years ago
I'm trying to combine two types of parameters before clustering.
My parameters are Text - represented as sparse matrix, and another array representing other features of my data point.
I've tried to combine the 2 types of parameters into 1 array and passing it as an input to the algo:
Also I've built a custom distance metric which I'm going to use.
But I'm getting an error when trying to pass my input array. The combined array was constructed as following:
Full Error Traceback:
Is this the correct approach for combining text vector with other parameters?