Speak now
Please Wait Image Converting Into Text...
Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Challenge yourself and boost your learning! Start the quiz now to earn credits.
Unlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
General Tech Technology & Software 2 years ago
Posted on 16 Aug 2022, this text provides information on Technology & Software related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.
Turn Your Knowledge into Earnings.
I am looking for a NoSQL technology that meets the requirement of being able to process geospatial as well as time queries on a large scale with decent performance. I want to batch-process several hundred of GBs to TBs of data with the proposed NoSQL technology along with Spark. This will obviously be run on a cluster with several nodes.
Types of queries I want to run:
I am currently evaluating which technologies are possible for my usecase but I'm overwhelmed by the sheer amount of technologies there are available. I have thought about popular technologies like MongoDB and Cassandra. Both seem to be applicable for my usecase (Cassandra only with Stratios Lucene index) but there might be a different technology that works even better.
Is there any technology that will heavily outperform others based on these requirements?
I want to batch-process several hundred of GBs to TBs of data
That's not really a cassandra use case. Cassandra is firstly optimized for write performance. If you have a really huge amount of writes, Cassandra could be a good option for your. Cassandra isn't a database for Exploratory queries. Cassandra is a database for known queries. On read level Cassandra is optimized for sequentiell reads. Cassandra can only query data sequentially. It's also possible to ignore this but it's not recommended. Huge amount of data could be, with the wrong data model, a problem in Cassandra. Maybe a hadoop based database system is a better option for your.
Time queries like "date <= 01.01.2011" or "time >= 11:00 and time <= 14:00"
Cassandra is really good for time series data.
"normal" queries for attributes like "field <= value"
If you know the queries before you modeling you database, Cassandra is also a good choice.
a combination of all of the three query types (something like "query all data that where location is within bbox and on date 01.01.2011 and time <= 14:00 and field_x <= 100")
Cassandra could be a good solution. Why could? As i said: You have to know this queries before you create your tables. If you know that you will have thousands of queries where you need a time range and the location (city, country, content etc.) it is a good solution for your.
time queries on a large scale with decent performance.
Cassandra will have the best performance in this use case. The data are already in the needed order. MonoDB is a nice replacement for MySQL use cases. If you need a better scale, but scaling mongodb is not so simple as in Cassandra, and flexibly and you care about the consistency. Cassandra has eventual consistency is scalable and performance is really important. MongoDB has also relations, Cassandra not. In Cassandra is everything denormalized because performance cares.
No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
General Tech 10 Answers
General Tech 7 Answers
General Tech 3 Answers
General Tech 9 Answers
General Tech 2 Answers
Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.