Speak now
Please Wait Image Converting Into Text...
Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Challenge yourself and boost your learning! Start the quiz now to earn credits.
Unlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
General Tech Technology & Software 2 years ago
Posted on 16 Aug 2022, this text provides information on Technology & Software related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.
Turn Your Knowledge into Earnings.
I was wondering if you could tell me which NoSQL db or technology/tools should I use for my scenario. We are looking at replacing our OLAP cubes based on SQL server Analysis services with an open source technology coz the data is getting too huge to manage and queries are taking too long to return. We have followed every rule in the book to shard the data, optimize the design of the cube by using aggregations and partitions etc and still some of our distinct count queries take 1-2 mins :( The data size of our fact table is roughly around 250GB. And there are 10-12 dimensions connected in star schema fashion.
So we decided to give open source technologies like Hadoop/HBase/NoSQL dbs a try to see if they can solve our OLAP scenarios with minimal setup and onboarding.
Our main requirements for the new technology are
It has to get blazing fast or instantaneous results for distinct count queries ( < 2 secs)
Supports the concept of measures and dimensions (like in OLAP).
As there are so many new technologies and tools in the open source world today, I was hoping if you can help me point to the right direction.
Notes: I'm from Apache Kylin team.
Please refer to below answers which may bring some idea for you:
Our main requirements for the new technology are It has to get blazing fast or instantaneous results for distinct count queries ( < 2 secs)
--Luke: 90%tile query latency less than 5s is our current statistics. For <2s on distinct count, how many data you will have? Is approximate result ok?
--Luke: Kylin is pure OLAP engine which has dimension (supports hierarchy also) and measure (Sum/Count/Min/Max/Avg/DistinctCount) definition
Support SQL like query language as many of our developers are SQL experts. --Luke: Kylin support ANSI SQL interface (most SELECT functions)
Ability to connect Excel/Tableau to visualize the data.
--Luke: Kylin has ODBC Driver works very well with Tableau, Excel/PowerBI will coming soon.
Please let's know if you have more questions.
Thanks.
No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.
General Tech 10 Answers
General Tech 7 Answers
General Tech 3 Answers
General Tech 9 Answers
General Tech 2 Answers
Ready to take your education and career to the next level? Register today and join our growing community of learners and professionals.