storm crawler - Technology stack and Apache Nutch

General Tech Technology & Software 2 years ago

0 1 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Technology & Software related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (1)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

I want to crawl a particular forum near real time and dump the data into HDFS if not Hbase.

I heard Apache Nutch could solve the purpose but sadly the technology stack it needed is pretty old. I don't want to downgrade the hadoop from 2.6 to earlier version and Elasticsearch to 1.7/1.4 hence i shifted my focus to storm-crawler.

Since I am using Hadoop 2.6, Elasticsearch 2.0 and Hbase 1.1.3, can anyone tell me if storm-crawler 0.9 can be used along with them?

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.