technology for web scraping need?

General Tech Technology & Software 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Technology & Software related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago


mates i need to know what is the best programming technology is best for web scraping from dynamic sites like Google search,bing search,Social media sites etc hope you get my point.

Want something is highly scalable and low resource taker also.

Also waste majority of developers community?

Modern language with best combination of DATABASE also i was thinking for MYSQL InnoDB? As we need to store the scraped data and present.

Cause we have been using PHP with MYSQL which is slow working at scrapping.

Let me know thanks please.

Regards

profilepic.png
manpreet 2 years ago


Look for an API for the particular scraping you want (such as rankings for keywords).

Then use an appropriate language to decode what the API gives you. If it gives you JSON or CSV, then Perl and PHP are excellent. Use the programming language to massage the data, then build a bulk INSERT or a CSV file (for LOAD DATA) and insert the stuff into an InnoDB table.

If you cannot find a suitable API, but you can find suitable web pages, then Perl may be the best for parsing. Look in CPAN for a suitable library to help you; there will be several (some better than others).


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.