python beautifulsoup web scraping issue

Course Queries Syllabus Queries 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Syllabus Queries related to Course Queries. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

s://forum.tuteehub.com/tag/page">page = requests.get("http://www.freejobalert.com/upsc-recruitment/16960/#Engg-Services2019")
c = s://forum.tuteehub.com/tag/page">page.content
soup=BeautifulSoup(c,"html.parser")
data=soup.find_all("tr")
for r in data:
    td = r.find_all("td",{"style":"text-align: center;"})
    for d in td:
        link =d.find_all("a")
        for li in link:
            span = li.find_all("span",{"style":"color: #008000;"})
            for s in span:
                strong = s.find_all("strong")
                for st in strong:
                        dict['title'] = st.text
        for l in link:
            dict["link"] = l['href']
    print(dict)

It is giving

{'title': 'Syllabus', 'link': 'http://www.upsc.gov.in/'}
{'title': 'Syllabus', 'link': 'http://www.upsc.gov.in/'}
{'title': 'Syllabus', 'link': 'http://www.upsc.gov.in/'}

I am expecting:

{'title': 'Apply Online', 'link': 'https://upsconline.nic.in/mainmenu2.php'}
{'title': 'Notification', 'link': 'http://www.freejobalert.com/wp-content/uploads/2018/09/Notification-UPSC-Engg-Services-Prelims-Exam-2019.pdf'}
{'title': 'Official Website ', 'link': 'http://www.upsc.gov.in/'}

Here i want all "Important Links" means "Apply online","Notification","official s://forum.tuteehub.com/tag/website">website" and it's link for each table. but it is giving me "Syllabus" in title instead with repeting links..

please have a look into this..

profilepic.png
manpreet 2 years ago

This may help you, check the code below.

import requests
from bs4 import BeautifulSoup
com/tag/page">page = requests.get('http://www.freejobalert.com/'
'upsc-recruitment/16960/#Engg-Services2019')
c = com/tag/page">page.content
soup = BeautifulSoup(c,"html.parser")
row = soup.find_all('tr')
dict = {}
for i in row:
    for title in i.find_all('span', attrs={
        'style':'color: #008000;'}):
        dict['Title'] = title.text
    for com/tag/link">link in i.find_all('a', href=True):
        dict['Link'] = com/tag/link">link['href']
        print(dict)

0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.