What am I doing wrong?

Question

I'm very new to web scraping, and I'm trying to read some information off a website. It seems like the URL doesn't work.
```python 
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("https://www.battlemetrics.com/servers/ark/2339725")
soup = BeautifulSoup(html.read(), "html.parser")
for th in soup.find_all('th'):
    print(th)
```

Brendan Whiting · Accepted Answer

This is the errror I get:

HTTP Error 403: Forbidden

Basically, the problem is that the webpage can tell that it's being scraped by a bot, and a lot of people don't want their webpages to be scraped :P.
I found this stackoverflow (https://stackoverflow.com/questions/16627227/http-error-403-in-python-3-web-scraping) article with a solution, and modified your code by adding the headers as described and it seems to work. It's sort of like we're having the bot disguise itself with the header that says "I'm a Mozilla Browser!" and the webpage says "OK, I believe you".
```Python
from urllib.request import urlopen, Request
from bs4 import BeautifulSoup
req = Request('https://www.battlemetrics.com/servers/ark/2339725', headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
soup = BeautifulSoup(webpage, "html.parser")
for th in soup.find_all('th'):
    print(th)
```

Welcome to the Treehouse Community

Looking to learn something new?

Nathan English

Nathan English

What am I doing wrong?

1 Answer

Brendan Whiting

Brendan Whiting