I would like to download the data directly, but I am having issue because I am using Python 3 instead of Python 2. Specifically, my code is breaking because urlopen(ties) does not return json, but instead returns a HTTPResponse class. How can I adjust this to make it work with Python 3? ```python from urllib.request import urlopen import json import pandas as pd myapikey = "xxxxxxx" url = "http://api.shopstyle.com/api/v2/" ties = "{}products?pid={}&cat=mens-ties&limit=100".format(url, myapikey) jsonResponse = urlopen(ties) print(type(jsonResponse)) # returns data = json.load(jsonResponse) # this errors because jsonResponse is not json ```

Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Watch Video

Posted June 19, 2016 8:26pm by

Urllib2 not supported in python 3. How can the import script be adjusted to do the same thing without urllib2?

I would like to download the data directly, but I am having issue because I am using Python 3 instead of Python 2. Specifically, my code is breaking because urlopen(ties) does not return json, but instead returns a HTTPResponse class. How can I adjust this to make it work with Python 3?

from urllib.request import urlopen
import json
import pandas as pd

my_api_key = "xxxxxxx"

url = "http://api.shopstyle.com/api/v2/"
ties = "{}products?pid={}&cat=mens-ties&limit=100".format(url, my_api_key)
jsonResponse = urlopen(ties)
print(type(jsonResponse)) # returns <class 'http.client.HTTPResponse'>
data = json.load(jsonResponse) # this errors because jsonResponse is not json

June 20, 2016 3:03am

I figured it out. The read() method and decode() method were required to return the http response as json. This is the script that worked for me in python 3.

from urllib.request import urlopen
import json
import pandas as pd
import math

my_api_key = "xxxxxxxxxxxxx"

url = "http://api.shopstyle.com/api/v2/"
ties = "{}products?pid={}&cat=mens-ties&limit=100".format(url, my_api_key)

data = json.loads(urlopen(ties).read().decode(encoding='UTF-8'))

total = data['metadata']['total']
limit = data['metadata']['limit']
offset = data['metadata']['offset']
pages = math.ceil(total / limit)

print("{} total, {} per page. {} pages to process".format(total, limit, pages))

# tmp = pd.DataFrame(data['products'])

# set up an empty dictionary
dfs = {}

# connect with api again, page by page and save the results to the dictionary
for page in range(pages + 1):
    allTies = "{}products?pid={}&cat=mens-ties&limit=100&offset={}&sort=popular".format(url, my_api_key, (page * 50))
    data = json.loads(urlopen(allTies).read().decode(encoding='UTF-8'))
    dfs[page] = pd.DataFrame(data['products'])

df = pd.concat(dfs, ignore_index=True)

df = df.drop_duplicates('id')
df['priceLabel'] = df['priceLabel'].str.replace('$', '').str.replace(',', '')
df['priceLabel'] = df['priceLabel'].astype(float)


def breakId(x, y=0):
    try:
        y = x["id"]
    except:
        pass
    return y


def breakName(x, y=""):
    try:
        y = x["name"]
    except:
        pass
    return y


df['brandId'] = df['brand'].map(breakId);
df['brandName'] = df['brand'].map(breakName);


def breakCanC(x, y=""):
    try:
        y = x[0]["canonicalColors"][0]["name"]
    except:
        pass
    return y


def breakColorName(x, y=""):
    try:
        y = x[0]["name"]
    except:
        pass
    return y


def breakColorId(x, y=""):
    try:
        y = x[0]["canonicalColors"][0]["id"]
    except:
        pass
    return y


df['colorId'] = df['colors'].map(breakColorId);
df['colorFamily'] = df['colors'].map(breakCanC);
df['colorNamed'] = df['colors'].map(breakColorName);

df.to_csv("data.csv", sep='\t', encoding='utf-8',
          columns=['id', 'priceLabel', 'name', 'brandId', 'colorId', 'colorFamily', 'colorNamed'])

Posting to the forum is only allowed for members with active accounts.
Please sign in or sign up to post.

Welcome to the Treehouse Community

Looking to learn something new?

Jason Law

Jason Law

Urllib2 not supported in python 3. How can the import script be adjusted to do the same thing without urllib2?

Jason Law

Jason Law