Not getting any data at all from spider

Question

I've been following the video, I made my spider, I made almost no changes to the code and I got no results from the spider. I got a report saying zero webpages were crawled, despite the urls being copied and pasted from the webpages Treehouse provided:

import scrapy

class HorseSpider(scrapy.Spider):
    name = 'ike'

    def start_request(self):
        urls = [
            'https://treehouse-projects.github.io/horse-land/index.html',
            'https://treehouse-projects.github.io/horse-land/mustang.html'
        ]
        return [scrapy.Request(url=url, callback=self.parse) for url in urls]

    def parse(self, response):
        url = response.url
        page = url.split('/')[-1]
        filename = 'horses-%s' % page
        print('URL: {}'.format(url))
        with open(filename, 'wb') as file:
            file.write(response.body)
        print('Saved as %s' % filename)

What am I missing?

Answer 1 · 2018-08-24T03:05:56Z

August 24, 2018 3:05am

Check def start_request(self): it should be def start_requests(self):

Answer 2 · 2020-08-05T00:26:12Z

August 5, 2020 12:26am

I make the same error as the initial poster (defining a "start_request" function). Changing it to start_requests worked for me.

Welcome to the Treehouse Community

Looking to learn something new?

Jonathan Kuhl

Jonathan Kuhl

Not getting any data at all from spider

2 Answers

Trevor J

Trevor J

Mark Chesney

Mark Chesney

Mark Chesney

Mark Chesney

jessechapman

jessechapman