Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community!
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trial

Andrew Cottage
20,718 PointsI'm stuck on how to integrate a nokogiri scrape into my rails application.
Here is my issue.
I wrote a nokogiri scrape in ruby that simply scrapes and puts the results on the screen.
require 'nokogiri'
require 'open-uri'
require 'rubygems'
url = 'http://disneyauditions.com/audition-calendar/'
data = Nokogiri::HTML(open(url))
auditions = data.css('.audtion')
auditions.each do |audition|
puts audition.css('.name').text
puts audition.css('.businessunit').text
puts audition.css('.location').text
puts audition.css('.venue').text
puts audition.css('.talent_type').text
puts audition.css('.start_date').text
puts audition.css('.start_time').text
puts audition.css('.time_zone').text
end
I was able to implement it and display it by putting half the code in my controller, and the other half in the view.
Controller
def list
require 'nokogiri'
require 'open-uri'
require 'rubygems'
url = 'http://disneyauditions.com/audition-calendar/'
data = Nokogiri::HTML(open(url))
@auditions = data.css('.audtion')
end
View
<div class="col-md-8 col-md-offset-2">
<table class="table table-striped">
<thead>
<tr>
<th>Resort</th>
<th>Type</th>
<th>Venue</th>
<th>Location</th>
<th>Date</th>
<th>Time</th>
<th>Zone</th>
</tr>
</thead>
<% @auditions.each do |a| %>
<% unit = a.css('.businessunit').text %>
<% location = a.css('.location').text %>
<% venue = a.css('.venue').text %>
<% type = a.css('.talent_type').text %>
<% date = a.css('.start_date').text %>
<% time = a.css('.start_time').text %>
<% time_zone = a.css('.time_zone').text %>
<tbody>
<tr>
<td><%= unit %></td>
<td><%= type %></td>
<td><%= venue %></td>
<td><%= location %></td>
<td><%= date %></td>
<td><%= time %></td>
<td><%= time_zone %></td>
</tr>
<% end %>
</tbody>
</table>
</div>
I'm truly stuck on this next part.
I would like to scrape the data, and put it into the database. I would then like to in my view display the data by simply calling from the database.
As my site is set up now every time I visit the page it scrapes the website.
Questions: Where should I put the code in my rails app to scrape the website? How do I edit my code to insert scraped data into the database? How do I schedule my scrape to run at a certain time daily? How do I display my scraped data in my view once it's in the database?
Thanks for all of your help in advance. I have spent hours searching the web trying to figure this out.
I'm feeling so defeated ARGH!
6 Answers

Nick Fuller
9,027 PointsWooooah there Andrew!
First of all this is fun stuff you're doing and that's quite a bit of a loaded post you have here. But I love what you're doing it sounds fun. You have four primary questions here and they aren't small.
a. Where should I put the code in my rails app to scrape the website?
First, I suggest looking at this book. I have no affiliation with the author or the publisher, but it's simply an amazing book that will really help a person of your skill set reach the next level. http://www.poodr.com/
Next, this sounds like business logic right? What is a model? It's an object, that controls business logic! Rails usually associates a model as an object which speaks to your database, which is true, but you can create models that don't deal with your database at all!
For instance... what if you created a file called disney_scraper.rb
and put it in your models directory. (FYI I haven't tested this but just trying to demonstrate)
require 'nokogiri'
require 'open-uri'
class DisneyScraper
attr_reader :url, :data
def initialize(url)
@url = url
end
def get_class_items(class)
data.css(class)
end
def data
@data ||= Nokogiri::HTML(open(url))
end
end
With this, in your controller you can now do something like
def list
@disney_scrape = DisneyScraper.new('http://disneyauditions.com/audition-calendar/')
@auditions = @disney_scrape.get_class_items('.audition')
end
b. How do I edit my code to insert scraped data into the database?
This still doesn't save them to the d/b but you can create a model in rails called... DisneyAuditions and then in your DisneyScraper object you can work with the DisneyAuditions model to save your values. I'm kind of just spouting stuff out here because you're really getting into some fun design concepts and with Ruby there are lots of ways to do things!
c. How do I schedule my scrape to run at a certain time daily?
This is also fun! Check out DelayedJobs!
https://github.com/collectiveidea/delayed_job
and from the immortal Ryan Bates:
http://railscasts.com/episodes/171-delayed-job-revised
d. How do I display my scraped data in my view once it's in the database?
If you end up creating an ActiveRecord model to save the data into your d/b, well you can use the same object to pull the data out! Just like any other rails model :)
I hope this helps! It's a big project man, keep at it, this sounds fun!

Andrew Cottage
20,718 PointsThanks Nick Fuller. I haven't had time to implement any of your suggestions yet, but I will be trying it this coming week. It surely is fun and will be a massive undertaking to get it up to the state that I want it to be. I appreciate your help and will update this ticket when I reach a solution.

Andrew Cottage
20,718 PointsAfter trying Nick Fuller's suggestions I still have not been able to figure this out. I think that my issue stems from not having a fundamental understanding of how rails relates Models, Views and Controllers.
I tried the following:
To start I currently have: Controller
class AuditionsController < ApplicationController
def list
@disney_audition_scrape = Audition.new('http://disneyauditions.com/audition-calendar/')
@disney_auditions = @disney_audition_scrape.get_class_items('.audition')
end
end
Model:
class Audition < ActiveRecord::Base
require 'nokogiri'
require 'open-uri'
require 'rubygems'
attr_reader :url, :data, :selector
def initialize(url)
@url = url
end
def get_class_items(selector)
data.css(selector)
end
def data
@data = Nokogiri::HTML(open(url))
end
end
View:
<div class="col-md-8 col-md-offset-2">
<table class="table table-striped">
<thead>
<tr>
<th>Resort</th>
<th>Type</th>
<th>Venue</th>
<th>Location</th>
<th>Date</th>
<th>Time</th>
<th>Zone</th>
</tr>
</thead>
<% @disney_auditons.each do |a| %>
<% unit = a.css('.businessunit').text %>
<% location = a.css('.location').text %>
<% venue = a.css('.venue').text %>
<% type = a.css('.talent_type').text %>
<% date = a.css('.start_date').text %>
<% time = a.css('.start_time').text %>
<% time_zone = a.css('.time_zone').text %>
<tbody>
<tr>
<td><%= unit %></td>
<td><%= type %></td>
<td><%= venue %></td>
<td><%= location %></td>
<td><%= date %></td>
<td><%= time %></td>
<td><%= time_zone %></td>
</tr>
<% end %>
</tbody>
</table>
</div>
When I try to load the page I get the following:
Showing /home/andrew/Projects/vsrb/app/views/auditions/list.html.erb where line #15 raised:
undefined method `each' for nil:NilClass
Extracted source (around line #15):
12
13
14
15
16
17
18
<th>Zone</th>
</tr>
</thead>
<% @disney_auditons.each do |a| %>
<% unit = a.css('.businessunit').text %>
<% location = a.css('.location').text %>
<% venue = a.css('.venue').text %>
Rails.root: /home/andrew/Projects/vsrb
Application Trace | Framework Trace | Full Trace
app/views/auditions/list.html.erb:15:in `_app_views_auditions_list_html_erb__3705441443227933535_12536740'
How does a particular controller know what model to talk to? Does it just talk to them all?
Do Models hold business logic and control communication with the database?
If I create a class inside a model, from which controller can I access that?
Again any and all help is appreciated!

Nick Fuller
9,027 PointsDo you have this on github?

Doug Tucker
7,437 PointsAre you still working on this project? I would be interested in learning about what you've done if you have made any progress.

Andrew Cottage
20,718 PointsHello, sorry I never responded to this post. Yes I figured out how to get it integrated thanks Nick Fuller. Yes the project is on github. It's actually part of the back end that I built for my girlfriends actress site, victoriaspringer.com
github.com/lambbear/vsrb
I basically just put the logic for the scrape into a rake task, then I call that rake task with heroku scheduler and have it run every 15 mins or so.