What version of PhantomJS is on Morph.io?

alisonkeen · October 29, 2016, 2:09am

As a break from issues with SqlLite not storing data, I have a different issue:

I’m working on this scraper:
https://morph.io/alisonkeen/SA_Parl_sitting_dates

On my local machine it’s working okay, but on Morph.io it doesn’t seem to be finding the data injected by JavaScript on the website in question. Could this be because I am using the most recent version of PhantomJS locally and Morph isn’t? (How do I get around this - do I put the executable in the repo and tell it to use that copy somehow? which version do I need - linux x86?)

The Legend at the top is hard coded and gives the class IDs for where the data is, but when Morph.io scrapes the page, the table cells containing the data which are populated by Javascript appear to be nonexistent or empty.

wfdd · November 3, 2016, 11:04pm

It looks like it takes a little while for the calendar to load the first time and I’m guessing this is why it’s failing. You can wait for it to finish loading by applying a no-small-dose of RSpec:


require 'rspec/expectations'

require 'capybara'
require 'capybara/poltergeist'
require 'capybara/rspec/matchers'

require 'scraperwiki'

class CalendarSearch
  include RSpec::Matchers
  include Capybara::RSpecMatchers

  @@url = "http://hansardpublic.parliament.sa.gov.au/#/search/1"

  def initialize
    @session = Capybara::Session.new(:poltergeist)
    @session.driver.browser.js_errors = false
  end

  def ready
    @session.visit(@@url)
    warn 'waiting...'
    expect(@session).to(have_css('.scheduler .yearWrapper .k-widget', wait: 10))
    warn 'all set!'
    yield(@session)
  end
end

CalendarSearch.new.ready do |session|
  # do stuff with `session`
end

Or you can just sleep for a few secs at the beginning and hope for the best

henare · November 4, 2016, 1:45am

What version of PhantomJS is on Morph.io?

It’s just installed using apt-get so whatever version Ubuntu has, I guess. You could always write a scraper to find out

alisonkeen · November 4, 2016, 10:00am

Thankyou @wfdd - that really helps!

I will play with it next week or the week after when I get a chance to code.

It would be helpful to have scrapers running on Morph.io not just locally. Then other people can use them or reuse the code for other stuff

@henare LOL at writing a scraper to find out.

As an aside, when using apt install PhantomJS on my Ubuntu 16.* VPS, it installed great but errored when run. Something about the repo version being compiled against Unity or another graphics library. Ended up using the compiled-from-source version.

alisonkeen · November 5, 2016, 7:04am

Hey, I just got a couple of lessons in proper code structuring in Ruby, from this (bonus). Thanks @wfdd

alisonkeen · November 5, 2016, 12:11pm

Got it working! Sweet. Thankyou for all your help!

wfdd · November 5, 2016, 9:13pm

Glad I could help