Scraping Javascript Heavy Sites - PhantomJS version [With Python and Selenium]

otherchirps · August 16, 2015, 3:58pm

Hiya,

Sounds like a hell of a corner case you’ve fallen on… I haven’t done a heap of stuff with python + phantom, but have dabbled a little.

The few that I’ve run on morph seemed to work ok with the system default phantomjs. Here’s one that’s using the python+selenium+phantomjs combo. Superficially, it doesn’t seem to be doing anything too different to your scraper. It’s not trying to call send_keys, as you are, but I wouldn’t think that should have this kind of effect.

I cloned your repo, and had a little try. Likewise, it seemed to work locally. I took your word that it wasn’t working up on morph. Instead of stepping through selenium stuff, I dropped in a library called Splinter, which is a slightly higher-level wrapper for selenium, in python. Firstly, I ran it without your custom phantom binaries, just to try the system default.

It appeared to run ok for me over here, but there’s no data saved yet (still just print statements, instead of saving the rows. I’m guessing that was next on the list?).

So… Sorry, no clues here on where that initial error you’re seeing is coming from, but maybe this is a side-step around the problem? If it works for you as well?