Just recently started playing around with Morph again as QuickCode is shutting down. I’ve had a scraper working for quite a while on QuickCode (Scraperwiki) scraping course registration data. It’s working fine in QuickCode/Scraperwiki but on Morph.io I get a 502 Bad Gateway Error:
Traceback (most recent call last):
File “scraper.py”, line 47, in
response = br.open(url, timeout=60)
File “/app/.heroku/python/lib/python2.7/site-packages/mechanize/_mechanize.py”, line 203, in open
return self._mech_open(url, data, timeout=timeout)
File “/app/.heroku/python/lib/python2.7/site-packages/mechanize/_mechanize.py”, line 255, in _mech_open
mechanize._response.httperror_seek_wrapper: HTTP Error 502: Bad Gateway
On Twitter, someone from Morph.io said it looked like a bad certificate. And, indeed, when I go to this site in Chrome I get a warning that says the certificate is invalid, so that sounds right.
But I don’t really care about the security vulnerability here. I just want to scrape the data. Is there some way I could just get mechanize to ignore the security certificate and plow on anyway?
At the time, the answer was “Not easily; we funnel web traffic through a proxy which validates the certificates and you can’t override that”.
However, because of this issue (and a few other things that came up at the same time) we’ve disabled that proxy, so the answer now is “Yes, it should be fairly easy”. It looks like https://www.python.org/dev/peps/pep-0476/#opting-out is still relevant, I think, but I haven’t tested this.