Get help with morph.io and scraping

Mass backlog again


#1

None of my scrapers are currently running — with a long queue building up.

Looking at the currently running queue, some have started over 20 hours ago.

(Not sure if it’s connected but https://morph.io/jflavan/pitchfork_review_data, which is one of the currently running scraper, took a very long time to load (almost 2 minutes), and looks like it has a lot of console output lines.)


#2

Thanks for posting @tmtmtmtm, sorry about the delay responding - it was a long weekend here. I’m glad to see that it looks like things are running OK again now.

Your hunch was almost certainly right - there were lots of CPU and memory alerts from New Relic over the last few days. When someone creates a scraper with a crazy amount of log lines the Sidekiq worker that pulls those out of the container can use heaps of memory and CPU: https://github.com/openaustralia/morph/issues/919


#3

Everything is running super super slow again today.

And when I examine the queue, 4 of the top 5 scrapers area again the Sidekick review ones (https://morph.io/atoumey/pitchfork_review_data, https://morph.io/mattbrehmer/pitchfork-full-text-album-reviews, https://morph.io/jflavan/pitchfork_review_data, https://morph.io/benfb/pitchfork_review_data) which even trying to look at takes for ever and almost crashes my laptop


#4

What a pain. I’ve just spent the day fixing https://github.com/openaustralia/morph/issues/984 and then https://github.com/openaustralia/morph/issues/985. I’m afraid things might be a bit slow again for the next day or so as buildpacks rebuild after the fix for #985.

The root cause to that is almost certainly an insane number of log lines. The fix is https://github.com/openaustralia/morph/issues/919 - when that’s done a lot of our problems will go away.