Mass backlog again

None of my scrapers are currently running — with a long queue building up.

Looking at the currently running queue, some have started over 20 hours ago.

(Not sure if it’s connected but https://morph.io/jflavan/pitchfork_review_data, which is one of the currently running scraper, took a very long time to load (almost 2 minutes), and looks like it has a lot of console output lines.)

Thanks for posting @tmtmtmtm, sorry about the delay responding - it was a long weekend here. I’m glad to see that it looks like things are running OK again now.

Your hunch was almost certainly right - there were lots of CPU and memory alerts from New Relic over the last few days. When someone creates a scraper with a crazy amount of log lines the Sidekiq worker that pulls those out of the container can use heaps of memory and CPU: https://github.com/openaustralia/morph/issues/919

Everything is running super super slow again today.

And when I examine the queue, 4 of the top 5 scrapers area again the Sidekick review ones (https://morph.io/atoumey/pitchfork_review_data, https://morph.io/mattbrehmer/pitchfork-full-text-album-reviews, https://morph.io/jflavan/pitchfork_review_data, https://morph.io/benfb/pitchfork_review_data) which even trying to look at takes for ever and almost crashes my laptop

What a pain. I’ve just spent the day fixing Every scraper started failing with a buildpack error · Issue #984 · openaustralia/morph · GitHub and then Python scrapers without requirements.txt are failing due to missing scraperwiki module · Issue #985 · openaustralia/morph · GitHub. I’m afraid things might be a bit slow again for the next day or so as buildpacks rebuild after the fix for #985.

The root cause to that is almost certainly an insane number of log lines. The fix is Consider limiting the number of lines of output your scraper can write · Issue #919 · openaustralia/morph · GitHub - when that’s done a lot of our problems will go away.