My scraper stuck, what to do?

I have scrapper: https://morph.io/IvanGoncharov/ProgrammableWeb
It stuck yesterday, “Stop scraper” doesn’t work.
What should I do?

1 Like

Thanks for opening this thread and sorry about this problem :frowning: We’ve got an unresolved issue where scrapers can be killed by a backend process and then get stuck.

I’ve manually unstuck your scraper :smile: If anyone else has this problem, just post a message here and we can look into it until it’s properly resolved.

Thanks.
Can I workaround it on my side?
Why backend tries to kill scraper? It some kind of timeout?

Not currently but you should be able to, we just need to develop a solution.

It’s not a timeout, it’s a bug in the underlying backend that we haven’t solved. From time to time, memory of the background process blows out, possibly due to this, which either results in restarting the queue (which has its own problems) or sometimes scrapers get orphaned like your one.

We need two fixes - one that means scraper never appears “stuck” like this, and one that fixes the underlying backend issue so it doesn’t happen in the first place.

Hello, 4 of my scrapers got stuck xD.

Can you please stop them?

Looks like we had quite a few stuck scrapers. I’ve re-added them to the queue so they should start finishing up.

Hi, one of my scrapers got stuck as well. Its called appex_tagline, could you stop it for me please? Thanks so much!

Also, is there a way to avoid getting stuck again?

Hi @jennyj5, I can’t see the scraper any more so I assume you deleted it.

Ideally scrapers should never get stuck, it’s just a bug in morph.io not with your scraper.

Hi, I think my scraper is stuck… Could you check it and stop it for me please?

sp_DFT004_DFT_gov

@woodbine That’s fixed.

Hello henare,

can you fix my stucked scraper, please?

https://morph.io/arasabbasi/behesht_zahra_5

I am still missing about 2000 entries. So if you unstuck it and i scrape the rest, I would be not coming back for asking to fix my scrapers ;)…

Thank you

Sorted. Luckily it’s been happening a lot less lately because of some other fixes we made.

I always meant to ask @arasabbasi, what’s the data you’re scraping and what are you using it for?

Hello henare,

thank you for your support.

I am scraping the databases of some iranian graveyards. Searching in them is not so easy. e.g. I couldnt find my greatgreatgrandmother and my greatgrandfather, even though I knew how their name was spelled and where they are buried. Now I have the database of behesht zahra (=Garden of Zahra) graveyard in Tehran.

I will try to scrape some other cemetery databases too and share them with others, who have iranian ancestors. Maybe I give that to some Iranistik Institutes here in germany… maybe they are interested in that…

1 Like

@henare
My scraper stucked one more time:
IvanGoncharov/API_specifications
It’s stucked for 12 hours already and don’t react on “Stop scraper” button.

I’ve given it a kick and it’s fixed now.

Hi,
quite a few of my scrapers seem to be getting stuck! I don’t know if it is something in my code, but a few used to be running okay and have started getting stuck without any changes. Some have even run for over a week, and are unresponsive to the stop. Thank you!

Hi @ErinClark, it’s almost certainly not your code but a long standing issue in morph.io.

I’ve once again done a manual fix. Because we’re so busy with other projects we’re looking to get some outside help to keep developing morph.io and fix this issue once and for all.

Thank you for the fix!

1 Like

I’ve once again done a manual fix. Because we’re so busy with other projects we’re looking to get some outside help to keep developing morph.io and fix this issue once and for all.

Hi @henare,

My project heavily depends on morph.io for daily scraping.
And I would love to help but I don’t have any Ruby knowledge.

Instead, I can ask users of my project to help in Gitter and Twitter.
Can you please write a document or blog and describe what kind of help do you need?

I also think that there are other people willing to help they just don’t know how.
I strongly suggest you add this message on top of README file and as a banner on morph.io itself.

IMHO, even if it’s an open-source project you still have some responsibilities to your users.
If you don’t enough resources to maintain it’s operation, you should ask your community to help.
And the fact that I don’t see any mention of this problem on the main page is very disturbing.