Get help with morph.io and scraping

Scraper stuck on 'Creating GitHub repository'

I tried to create some new scrapers at morph.io/wdiv-scrapers/DC-PollingStations-Doncaster and morph.io/wdiv-scrapers/DC-PollingStations-Kingston but they are both stuck on ‘Creating GitHub repository’.

I’ve created scrapers ok under my own github username, so I’m guessing that maybe there’s some permissions issue with creating the scraper under this organisation namespace? I guess I have 2 questions/requests:

  1. Is there some setting or permission I need to change to allow morph to create repositories in this github organisation? My user account is already the organisation owner.
  2. Can someone reset/delete the 2 scrapers I’ve tried to create.

Thanks,
Chris

Quick update on this - I’ve now managed to create some scrapers in my organisation by creating repositories on github first and then importing via morph.io/scrapers/new/github (instead of trying to create the repositories through morph) so if it isn’t obvious what I need to change, just delete the 2 that are stuck and I can make my scrapers like this in future.
Cheers

Apologies for bumping this, but would it be possible to get these 2 deleted - I’d like to be able to import scrapers at these urls and keep my naming convention.
Cheers,
Chris

@chris48s first up, sorry for the problems you had with the forums marking your posts as spam. Discourse has pretty aggressive anti-spam stuff. I’ve whitelisted morph.io so hopefully it’ll be a little less overbearing about posts with links.

After much fiddling on the console I’ve tried to rerun the scraper creation process. The problem morph.io is hitting is this bug:

Which is caused by the permissions on your GitHub repository. The error I can see is this one:

Octokit::Forbidden: POST https://api.github.com/orgs/wdiv-scrapers/repos: 403 - You need admin access to the organization before adding a repository to it. // See: https://developer.github.com/v3

If you fix up the GitHub permissions I can rerun the manual thing on the console to try creating these scrapers.

To save me typing and working it out next time, this magic should work on the console:

CreateScraperWorker.new.perform(
  Scraper.find_by(full_name: "wdiv-scrapers/DC-PollingStations-Doncaster").id,
  User.find_by(nickname: "chris48s").id,
  "https://morph.io/wdiv-scrapers/DC-PollingStations-Doncaster"
)

Hi Henare,
Thanks for responding. Whitelisting morph.io on discourse’s spam filter is probably a sensible move :slight_smile:

I appreciate your explanation of the problem. Looking at the organisation permissions for my account on the wdiv-scrapers organisation:

As an owner, chris48s has admin access to all repositories that belong to the wdiv-scrapers organization.

…I can’t see what I can change that would give my account any higher level of access to the organisation than that.

As I say in my second post, I’m perfectly happy to just create repositories on github and import them (as I’ve done with my other 32 working scrapers) rather than have morph create the repos for me - I’d just like to free up the names:
wdiv-scrapers/DC-PollingStations-Kingston
wdiv-scrapers/DC-PollingStations-Doncaster
so I can create github repos myself and import them into morph with those names. Is there any chance you could just delete those 2 scrapers?

Cheers,
Chris

I was hoping to fix the deeper problem but meh, I’ve just done this now :slight_smile:

Thanks Henare,
I’d also ideally like to know what I need to change in my organisation permissions, but as I say I can’t see what else I can grant. Perhaps someone else who creates scrapers under a GitHub org rather than their own namespace might be able to let me know what I’m missing.

Cheers,
Chris

1 Like

Hey friends,

This scraper has been stuck on ‘Creating GitHub repository’ for the last 40min or so https://morph.io/austccr/fossil_fuel_lobby_mediacloud_mentions

Would someone mind giving it nudge when you get the chance.

Thanks for your help,

Luke

Hi Luke,

It looks like it’s a permission problem with the organisation that you’re trying to add a scraper to. Github is coming back with the error: “Octokit::Forbidden: POST https://api.github.com/orgs/austccr/repos: 403 - You need admin access to the organization before adding a repository to it. // See: https://developer.github.com/v3/repos/#crea…”

Sorry that it’s just been hanging rather than giving you a sensible error message.

Once you fix the permissions at your end, it should retry and hopefully work. I can also kick a retry from here if necessary.

All the best,
Matthew

Thanks for your help Matthew,

I’m the owner of the org on Github, and in the settings page it says ‘As an owner, equivalentideas has admin access to all repositories that belong to the austccr organization.’.

Do you know if there’s any additional permissions required on my end?

For context, I’ve created other scrapers on the org through the ‘New scraper’ path successfully over the last few months (this is the first one in python, in case that might be relevant), and I’m not aware of any permissions changes to my Github account.

Best wishes,
Luke

I think if you join or create an organisation after you created your morph account it doesn’t automatically give morph permission on the new org. I was able to fix this like this:

  • Sign into GitHub, click your avatar and click ‘settings’
  • Choose ‘Authorized OAuth Apps’ in the left menu bar
  • Click on ‘Morph.io’ in your list of apps
  • Under ‘Organization access’, find the org you want to create scrapers in and click ‘Grant’

Amazing @chris48s thanks for that. I didn’t even realise there were more detail per app settings in on that getting page. Morph wasn’t granted permissions for that org, so I’ve granted them now.

Once you fix the permissions at your end, it should retry and hopefully work. I can also kick a retry from here if necessary.

I’ve given in 10 min or so, but I think it needs some encouragement if you don’t mind @mlandauer :slight_smile:

Thanks

Yay, it’s unstuck and created successfully. Thanks @chris48s and @mlandauer :gem:

Best wishes,

Luke