×
  • remind me tomorrow
  • remind me next week
  • never remind me
Subscribe to the ANN Newsletter • Wake up every Sunday to a curated list of ANN's most interesting posts of the week. read more

Forum - View topic
One and only release scraper issues thread


Goto page 1, 2, 3, 4  Next

Anime News Network Forum Index -> Site-related -> Encyclopedia
View previous topic :: View next topic  
Author Message
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Wed Dec 14, 2016 7:34 am Reply with quote
Roundup of current issues with the scrapers. Let me know if I have missed anything

General
    eBooks are not identified as such when submitted using the wizard, whether using B&N and Amazon or B&N alone (Amazon alone is not an option per below), and are entered as GNs instead.


Right Stuf Anime
  • Completely broken. Returning "This doesn't appear to be a valid page; could not find urlcode,refcode" through the wizard and when adding links and "no product found!? (could not extract urlcode from page; could not extract refcode from page; Does not appear to be a product page)" on existing releases.
  • Does not identify combo releases.
  • Does not pick up the SKU field (usually UPC or ISBN) so cannot be used to create a release entry via the wizard without a secondary source.
  • Does not retrieve the description.
  • Does not retrieve the MSRP.


Barnes and Noble
  • Completely broken. Returning "This doesn't appear to be a valid page; could not find urlcode,refcode" through the wizard and when adding links and "no product found!? (could not extract urlcode from page; could not extract refcode from page; Does not appear to be a product page)" on existing releases.
  • Not retrieving correct in-stock status for books and ebooks.
  • Not retrieving/updating price for books (seems to be working for discs so may be due to erroneously believing out of stock?).
  • Not retrieving price for eBooks ("could not extract price from page")
  • No longer working at all: error "This doesn't appear to be a valid page; could not find urlcode,refcode" when trying to add to a release and no longer picking up price and in-stock changes for existing entries.


Amazon

  • Completely broken: "Could not fetch page" when adding links or using the wizard and "page missing/gone (Could not find page)" on existing entries (although the back-end is still picking up data through the API for physical releases).
  • Cannot create an eBook release using the wizard with only an Amazon link as "No UPC or ISBN was found at the URL".
  • Does not retrieve price for ebooks.
  • Detecting eBooks as "out of stock" when they are available to buy (I assume; being in the UK they all show as not available to me on the US Amazon site)
  • Often returns "Could not find ASIN in that page" for eBooks. With the B&N scraper down this makes it impossible to add most eBooks using the wizard.
  • "Could not fetch page" error when trying to add an Amazon listing to a release. Does not appear to be causing fetch errors on existing entries.


BookWalker
  • I assume this is still in testing since it cannot be added by users, and this isn't a problem with the scraper per se, but because it is picking up prices in Yen and combined with the problems with the B&N and Amazon scrapers, there are a number of releases showing as "from $760.00" on Manga pages (e.g. manga#13456).


Rakuten
  • No longer working at all: error "This doesn't appear to be a valid page; could not find urlcode,refcode" when trying to add to a release and no longer picking up price and in-stock changes for existing entries.


Last edited by Shiroi Hane on Tue Aug 22, 2017 11:09 am; edited 27 times in total
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Wed Dec 14, 2016 11:55 am Reply with quote
I don't know if I've prodded the hornets nest too much, but since posting I've been getting consistent "Could not fetch page" errors when trying to pull from Amazon for print books (and I've tried it from Daru, Mayuri, Ruka and Kurisu)

--edit--

Affects discs also. Updated top post.
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Thu Dec 15, 2016 9:12 am Reply with quote
The "Could not fetch page" issue appears to have been resolved or cleared itself.

--edit--

Although I have a wonderful new problem with Amazon itself since yesterday; now if I search amazon.com for kindle books by name, author or even eISBN I get no results - however if I get the ASIN via the API or just from the UK store and visit the page directly, then the page actually is there.

It's possible the Amazon scraper has been fixed for eBooks now as well, but confirming that will take longer than it ought to now I need to go through hoops.

--edit--

Every Amazon link I've tried adding to existing eBook releases has worked so far, however when trying to create a release with the wizard I am getting "No UPC or ISBN was found at the URL(s) above", however the ISBN is definitely present in the Amazon API (under Items/Item/ItemAttributes/EISBN).

I will leave this as an example rather than adding manually:

Manga#16928
eBook vol. 7
Details: https://www.hachettebookgroup.biz/titles/tsuyoshi-watanabe/dragons-rioting-vol-7/9780316470933/
Amazon link: https://www.amazon.com/dp/B01N0GITHY
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Thu Dec 15, 2016 2:47 pm Reply with quote
I knew it was too good to be true. First example of "Could not find ASIN in that page" for today:

release#32971, https://www.amazon.com/dp/B019IAA69W

What makes this different to the ones that are working, I don't know.
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
bglassbrook



Joined: 29 Aug 2006
Posts: 1243
Location: Gaithersburg, MD
PostPosted: Mon Dec 19, 2016 7:49 pm Reply with quote
Shiroi Hane wrote:
I knew it was too good to be true. First example of "Could not find ASIN in that page" for today:

release#32971, https://www.amazon.com/dp/B019IAA69W

What makes this different to the ones that are working, I don't know.

Could it be the short URL you are trying to add it by? Not sure if it was timing, but it worked using the full address (though for some reason it is missing price & availability.)

Is B&N down for everything again, or just books? Videos do seem to have a better than hair-rending chance of getting accepted.
Back to top
View user's profile Send private message My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Tue Dec 20, 2016 7:17 am Reply with quote
I think I used the full URL from the search and cut it down just for posting here.

I've mainly been doing books and the latest disc solicitations that aren't listed on B&N yet, so I'm not certain how badly disc releases are affected (I'm sure I did check a few though and got the same error).

--edit--

Actually, I may have just used the short Amazon URL. Yen Press kindle books weren't coming up when I was searching on amazon.com so I took to searching for ISBN via the API scratchpad and appending the ASIN to the URL manually. I do remember times in the past where copying the ANN affiliate link from another release and changing the ASIN has caused the link to be recognised for some reason so this might be something similar.
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Tue Dec 20, 2016 9:20 am Reply with quote
So, addendum: I've just been through the whole of animenewsnetwork.com/encyclopedia/releases.php?format=ebook adding Amazon links (where they exist) to everything and it worked every time, using the short link. I'll will strike out the issue for now.
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Fri Mar 17, 2017 12:37 pm Reply with quote
I believe the RightStuff scaper is fixed so I have struck out all the problems. I have added one however, in that it does not pick up on combos as being combos (the one I tried was picked up as a DVD). I need to find a BD release to see if those are being picked up OK, but I don't have one immediately to hand. The combo problem is a longstanding one I believe affects all retailers, but I need to find examples to confirm.
I obviously need to re-check the other listed bugs to see if any of those have been fixed also.
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Tue Mar 21, 2017 11:00 am Reply with quote
When attempting to add a Rakuten page (e.g. https://www.rakuten.com/prod/304260489.html for release#31639) it is returning "This doesn't appear to be a valid page; could not find urlcode,refcode"
Existing listings (e.g. release#30771) are also now returning the same errors.
Adding to list.

--edit--

This is for books. Haven't tried any other products yet.
Confirmed for discs as well, e.g. https://www.rakuten.com/prod/307566128.html for release#33440.
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Tue Mar 21, 2017 11:18 am Reply with quote
Also, the B&N scraper does appear to be working now (in fact, I think it is picking up pages and adding them automatically itself before I can test it) but it is not pulling prices ("could not extract price from page") and appears to be treating everything as out of stock (e.g. it says out of stock for release#31971 while the B&N site says "Get it by Friday, March 24")
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Wed Mar 22, 2017 2:33 pm Reply with quote
I've made a few new additions and tweaks to the active list.

Also however, and these are little things that we can't easily work around but which really annoy me...
Yen Press assigns age ratings like "13 & up". B&N always interprets that as "13-18", as if the only books that can be read by adults are ones that are rated 18+ (e.g. Yen vs B&N)
If Yen Press hasn't published a synopsis yet, they will repeat the one from an earlier volume with the volume number appended. If no description is available yet I prefer to leave it blank until it is, but because the retailers pick up these placeholder descriptions the system automatically re-inserts them as soon as I remove them (e.g., at time of typing, release#33567)
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Thu Mar 30, 2017 1:33 pm Reply with quote
B&N price scraper is working now. Not sure about stock status?
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
Shiroi Hane
Encyclopedia Editor


Joined: 25 Oct 2003
Posts: 7578
Location: Wales
PostPosted: Tue Jul 04, 2017 11:55 am Reply with quote
Amazon returning "Could not fetch page" for both books and ebooks (unsure about video products). See also animenewsnetwork.com/bbs/phpBB2/viewtopic.php?t=3069867
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number My Anime My Manga
stevek504



Joined: 29 Apr 2007
Posts: 216
PostPosted: Wed Jul 19, 2017 12:00 pm Reply with quote
Shiroi Hane wrote:
B&N price scraper is working now. Not sure about stock status?


I just tried adding Kuma Miko volume 1 release and as you say it allowed me to add the release but pricing and stock status does not appear to flow thru.

I also tried Amazon (Could not fetch page) and Right Stuf (This doesn't appear to be a valid page; could not find urlcode,refcode) with no luck.

The links I used:
https://www.rightstufanime.com/Kuma-Miko-Girl-Meets-Bear-Manga-Volume-1
https://www.amazon.com/Kuma-Miko-Girl-Meets-Bear/dp/1935548530/ref=sr_1_2?ie=UTF8&qid=1500481977&sr=8-2&keywords=kuma+miko
Back to top
View user's profile Send private message
BigOnAnime
Encyclopedia Editor


Joined: 01 Jul 2010
Posts: 1219
Location: Minnesota, USA
PostPosted: Thu Jul 20, 2017 1:34 am Reply with quote
Amazon, RightStuf, and DVDEmpire aren't retrieving stock status or the latest price. Also mysteriously many entries now have missing covers when they had them before.
release#31461
release#33792
release#31770 -Volume number needs to be changed to a "2" BTW.

I also get "Could not fetch page" errors when using RightStuf and Amazon links to add releases, a pretty big problem given the number of releases currently missing, old and new.
Back to top
View user's profile Send private message Visit poster's website My Anime My Manga
Display posts from previous:   
Reply to topic    Anime News Network Forum Index -> Site-related -> Encyclopedia All times are GMT - 5 Hours
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 


Powered by phpBB © 2001, 2005 phpBB Group