Page 1 of 1

URLs

Posted: Thu Oct 04, 2018 11:16 pm
by JeanneMarie
Is there a guide to de-coding the URLs used here at IMSLP? I'm asking because I've run into a problem and figure it's probably not the last time.

I created a book Appendix listing a lot of scores, hotlinking them here. All my hotlinks were good a year or so ago. Now some are not. I'm getting diverse messages on about a half-dozen. Sometimes I get "file not found" while other times I get a can't-open-the-file message. And just a few are resulting in "The owner of imslp.nl has configured their website improperly. To protect your information from being stolen, Firefox has not connected to this website."

This is one example:
Currently, this is a good link--

Code: Select all

http://hz.imslp.info/files/imglnks/usimg/9/9b/IMSLP43383-PMLP07734-Puccini_-_Madama_Butterfly_-_Act_I_(full_score).pdf
The URL that was good but now can't display is--

Code: Select all

http://petrucci.mus.auth.gr/imglnks/usimg/9/9b/IMSLP43383-PMLP07734-Puccini_-_Madama_Butterfly_-_Act_I__full_score_.pdf
The link that Firefox refuses to connect to is this

Code: Select all

https://imslp.nl/imglnks/usimg/d/d8/IMSLP43372-PMLP50401-Puccini_-_Tosca_-_Act_III__full_score_.pdf
But the file is now at

Code: Select all

http://conquest.imslp.info/files/imglnks/usimg/d/d8/IMSLP43372-PMLP50401-Puccini_-_Tosca_-_Act_III_(full_score).pdf
So is the old URL pointing to a mirror that is no longer used? Is there someplace one keeps up on changes like this?

As mentioned, I've got about five more. So any guidance on how to read these? how to anticipate any problems? Are they just inevitably going to change? unpredictably? And just understanding why things go wrong would help a lot.

Sincerest thanks for any info.

EDIT Oct. 5: It was wishful thinking and denial when I said I had a few problems. It's a big mess. Lots and lots of all three kinds of problem and a pattern is clear with respect to the "insecure connection" message. Still, it's not all clear and I could really use any suggestions going forward. If it's just the nature of the beast, I do understand. Again, thanks for any information.

Re: URLs

Posted: Fri Oct 05, 2018 5:45 pm
by Choralia
The IMSLP architecture is such that traffic is distributed over several mirrors, therefore the same edition may be reachable at multiple URLs. Also, it is possible that page titles or file names slightly change from time to time, due to changes in title standardization, encoding of special characters, etc.

I'd suggest using the IMSLP number as a key, because the IMSLP number is a unique identifier that does not depend on the mirror, and that doesn't change if small changes to the title or filename are made. Given an IMSLP number, you reach the relevant file as follows:

https://imslp.org/wiki/Special:ImagefromIndex/<edition number>

For example:

https://imslp.org/wiki/Special:ImagefromIndex/43383

https://imslp.org/wiki/Special:ImagefromIndex/43372

I hope this helps.

Max

P.S.: I'd also recommend using "shallow links" (i.e., links to the web page) rather than "deep links" (i.e., direct links to pdf files). This allows the user to know about any possible copyright restrictions applicable to the score, which would be not displayed otherwise.

Re: URLs

Posted: Fri Oct 05, 2018 8:51 pm
by JeanneMarie
WOW! Thank you so much. Yes, this helps enormously.

And the URLs you supplied will persist unless a file's "edition" number changes? They are somehow forcing a lookup/redirection based on that edition number? Some sort of server-side routine?

By "shallow" links, you mean the page for a particular composition? A link to imslp.org like the following is not as subject to change?
https://imslp.org/wiki/Tosca,_SC_69_(Puccini,_Giacomo) ? (I'm dealing entirely with public domain music except for a few Richard Strauss works and I've indicated their copyright status prominently. But linking the work's page would make other editions/versions visible, so that is worth doing.)

I really do appreciate this.

Re: URLs

Posted: Sat Oct 06, 2018 6:16 am
by Choralia
JeanneMarie wrote:And the URLs you supplied will persist unless a file's "edition" number changes?
I have no control on this, however I think this is probably the most persistent key to reach a certain file.
JeanneMarie wrote:They are somehow forcing a lookup/redirection based on that edition number? Some sort of server-side routine?
Yes.
JeanneMarie wrote:By "shallow" links, you mean the page for a particular composition? A link to imslp.org like the following is not as subject to change?
The link to a page title may be less persistent, for example, in this case, if opus numbers for Puccini are changed. I'd suggest using the IMSLP number (I sometimes call it edition number as this is the term we use at CPDL, however IMSLP number is more accurate) as a key anyway. Shallow linking can be achieved as follows:

https://imslp.org/wiki/Special:ReverseLookup/<IMSLP number>

Examples:

https://imslp.org/wiki/Special:ReverseLookup/43383

https://imslp.org/wiki/Special:ReverseLookup/43372

Max

Re: URLs

Posted: Sat Oct 06, 2018 11:37 am
by JeanneMarie
Aha! Again, I'm in your debt. This has been sooooo helpful.

Is this documented somewhere? I feel I'd like to know whatever else is available.

Even if our conversation stops here however, thank you, thank you, thank you.

Re: URLs

Posted: Sat Oct 06, 2018 8:33 pm
by Choralia
JeanneMarie wrote:Is this documented somewhere?
I don't think so. I did some reverse engineering by inspecting URLs invoked by download links, and by inspecting the html code of pages where functions are used.

I'd suggest you to have a look at:

https://imslp.org/wiki/Special:SpecialPages

Most (all?) pages listed under "Other special pages" are specific to IMSLP, and may provide useful functions. For example, the "Reverse Lookup" function is listed there.
JeanneMarie wrote:thank you, thank you, thank you
You're welcome, you're welcome, you're welcome :D

Max