MIT Project: submission of 14,000 scores
Moderator: kcleung
The problem is that the files still need to be checked for copyright status and duplication, so it is mainly a difference of putting the information directly in the submission vs. in a table format to be processed by the bot. Considering the number of possible factors (since submitting those files are akin to normal scores from the internet; they have no inherent order or set), a bot may not yield significant benefits. However, I haven't thought much about this, so I may be wrong... if you can think of a way in which the bot can automate the process greatly I think it'd be a fine idea! In the meanwhile I think it should be done normally (manual submission).
By the way, do you want in on the project too? That way you can also check to see if you can automate it somehow.
By the way, do you want in on the project too? That way you can also check to see if you can automate it somehow.
-
- Groundskeeper
- Posts: 553
- Joined: Fri Feb 16, 2007 8:55 am
I can't contribute in the near future, probably not before august/september. But if you nonetheless want to send me the URL, I'd be curious to have a look.
As for the bot, Feldmahler is completely right that it is mainly a difference of putting the information directly in the submission vs. in a table format to be processed by the bot. I have no idea myself if there would be a benefit at all. One could also think of having the bot prepare the spreadsheets with the meta-data that can be extracted from the filenames (though that's only work title and opus, as far as I can tell), and have the users fill in the rest. But this might also be a useless idea, dunno... It would be different if there was some database with a significant amount of meta-data that we could try to match against the file names, but I have no idea if such a thing exists... maybe I'm just talking lots of nonsense.
As for the bot, Feldmahler is completely right that it is mainly a difference of putting the information directly in the submission vs. in a table format to be processed by the bot. I have no idea myself if there would be a benefit at all. One could also think of having the bot prepare the spreadsheets with the meta-data that can be extracted from the filenames (though that's only work title and opus, as far as I can tell), and have the users fill in the rest. But this might also be a useless idea, dunno... It would be different if there was some database with a significant amount of meta-data that we could try to match against the file names, but I have no idea if such a thing exists... maybe I'm just talking lots of nonsense.
I wonder if, for the big-name people (and consequently the highest concentration of scores), people could assign themselves the individual folders? I'm going to work my way from the bottom up, but with 'B' for example, someone would have to otherwise tackle Bach, Beethoven and Brahms(!) unless the letters could be divided by certain folders.
-
- active poster
- Posts: 385
- Joined: Mon Apr 16, 2007 11:09 pm
- notabot: 42
- notabot2: Human
- Location: Melbourne, Australia
I don't have the time or the experience so I'll leave it to those who know what they're doing.
But 14,000 scores??? The mind boggles.
It makes my 500-600 Schubert Lieder fade into insignificance.
aldona
But 14,000 scores??? The mind boggles.
It makes my 500-600 Schubert Lieder fade into insignificance.
aldona
“all great composers wrote music that could be described as ‘heavenly’; but others have to take you there. In Schubert’s music you hear the very first notes, and you know that you’re there already.” - Steven Isserlis
-
- Copyright Reviewer
- Posts: 182
- Joined: Fri Jun 22, 2007 9:12 pm
- notabot: 42
- notabot2: Human
- Location: Milky Way galaxy
Re: New Project: submission of 14,000 scores
I will be definitely be interested to help with indexing. I can use any data structure (Excel, Access etc). It will be long term project, hey but 10xxx scores added? Super!
Duke
Duke
-
- Site Admin
- Posts: 1139
- Joined: Sun Jan 14, 2007 8:16 am
- notabot: YES
- notabot2: Bot
- Location: Perth, Australia
- Contact:
I wonder if we should have system or registry for indicating, by individual folders or even files, which works (or composers) are not public domain so that people coming in behind us won't think we overlooked them. For example, I've come across composers who have no works in the public domain (at least the works in that folder) and thus that folder would have to be skipped. For now I'm simply keeping a running list of the folders/composers that could not be added, but it may be nice to have an "official" list of sorts on the project page.
I've put up a new page for listing composers who have already been done, so I don't think there is a need to treat those composers specially anymore
Dpajalic, ArcticWind, I've put you on the waiting list. Basically, I want to see how this works out first before adding more people to the project. I think the 7 people currently is enough to get the project going; and I really want to check (and iron out) any problems that might arise, and this may be better with fewer people so that coordination doesn't become a problem instead. After things are sorted out I'll add more people
Dpajalic, ArcticWind, I've put you on the waiting list. Basically, I want to see how this works out first before adding more people to the project. I think the 7 people currently is enough to get the project going; and I really want to check (and iron out) any problems that might arise, and this may be better with fewer people so that coordination doesn't become a problem instead. After things are sorted out I'll add more people
willing to help also
I'd be willing to help, albeit in small doses.