Google books

Post by **Carolus** » Thu Jul 26, 2007 6:43 am

As some of you probably already know, Google has been scanning a huge number of items from several university libraries in the USA. There appear to be a number of scores among the vast quantity of items in their collections. The couple of files I looked at appear to have some problems on several of the pages - missing sections of a music page, etc.

Anyone else here have experience with these? Google does appear to be scanning things it at 600 dpi. It also appears that they've inserted a Google watermark on each page of some files as well. I expect we'd have to remove those so as to not violate their trademark. If you do a search (advanced search) in the Googlebooks section, searching by publisher name (like Schirmer or Novello) and limiting it to "full text" (meaning downloadable) gives a much more manageable hit-list.

Post by **imslp** » Fri Jul 27, 2007 12:21 pm

I actually never thought of using Google Books... but this is a good idea! About the missing sections, yes apparently Google likes to leave out sections sometimes (like with the old LoC catalog); maybe because of (imaginary) copyright issues?

In any case it would be a very nice idea to mine out what Google has; though the watermarks do need to be removed. I may start a project around this soon when the MIT project comes near completion.