Scanning/PDF Guidelines

Advice and Help

Moderator: kcleung

Post Reply
Puzzi22
regular poster
Posts: 26
Joined: Wed Jan 20, 2010 3:43 pm
notabot: 42
notabot2: Human

Scanning/PDF Guidelines

Post by Puzzi22 »

Hello:

I'm new to all this but I think I have my fair part of things to share. I have some questions concerning the scanning and merging into PDF guidelines. Here are my questions:

1. I found on the website that it is recommended to use a "CCITT Level 4 compression" with scans. I don't know how to achieve this. My scans are usually 600dpi, B/W. Up to now, they have always been JPEG files, but I'll try to change it to TIFF, as it appeared to be better quality. The problem is that the files are quite heavy: 3-7 Mo. When I merge them all in a PDF file I can choose either "High Quality Impression" (600dpi = 13 Mo for 4 pages) or "Standard" (150 dpi = 1 Mo for 4 pages). I would like a smaller file without sacrificing the quality of the page. What should I do? I guess the CCITT Level 4 compression thing with help me with that, wouldn't it?

2. Some people use automatic "filters" to clean out some of the dirt and dust, before manually doing so. Anyone could suggest a good one that is free?

That's pretty much it for now. Thanks for helping me! I just wish to share the things I have in a good way so that people can make good use of it...

patremblay22
kalliwoda
active poster
Posts: 504
Joined: Fri Dec 19, 2008 8:36 pm
notabot: YES
notabot2: Bot
Location: Berlin, Germany

Re: Scanning/PDF Guidelines

Post by kalliwoda »

I am in a somewhat similar situation, with a pretty old full version of Adobe Photoshop (7.0).
In Photoshop 7.0 there seems to be no way to choose the pdf compression levels (at least I did not find any option) but if I save a file as "Adobe pdf", the size is pretty similar to those of other users. In this way I can obtain single page pdf files for exactly the page size of the original file.
Alternative on my Mac (not sure how this works on PCs) is to "print" the page, then choose "pdf" in the printer menu. This again results in a single page pdf file, but you are limited to the page size options of your printer.

To compile a multipage pdf, you can use pdfsam (free). I usually use my old Acrobat 5 full version, but that is not free.

File size for a 600 dpi monochrome (B/W) A4 page: In tiff about 4 mb, in pdf about 200-350 kb, depending on the complexity of the page.
horndude77
active poster
Posts: 293
Joined: Sun Apr 23, 2006 5:08 am
notabot: YES
notabot2: Bot
Location: Phoenix, AZ

Re: Scanning/PDF Guidelines

Post by horndude77 »

Avoid jpeg and other lossy formats. Most image programs should be able to save an image as a tiff with CCITT group 4 compression. Some free possibilities are gimp and irfanview.

The people over at http://www.diybookscanner.org/ have been steadily improving scantailor. It can do some fairly impressive cleaning including dewarping (not just the regular deskewing). It's definitely worth checking out.
Carolus
Site Admin
Posts: 2249
Joined: Sun Dec 10, 2006 11:18 pm
notabot: 42
notabot2: Human
Contact:

Re: Scanning/PDF Guidelines

Post by Carolus »

That's really quite impressive. Warping and skewing have been the major bugaboos of scanned music from the start. A program that could handle this automatically would be a tremendous help. With printed scores, it theoretically should be possible to correct images on the basis that all staff lines should be at true horizontal and all bar lines at true vertical - adjusting the relative positions of all the other elements accordingly. The problem with much scanned music is that there is not an even amount of distortion from page to page, or even within a single page - which is why it is so nasty to fix bad pages manually, even with Photoshop and similar software. For example, there are some fairly ugly pages in the vocal score of Elijah I just uploaded, not unreadable just warped in a rather unattractive manner for several pages.

Of course, there's the issue of the crazy isolated low-res grayscale segments one encounters in the scores scanned by Google. We'd probably have more of their scores here if they weren't so problematic. Apparently some outfit named Kessinger Publishing justs reprints them regardless of all the bad images, etc. At any rate, software like scantailor will be sorely needed before any serious progress is made with music OCR.
Puzzi22
regular poster
Posts: 26
Joined: Wed Jan 20, 2010 3:43 pm
notabot: 42
notabot2: Human

Re: Scanning/PDF Guidelines

Post by Puzzi22 »

Thanks Horndude77! Scan Tailor seemed to be a very fine software. Do you know anything about filters?
daphnis
Copyright Reviewer
Posts: 1635
Joined: Thu May 17, 2007 7:15 pm
notabot: 42
notabot2: Human

Re: Scanning/PDF Guidelines

Post by daphnis »

I was not familiar with ScanTailor and it does look promising. It still needs some work, or perhaps I'm missing how I can de-activate some elements from running on the pages. For example, I just want to test the dewarping and despeckling algos. without applying anything else, and I can't seem to do so, unless you know something I don't?
daphnis
Copyright Reviewer
Posts: 1635
Joined: Thu May 17, 2007 7:15 pm
notabot: 42
notabot2: Human

Re: Scanning/PDF Guidelines

Post by daphnis »

@ patremblay22: Regarding filters, it's much better to set a scan profile with an appropriate black-and-white conversion threshold tailored for each material than to post process using filters, which, in printings of lesser quality, can remove musical elements like staccati (if the poor printing is caused by under-inking). Most scanning software has this ability and should be controlled manually rather than set to auto.
Mazin
regular poster
Posts: 44
Joined: Sun Jan 25, 2009 3:03 am

Re: Scanning/PDF Guidelines

Post by Mazin »

I would suggest asking (willing) experienced members to do the PDF conversion if it's just a few things. I for one would be happy to convert something to PDF provided the scans. It would be worth it to save future downloaders the trouble of downloading oversized poor-quality PDFs.
Post Reply