What is Optical Character Recognition or OCR and why should you care? Well, before I answer that let me ask you this question. Have you ever been handed a pile of typed pages for a project that you were working on instead of a nice digital word processor file? Personally, this has happened to me more than once during my career. Maybe the pages are an excerpt from your client’s book. Maybe you were given a typed biography or company history that had long been forgotten about on some relic of a computer that the client doesn’t even have anymore. I even had one instance where a client had a digital copy of the file, but it was saved in a format that wasn’t supported anymore. I’m looking at you Appleworks .cwk file. Luckily the client had a hard copy of the file that could be scanned to a pdf. This is where the discovery of OCR saved me a ton of time that would have been wasted if I had to transcribe those old files by hand.
OCR does exactly what the acronym stands for. It optically scans a document and recognizes the characters printed in it. Then it serves that info up as editable text. With just a little processor power and the right program, you can save yourself a ton of extra work.
Now your next question probably is, “How can I get this wonderful software that Julian is talking about?” Well here is the best part. Are you an Adobe Cloud subscriber? If you are, this magic rosetta stone for your scanned document is actually BUILT IN! That’s right, OCR software is included with Adobe Acrobat Pro DC. I also used it earlier in Acrobat Pro CC but I can’t tell you with certainty if it was included in editions before that. Still, with so many of you using the Creative Cloud nowadays, just know that if you have a subscription you are all set.
Now let’s do a quick walkthrough to show you where it is and how you can start using it.
First we will scan in our document. I’m just using one of my old college papers here so you get the idea.
Now open those files up in Acrobat.
Switch over to the tools tab and look for the Enhance Scans icon. Here is where the OCR lives inside of Acrobat.
You’ll see an option labeled Recognize Text with a dropdown. Here you can select to scan just this document or multiple documents. If you have a bunch of scans go ahead an use the multiple option but for this walk through I’ll select “In This File.”
I wrote this paper in English so I’ll go ahead and keep the language the same and click the highlighted button to the left marked “Recognize Text.”
Just like that we now have a nice file with selectable text that you can copy and paste into your editor of choice. Remember that no method is going to be 100% perfect, but this does a remarkable job and seems to get better at stringing together words with each software update.
Thanks for reading and have a great week everyone. I’ll be back next week with a new post from the world of graphic design.