Anatomy of a Scanned Image

A “Scanned Image” or “Document Image”, is, as the name implies, a picture (image) of a document.  Document scanners are just fancy cameras that take a picture of a document.  Scanners with an Automatic Document Feeder (ADF) will move the pages across a special type of camera lens (for the nerds out there it is called a CCD) and the computer processor inside the scanner will assemble many small pictures into a single picture representing a page of a document (along with a lot of other cool features).

There are many variables in scanning documents, many of which are beyond the scope of this post.  Below we’ll try to give you the information you need to know.

Types of scanned images

Here are some of the most relevant types of scanned images:

  • Bi-tonal or Black and White – In this mode the scanner is only looking for 2 colors (technically 1) for each pixel.  This creates the smallest possible image because each pixel of the document can be represented by a single computer bit (see the picture on the right).
  • Color – In this mode the scanner is looking for multiple colors for each pixel.  With an infinite spectrum of colors, you can imagine how a color file can be much larger in size than black and white.  Most scanners allow several “color depth” settings which define how many possible colors are in each pixel.  The higher the color depth, the larger the file size.  Higher color depth will generally improve the quality of a color image (up to a point).
  • Grayscale – similar to color except the colors are limited to shades of gray.  Not commonly used for document imaging.

Pixels and DPI

A pixel is a small dot on the page that the scanner captures.  One of the most important settings when scanning a paper document is the Dots Per Inch, or DPI, because this drives the image quality and the size of the file.  DPI is the number of pixels in a square inch of an image.  300 DPI, for example means that the number of pixels captured in an inch is 300×300 or 90,000.  See recommendations below on setting the DPI in your scanner.

Compression

Ok, now we’re getting really nerdy, why do we care about compression.  Compression becomes very important, if you don’t have it.  Almost all images are stored with some sort of compression to reduce the file size and make the image easier to move around.  A JPG file, for example, uses a type of compression that is very efficient at storing photographs.  One of the most popular ways of storing a black and white image is in a TIFF format, with CCITT T.4 compression.  The good news is the better scanners generally do all this for you and you don’t need to worry about it.

A word of caution about compression: Some types of compression actually cause data in the image to be lost.  This is referred to as “lossy compression”.  In most cases the loss is irrelevant because enough of the original is preserved for human viewing.  JPG is the most popular type of compression, and it is a lossy compression.  The TIFF CCITT T.4 compression is lossless, so that is one reason it is commonly recommended for black and white scanning.  To put this into perspective – virtually every photographic process losses data – it is just the nature of photography.  So the act of scanning, means you are creating a lower quality reproduction of the original.  From a legal perspective you should be conscience of what and how you are scanning to ensure that the loss of data is not altering evidence.  In some cases you can avoid scanning altogether, as described in this blog post.

What is the big deal about PDFs

The Adobe Portable Document Format (PDF) has become the standard for storing documents in the digital age.  It became the standard mainly because Adobe made a genius business decision to freely distributed the Adobe Acrobat Reader application to everyone, at a time when other formats were sometimes requiring proprietary technology to view their documents.  The PDF format is very sophisticated and is often used as a wrapper around other image formats.  For example, TIFF and JPG files can be stored in a PDF wrapper, which makes them viewable in a consistent way.  One of the best things that PDF does, is it allows the storing of multiple pages to represent a single document (something JPG does not allow).  Another PDF feature that is very popular is that it allows the storing of the text of the document and not just an image, which allows the text to be searched or copy-pasted.  Today most modern scanners and other modern software products, like Microsoft Office, can store directly into PDF format – so you should not need to buy any additional products to create PDF files.

All scanners are not created equal

So you need a scanner, what should you buy?  This is one of those “you get what you pay for” situations.  There are many scanners to choose from, and most will do an adequate job.  A lot of Multi-Function Printers (MFPs) do a decent job of scanning.  You can even scan a paper document using your modern smartphone along with a scanning app see the related blog post. However, if you are a professional law firm and you’re going to scan more than a few pages a week, we recommend you consider buying a true production scanner which will save you a lot of money and headaches in the long run.  You can purchase a production scanner, preconfigured to work with MiFILE at this page.

Recommendations

  1. Scan using Bi-tonal (Black and White) wherever possible to keep file size down.  For most court documents the color in the document is not important to the process.
  2. If you are scanning a black and white document scan at 200-300 DPI into a TIFF or PDF file (your file size should be less than 75K per page).
  3. If you are scanning photographs or color documents where the color is important, scan at 150-200 DPI (32K color depth) into a JPG or PDF file (your file size should be less than 200K per page).  Some courts don’t accept color files, so check the file-stamped copy that is returned to you to make sure it represents what you intended to submit to the court.
  4. In the above two scenarios, if your file sizes are significantly higher than what is shown, then you are probably not using compression right.  You may need to experiment with these settings.
  5. If you are a professional law firm and you’re going to scan more than a few pages a week, we recommend buying a true production scanner which will save you a lot of money and headaches in the long run.  You can purchase a production scanner, preconfigured to work with MiFILE at this page.
  6. Once the scanner is working the way you want it, save the settings so that you can reuse next time.  Most people will save separate settings for color photographs, color documents, and black & white documents.
  7. Newer scanners have some really nice software features built in, such as: blank-page delete, auto-cropping/page size detection, deskew, and auto-orientation.  Become familiar with these features, they can really help create a good looking document image.

 

Leave a Reply

Your email address will not be published. Required fields are marked *