How to convert raster to text by command line?

VeryDOC Raster to Text OCR Converter Command Line can be used to convert raster image to text. Raster image could be the following image file formats: TIFF, JPG, PNG, BMP, GIF, PCX, TGA, JP2, PNM and MNG. Meanwhile this software also can help you convert PDF file including image PDF file to text by command line. In the following part, I will show you how to use this software.

Step 1. Download Raster to Text OCR Converter

  • When downloading finishes, there will be a zip file. You need to extract it to some folder then you can call the executable file in MS Dos Windows.
  • This is Windows application and for now it can not be used under Mac or Linux system.

Step 2. Convert raster to text by command line.

  • When you use this software, please refer to the usage and examples.
  • Here is the usage for your reference:  pdf2txtocr.exe [options] <PDF-file> <Text-file>
  • When converting raster to text, please refer to the following command line templates.
  • pdf2txtocr.exe C:\in.tif C:\out.txt
    pdf2txtocr.exe C:\in.jpg C:\out.txt
    pdf2txtocr.exe C:\in.bmp C:\out.txt
    pdf2txtocr.exe C:\in.png C:\out.txt
    pdf2txtocr.exe C:\in.pcx C:\out.txt
    pdf2txtocr.exe C:\in.tga C:\out.txt
    pdf2txtocr.exe C:\in.pnm C:\out.txt
    pdf2txtocr.exe C:\in.mng C:\out.txt
    When converting raster image files to text, simply input full path of input file and output file and you do not need to add any other parameters. When you need to convert image file to text in batch, please use wild character like the following command line templates.
    pdf2txtocr.exe C:\*.tif C:\*.txt
    In order to improve OCR recognition rate, you can convert image to PDF first as when converting raster to PDF, you can adjust image threshold and rotate image in some degree.
    pdf2txtocr.exe -ocrmode 3 -threshold 200 -ocr C:\in.tif C:\out.pdf
    pdf2txtocr.exe -ocrmode 4 -rotate 90 -ocr C:\in.tif C:\out.pdf
    Now let us check related parameters:
    -rotate <int>       : when you need to rotate pages before OCR, please add this parameter.
    -threshold <int>    :when you need to adjust lightness threshold that used to convert image to B&W, please add this parameter.
    -ocr                : this parameter will enable OCR function when converting image file scanned PDF file to text or searchable PDF file.
      -ocrmode <int>      : set OCR mode
    -ocrmode 4: output to OCRed PDF file (Color) with hidden text layer

By this function, you can extract text from raster image file to text. Meanwhile you can convert raster image to searchable PDF file. When output PDF file is PDF file, there are many parameter available for you to choose. If you need to check more functions and parameters of this software, please visit its homepage. During the using if you have any question, please contact us as soon as possible. Now let us check the conversion effect from the following snapshot.

input tiff file
                       This is from input tiff file.

output text from PDF
   This is from output text file.

VN:F [1.9.20_1166]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
How to convert raster to text by command line?, 10.0 out of 10 based on 1 rating

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!