How to convert multi-page PDF to text and insert page break symbol?

     In this article, I will show you how to convert multi-page PDF to text and insert page break symbol by command line operation. The software I will use is VeryDOC Raster to Text OCR Converter Command Line, by which you can recognize many kinds of languages in PDF to text. Please check more information on homepage, in the following part, I will show you how to use this software.

Step 1. Download Raster to Text OCR Converter Command Line 

  • Even if we name this software as Raster to text converter, but it supports many version files as input like scanned PDF, text based PDF, TIFF, JPG, PNG, BMP, GIF, PCX, TGA and others. So this software can help you convert all version PDF file to plain text and insert page break symbol.
  • When downloading finishes, there will be a zip file. Please extract it to some folder then you can call the executable file in MS Dos Windows.

Step 2.  Convert multi-page PDF to text and insert page break symbol.

  • When you use this software, please obey rules of this software and follow examples templates.
  • Here is the usage for your reference:  pdf2txtocr.exe [options] <PDF-file> <Text-file>
  • When convert text based PDF to text and insert page break, please refer to the following command line templates.
  • pdf2txtocr.exe C:\in.pdf C:\out.txt
    By this simply command line, we can convert PDF to text and insert page break automatically.
    pdf2txtocr.exe -firstpage 1 -lastpage 1 C:\in.pdf C:\out.txt
    By this command line, we can convert PDF to text and and choose conversion page range.
    pdf2txtocr.exe -ownerpwd 123 -userpwd 456 C:\in.pdf C:\out.txt
    By this command line, we can convert password protected PDF file to text and insert page break.
    pdf2txtocr.exe -layout C:\in.pdf C:\out.txt
    By this command line, we can convert PDF to text and maintain original layout. 
    Please do not be surprised for there is no parameter about page break used in above command line as this software will convert PDF to text and insert page break automatically. When you do not need to insert page break, please add this parameter
    -noc                : don't insert page breaks 0x0C between pages in text file. 
    The above command line only can be used to convert text based PDF to text and insert page break.

  • When converting image based PDF to text and insert page break, please refer to the following command line templates.
  • pdf2txtocr.exe -ocr -lang eng -ocrmode 0 C:\in.pdf C:\out.txt
    pdf2txtocr.exe -ocr -lang deu -ocrmode 1 C:\in.pdf C:\out.pdf
    You need to add paramer –OCR to launch OCR function then you can run the conversion successfully.

There are more examples and parameters in readme.txt, please check more detail information there. I can not list all of them here. During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!