Converting non-indexed PDF to text indexed PDF

Question

I found a conversation from eight years ago where Mika Javanien says: 
 
 M-Files has a tool (extra cost) that will convert non-indexed PDF to text indexed PDF as a background task in M-Files server. 
 This tool is actually available in M-Files Solution catalog: catalog.m-files.com/.../ It requires M-Files OCR module. This tool can be configured to find the non-searchable documents and to convert them to searchable. 
 
 We need this tool. What is it, and where can we get it?

Joonas Linkola · Answer

This tool has been deprecated years ago and is no longer available as it was causing a lot of issues. OCR is a heavy operation on the server and M-Files OCR is not designed for mass operations such as converting thousands of documents in the vault to searchable in one go. OCR in M-Files is basically meant for human use: upload a document from scanner and convert it to searchable as you store the doc to the vault. 
 If mass operations are needed, the recommendation is to use dedicated software or have OCR support directly in the scanner so that the documents are already searchable when they are stored to the vault. If the scanner does not have OCR support, you can also convert the documents to searchable as part of the import operation in external file sources or put them in a workflow where you convert them with a workflow script ( PerformOCROperation method in the API). Users can also manually convert documents to searchable.

Converting non-indexed PDF to text indexed PDF

Top Replies

Contact Us

Schedule a Demo

Careers

Trust Center

Privacy Policy

Security Hall of Fame

Subcontractors

M-Files Community

Support Portal

Help Center

Product Center

Download M-Files

User Guides

Product News