This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Slow XML import

Former Member
Former Member
Hia,

We have a conversion mechnism running that's converting legacy postscript output to generic PDF with formatted XML (1 pdf = 1 pdf, same naming).
On a daily base about 5000 files are generated and imported with an external file source job using the XML and xpaths for meta data.
It's all working nicely, however, the import into M-Files is so slo...o..w.....

Any experience here to speed up this process, to increase the # of files per import (for example 500 in stead of 100) etc?
The (virtual) hardware shouldn't be the issue, nor the SQL vault database.

We have a backlog of let's say a few million files, so any speed increase would be great.


PS: best wishes all!
Parents
  • Are you using text recognition (OCR) in the file source job? OCR processing utilizes a large amount of available CPU resources during import which can slow it down. If you have OCR enabled, try running the import job with it disabled and see what effect that has. If the source PDFs are already text searchable OCR processing is unnecessary anyway.

    If you are not using OCR, then could you please describe what is the current speed of the import? How long does it take to import, say, 100 documents?
Reply
  • Are you using text recognition (OCR) in the file source job? OCR processing utilizes a large amount of available CPU resources during import which can slow it down. If you have OCR enabled, try running the import job with it disabled and see what effect that has. If the source PDFs are already text searchable OCR processing is unnecessary anyway.

    If you are not using OCR, then could you please describe what is the current speed of the import? How long does it take to import, say, 100 documents?
Children
No Data