This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

M-Files as a storage for frequent big documents

Hi there,

I was approached today with a requirement to save big technical reports containing picture snippets and summaries. These reports are usually between 250-500MB each and they are in PDF format.
Another point is that these reports could easily be produce 1-10 times per month.
So if I calculate a middle case 5 documents of 300 MB per month I coming to 60 documents with size 17 GB.
I have did some load test with 200-250MB files over REST based API and it was not particularly fast (obviously due to size of a document). I am currently not aware about complexity of metadata for that specific document type. This can of course influence uploading speed as well.
I am not aware on any document size limit in M-Files but wondering if M-Files is a proper solution for such a use case.
I am also worried because Vault will grow to be quite big and this could influence other document types used in a system (planned 50-100 document types).

What are your experiences with big frequent documents? I would appreciate your experiences.

Dejan
Parents
  • Hi Karl,

    Thanks for the summary and insights. That is very encouraging.

    I am wondering if you could share the server setup (RAM, Disk size, disk throughput as I read it is very important for fast storing) as I could imagine the setup would be the key. I know about general setup that M-Files recommended which is I suppose good guidance. I would be very curious about type of disks and disk throughput as documents would be saved on a disk. In our case, we would have shared drive provided from NetApp which could grow so we would prepare ourselves for increase scenario.

    Are the amount of those 500 big zips constantly growing or is more or less fixed? 100 documents per day incoming is quite a solid number for standard documents so this seems to be quite a solid throughput. Also wonder which search engine the client is using DtSearch or IDOL. I suppose indexing services are setup on separate server so that index can be re-built fast as new documents arrive.

    Obviously users would not preview zip files. Do you have any experience with previewing of big 300MB files in a preview mode? I did some testing and it freezes a client for some time with VM having 16GB (but I have to admit many other things are running in parallel, it is our testing VM). As in our case, we plan to upload PDFs, the usage of preview mode would be an realistic usage.
    I do agree with you about complexity: we have currently 2 effective document types with quite a complex workflows. M-Files is amazing because offers so flexibility but of course with that flexibility comes complexity as well. We now try to reduce number of auto-calculate properties and also trying to re-use as much metadata we can. Not an easy task, when you have real scenarios already running over existing solutions. We would need to exponentially grow (+50 document types and than even more) so maintenance costs and maintenance best practices are worrying/points of interest for me as well.

    Nevertheless what you have summarized sounds encouraging and have a lot of potential.

    Thanks again for you insights.

    Dejan


Reply
  • Hi Karl,

    Thanks for the summary and insights. That is very encouraging.

    I am wondering if you could share the server setup (RAM, Disk size, disk throughput as I read it is very important for fast storing) as I could imagine the setup would be the key. I know about general setup that M-Files recommended which is I suppose good guidance. I would be very curious about type of disks and disk throughput as documents would be saved on a disk. In our case, we would have shared drive provided from NetApp which could grow so we would prepare ourselves for increase scenario.

    Are the amount of those 500 big zips constantly growing or is more or less fixed? 100 documents per day incoming is quite a solid number for standard documents so this seems to be quite a solid throughput. Also wonder which search engine the client is using DtSearch or IDOL. I suppose indexing services are setup on separate server so that index can be re-built fast as new documents arrive.

    Obviously users would not preview zip files. Do you have any experience with previewing of big 300MB files in a preview mode? I did some testing and it freezes a client for some time with VM having 16GB (but I have to admit many other things are running in parallel, it is our testing VM). As in our case, we plan to upload PDFs, the usage of preview mode would be an realistic usage.
    I do agree with you about complexity: we have currently 2 effective document types with quite a complex workflows. M-Files is amazing because offers so flexibility but of course with that flexibility comes complexity as well. We now try to reduce number of auto-calculate properties and also trying to re-use as much metadata we can. Not an easy task, when you have real scenarios already running over existing solutions. We would need to exponentially grow (+50 document types and than even more) so maintenance costs and maintenance best practices are worrying/points of interest for me as well.

    Nevertheless what you have summarized sounds encouraging and have a lot of potential.

    Thanks again for you insights.

    Dejan


Children
No Data