Automatic metadata based on document scanning

Hi all,

I wanted to scan different types of documents and keep it in a network folder. M-Files will scan and perform ocr before importing it into M-Files.

Then, i would like to have all the metadata populated automatically so that the user will not have to check all the document again one by one.

For auto classification, we can achieve this by using M-Files Discovery. However, i am not sure on the auto metadata part.

The information extractor only suggest us the metadata but we need it to be automatically created.

I can only think of 1 option which is OCR source value configuration but this will only work if you are scanning same type of document with same layout. In this case, it will be different document, different layout and not sure whether we can retrieve the content as the metadata.

  • M-Files Discovery can add metadata automatically using Information Extractor and Smart Metadata starting from the August '21 release.

    Whether or not this will work in your case really depends on what kind of information you need to extract from the documents: for instance, is it always in a well-defined pattern that can be configured with a regular expression and so on. If it is more line item type data (from an invoice, for instance) then these intelligent services won't be able to help and you will probably need to look into third-party capture solutions with more advanced extraction options.

  • Hi Joonas, 

    Really appreciate your response. Just wanted to make sure both of us are on the same page, let me just give you a brief flow.

    Normally, this is what we do.

    The user adds a document to M-Files -> The document is analyzed for metadata suggestions - > Metadata suggestions are generated - > The user tags the document with Metadata -> The document is added to the vault

    I wanted to removes these extra 2 steps so that it will be something like this:

    The user adds a document to M-Files -> The document is analyzed for metadata suggestions - > The document is added to the vault automatically with metadata suggestion.

    So from the end user perspective, they don't have to do anything (e.g choose the suggestion). The file could be automatically imported from external file sources (e.g: scanned file folder) and M-Files will automatically choose the best metadata for these files without the user having to choose the suggestion. It will be transparent to the user.

  • Yes, M-Files Discovery works in the background and can apply metadata automatically to documents both in the vault and in connected external repositories (in case you have for instance Network Folder Connector configured to access files on a network share). This background operation works at its own pace so the metadata is not applied immediately as the user saves the document to the vault but at some point as Discovery works through the vault content.