This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Smart Metadata Model Training (Too few samples)

Hi all, 

Hope that everyone is having a good day.

Currently, I wanted to test and play around with Smart Metadata using resume. I wanted M-Files to automatically populate (without suggestion) applicant name whenever I upload any resume.

So for this,  I have already installed M-Files Smart Metadata + Knowledge graph, M-Files discovery, Information Extractor and M-Files Classifier (to auto classify resume class).

I have 60 samples resume and 20+ of them is in the word format while the rest is in PDF.

These are my metadata for resume. There is a property called Applicant Name.

      

For word document (docx), I also insert the property application name into the document.

However, when I wanted to train this model, it always showing a message "Too few samples for training". To add more frustration, the system not able to find even 1 document even though I have 60 sample resumes and for word, I even insert property for application name"

      

This is my smart metadata general setting configuration. For the service-specific setting, I just use the westeurope url and the key provided in partner portal.

Am I missing some step here that causes M-Files not able to detect even 1 document ?

  • The Smart Metadata FAQ says you need approximately 100 examples of each class/property you want to be able to detect but if the documents are very similar, fewer than that might be enough. You need to have manually set the property value for all those examples so that Smart Metadata can use them to learn what data to look for in the file content. If I understood correctly you had done this for the Word documents but not the PDFs so far?

    The FAQ also has further information on possible reasons why the training might not succeed.

  • Hi Joonas,

    Yup. Not done in the pdf. Noted on the minimum doc - 100.

    However, is this consider as normal behaviour when mfiles mention 0 file detected ?

    I am afraid that if i create more that 100 docs, it will show 0 again. 

  • That doesn't sound right. Did you set the MF_AMP_PERSON prefixed alias for the Applicant Name property definition like described in the Getting Started guide?

  • I've tried to configure the alias and retrain the model.

    However, still not able to detect even 1 doc.

      

    Maybe I should try to add until 100 samples ?

  • Thanks Joonas.

    All is good now after I tried to retrain the model again with the prefixed alias and also specifying the alias in the configuration.

  • Hi Joonas,

    I hope you don't mind me replying on a 2 months old question Sweat smile

    You mentioned that we have to manually set the property value for all those examples so that Smart Metadata can use them to learn what data to look for in the file content.

    Let say we have lots of PDF invoices provided by our supplier and we would like to store it in M-Files. Since the PDF is provided by the supplier, then we cannot manually set the property value. So for this kind of scenario, is it impossible to use the smart metadata ? Do we really need to have word document with our own defined properties in it ?