Import data from XML file with namespace

Hi All!

We tried to read XML file (and a PDF too) from outer source.

If the XML files contains namespace we can't set up correctly the rule on the metadata tab.

For example:

<?xml version="1.0" encoding="UTF-8"?><InvoiceData xmlns="">schemas.nav.gov.hu/.../data" xmlns:ns2="">schemas.nav.gov.hu/.../base" xmlns:ns3="">invoice.dto.icon.icellmobilsoft.hu/invoice">
<invoiceNumber>INVOICENUMBER</invoiceNumber>
<invoiceIssueDate>2021-12-31</invoiceIssueDate>
<completenessIndicator>false</completenessIndicator>
<invoiceMain>
<invoice>
<invoiceHead>
<supplierInfo>
<supplierTaxNumber>
<ns2:taxpayerId>12345678</ns2:taxpayerId>
<ns2:vatCode>1</ns2:vatCode>
<ns2:countyCode>12</ns2:countyCode>
</supplierTaxNumber>
<supplierName>"Company" KFT</supplierName>
<supplierAddress>
<ns2:simpleAddress>
<ns2:countryCode>HU</ns2:countryCode>
<ns2:postalCode>1234</ns2:postalCode>
<ns2:city>Budapest</ns2:city>
<ns2:additionalAddressDetail>Street details</ns2:additionalAddressDetail>
</ns2:simpleAddress>
</supplierAddress>
<supplierBankAccountNumber>12345678-12345678</supplierBankAccountNumber>
<individualExemption>false</individualExemption>
</supplierInfo>
<customerInfo>
<customerVatStatus>DOMESTIC</customerVatStatus>
<customerVatData>
<customerTaxNumber>
<ns2:taxpayerId>12345678</ns2:taxpayerId>
<ns2:vatCode>1</ns2:vatCode>
<ns2:countyCode>12</ns2:countyCode>
</customerTaxNumber>
</customerVatData>
<customerName>Customer</customerName>
<customerAddress>
<ns2:simpleAddress>
<ns2:countryCode>HU</ns2:countryCode>
<ns2:postalCode>1234</ns2:postalCode>
<ns2:city>Budapest</ns2:city>
<ns2:additionalAddressDetail>Street details</ns2:additionalAddressDetail>
</ns2:simpleAddress>
</customerAddress>
</customerInfo>
<invoiceDetail>
<invoiceCategory>NORMAL</invoiceCategory>
<invoiceDeliveryDate>2021-12-31</invoiceDeliveryDate>
<currencyCode>HUF</currencyCode>
<exchangeRate>1</exchangeRate>
<selfBillingIndicator>false</selfBillingIndicator>
<paymentMethod>TRANSFER</paymentMethod>
<paymentDate>2021-12-31</paymentDate>
<cashAccountingIndicator>false</cashAccountingIndicator>
<invoiceAppearance>PAPER</invoiceAppearance>
</invoiceDetail>
</invoiceHead>
<invoiceLines>
<mergedItemIndicator>false</mergedItemIndicator>
<line>
<lineNumber>1</lineNumber>
<lineExpressionIndicator>true</lineExpressionIndicator>
<lineDescription>Row details</lineDescription>
<quantity>1</quantity>
<unitOfMeasure>PIECE</unitOfMeasure>
<unitPrice>15000</unitPrice>
<unitPriceHUF>15000</unitPriceHUF>
<lineAmountsNormal>
<lineNetAmountData>
<lineNetAmount>15000</lineNetAmount>
<lineNetAmountHUF>15000</lineNetAmountHUF>
</lineNetAmountData>
<lineVatRate>
<vatPercentage>0.00</vatPercentage>
</lineVatRate>
<lineVatData>
<lineVatAmount>0</lineVatAmount>
<lineVatAmountHUF>0</lineVatAmountHUF>
</lineVatData>
</lineAmountsNormal>
</line>
</invoiceLines>
<invoiceSummary>
<summaryNormal>
<summaryByVatRate>
<vatRate>
<vatPercentage>0.0</vatPercentage>
</vatRate>
<vatRateNetData>
<vatRateNetAmount>0</vatRateNetAmount>
<vatRateNetAmountHUF>0</vatRateNetAmountHUF>
</vatRateNetData>
<vatRateVatData>
<vatRateVatAmount>0</vatRateVatAmount>
<vatRateVatAmountHUF>0</vatRateVatAmountHUF>
</vatRateVatData>
<vatRateGrossData>
<vatRateGrossAmount>15000</vatRateGrossAmount>
<vatRateGrossAmountHUF>15000</vatRateGrossAmountHUF>
</vatRateGrossData>
</summaryByVatRate>
<invoiceNetAmount>15000</invoiceNetAmount>
<invoiceNetAmountHUF>15000</invoiceNetAmountHUF>
<invoiceVatAmount>0</invoiceVatAmount>
<invoiceVatAmountHUF>0</invoiceVatAmountHUF>
</summaryNormal>
<summaryGrossData>
<invoiceGrossAmount>15000</invoiceGrossAmount>
<invoiceGrossAmountHUF>15000</invoiceGrossAmountHUF>
</summaryGrossData>
</invoiceSummary>
</invoice>
</invoiceMain>
</InvoiceData>

How can we read from the XML for example: InvoiceNumber and taxpayerID tag?

We tried any configuration but can't read from XML.

Thank you any information about it!

Best regards:

Kornél

  • Have you seen the note on namespaces in the user guide? Step 10 on this page: https://www.m-files.com/user-guide/latest/eng/defining_metadata_for_external_file_source.html

    I think you need to use something like local-name()='taxpayerId' in the XPath expression to access the value.

  • Hi Joonas! 

    Yes we read the user guide. :)

    If i tried with your formula. And the event log is:

    M-Files Online
    Bemutato {091FD1EB-4F80-4CDE-AB7F-5C962F560810}

    Hiba a jellemző értékének kiolvasáskor: Megnevezés (11039244-krfn-2021-592nav.pdf)
    Meghatározatlan hiba. (NodeTest megadása szükséges.

    InvoiceData/invoiceMain/invoice/invoiceHead/supplierInfo/supplierTaxNumber/-->local-name<--()='taxpayerId')

    ExternalFileMonitor.cpp, 5606, Hiba a jellemző értékének kiolvasáskor: Megnevezés (11039244-krfn-2021-592nav.pdf) (0x800400E3)
    ExternalFileMonitor.cpp, 5606, Meghatározatlan hiba. (NodeTest megadása szükséges.

    InvoiceData/invoiceMain/invoice/invoiceHead/supplierInfo/supplierTaxNumber/-->local-name<--()='taxpayerId') (0x80004005)
    ExternalFileMonitor.cpp, 5694, Meghatározatlan hiba. (NodeTest megadása szükséges.

    InvoiceData/invoiceMain/invoice/invoiceHead/supplierInfo/supplierTaxNumber/-->local-name<--()='taxpayerId') (0x80004005)
    ExternalFileMonitor.cpp, 6424, Meghatározatlan hiba. (NodeTest megadása szükséges.

    InvoiceData/invoiceMain/invoice/invoiceHead/supplierInfo/supplierTaxNumber/-->local-name<--()='taxpayerId') (0x80004005)
    ExternalFileMonitor.cpp, 6460, Meghatározatlan hiba. (NodeTest megadása szükséges.

    InvoiceData/invoiceMain/invoice/invoiceHead/supplierInfo/supplierTaxNumber/-->local-name<--()='taxpayerId') (0x80004005)
    (M-Files 21.9.10629.5)

    The error message is: Nodetest declaration needed.

    I used this Xpath:

    InvoiceData/invoiceMain/invoice/invoiceHead/supplierInfo/supplierTaxNumber/local-name()='taxpayerId'

  • There are 2 points to consider in this scenario, first is the XML structure, and then the Xpath.

    The  provided XML example is not a well-formed XML structure as the root node is leaving out the namespace declarations. When loading the XML I get an error on the missing namespace declarations. The namespace declarations should be inside the <InvoiceData> tag.

    If that's fixed, then the following Xpath returns the value from the node <taxpayerId>:  

    /*[name()='InvoiceData']/*[name()='invoiceMain']/*[name()='invoice']/*[name()='invoiceHead']/*[name()='supplierInfo']/*[name()='supplierTaxNumber']/*[local-name()='taxpayerId']