0

I am trying to get XML validation errors with line numbers, and the fact that there is a LineNumberOffset property of [xml.xmlReaderSettings] suggests this is possible. But I can't seem to find how to either enable line numbers, or access line numbers in the resultant error. This talks about doing it in C# with LoadOptions.SetLineInfo; but that's not a valid property when I try $xmlReaderSettings.SetLineInfo = $true.

function readXMLFile ([string]$path) {
    $readXMLFile = [psCustomObject]@{    
        xml    = [xml.xmlDocument]::New()
        error = $null
    }
        
    $fileStream = $null
    $xmlreader = $null
    $importFile = [xml.xmlDocument]::New()
    $xmlReaderSettings = [xml.xmlReaderSettings]::New()
    #$xmlReaderSettings.ignoreComments = $true
    $xmlReaderSettings.closeInput = $true
    $xmlReaderSettings.prohibitDtd = $false
    $xmlReaderSettings.ValidationType = [System.Xml.ValidationType]::Schema
    $xmlReaderSettings.ValidationFlags = [System.Xml.Schema.XmlSchemaValidationFlags]::ProcessInlineSchema -bor
                                         [System.Xml.Schema.XmlSchemaValidationFlags]::ProcessSchemaLocation -bor 
                                         [System.Xml.Schema.XmlSchemaValidationFlags]::ReportValidationWarnings
    $xmlReaderSettings.Schemas.Add($Null, $SchemaFile)


    try {
        $fileStream = [io.fileStream]::New($path, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read, [System.IO.FileShare]::ReadWrite)
        $xmlreader = [xml.xmlreader]::Create($fileStream, $xmlReaderSettings)
        $importFile.Load($xmlreader)
    } catch {
        $exceptionName = $_.exception.GetType().name
        $exceptionMessage = $_.exception.message
        switch ($exceptionName) {
            MethodInvocationException {
                if ($exceptionMessage -match ': "(?<string>.*)"$') {
                    $readXMLFile.error = "Error loading XML; $($matches['string'])"
                } else {
                    $readXMLFile.error = "Error loading XML; $exceptionMessage"
                }
            }
            Default {
                $readXMLFile.error = "Error loading XML; $($exceptionName) - $exceptionMessage" # Or just the message?
            }
        }
    } finally {
        if ($xmlreader) {
            $xmlreader.Dispose()
        }
        if ($readXMLFile.error) {
            $readXMLFile.xml = $null
        } else {
            $readXMLFile.xml = $importFile
        }
    }
        
    return $readXMLFile
}

EDIT: The schema I have been working on is

<?xml version = "1.0"?>
<xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema">
    <xs:element name = 'Definitions'>
        <xs:complexType>
         <xs:sequence>
            <xs:element name = 'Sets' type = 'Sets' minOccurs = '0'  maxOccurs = '1' />
            <xs:element name = 'Packages' type = 'Packages' minOccurs = '0'  maxOccurs = '1' />
         </xs:sequence>
      </xs:complexType>
    </xs:element>
    
    <xs:complexType name = 'Sets'>
        <xs:sequence>
            <xs:element name = "Set" type = 'Set' minOccurs = '0' maxOccurs='unbounded' />
        </xs:sequence>
    </xs:complexType>
    
    <xs:complexType name = 'Set'>
        <xs:sequence>
            <xs:element name = 'Set' type='xs:string' minOccurs = '0' maxOccurs='unbounded' />
            <xs:element name = 'Package' type='xs:string' minOccurs = '0' maxOccurs='unbounded' />
            <xs:element name = 'Rollout' type='xs:string' minOccurs = '0' maxOccurs='unbounded' />
            <xs:element name = 'Remove' type='xs:string' minOccurs = '0' maxOccurs='unbounded' />
        </xs:sequence>
        <!--<xs:attribute name = 'id' type = 'xs:string'/>-->
    </xs:complexType>
    
    <xs:complexType name = 'Packages'>
        <xs:sequence>
            <xs:element name = 'Package' type = 'Package' minOccurs = '0' maxOccurs='unbounded' />
        </xs:sequence>
        <xs:attribute name = 'id' type = 'xs:string'/>
    </xs:complexType>
    
    <xs:complexType name = 'Package'>
        <xs:sequence>
            <xs:element name = 'Package' type='xs:string' minOccurs = '0' maxOccurs='unbounded' />
            <xs:element name = 'Task' type='Task' minOccurs = '0' maxOccurs='unbounded' />
        </xs:sequence>
    </xs:complexType>
    
    
    
    <xs:complexType name = 'Task'>
        <xs:sequence>
            <xs:element name = 'PreProcess' type='TaskPrePostProcess' minOccurs = '0' maxOccurs='1' />
            <xs:element name = 'Process' type='TaskProcess' minOccurs = '1' maxOccurs='1' />
            <xs:element name = 'PostProcess' type='TaskPrePostProcess' minOccurs = '0' maxOccurs='1' />
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name = 'TaskPrePostProcess'>
        <xs:sequence>
            <xs:element name = 'Task' type='Task' minOccurs = '0' maxOccurs='unbounded' />
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name = 'TaskProcess'>
    </xs:complexType>
</xs:schema>

And some simple sample data would be

<?xml version="1.0" encoding="utf-8" ?>
<Definitions>
    <Sets>
        <Set id="Arch">
            <Package>DTV_2017</Package>
        </Set>
        <Set id="Px_Arch">
            <Package>RVT_2017</Package>
            <Package>RVT_2018</Package>
        </Set>
    </Sets> 

    <Packages>
    </Packages>
</Definitions>

EDIT: Interestingly, when I remove the validation and just catch malformed XML errors, I DO get line numbers. It's only validating with the XSD file that produces errors that aren't particularly helpful.

2
  • Do you have a schema + an invalid sample document to test against? Commented Jul 3, 2020 at 18:38
  • @mathias-r-jessen Updated with XML & XSD examples. At least I made sure I wasn't testing on a Windows 7/PS 2.0 machine, as I have a few times. It's Windows 10/PS5.1 this time.
    – Gordon
    Commented Jul 3, 2020 at 18:59

1 Answer 1

1

You're fighting with some of PowerShell's black magic around the way it sometimes wraps objects with its own types :-(.

If you look at the System.Management.Automation.MethodInvocationException you caught, you'll see it's got an InnerException property which contains the System.Xml.Schema.XmlSchemaValidationException instance that the XmlReader actually threw, and that has got the LineNumber and LinePosition properties that you want.

However, a cleaner way is to only catch XmlSchemaValidationException exceptions in the first place and let everything else throw. That way, PowerShell gives you the original exception rather than its wrapper:

catch [System.Xml.Schema.XmlSchemaValidationException]
{
    $ex = $_.Exception;
    $type = $ex.GetType().FullName;
    $lineNumber = $ex.LineNumber;
    $linePosition = $ex.LinePosition;
    $message = $ex.Message;
    write-host "type = $type";
    write-host "line = $lineNumber";
    write-host "position = $linePosition";
    write-host "message = $message";
    ...
}

which outputs:

type = System.Xml.Schema.XmlSchemaValidationException
line = 4
position = 14
message = The 'id' attribute is not declared.

As an aside, you might want also to capture the return value from $xmlReaderSettings.Schemas.Add($Null, $SchemaFile) otherwise it'll be written to the output stream from your function and will give some odd results...

$null = $xmlReaderSettings.Schemas.Add($Null, $SchemaFile)

Not the answer you're looking for? Browse other questions tagged or ask your own question.