删除不针对XSD进行验证的XML

最后发布: 2017-01-18 11:01:06


问题

I have an XML and XSD. 我有一个XML和XSD。

The problem I have is that if one element\\attribute fails during the upload then nothing is uploaded. 我的问题是,如果在上传过程中一个元素\\属性失败,则不会上传任何内容。 Therefore using the XSD, I would like to strip out any invalid “rows” prior to the upload. 因此,使用XSD,我想在上传之前去除所有无效的“行”。

If the following is taken as example 如果以以下为例

<Row>

    <Column1>1</Column1>

    <Column2>2</Column2>

</Row>

<Row>

    <Column1>1</Column1>

    <Column2>2</Column2>

</Row>

<Row>

    <Column1>1</Column1>

    **<Column2>**B**</Column2>**
</Row>
<Row>

    <Column1>1</Column1>

    **<Column2>**C**</Column2>**
</Row>

In the above example, Column2 in the 3rd Row and 4th row is invalid. 在上面的示例中,第3行和第4行中的Column2无效。 Therefore I would like to remove it both from the XML. 因此,我想将其都从XML中删除。

I tried 我试过了

 foreach (XmlElement row in doc.SelectNodes("TableName/Row"))
            {
                if (row.SchemaInfo.Validity == XmlSchemaValidity.Invalid)
                {
                    row.ParentNode.RemoveChild(row);
                }
            }

but it removes only the first error section and if later there are sections with error the SchemaInfo.Validity value is "NotKnown" 但它仅删除第一个错误部分,如果以后有错误的部分,则SchemaInfo.Validity值为“ NotKnown”

c# xml xsd
回答

I think the only way to do this would be to manually validate the XML using your own code. 我认为唯一的方法是使用您自己的代码手动验证XML。

Due to the possible structure of an XSD and the possible errors that could occur in it, creating a validator that can consistently skip over an error and continue, would be very difficult (and hence is not something that any of the parsers i'm aware of have done). 由于XSD的可能结构以及其中可能发生的错误,因此创建一个可以始终跳过错误并继续进行操作的验证器将非常困难(因此,我所知道的任何解析器都不是这样)已完成)。

In some circumstances they will continue validation after an error, but typically they then ignore all siblings after the initial error (in order to get back to a more consistent state). 在某些情况下,它们将在发生错误后继续进行验证,但是通常情况下,他们将在初始错误后忽略所有同级(以便返回到更一致的状态)。 Basically once an error is encountered there are often multiple validation paths that can be taken as the validation state has become ambiguous. 基本上,一旦遇到错误,由于验证状态变得模棱两可,通常可以采用多个验证路径。

That said if your data is something along the lines of your sample and you have some control over your XSD you could refactor the XSD defintion of <row> to be root element (then use an element ref where you need it). 这就是说,如果您的数据与示例类似,并且可以控制XSD,则可以将<row>的XSD定义重构为根元素(然后在需要时使用element ref)。 You could then load each <row> element one at a time and validate each one as you go. 然后,您可以一次加载每个<row>元素,并在进行过程中验证每个元素。 That way the code that reads the document is disconnected from the validation of each <row>, so if one is invalid you discard it and move onto the next. 这样,读取文档的代码就会与每个<row>的验证断开连接,因此,如果其中一个无效,则将其丢弃并移至下一个。

NOTE : This approach would mean the rest of the XML document is NOT validated. 注意:这种方法将意味着未验证XML文档的其余部分。