XML Infoset and XML Schemas versus RDF and RDF Schemas

In my other mnot response today, I wrote the response here and just referenced it in the comments section of mnot's entry. In this case, I responded in full in the comments section and then decided afterwards to include it here as well. More evidence that comments and trackbacks are the same thing.

I wrote the following in response to mnot's Informational Properties of Infosets:

It could be my document-centric bias (like many members of the original WG, I came from a publishing and text processing background) but, for the most part, I've viewed XML as surface syntax (and by extension, XML schemas as as grammars for surface syntax and the Infoset as modelling surface syntactic information).

RDF/RDFS has always seemed to me to be a much better data modelling language. The problem I've always had with the syntax of RDF is that it is neither a fixed serialization of the data model nor a generic mapping to-and-from arbitrary XML. It is rather a middle-ground where some common XML patterns are supported but not the generic case.

I've always argued that RDF should support a mapping to any XML surface syntax. Back when I was writing PyTREX (never updated to support RELAX NG, unfortunately) I was hoping to annotate the TREX grammar (which I saw as being about surface syntax) with a mapping to RDF (which I saw as the right way to express the underlying data model of the document). This plus something like Sparta would be then be the XML data binding.

The Infoset is priceless for modelling the surface syntax. For everything else there's RDF.