Here are some XHTML test cases with links to the following services:
The ASCII character encoding does not use octets above 0x7F.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Octets between 0x80 and 0x9F do not encode printable characters in the ISO-8859-1 character encoding.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The character encoding declared by the HTTP Content-Type header does not match the character encoding declared by the meta element. The latter one must be ignored.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The character encoding declared by the HTTP Content-Type header does not match the character encoding declared by the XML declaration. According to RFC 3023, the HTTP header takes precedence.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Opera und Firefox do not detect invalid UTF-8 inside comments.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
An UTF-8 BOM cannot appear in an ISO-8859-1 encoded stream. At least, the character encoding is set by the HTTP Content-Type header, therefore the BOM must be decoded to .
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The character encoding is declared only via a meta element. This is insufficient for XHTML.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
text/xml entities with the charset parameter omitted, XML processors must use ASCII.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
There is no unknown character encoding.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
XML processors must support UTF-8 and UTF-16.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Entities encoded in UTF-16 must begin with a BOM.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Elements with an non-empty content model should not use an empty-element tag, and elements with an empty content model should not use an start and end tag.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
lang and xml:lang attributes should be specified with the same value.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
XML parsers are required to ignore comments.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Strictly conforming XHTML documents must contain a namespace declaration for the XHTML namespace.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
This is a valid HTML 4.01 document, but not XHTML compliant.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The name attributes of a elements and the id attributes of all elements populate the same namespace.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Unlike SGML DTDs, XML DTDs and schemas are unable to express element prohibitions. However, certain XHTML elements should not be nested.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The ins and del elements must not contain block-level content when these elements behave as inline elements.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
DTDs do not provide strong datatyping.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
1 2 3 is not an XML name.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
A standalone document requires an XML document not to use entity references other than the built-in ones, not to use attributes with default values, to set all attribute values in normalized form and not to use white space in element types with element content.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The name in the document type declaration must match the element type of the root element.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The ID attribute is not set in normalized form.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The ID attribute is not set in normalized form. This is not allowed in conjunction with a standalone document declaration.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
While SGML allows adjacent attribute specifications, XML requires white space there.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
& must be escaped in element content and attribute values.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
An attribute name must not appear more than once in the same tag.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
If the XML declaration is present, it must appear at the very beginning of the document, i.e. it must not be preceded by a comments.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The first character already breaks well-formedness constraints.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
Entity references such as ä require a document type definition (DTD) with an definition of that entity.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
U+0001 is not an XML character, therefore  is an illegal reference.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
U+10FFFF is the highest Unicode character, therefore � is an illegal reference.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
No < must appear in attribute values.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
]]> must not appear in a CDATA section.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The XML specification provides a grammar for the the internal subset, too.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The XML declaration must exactly match the corresponding production in the XML 1.0 specification.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The typical well-formedness violation.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
If the XML declaration is present, it must appear at the very beginning of the document, i.e. it must not be preceded by white space.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet
The processing instruction target names "XML", "xml", and so on are reserved for standardization of XML.
,sniffer ,sv ,relaxed ,sivonen ,w3c ,w3c-dev ,validome ,pagevalet