Here are some XHTML test cases with links to the following services:

Encoding

  1. ascii-above-127

    The ASCII character encoding does not use octets above 0x7F.

  2. control-characters

    Octets between 0x80 and 0x9F do not encode printable characters in the ISO-8859-1 character encoding.

  3. http-meta-mismatch

    The character encoding declared by the HTTP Content-Type header does not match the character encoding declared by the meta element. The latter one must be ignored.

  4. http-xml-mismatch

    The character encoding declared by the HTTP Content-Type header does not match the character encoding declared by the XML declaration. According to RFC 3023, the HTTP header takes precedence.

  5. invalid-bytes-in-comment

    Opera und Firefox do not detect invalid UTF-8 inside comments.

  6. iso-bom

    An UTF-8 BOM cannot appear in an ISO-8859-1 encoded stream. At least, the character encoding is set by the HTTP Content-Type header, therefore the BOM must be decoded to .

  7. meta-only-encoding

    The character encoding is declared only via a meta element. This is insufficient for XHTML.

  8. text-xml-without-encoding

    text/xml entities with the charset parameter omitted, XML processors must use ASCII.

  9. unknown-encoding

    There is no unknown character encoding.

  10. utf-16

    XML processors must support UTF-8 and UTF-16.

  11. utf-16-bom-with-iso-xml-decl

  12. utf-16-without-bom

    Entities encoded in UTF-16 must begin with a BOM.

  13. utf-8-bom-with-iso-xml-decl

HTML compatibility

  1. empty-elements

    Elements with an non-empty content model should not use an empty-element tag, and elements with an empty content model should not use an start and end tag.

  2. lang-mismatch

    lang and xml:lang attributes should be specified with the same value.

Miscellaneous

  1. malformed-doctype-in-comment

    XML parsers are required to ignore comments.

  2. missing-namespace-declaration

    Strictly conforming XHTML documents must contain a namespace declaration for the XHTML namespace.

  3. valid-html4

    This is a valid HTML 4.01 document, but not XHTML compliant.

Validity

  1. duplicate-anchors

    The name attributes of a elements and the id attributes of all elements populate the same namespace.

  2. improper-nesting-of-a-elements

    Unlike SGML DTDs, XML DTDs and schemas are unable to express element prohibitions. However, certain XHTML elements should not be nested.

  3. inline-del-block.xml

    The ins and del elements must not contain block-level content when these elements behave as inline elements.

  4. invalid-attributes

    DTDs do not provide strong datatyping.

  5. invalid-id

    1 2 3 is not an XML name.

  6. must-not-standalone

    A standalone document requires an XML document not to use entity references other than the built-in ones, not to use attributes with default values, to set all attribute values in normalized form and not to use white space in element types with element content.

  7. root-element-name-mismatch

    The name in the document type declaration must match the element type of the root element.

  8. unnormalized-id

    The ID attribute is not set in normalized form.

  9. unnormalized-id-standalone

    The ID attribute is not set in normalized form. This is not allowed in conjunction with a standalone document declaration.

Well-Formedness

  1. adjacent-attribute-specifications

    While SGML allows adjacent attribute specifications, XML requires white space there.

  2. ampersand-as-data

    & must be escaped in element content and attribute values.

  3. attrs-not-unique

    An attribute name must not appear more than once in the same tag.

  4. comment-before-xml-declaration

    If the XML declaration is present, it must appear at the very beginning of the document, i.e. it must not be preceded by a comments.

  5. early-error

    The first character already breaks well-formedness constraints.

  6. entity-without-doctype-declaration

    Entity references such as ä require a document type definition (DTD) with an definition of that entity.

  7. illegal-char-reference-1

    U+0001 is not an XML character, therefore  is an illegal reference.

  8. illegal-char-reference-110000

    U+10FFFF is the highest Unicode character, therefore � is an illegal reference.

  9. lt-in-attr

    No < must appear in attribute values.

  10. malformed-cdata-section

    ]]> must not appear in a CDATA section.

  11. malformed-internal-subset

    The XML specification provides a grammar for the the internal subset, too.

  12. malformed-xml-declaration

    The XML declaration must exactly match the corresponding production in the XML 1.0 specification.

  13. unmatched-tags

    The typical well-formedness violation.

  14. whitespace-before-xml-declaration

    If the XML declaration is present, it must appear at the very beginning of the document, i.e. it must not be preceded by white space.

  15. xml-pi

    The processing instruction target names "XML", "xml", and so on are reserved for standardization of XML.