XML Schema versus DTD

An XML document with correct syntax is called “Well Formed”.

An XML document validated against a DTD is both “Well Formed” and “Valid”.


What is a DTD?

DTD stands for Document Type Definition.

A DTD defines the structure and the legal elements and attributes of an XML document.


Valid XML Documents

A “Valid” XML document is “Well Formed”, as well as it conforms to the rules of a DTD:<?xml version=”1.0″ encoding=”UTF-8″?>
<!DOCTYPE note SYSTEM “Note.dtd”>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don’t forget me this weekend!</body>
</note>

The DOCTYPE declaration above contains a reference to a DTD file. The content of the DTD file is shown and explained below.


XML DTD

The purpose of a DTD is to define the structure and the legal elements and attributes of an XML document:

Note.dtd:

<!DOCTYPE note
[
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>

The DTD above is interpreted like this:

  • !DOCTYPE note –  Defines that the root element of the document is note
  • !ELEMENT note – Defines that the note element must contain the elements: “to, from, heading, body”
  • !ELEMENT to – Defines the to element to be of type “#PCDATA”
  • !ELEMENT from – Defines the from element to be of type “#PCDATA”
  • !ELEMENT heading  – Defines the heading element to be of type “#PCDATA”
  • !ELEMENT body – Defines the body element to be of type “#PCDATA”

Tip: #PCDATA means parseable character data.

Using DTD for Entity Declaration

A DOCTYPE declaration can also be used to define special characters or strings, used in the document:

Example

<?xml version=”1.0″ encoding=”UTF-8″?>

<!DOCTYPE note [
<!ENTITY nbsp “&#xA0;”>
<!ENTITY writer “Writer: Donald Duck.”>
<!ENTITY copyright “Copyright: W3Schools.”>
]>

<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don’t forget me this weekend!</body>
<footer>&writer; &copyright;</footer>
</note>

Output:

Tip: An entity has three parts: it starts with an ampersand (&), then comes the entity name, and it ends with a semicolon (;).

When to Use a DTD?

With a DTD, independent groups of people can agree to use a standard DTD for interchanging data.

With a DTD, you can verify that the data you receive from the outside world is valid.

You can also use a DTD to verify your own data.

If you want to study DTD, please read our DTD Tutorial.


When NOT to Use a DTD?

XML does not require a DTD.

When you are experimenting with XML, or when you are working with small XML files, creating DTDs may be a waste of time.

If you develop applications, wait until the specification is stable before you add a DTD. Otherwise, your software might stop working because of validation errors.

Defining XML Schema elements:

In a DTD, elements are declared with an ELEMENT declaration.


Declaring Elements

In a DTD, XML elements are declared with the following syntax:<!ELEMENT element-name category>
or
<!ELEMENT element-name (element-content)>


Empty Elements

Empty elements are declared with the category keyword EMPTY:<!ELEMENT element-name EMPTY>

Example:

<!ELEMENT br EMPTY>

XML example:

<br />


Elements with Parsed Character Data

Elements with only parsed character data are declared with #PCDATA inside parentheses:<!ELEMENT element-name (#PCDATA)>

Example:

<!ELEMENT from (#PCDATA)>

Elements with any Contents

Elements declared with the category keyword ANY, can contain any combination of parsable data:<!ELEMENT element-name ANY>

Example:

<!ELEMENT note ANY>


Elements with Children (sequences)

Elements with one or more children are declared with the name of the children elements inside parentheses:<!ELEMENT element-name (child1)>
or
<!ELEMENT element-name (child1,child2,…)>

Example:

<!ELEMENT note (to,from,heading,body)>

When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children. The full declaration of the “note” element is:<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>


Declaring Only One Occurrence of an Element

<!ELEMENT element-name (child-name)>

Example:

<!ELEMENT note (message)>

The example above declares that the child element “message” must occur once, and only once inside the “note” element.


Declaring Minimum One Occurrence of an Element

<!ELEMENT element-name (child-name+)>

Example:

<!ELEMENT note (message+)>

The + sign in the example above declares that the child element “message” must occur one or more times inside the “note” element.


Declaring Zero or More Occurrences of an Element

<!ELEMENT element-name (child-name*)>

Example:

<!ELEMENT note (message*)>

The * sign in the example above declares that the child element “message” can occur zero or more times inside the “note” element.


Declaring Zero or One Occurrences of an Element 

<!ELEMENT element-name (child-name?)>

Example:

<!ELEMENT note (message?)>

The ? sign in the example above declares that the child element “message” can occur zero or one time inside the “note” element.


Declaring either/or Content

<!ELEMENT note (to,from,header,(message|body))>

The example above declares that the “note” element must contain a “to” element, a “from” element, a “header” element, and either a “message” or a “body” element.


Declaring Mixed Content

<!ELEMENT note (#PCDATA|to|from|header|message)*>

The example above declares that the “note” element can contain zero or more occurrences of parsed character data, “to”, “from”, “header”, or “message” elements.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.