XML Tutorial
Volume 5 : XML Schema Data Types (Part I)
Seiichi Kinugasa
Index
XML Schema Data Type
Simple Type Restrictions
Simple Type Extensions
Review Questions
XML Schema Data Types
XML Schema data types can be generally categorized a "simple type" (including embedded simple type) and "complex type." The "embedded simple type" is already defined, but can be used to create a new type through restriction or extension.
Table : XML Schema Data Types
Simple Type | User can independently define. This type is used when a restriction is placed on an embedded simple type to create and use a new type. |
---|---|
Complex Type | User can independently define. This type is used when the type has a child element or attribute. |
A simple type is a type that only contains text data when expressed according to XML 1.0. This type can be used with element declarations and attribute declarations. The embedded simple type is provided for in XML Schema Part 2. A restriction may be placed on an embedded simple type to create a new, unique simple type.
On the other hand, a complex data type is a type that has a child element or attribute structure when expressed according to XML 1.0. An element declaration may be used with this type. There are no predefined complex type data types, so the user will always define their own.
●Simple Type Example
<xs:element name="Department" type="xs:string" />
Here, the section described together with "xs:string" is an embedded simple type according to XML Schema. In this example, we have established the definition that the data type for the element called "Department" is a text string.
●Complex Type Example
<xs:complexType name="EmployeeType">
<xs:sequence maxOccurs="unbounded">
<xs:element ref="Name" />
<xs:element ref="Department" />
</xs:sequence>
</xs:complexType>
<xs:element name="Name" type="xs:string" />
<xs:element name="Department" type="xs:string" />
In this case the type name "EmployeeType" is designated by the name attribute of the complexType element. A model group (what designates the order of occurrence for the child element) is designated in the child element.
New types are created by placing restrictions on or extending simple or complex types. In this volume, we will discuss restrictions and extensions for simple types.
Simple Type Restrictions
"Simple type restrictions" are restrictions set on conditions such as maximum value/ minimum value, number of characters, etc., related to basic data types such as "int" and "string" defined by embedded simple types. Restrictions allow the user to create user-defined simple types as new types.
Since the concept is somewhat difficult to communicate in words alone, let’s look at the following three examples of how simple type restrictions are defined:
- [1] Password element is a string type, between six and 12 characters
- [2] Employee Number element is a integer data type, greater than or equal to 1,000 and less than 10,000
- [3] Department element is a string type, limited to either "Sales Department," "Development Department," or "Service Department."
[1] Password element is a string type, between six and 12 characters
First, we will use the "simpleType" element to declare the new data type (simple type). The name of the data type created is designated using the name attribute. Here , we will create a data type with the name "passwordType."
<xs:simpleType name="passwordType">
:
:
</xs:simpleType>
Next, we will describe a "restriction element." We will use the base attribute value to designate the data type we will use to create our data type. Since we want to create the password element data type based on a string type, we will describe "string" as the base attribute value.
<xs:restriction base="xs:string">
:
:
</xs:restriction>
Next, we will designate the restriction conditions using the child element of the restriction element. The restriction conditions call for "between six and 12 characters." Under XML Schema, this kind of restriction condition is called a "restriction facet." Each facet setting value is established using the value attribute. Restriction facets include those of the type shown in the table below. Facets allowable differ according to the type that serves as the base.
Table: Types of Restriction Facets
Restrictions Facet | Meaning |
length | Fixes the length of the value |
minLength | Designates the length of the minimum value |
maxLength | Designates the length of the maximum value |
pattern | Designates a pattern using a regular expression |
enumeration | Designates the allowable values |
minInclusive | Minimum value of the value range (includes designated value) |
maxInclusive | Maximum value of the value range (includes designated value) |
minExclusive | Minimum value of the value range (does not include designated value) |
maxExclusive | Maximum value of the value range (does not include designated value) |
whiteSpace | Normalization of blank characters |
totalDigits | Designates maximum number of digits |
fractionDigits | Designates the maximum number of decimals |
Since the password element is between six and 12 characters, we will use the "minLength" and "maxLength" restriction facets. These facets designate the respective minimum and maximum lengths that the value can be.
●Settings the Designated Value of each Facet
<xs:minLength value="6" />
<xs:maxLength value="12" />
The following summarizes the designations made to this point:
The following is the declaration of the passwordType type PW element:
<xs:element name="PW" type="passwordType" />
The following shows examples of a valid XML document and an invalid XML document for this Schema document:
●Valid XML Document
<PW>dbmagazine</PW>
●Invalid XML Document
<PW>db</PW>
[2] Employee_Number element is a integer data type greater than or equal to 1,000 and less than 10,000
In this case, the conditions call for an integer greater than or equal to 1,000 and less than 10,000. We will use "int" for the value of the base attribute of the restriction element, and "minInclusive" and "maxExclusive" for the restrictions facets.
Let’s be sure we have a good understanding of the differences between minInclusive and minExclusive, and maxInclusive and maxExclusive.
●Setting the Designated Value of the Restriction Facet
<xs:restriction base="xs:int">
<xs:minInclusive value="1000" />
<xs:maxExclusive value="10000" />
</xs:restriction>
We will use the name "snoType" as the Employee Number element type. The final data type definition is as follows:
<xs:simpleType name="snoType">
<xs:restriction base="xs:int">
<xs:minInclusive value="1000"/>
<xs:maxExclusive value="10000"/>
</xs:restriction>
</xs:simpleType>
The snoType Employee_Number element declaration is as follows:
<xs:element name="Employee_Number" type="snoType" />
The following shows examples of a valid XML document and an invalid XML document for this Schema document:
●Valid XML Document
<Employee_Number>1998</Employee_Number>
●Invalid XML Document
<Employee_Number>999</Employee_Number>
<Employee_Number>10000</Employee_Number>
[3] Department element is a string type, limited to either "Sales Department," "Development Department," or "Service Department."
In this case, we will be defining the candidate values that are allowed (Sales Department, Development Department, Service Department). Accordingly, we will use "string" as the value for the restriction element’s base attribute, and "enumeration" to designate the candidate values for the restriction facet.
●Setting the Designation Value of Restrictions Facet
<xs:restriction base="xs:string">
<xs:enumeration value="Sales" />
<xs:enumeration value="Development" />
<xs:enumeration value="Service" />
</xs:restriction>
We will use the name "belongType" as the Department element type. The final data type definition is as follows:
<xs:simpleType name="belongType">
<xs:restriction base="xs:string">
<xs:enumeration value="Sales"/>
<xs:enumeration value="Development"/>
<xs:enumeration value="Service"/>
</xs:restriction>
</xs:simpleType>
The belongType type Department element declaration is as follows:
<xs:element name="Department" type="belongType" />
The following shows examples of a valid XML document and an invalid XML document for this Schema document:
●Valid XML Document
<Department>Sales</Department>
●Invalid XML Document
<Department>Human Resources</Department>
Having an understanding of the method to reuse a type and the meaning of the restriction facet, we can easily describe simple type restrictions.
Simple Type Extensions
The phrase "simple type extension" may lead some to think that this means expanding the value range of the base data type. However, in truth, only the attribute definition can be added. Extending a simple type results in the data type becoming a complex type. Accordingly, adding an attribute definition to a simple type can be performed by just extending the simple type. Let’s look at an example:
The Product_Price element is an integer value. The value may be greater than or equal to 2,000 and less than or equal to 5,000. Also, the Product_Price element is a string type having a Currency attribute.
The following procedure is used to define this example:
- [1] Create a "priceType type" that is an integer greater than or equal to 2,000 and less than or equal to 5,000
- [2] Create a product price "goodsPricetype" type from the priceType type. The goodsPriceType type is a string type and has a Currency attribute
[1] Create a priceType type that is an integer greater than or equal to 2,000 and less than or equal to 5,000
This uses the simple type restrictions explained above. We will use the restriction element and the minInclusive and maxInclusive restrictions facets.
<xs:simpleType name="priceType">
<xs:restriction base="xs:int">
<xs:minInclusive value="2000"/>
<xs:maxInclusive value="5000"/>
</xs:restriction>
</xs:simpleType>
[2] From the priceType type, create a product price goodsPriceType type that is a string type having a Currency attribute
Since we will be adding a Currency attribute to the priceType simple type, this becomes a simple type extension. Also, in this case, the simple type extension will have an attribute, thus becoming a complex type.
Table : Type Reuse Method
Base Type | Reuse | New Type | Usage |
Simple Type | Restriction | Simple Type (simpleType) | Add restrictions to data type |
---|---|---|---|
Extension | Complex Type (complexType) | Assign an attribute to the simple type | |
Complex Type | Restriction | Complex Type(complexType) | Reuse the type |
Extension | Complex Type (complexType) | Reuse the type |
In practice, a complexType element is used to extend the simple type. The name "goodPriceType" for the created data type is designated using the name attribute.
<xs:complexType name="goodPriceType">
:
:
</xs:complexType>
Next, we will designate the simpleContent element so as to define the content in the child element of the complexType as a simple type.
<xs:simpleContent>
:
:
</xs:simpleContent>
Now, we will describe the extension element that represents the extension. Then we will designate the "priceType" data type that serves as the base attribute.
<xs:extension base="priceType">
:
:
</xs:extension>
Last, we will designate the Currency attribute. In the end, the goodsPriceType type will look like below.
The Product_Price element declaration for the goodsPriceType type is as follows:
<xs:element name="Product_Price" type="goodsPriceType" />
The following shows examples of a valid XML document and an invalid XML document for this Schema document:
● Valid XML Document
<Product_Price Currency="¥">2500</Product_Price>
<Product_Price Currency="$">4300</Product_Price>
● Invalid XML Document
<Product_Price Currency="¥">5200</Product_Price>
<Product_Price>3500</Product_Price>
Review Questions
Question 1
Select which of the following is incorrect as an XML Schema description. Assume that the namespace prefix xs is declared as "http://www.w3.org/2001/XMLSchema."
<xs:simpleType name="Data_Type">
<xs:restriction base="xs:int">
<xs:minExclusive value="20"/>
<xs:maxInclusive value="200"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="Price" type="Data_Type"/>
- A value of 20 is allowed under the Price element
- A value of 23.4 is not allowed under the Price element
- A value of 200 is allowed under the Price element
- A value of 99 is allowed under the Price element
Comments
Both a simpleType element and a restriction element are used, so I’m sure you understand that this becomes a simple type restriction. The content defined here is a type based on an integer type, with a minimum value of greater than 20, and a maximum value of up to 200. The description under answer A does not meet the restrictions; however, answers B, C and D do. Accordingly, the correct answer is A.
Question 2
Select which of the following is an XML Schema document that correctly defines the Unit Price element as either 3500 or 4500.
- <xs:simpleType name="priceType">
<xs:restriction base="xs:int">
<xs:minInclusive value="3500"/>
<xs:maxInclusive value="4500"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="Unit Price" type="priceType"/> - <xs:simpleType name="priceType">
<xs:restriction base="xs:int">
<xs:minExclusive value="3500"/>
<xs:maxExclusive value="4500"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="Unit Price" type="priceType"/> - <xs:simpleType name="priceType">
<xs:restriction base="xs:int">
<xs:enumeration value="3500"/>
<xs:enumeration value="4500"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="Unit Price" type="priceType"/> - <xs:simpleType name="priceType">
<xs:restriction base="xs:int">
<xs:minlength value="3500"/>
<xs:maxlength value="4500"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="Unit Price" type="priceType"/>
Comments
This question also involves a simpleType element and restriction element, forming a simple type restriction. Since A uses minInclusive and maxInclusive, the value designated for the Unit Price element is 3500 or greater and 4500 or less, which does not meet the required conditions. Answer B uses minExclusive and maxExclusive, which means that the value designated for the Unit Price element is greater than 3500 and less than 4500, contrary to the requirements defined. Accordingly, B is also incorrect as an answer here. Answer C uses enumeration, meaning that the value designated for the Unit Price element is either 3500 or 4500. This satisfies the conditions, and is accordingly our correct answer. Answer D uses minLength and maxLength. If the base type is "int," then minLength and maxLength cannot be designated, and therefore, D is an incorrect answer.
Question 3
Select which two of the XML documents below are valid with respect to the following XML Schema document.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Product" default="Computer">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="ship_date" type="xs:date" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>
- <Product ship_date = "April 4, 2006">Notebook Computer</Product>
- <Product ship_date="2006-04-15">Mouse</Product>
- <Product ship_date="2006-4-20">Hard Disk</Product>
- <Product ship_date="2006-04-30"></Product>
Comments
This XML Schema document makes the following definitions
- The Product element is text data, with a default value of "Computer"
- The Product element has a date type Ship_date attribute, and the attribute must be designated
The Product Name element has a default attribute assigned. It is only in the event that content for the "Product Name" element is not described that the default attribute value will be applied. The Ship_date attribute is designated as "required" in a use attribute. Accordingly, any content is allowed as long as it is date type data. The pattern for date type is "XXXX-XX-XX (2006-05-12)." Accordingly, the correct answer is B and D.
Seiichi Kinugasa
Hewlett-Packard Japan HP Training Services.
I currently oversee XML training courses as an Infoteria-certified trainer, providing technical and training support for IT professional development programs, including large-scale Web development support courses. Not having been asked to write magazine articles for quite some time, I am truly feeling the pressure, but I will continue to give my best for the next two articles I am writing for this series.
The content presented here is an HTML version of an article that originally appeared in the July 2007 issue of DB Magazine published by Shoeisya.