Abstract
The primary purpose of the chapter is to introduce the basics for reading and writing text with XML markup. The logical structure of an XML document includes primarily nested elements, some of which have associated attributes. A document type definition (DTD) can be included to constrain the contents and structure of the document. The concepts of well-formedness and validity of documents are defined. Two alternative constraining mechanisms, XML Schema and RELAX NG, are introduced and compared to DTDs. Finally, the two standard processing models for XML, one based on streams and one based on trees, are introduced. Although not all details of XML are covered, the chapter provides some literacy with respect to XML specifications, so that the complete language can be learned as necessary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
“Once a fatal error is detected, however, the processor MUST NOT continue normal processing (i.e., it MUST NOT continue to pass character data and information about the document’s logical structure to the application in the normal way).” “This innocent-looking definition embodies one of the most important and unprecedented aspects of XML: ‘Draconian’ error-handling.” [3]
- 3.
By default, curly apostrophes and quotation marks are commonly used in place of straight ones in documents prepared by word processors. However, these marks are not accepted by XML processors and are a common cause of parsing errors when examples are copied for XML parsing.
- 4.
Character references are syntactically similar to general entity references and can appear in the same contexts. However, character references are not parsed as described in this section. See instead Sect. 2.3.3.
References
Austin, D., Peruvemba, S., McCarron, S., Birbeck, M. (eds): XHTML™ Modularization 1.1 – Second Edition. W3C Recommendation (29 July 2010) http://www.w3.org/TR/xhtml-modularization/, Cited 10 March 2011.
Biron, P., Malhotra, A. (eds): XML Schema Part 2: Datatypes (Second Edition). W3C Re-commendation (28 Oct 2004) http://www.w3.org/TR/xmlschema-2/, Cited 10 March 2011.
Bray, T.: The Annotated XML Specification. http://www.xml.com/axml/testaxml.htm, Cited 10 March 2011.
Bray, T., Hollander, D., Layman, A., Tobin, R., Thompson, H.S. (eds): Namespaces in XML 1.0 (Third Edition). W3C Recommendation (8 December 2009) http://www.w3.org/TR/xml-names/, Cited 10 March 2011.
Bray, T., Paoli, J., Sperberg-McQueen, C.M. (eds): Extensible Markup Language (XML) 1.0. W3C Recommendation (10 February 1998) http://www.w3.org/TR/1998/REC-xml-19980210, Cited 10 March 2011.
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F., Cowen, J. (eds): Ex-tensible Markup Language (XML) 1.1. W3C Recommendation (4 February 2004, edited in place 15 April 2004) http://www.w3.org/TR/2004/REC-xml11-20040204/, Cited 10 March 2011.
Brownell, D. (ed): SAX. http://www.saxproject.org/, Cited 10 March 2011.
Clark, J., Pieters, S., Thompson, H.S. (eds): Associating Stylesheets with XML documents 1.0 (Second Edition). W3C Recommendation (28 October 2010) http://www.w3.org/TR/xml-stylesheet, Cited 10 March 2011.
Clark, J., DeRose, S. (eds): XML Path Language (XPath) Version 1.0. W3C Recommendation (16 November 1999) http://www.w3.org/TR/xpath, Cited 10 March 2011.
Clark, J, Murata, M.: RELAX NG Specification, Committee Specification. OASIS (3 De-cember 2001) http://www.oasis-open.org/committees/relax-ng/spec-20011203.html, Cited 10 March 2011.
Clark, J, Murata, M.: RELAX NG Tutorial, OASIS Committee Specification (3 December 2001), http://www.relaxng.org/tutorial-20011203.html, Cited 10 March 2011.
Cowan, J., Tobin, R. (eds): XML Information Set (Second Edition). W3C Recommendation (4 February 2004) http://www.w3.org/TR/xml-infoset/, Cited 10 March 2011.
Duerst, M., Suignard, M.: Internationalized Resource Identifiers (IRIs). The Internet Society (January 2005) http://www.rfc-editor.org/rfc/rfc3987.txt, Cited 10 March 2011.
Fallside, D.C., Walmsley, P. (eds): XML Schema Part 0: Primer Second Edition. W3C Recommendation (28 October 2004) http://www.w3.org/TR/xmlschema-0/. Cited 10 March 2011.
Fernández, M., et al. (eds): XQuery 1.0 and XPath 2.0 Data Model (XDM). W3C Recom-mendation (23 January 2007) http://www.w3.org/TR/2007/REC-xpath-datamodel-20070123/, Cited 10 March 2011.
Goldfarb, C.F.: The SGML Handbook, edited by Y. Rubinsky. Oxford University Press, Oxford, UK (1990).
ISO/IEC JTC1/SC34 Web Server, Information Technology – Document Description and Processing Languages. International Organization for Standardization and the International Electrotechnical Commission. http://www.ornl.gov/sgml/, Cited 10 March 2011.
Le Hégaret, P., et al. (eds): Document Object Model (DOM). http://www.w3.org/DOM/, Cited 10 March 2011.
Maler, E., El Andaloussi, J.: Developing SGML DTDs. From Text to Model to Markup. Prentice Hall PTR, Upper Saddle River, NJ (1995). Available online at http://www.xmlgrrl.com/publications/DSDTD/, Cited 10 March 2011.
Pemberton, S., et al.: XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition): A Reformulation of HTML 4 in XML 1.0. W3C Recommendation (26 January 2000. revised 1 August 2002) http://www.w3.org/TR/xhtml1/, Cited 10 March 2011.
Thompson, H.S., Bech, D., Maloney, M., Mendelsohn, N. (eds): XML Schema Part 1: Structures Second Edition. W3C Recommendation (28 October 2004) http://www.w3.org/TR/xmlschema-1/. Cited 10 March 2011.
Tompa, F.W.: What is (tagged) text? In: Dictionaries in the Electronic Age, Proceedings of the Fifth Annual Conference of UW Centre for the New Oxford English Dictionary and Text Research, pp. 81–93. Waterloo, Ont.: University of Waterloo (1989). Available online at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.5411&rep=rep1&type=pdf, Cited 10 March 2011.
W3C, All Standards and Drafts. http://www.w3.org/TR/, Cited 10 March 2011.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Salminen, A., Tompa, F. (2011). Fundamentals. In: Communicating with XML. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-0992-2_2
Download citation
DOI: https://doi.org/10.1007/978-1-4614-0992-2_2
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-0991-5
Online ISBN: 978-1-4614-0992-2
eBook Packages: Computer ScienceComputer Science (R0)