Skip to main content

Fundamentals

  • Chapter
  • First Online:
Communicating with XML

Abstract

The primary purpose of the chapter is to introduce the basics for reading and writing text with XML markup. The logical structure of an XML document includes primarily nested elements, some of which have associated attributes. A document type definition (DTD) can be included to constrain the contents and structure of the document. The concepts of well-formedness and validity of ­documents are defined. Two alternative constraining mechanisms, XML Schema and RELAX NG, are introduced and compared to DTDs. Finally, the two standard processing models for XML, one based on streams and one based on trees, are introduced. Although not all details of XML are covered, the chapter provides some ­literacy with respect to XML specifications, so that the complete language can be learned as necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    SGML, the Standard Generalized Markup Language, was accepted as ISO standard 8879 in 1986 [16] and later augmented by supplements [17].

  2. 2.

    “Once a fatal error is detected, however, the processor MUST NOT continue normal processing (i.e., it MUST NOT continue to pass character data and information about the document’s logical structure to the application in the normal way).” “This innocent-looking definition embodies one of the most important and unprecedented aspects of XML: ‘Draconian’ error-handling.” [3]

  3. 3.

    By default, curly apostrophes and quotation marks are commonly used in place of straight ones in documents prepared by word processors. However, these marks are not accepted by XML processors and are a common cause of parsing errors when examples are copied for XML parsing.

  4. 4.

    Character references are syntactically similar to general entity references and can appear in the same contexts. However, character references are not parsed as described in this section. See instead Sect. 2.3.3.

References

  1. Austin, D., Peruvemba, S., McCarron, S., Birbeck, M. (eds): XHTML™ Modularization 1.1 – Second Edition. W3C Recommendation (29 July 2010) http://www.w3.org/TR/xhtml-modularization/, Cited 10 March 2011.

  2. Biron, P., Malhotra, A. (eds): XML Schema Part 2: Datatypes (Second Edition). W3C Re-commendation (28 Oct 2004) http://www.w3.org/TR/xmlschema-2/, Cited 10 March 2011.

  3. Bray, T.: The Annotated XML Specification. http://www.xml.com/axml/testaxml.htm, Cited 10 March 2011.

  4. Bray, T., Hollander, D., Layman, A., Tobin, R., Thompson, H.S. (eds): Namespaces in XML 1.0 (Third Edition). W3C Recommendation (8 December 2009) http://www.w3.org/TR/xml-names/, Cited 10 March 2011.

  5. Bray, T., Paoli, J., Sperberg-McQueen, C.M. (eds): Extensible Markup Language (XML) 1.0. W3C Recommendation (10 February 1998) http://www.w3.org/TR/1998/REC-xml-19980210, Cited 10 March 2011.

  6. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F., Cowen, J. (eds): Ex-tensible Markup Language (XML) 1.1. W3C Recommendation (4 February 2004, edited in place 15 April 2004) http://www.w3.org/TR/2004/REC-xml11-20040204/, Cited 10 March 2011.

  7. Brownell, D. (ed): SAX. http://www.saxproject.org/, Cited 10 March 2011.

  8. Clark, J., Pieters, S., Thompson, H.S. (eds): Associating Stylesheets with XML documents 1.0 (Second Edition). W3C Recommendation (28 October 2010) http://www.w3.org/TR/xml-stylesheet, Cited 10 March 2011.

  9. Clark, J., DeRose, S. (eds): XML Path Language (XPath) Version 1.0. W3C Recommendation (16 November 1999) http://www.w3.org/TR/xpath, Cited 10 March 2011.

  10. Clark, J, Murata, M.: RELAX NG Specification, Committee Specification. OASIS (3 De-cember 2001) http://www.oasis-open.org/committees/relax-ng/spec-20011203.html, Cited 10 March 2011.

  11. Clark, J, Murata, M.: RELAX NG Tutorial, OASIS Committee Specification (3 December 2001), http://www.relaxng.org/tutorial-20011203.html, Cited 10 March 2011.

  12. Cowan, J., Tobin, R. (eds): XML Information Set (Second Edition). W3C Recommendation (4 February 2004) http://www.w3.org/TR/xml-infoset/, Cited 10 March 2011.

  13. Duerst, M., Suignard, M.: Internationalized Resource Identifiers (IRIs). The Internet Society (January 2005) http://www.rfc-editor.org/rfc/rfc3987.txt, Cited 10 March 2011.

  14. Fallside, D.C., Walmsley, P. (eds): XML Schema Part 0: Primer Second Edition. W3C Recommendation (28 October 2004) http://www.w3.org/TR/xmlschema-0/. Cited 10 March 2011.

  15. Fernández, M., et al. (eds): XQuery 1.0 and XPath 2.0 Data Model (XDM). W3C Recom-mendation (23 January 2007) http://www.w3.org/TR/2007/REC-xpath-datamodel-20070123/, Cited 10 March 2011.

  16. Goldfarb, C.F.: The SGML Handbook, edited by Y. Rubinsky. Oxford University Press, Oxford, UK (1990).

    Google Scholar 

  17. ISO/IEC JTC1/SC34 Web Server, Information Technology – Document Description and Processing Languages. International Organization for Standardization and the International Electrotechnical Commission. http://www.ornl.gov/sgml/, Cited 10 March 2011.

  18. Le Hégaret, P., et al. (eds): Document Object Model (DOM). http://www.w3.org/DOM/, Cited 10 March 2011.

  19. Maler, E., El Andaloussi, J.: Developing SGML DTDs. From Text to Model to Markup. Prentice Hall PTR, Upper Saddle River, NJ (1995). Available online at http://www.xmlgrrl.com/publications/DSDTD/, Cited 10 March 2011.

  20. Pemberton, S., et al.: XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition): A Reformulation of HTML 4 in XML 1.0. W3C Recommendation (26 January 2000. revised 1 August 2002) http://www.w3.org/TR/xhtml1/, Cited 10 March 2011.

  21. Thompson, H.S., Bech, D., Maloney, M., Mendelsohn, N. (eds): XML Schema Part 1: Structures Second Edition. W3C Recommendation (28 October 2004) http://www.w3.org/TR/xmlschema-1/. Cited 10 March 2011.

  22. Tompa, F.W.: What is (tagged) text? In: Dictionaries in the Electronic Age, Proceedings of the Fifth Annual Conference of UW Centre for the New Oxford English Dictionary and Text Research, pp. 81–93. Waterloo, Ont.: University of Waterloo (1989). Available online at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.5411&rep=rep1&type=pdf, Cited 10 March 2011.

  23. W3C, All Standards and Drafts. http://www.w3.org/TR/, Cited 10 March 2011.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Airi Salminen .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Salminen, A., Tompa, F. (2011). Fundamentals. In: Communicating with XML. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-0992-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-0992-2_2

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-0991-5

  • Online ISBN: 978-1-4614-0992-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics