XML (Extensible Markup Language) Computer Science YouTube Lecture Handouts

Dr. Manishika Jain- Join online Paper 1 intensive course. Includes tests and expected questions.

Topics to be discussed

  • Introduction
  • XML VS. HTML
    • Differences
    • Similarities
  • Working Model of XML
    • Syntax
      • XML Declaration
      • XML Tags & Elements
      • Text
      • References
  • XML Document
  • XML Parser
  • XML Validation

Introduction

  • XML stands for Extensible Markup Language. It is called extensible because you can manipulate it according to your need by creating your own self -descriptive tags.
  • It is derived from SGML i.e.. Standard Generalized Markup Language.
  • It is developed by W3C i.e.. World Wide Consortium.
  • It is used to store and transfer/exchange data between different applications or systems.
  • XML documents can be easily parsed.
  • XML is a case sensitive language. The name of the start and end tag must be in same case.

XML vs. HTML- Differences

XML vs HTML- Differences
XMLHTML
It is extensible as it can be customized according to our need using user-defined tags.HTML is a pre-defined Language. You can use just pre-defined tags.
As It is used to store and exchange data between applications and give focus on content.It is used to design web pages. It only gives focus on presentation of data.
It is case sensitive.It is not case sensitive
It is very strict about the structure of XML document. Means every tag that is started will be closed in appropriate manner.HTML is very lenient for the structure of HTML document.
XML provide a framework to define markup languages.HTML is a markup language itself.

Xml vs Html – Similarities

  • Both are derived from SGML.
  • They are structure based language.
  • They are markup languages as tags are used in both the languages.
  • In both languages, we create code in notepad and view in web browser.

Working Model of XML

Working Model of XML

Syntax Rule- XML Declaration

XML declaration is always declared at the beginning of the html document.

XML Declaration Code:

< ? xml version = “1.0” encoding = “UTF-8” ? >

Attributes:

version: specify the xml version

encoding: specify the character encoding

Note: XML declaration is optional.

Syntax Rule- Tags/Elements

TAGS-They define the scope of the elements. It is just like a container that can hold elements. In XML, tags are always introduced in pairs [Start tag and end tag] . The text appears between these tags are called ‘content’ .

Example:

< Employee >

! Content

< /Employee >

Rules for Creating Tags

  • XML tags are case sensitive.
  • An XML document can have only one root element.
  • Tags/Elements can be nested. XML tags should be closed in appropriate order because XML is very strict for its structure. It follows LIFO (Last In First Out) structure, that means the tag which is created inside a tag will be closed before the outer tag.

Example:

< ? XML VRSION = “1.0” ? >

< EMPLOYEE >! ROOT ELEMENT ⇾

< NAME > RAM SINGH < /NAME >< ! … NAME, AGE, DEPARTMENT ARE CHILD ELEMENT … ⇾

  • < AGE > 26 < /AGE >! RAM SINGH, 26, SALES ARE CONTENT ITEMS … ⇾
  • < DEPARTMENT > SALES < /DEPARTMENT >

< /EMPLOYEE >

< ? XML VRSION = “1.0” ? >

< EMPLOYEE >! ROOT ELEMENT

< NAME > RAM SINGH < /NAME >! NAME, AGE, DEPARTMENT ARE CHILD ELEMENT

< AGE > 26 < /AGE >! RAM SINGH, 26, SALES ARE CONTENT ITEMS

< DEPARTMENT > SALES < /DEPARTMENT >

< /EMPLOYEE >

Syntax Rules- Xml Text

  • The names of XML elements and attributes are case sensitive that means names of start and end tag must be in same case.
  • You cannot use the reserved words as a name of XML element.
  • Some characters like > , < , &, ′, ″ are not allowed in the text in XML. To use them you can use these replacements [&gt, &lt, &amp, &aps and &quot] respectively.
  • You can use white space in XML content. But whenever you want to use it in XML elements, you should use (.) dot instead of white space.

Example:

< Employeename >

< /Employeename >

Syntax Rule- References

Reference is used add additional text or markup in xml. It is always begin with ‘&’ and end with ‘;’ .

There are two types of References in XML:

Entity References: It contains a name between start [&] and end [;] delimiters. Eg. represents ‘greater than’ .

Character References: It contains hash mark ‘#’ ; followed by a number within a start and end delimiters. It always represents a Unicode character of an alphabet. For Example: represents alphabet ‘D’ .

Xml Document

An XML Document Contains Two Sections

  • Prolog Section- contains only ‘xml declaration’ .
  • Element Section
An XML Document Contains Two Sections

XML Parser

It is a package that provides interface to a user to work with xml document.

  • It checks the format/structure of xml document.
  • It also validate the xml document.
  • It converts the xml document to readable code

XML Validation

It checks the format of xml document.

Validation Checks

  • Case sensitivity
  • Tags formation.
    • All elements must have an end tag
    • Nesting of tags must be in correct format.
  • XML document must have a root element.
  • Check [″] Quotation mark. Attribute value must be quoted.

MCQs

Q-1 From Which language XML is derived?

1. HTML

2. SGML

3. DHTML

4. C Language

Answer: 2

Q. 2 XML Validator checks ________.

1. The number of tags used

2. Colour of elements.

3. Whether the root element is defined or not.

4. Size of xml tags.

Answer: 3

Developed by: