summaryrefslogtreecommitdiffhomepage
path: root/xml.html.markdown
blob: 1e1eff865ff753bc286306058e2715cf75b146fc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
language: xml
filename: learnxml.xml
contributors:
  - ["João Farias", "https://github.com/JoaoGFarias"]
---

XML is a markup language designed to store and transport data.

Unlike HTML, XML does not specifies how to display or to format data, just carry it.

* XML Syntax

```xml
<!-- Comments in XML are like this -->

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="COOKING">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>
  <book category="CHILDREN">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>

<!-- Above is a typical XML file.
  It starts with a declaration, informing some metadata (optional).
  
  XML uses a tree structure. Above, the root node is 'bookstore', which has
  three child nodes, all 'books'. Those nodes has more child nodes, and so on... 
  
  Nodes are created using open/close tags, and childs are just nodes between
  the open and close tags.-->


<!-- XML carries two kind of data:
  1 - Attributes -> That's metadata about a node.
      Usually, the XML parser uses this information to store the data properly.
      It is characterized by appearing in parenthesis within the opening tag
  2 - Elements -> That's pure data.
      That's what the parser will retrieve from the XML file.
      Elements appear between the open and close tags, without parenthesis. -->
      
  
<!-- Below, an element with two attributes -->
<file type="gif" id="4293">computer.gif</file>


```

* Well-Formated Document x Validation

A XML document is well-formated if it is syntactically correct.
However, it is possible to inject more constraints in the document,
using document definitions, such as DTD and  XML Schema.

A XML document which follows a document definition is called valid, 
regarding that document. 

With this tool, you can check the XML data outside the application logic.

```xml

<!-- Below, you can see an simplified version of bookstore document, 
  with the addition of DTD definition.-->

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "Bookstore.dtd">
<bookstore>
  <book category="COOKING">
    <title >Everyday Italian</title>
    <price>30.00</price>
  </book>
</bookstore>

<!-- This DTD could be something like:-->

<!DOCTYPE note
[
<!ELEMENT bookstore (book+)>
<!ELEMENT book (title,price)>
<!ATTLIST book category CDATA "Literature">
<!ELEMENT title (#PCDATA)>
<!ELEMENT price (#PCDATA)>
]>


<!-- The DTD starts with a declaration.
  Following, the root node is declared, requiring 1 or more child nodes 'book'.
  Each 'book' should contain exactly one 'title' and 'price' and an attribute
  called 'category', with "Literature" as its default value.
  The 'title' and 'price' nodes contain a parsed character data.-->

<!-- The DTD could be declared inside the XML file itself.-->

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE note
[
<!ELEMENT bookstore (book+)>
<!ELEMENT book (title,price)>
<!ATTLIST book category CDATA "Literature">
<!ELEMENT title (#PCDATA)>
<!ELEMENT price (#PCDATA)>
]>

<bookstore>
  <book category="COOKING">
    <title >Everyday Italian</title>
    <price>30.00</price>
  </book>
</bookstore>


```