Annotation of html5/spec/tokenization.html, revision 1.47
1.1 mike 1: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
2: <!DOCTYPE html>
3: <!-- when publishing, change bits marked ZZZ --><html lang="en-US-x-Hixie" class="split chapter"><head><title>8.2.4 Tokenization — HTML5 </title><style type="text/css">
4: pre { margin-left: 2em; white-space: pre-wrap; }
5: h2 { margin: 3em 0 1em 0; }
6: h3 { margin: 2.5em 0 1em 0; }
7: h4 { margin: 2.5em 0 0.75em 0; }
8: h5, h6 { margin: 2.5em 0 1em; }
9: h1 + h2, h1 + h2 + h2 { margin: 0.75em 0 0.75em; }
10: h2 + h3, h3 + h4, h4 + h5, h5 + h6 { margin-top: 0.5em; }
11: p { margin: 1em 0; }
12: hr:not(.top) { display: block; background: none; border: none; padding: 0; margin: 2em 0; height: auto; }
13: dl, dd { margin-top: 0; margin-bottom: 0; }
14: dt { margin-top: 0.75em; margin-bottom: 0.25em; clear: left; }
15: dt + dt { margin-top: 0; }
16: dd dt { margin-top: 0.25em; margin-bottom: 0; }
17: dd p { margin-top: 0; }
18: dd dl + p { margin-top: 1em; }
19: dd table + p { margin-top: 1em; }
20: p + * > li, dd li { margin: 1em 0; }
21: dt, dfn { font-weight: bold; font-style: normal; }
22: dt dfn { font-style: italic; }
23: pre, code { font-size: inherit; font-family: monospace; font-variant: normal; }
24: pre strong { color: black; font: inherit; font-weight: bold; background: yellow; }
25: pre em { font-weight: bolder; font-style: normal; }
26: @media screen { code { color: orangered; } code :link, code :visited { color: inherit; } }
27: var sub { vertical-align: bottom; font-size: smaller; position: relative; top: 0.1em; }
28: table { border-collapse: collapse; border-style: hidden hidden none hidden; }
29: table thead, table tbody { border-bottom: solid; }
30: table tbody th:first-child { border-left: solid; }
31: table tbody th { text-align: left; }
32: table td, table th { border-left: solid; border-right: solid; border-bottom: solid thin; vertical-align: top; padding: 0.2em; }
33: blockquote { margin: 0 0 0 2em; border: 0; padding: 0; font-style: italic; }
34:
35: .bad, .bad *:not(.XXX) { color: gray; border-color: gray; background: transparent; }
36: .matrix, .matrix td { border: none; text-align: right; }
37: .matrix { margin-left: 2em; }
38: .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }
39: .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }
40: .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }
41:
42: .toc dfn, h1 dfn, h2 dfn, h3 dfn, h4 dfn, h5 dfn, h6 dfn { font: inherit; }
43: img.extra { float: right; }
44: pre.idl { border: solid thin; background: #EEEEEE; color: black; padding: 0.5em 1em; }
45: pre.idl :link, pre.idl :visited { color: inherit; background: transparent; }
46: pre.css { border: solid thin; background: #FFFFEE; color: black; padding: 0.5em 1em; }
47: pre.css:first-line { color: #AAAA50; }
48: dl.domintro { color: green; margin: 2em 0 2em 2em; padding: 0.5em 1em; border: none; background: #DDFFDD; }
49: hr + dl.domintro, div.impl + dl.domintro { margin-top: 2.5em; margin-bottom: 1.5em; }
50: dl.domintro dt, dl.domintro dt * { color: black; text-decoration: none; }
51: dl.domintro dd { margin: 0.5em 0 1em 2em; padding: 0; }
52: dl.domintro dd p { margin: 0.5em 0; }
53: dl.switch { padding-left: 2em; }
54: dl.switch > dt { text-indent: -1.5em; }
55: dl.switch > dt:before { content: '\21AA'; padding: 0 0.5em 0 0; display: inline-block; width: 1em; text-align: right; line-height: 0.5em; }
56: dl.triple { padding: 0 0 0 1em; }
57: dl.triple dt, dl.triple dd { margin: 0; display: inline }
58: dl.triple dt:after { content: ':'; }
59: dl.triple dd:after { content: '\A'; white-space: pre; }
60: .diff-old { text-decoration: line-through; color: silver; background: transparent; }
61: .diff-chg, .diff-new { text-decoration: underline; color: green; background: transparent; }
62: a .diff-new { border-bottom: 1px blue solid; }
63:
64: h2 { page-break-before: always; }
65: h1, h2, h3, h4, h5, h6 { page-break-after: avoid; }
66: h1 + h2, hr + h2.no-toc { page-break-before: auto; }
67:
1.44 mike 68: p > span:not([title=""]):not([class="XXX"]):not([class="impl"]):not([class="note"]),
69: li > span:not([title=""]):not([class="XXX"]):not([class="impl"]):not([class="note"]), { border-bottom: solid #9999CC; }
1.1 mike 70:
71: div.head { margin: 0 0 1em; padding: 1em 0 0 0; }
72: div.head p { margin: 0; }
73: div.head h1 { margin: 0; }
74: div.head .logo { float: right; margin: 0 1em; }
75: div.head .logo img { border: none } /* remove border from top image */
76: div.head dl { margin: 1em 0; }
77: div.head p.copyright, div.head p.alt { font-size: x-small; font-style: oblique; margin: 0; }
78:
79: body > .toc > li { margin-top: 1em; margin-bottom: 1em; }
80: body > .toc.brief > li { margin-top: 0.35em; margin-bottom: 0.35em; }
81: body > .toc > li > * { margin-bottom: 0.5em; }
82: body > .toc > li > * > li > * { margin-bottom: 0.25em; }
83: .toc, .toc li { list-style: none; }
84:
85: .brief { margin-top: 1em; margin-bottom: 1em; line-height: 1.1; }
86: .brief li { margin: 0; padding: 0; }
87: .brief li p { margin: 0; padding: 0; }
88:
89: .category-list { margin-top: -0.75em; margin-bottom: 1em; line-height: 1.5; }
90: .category-list::before { content: '\21D2\A0'; font-size: 1.2em; font-weight: 900; }
91: .category-list li { display: inline; }
92: .category-list li:not(:last-child)::after { content: ', '; }
93: .category-list li > span, .category-list li > a { text-transform: lowercase; }
94: .category-list li * { text-transform: none; } /* don't affect <code> nested in <a> */
95:
96: .XXX { color: #E50000; background: white; border: solid red; padding: 0.5em; margin: 1em 0; }
97: .XXX > :first-child { margin-top: 0; }
98: p .XXX { line-height: 3em; }
99: .annotation { border: solid thin black; background: #0C479D; color: white; position: relative; margin: 8px 0 20px 0; }
100: .annotation:before { position: absolute; left: 0; top: 0; width: 100%; height: 100%; margin: 6px -6px -6px 6px; background: #333333; z-index: -1; content: ''; }
101: .annotation :link, .annotation :visited { color: inherit; }
102: .annotation :link:hover, .annotation :visited:hover { background: transparent; }
103: .annotation span { border: none ! important; }
104: .note { color: green; background: transparent; font-family: sans-serif; }
105: .warning { color: red; background: transparent; }
106: .note, .warning { font-weight: bolder; font-style: italic; }
107: p.note, div.note { padding: 0.5em 2em; }
108: span.note { padding: 0 2em; }
109: .note p:first-child, .warning p:first-child { margin-top: 0; }
110: .note p:last-child, .warning p:last-child { margin-bottom: 0; }
111: .warning:before { font-style: normal; }
112: p.note:before { content: 'Note: '; }
113: p.warning:before { content: '\26A0 Warning! '; }
114:
115: .bookkeeping:before { display: block; content: 'Bookkeeping details'; font-weight: bolder; font-style: italic; }
116: .bookkeeping { font-size: 0.8em; margin: 2em 0; }
117: .bookkeeping p { margin: 0.5em 2em; display: list-item; list-style: square; }
1.19 mike 118: .bookkeeping dt { margin: 0.5em 2em 0; }
119: .bookkeeping dd { margin: 0 3em 0.5em; }
1.1 mike 120:
121: h4 { position: relative; z-index: 3; }
122: h4 + .element, h4 + div + .element { margin-top: -2.5em; padding-top: 2em; }
123: .element {
124: background: #EEEEFF;
125: color: black;
126: margin: 0 0 1em 0.15em;
127: padding: 0 1em 0.25em 0.75em;
128: border-left: solid #9999FF 0.25em;
129: position: relative;
130: z-index: 1;
131: }
132: .element:before {
133: position: absolute;
134: z-index: 2;
135: top: 0;
136: left: -1.15em;
137: height: 2em;
138: width: 0.9em;
139: background: #EEEEFF;
140: content: ' ';
141: border-style: none none solid solid;
142: border-color: #9999FF;
143: border-width: 0.25em;
144: }
145:
146: .example { display: block; color: #222222; background: #FCFCFC; border-left: double; margin-left: 2em; padding-left: 1em; }
147: td > .example:only-child { margin: 0 0 0 0.1em; }
148:
149: ul.domTree, ul.domTree ul { padding: 0 0 0 1em; margin: 0; }
150: ul.domTree li { padding: 0; margin: 0; list-style: none; position: relative; }
151: ul.domTree li li { list-style: none; }
152: ul.domTree li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
153: ul.domTree li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
154: ul.domTree span { font-style: italic; font-family: serif; }
155: ul.domTree .t1 code { color: purple; font-weight: bold; }
156: ul.domTree .t2 { font-style: normal; font-family: monospace; }
157: ul.domTree .t2 .name { color: black; font-weight: bold; }
158: ul.domTree .t2 .value { color: blue; font-weight: normal; }
159: ul.domTree .t3 code, .domTree .t4 code, .domTree .t5 code { color: gray; }
160: ul.domTree .t7 code, .domTree .t8 code { color: green; }
161: ul.domTree .t10 code { color: teal; }
162:
163: body.dfnEnabled dfn { cursor: pointer; }
164: .dfnPanel {
165: display: inline;
166: position: absolute;
167: z-index: 10;
168: height: auto;
169: width: auto;
170: padding: 0.5em 0.75em;
171: font: small sans-serif, Droid Sans Fallback;
172: background: #DDDDDD;
173: color: black;
174: border: outset 0.2em;
175: }
176: .dfnPanel * { margin: 0; padding: 0; font: inherit; text-indent: 0; }
177: .dfnPanel :link, .dfnPanel :visited { color: black; }
178: .dfnPanel p { font-weight: bolder; }
179: .dfnPanel * + p { margin-top: 0.25em; }
180: .dfnPanel li { list-style-position: inside; }
181:
182: #configUI { position: absolute; z-index: 20; top: 10em; right: 1em; width: 11em; font-size: small; }
183: #configUI p { margin: 0.5em 0; padding: 0.3em; background: #EEEEEE; color: black; border: inset thin; }
184: #configUI p label { display: block; }
185: #configUI #updateUI, #configUI .loginUI { text-align: center; }
186: #configUI input[type=button] { display: block; margin: auto; }
1.17 mike 187:
1.43 mike 188: fieldset { margin: 1em; }
189: fieldset > legend * + { margin-top: 0; }
190: fieldset > :last-child { margin-bottom: 0; }
191:
1.1 mike 192: </style><style type="text/css">
193:
194: .applies thead th > * { display: block; }
195: .applies thead code { display: block; }
196: .applies tbody th { whitespace: nowrap; }
197: .applies td { text-align: center; }
198: .applies .yes { background: yellow; }
199:
1.20 mike 200: .matrix, .matrix td { border: hidden; text-align: right; }
1.1 mike 201: .matrix { margin-left: 2em; }
202:
203: .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }
204: .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }
205: .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }
206:
1.32 mike 207: td.eg { border-width: thin; text-align: center; }
208:
1.1 mike 209: #table-example-1 { border: solid thin; border-collapse: collapse; margin-left: 3em; }
210: #table-example-1 * { font-family: "Essays1743", serif; line-height: 1.01em; }
211: #table-example-1 caption { padding-bottom: 0.5em; }
212: #table-example-1 thead, #table-example-1 tbody { border: none; }
213: #table-example-1 th, #table-example-1 td { border: solid thin; }
214: #table-example-1 th { font-weight: normal; }
215: #table-example-1 td { border-style: none solid; vertical-align: top; }
216: #table-example-1 th { padding: 0.5em; vertical-align: middle; text-align: center; }
217: #table-example-1 tbody tr:first-child td { padding-top: 0.5em; }
218: #table-example-1 tbody tr:last-child td { padding-bottom: 1.5em; }
219: #table-example-1 tbody td:first-child { padding-left: 2.5em; padding-right: 0; width: 9em; }
220: #table-example-1 tbody td:first-child::after { content: leader(". "); }
221: #table-example-1 tbody td { padding-left: 2em; padding-right: 2em; }
222: #table-example-1 tbody td:first-child + td { width: 10em; }
223: #table-example-1 tbody td:first-child + td ~ td { width: 2.5em; }
224: #table-example-1 tbody td:first-child + td + td + td ~ td { width: 1.25em; }
225:
226: .apple-table-examples { border: none; border-collapse: separate; border-spacing: 1.5em 0em; width: 40em; margin-left: 3em; }
227: .apple-table-examples * { font-family: "Times", serif; }
228: .apple-table-examples td, .apple-table-examples th { border: none; white-space: nowrap; padding-top: 0; padding-bottom: 0; }
229: .apple-table-examples tbody th:first-child { border-left: none; width: 100%; }
230: .apple-table-examples thead th:first-child ~ th { font-size: smaller; font-weight: bolder; border-bottom: solid 2px; text-align: center; }
231: .apple-table-examples tbody th::after, .apple-table-examples tfoot th::after { content: leader(". ") }
232: .apple-table-examples tbody th, .apple-table-examples tfoot th { font: inherit; text-align: left; }
233: .apple-table-examples td { text-align: right; vertical-align: top; }
234: .apple-table-examples.e1 tbody tr:last-child td { border-bottom: solid 1px; }
235: .apple-table-examples.e1 tbody + tbody tr:last-child td { border-bottom: double 3px; }
236: .apple-table-examples.e2 th[scope=row] { padding-left: 1em; }
237: .apple-table-examples sup { line-height: 0; }
238:
239: .details-example img { vertical-align: top; }
240:
241: #named-character-references-table {
1.41 mike 242: white-space: nowrap;
1.1 mike 243: font-size: 0.6em;
1.41 mike 244: column-width: 30em;
1.1 mike 245: column-gap: 1em;
1.41 mike 246: -moz-column-width: 30em;
1.1 mike 247: -moz-column-gap: 1em;
1.41 mike 248: -webkit-column-width: 30em;
1.1 mike 249: -webkit-column-gap: 1em;
250: }
1.41 mike 251: #named-character-references-table > table > tbody > tr > td:first-child + td,
1.1 mike 252: #named-character-references-table > table > tbody > tr > td:last-child { text-align: center; }
253: #named-character-references-table > table > tbody > tr > td:last-child:hover > span { position: absolute; top: auto; left: auto; margin-left: 0.5em; line-height: 1.2; font-size: 5em; border: outset; padding: 0.25em 0.5em; background: white; width: 1.25em; height: auto; text-align: center; }
1.41 mike 254: #named-character-references-table > table > tbody > tr#entity-CounterClockwiseContourIntegral > td:first-child { font-size: 0.5em; }
1.1 mike 255:
1.2 mike 256: .glyph.control { color: red; }
257:
1.4 mike 258: @font-face {
259: font-family: 'Essays1743';
260: src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743.ttf');
261: }
262: @font-face {
263: font-family: 'Essays1743';
264: font-weight: bold;
265: src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743-Bold.ttf');
266: }
267: @font-face {
268: font-family: 'Essays1743';
269: font-style: italic;
270: src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743-Italic.ttf');
271: }
272: @font-face {
273: font-family: 'Essays1743';
274: font-style: italic;
275: font-weight: bold;
276: src: url('http://www.whatwg.org/specs/web-apps/current-work/fonts/Essays1743-BoldItalic.ttf');
277: }
278:
1.1 mike 279: </style><style type="text/css">
280: .domintro:before { display: table; margin: -1em -0.5em -0.5em auto; width: auto; content: 'This box is non-normative. Implementation requirements are given below this box.'; color: black; font-style: italic; border: solid 2px; background: white; padding: 0 0.25em; }
281: </style><link href="data:text/css," id="complete" rel="stylesheet" title="Complete specification"><link href="data:text/css,.impl%20%7B%20display:%20none;%20%7D%0Ahtml%20%7B%20border:%20solid%20yellow;%20%7D%20.domintro:before%20%7B%20display:%20none;%20%7D" id="author" rel="alternate stylesheet" title="Author documentation only"><link href="data:text/css,.impl%20%7B%20background:%20%23FFEEEE;%20%7D%20.domintro:before%20%7B%20background:%20%23FFEEEE;%20%7D" id="highlight" rel="alternate stylesheet" title="Highlight implementation requirements"><script type="text/javascript">
282: function getCookie(name) {
283: var params = location.search.substr(1).split("&");
284: for (var index = 0; index < params.length; index++) {
285: if (params[index] == name)
286: return "1";
287: var data = params[index].split("=");
288: if (data[0] == name)
289: return unescape(data[1]);
290: }
291: var cookies = document.cookie.split("; ");
292: for (var index = 0; index < cookies.length; index++) {
293: var data = cookies[index].split("=");
294: if (data[0] == name)
295: return unescape(data[1]);
296: }
297: return null;
298: }
299: function load(script) {
300: var e = document.createElement('script');
1.46 mike 301: e.setAttribute('src', 'http://www.whatwg.org/specs/web-apps/current-work/' + script + '?' + encodeURIComponent(location) + '&' + encodeURIComponent(document.referrer));
1.1 mike 302: document.body.appendChild(e);
303: }
304: function init() {
305: if (location.search == '?slow-browser')
306: return;
307: var configUI = document.createElement('div');
308: configUI.id = 'configUI';
309: document.body.appendChild(configUI);
310: // load('reviewer.js'); // would need cross-site XHR
311: if (document.getElementById('head'))
312: load('toc.js');
313: load('styler.js');
314: // load('updater.js'); // would need cross-site XHR
315: load('dfn.js'); // doesn't support split-out specs, but, oh well.
316: // load('status.js'); // would need cross-site XHR
317: if (getCookie('profile') == '1')
318: document.getElementsByTagName('h2')[0].textContent += '; load: ' + (new Date() - loadTimer) + 'ms';
319: fixBrokenLink();
320: }
1.46 mike 321: </script><link href="http://www.w3.org/StyleSheets/TR/W3C-ED" rel="stylesheet" type="text/css">
1.1 mike 322: <script src="link-fixup.js"></script>
323: <link href="parsing.html" title="8.2 Parsing HTML documents" rel="prev">
324: <link href="spec.html#contents" title="Table of contents" rel="index">
325: <link href="the-end.html" title="8.2.6 The end" rel="next">
326: </head><body onload="fixBrokenLink(); init()"><div class="head" id="head">
327: <p><a href="http://www.w3.org/"><img alt="W3C" height="48" src="http://www.w3.org/Icons/w3c_home" width="72"></a></p>
1.3 mike 328:
1.1 mike 329: <h1>HTML5</h1>
330: </div><div>
331: <a href="parsing.html">← 8.2 Parsing HTML documents</a> –
332: <a href="spec.html#contents">Table of contents</a> –
333: <a href="the-end.html">8.2.6 The end →</a>
334: <ol class="toc"><li><ol><li><ol><li><a href="tokenization.html#tokenization"><span class="secno">8.2.4 </span>Tokenization</a>
1.37 mike 335: <ol><li><a href="tokenization.html#data-state"><span class="secno">8.2.4.1 </span>Data state</a></li><li><a href="tokenization.html#character-reference-in-data-state"><span class="secno">8.2.4.2 </span>Character reference in data state</a></li><li><a href="tokenization.html#rcdata-state"><span class="secno">8.2.4.3 </span>RCDATA state</a></li><li><a href="tokenization.html#character-reference-in-rcdata-state"><span class="secno">8.2.4.4 </span>Character reference in RCDATA state</a></li><li><a href="tokenization.html#rawtext-state"><span class="secno">8.2.4.5 </span>RAWTEXT state</a></li><li><a href="tokenization.html#script-data-state"><span class="secno">8.2.4.6 </span>Script data state</a></li><li><a href="tokenization.html#plaintext-state"><span class="secno">8.2.4.7 </span>PLAINTEXT state</a></li><li><a href="tokenization.html#tag-open-state"><span class="secno">8.2.4.8 </span>Tag open state</a></li><li><a href="tokenization.html#end-tag-open-state"><span class="secno">8.2.4.9 </span>End tag open state</a></li><li><a href="tokenization.html#tag-name-state"><span class="secno">8.2.4.10 </span>Tag name state</a></li><li><a href="tokenization.html#rcdata-less-than-sign-state"><span class="secno">8.2.4.11 </span>RCDATA less-than sign state</a></li><li><a href="tokenization.html#rcdata-end-tag-open-state"><span class="secno">8.2.4.12 </span>RCDATA end tag open state</a></li><li><a href="tokenization.html#rcdata-end-tag-name-state"><span class="secno">8.2.4.13 </span>RCDATA end tag name state</a></li><li><a href="tokenization.html#rawtext-less-than-sign-state"><span class="secno">8.2.4.14 </span>RAWTEXT less-than sign state</a></li><li><a href="tokenization.html#rawtext-end-tag-open-state"><span class="secno">8.2.4.15 </span>RAWTEXT end tag open state</a></li><li><a href="tokenization.html#rawtext-end-tag-name-state"><span class="secno">8.2.4.16 </span>RAWTEXT end tag name state</a></li><li><a href="tokenization.html#script-data-less-than-sign-state"><span class="secno">8.2.4.17 </span>Script data less-than sign state</a></li><li><a href="tokenization.html#script-data-end-tag-open-state"><span class="secno">8.2.4.18 </span>Script data end tag open state</a></li><li><a href="tokenization.html#script-data-end-tag-name-state"><span class="secno">8.2.4.19 </span>Script data end tag name state</a></li><li><a href="tokenization.html#script-data-escape-start-state"><span class="secno">8.2.4.20 </span>Script data escape start state</a></li><li><a href="tokenization.html#script-data-escape-start-dash-state"><span class="secno">8.2.4.21 </span>Script data escape start dash state</a></li><li><a href="tokenization.html#script-data-escaped-state"><span class="secno">8.2.4.22 </span>Script data escaped state</a></li><li><a href="tokenization.html#script-data-escaped-dash-state"><span class="secno">8.2.4.23 </span>Script data escaped dash state</a></li><li><a href="tokenization.html#script-data-escaped-dash-dash-state"><span class="secno">8.2.4.24 </span>Script data escaped dash dash state</a></li><li><a href="tokenization.html#script-data-escaped-less-than-sign-state"><span class="secno">8.2.4.25 </span>Script data escaped less-than sign state</a></li><li><a href="tokenization.html#script-data-escaped-end-tag-open-state"><span class="secno">8.2.4.26 </span>Script data escaped end tag open state</a></li><li><a href="tokenization.html#script-data-escaped-end-tag-name-state"><span class="secno">8.2.4.27 </span>Script data escaped end tag name state</a></li><li><a href="tokenization.html#script-data-double-escape-start-state"><span class="secno">8.2.4.28 </span>Script data double escape start state</a></li><li><a href="tokenization.html#script-data-double-escaped-state"><span class="secno">8.2.4.29 </span>Script data double escaped state</a></li><li><a href="tokenization.html#script-data-double-escaped-dash-state"><span class="secno">8.2.4.30 </span>Script data double escaped dash state</a></li><li><a href="tokenization.html#script-data-double-escaped-dash-dash-state"><span class="secno">8.2.4.31 </span>Script data double escaped dash dash state</a></li><li><a href="tokenization.html#script-data-double-escaped-less-than-sign-state"><span class="secno">8.2.4.32 </span>Script data double escaped less-than sign state</a></li><li><a href="tokenization.html#script-data-double-escape-end-state"><span class="secno">8.2.4.33 </span>Script data double escape end state</a></li><li><a href="tokenization.html#before-attribute-name-state"><span class="secno">8.2.4.34 </span>Before attribute name state</a></li><li><a href="tokenization.html#attribute-name-state"><span class="secno">8.2.4.35 </span>Attribute name state</a></li><li><a href="tokenization.html#after-attribute-name-state"><span class="secno">8.2.4.36 </span>After attribute name state</a></li><li><a href="tokenization.html#before-attribute-value-state"><span class="secno">8.2.4.37 </span>Before attribute value state</a></li><li><a href="tokenization.html#attribute-value-double-quoted-state"><span class="secno">8.2.4.38 </span>Attribute value (double-quoted) state</a></li><li><a href="tokenization.html#attribute-value-single-quoted-state"><span class="secno">8.2.4.39 </span>Attribute value (single-quoted) state</a></li><li><a href="tokenization.html#attribute-value-unquoted-state"><span class="secno">8.2.4.40 </span>Attribute value (unquoted) state</a></li><li><a href="tokenization.html#character-reference-in-attribute-value-state"><span class="secno">8.2.4.41 </span>Character reference in attribute value state</a></li><li><a href="tokenization.html#after-attribute-value-quoted-state"><span class="secno">8.2.4.42 </span>After attribute value (quoted) state</a></li><li><a href="tokenization.html#self-closing-start-tag-state"><span class="secno">8.2.4.43 </span>Self-closing start tag state</a></li><li><a href="tokenization.html#bogus-comment-state"><span class="secno">8.2.4.44 </span>Bogus comment state</a></li><li><a href="tokenization.html#markup-declaration-open-state"><span class="secno">8.2.4.45 </span>Markup declaration open state</a></li><li><a href="tokenization.html#comment-start-state"><span class="secno">8.2.4.46 </span>Comment start state</a></li><li><a href="tokenization.html#comment-start-dash-state"><span class="secno">8.2.4.47 </span>Comment start dash state</a></li><li><a href="tokenization.html#comment-state"><span class="secno">8.2.4.48 </span>Comment state</a></li><li><a href="tokenization.html#comment-end-dash-state"><span class="secno">8.2.4.49 </span>Comment end dash state</a></li><li><a href="tokenization.html#comment-end-state"><span class="secno">8.2.4.50 </span>Comment end state</a></li><li><a href="tokenization.html#comment-end-bang-state"><span class="secno">8.2.4.51 </span>Comment end bang state</a></li><li><a href="tokenization.html#doctype-state"><span class="secno">8.2.4.52 </span>DOCTYPE state</a></li><li><a href="tokenization.html#before-doctype-name-state"><span class="secno">8.2.4.53 </span>Before DOCTYPE name state</a></li><li><a href="tokenization.html#doctype-name-state"><span class="secno">8.2.4.54 </span>DOCTYPE name state</a></li><li><a href="tokenization.html#after-doctype-name-state"><span class="secno">8.2.4.55 </span>After DOCTYPE name state</a></li><li><a href="tokenization.html#after-doctype-public-keyword-state"><span class="secno">8.2.4.56 </span>After DOCTYPE public keyword state</a></li><li><a href="tokenization.html#before-doctype-public-identifier-state"><span class="secno">8.2.4.57 </span>Before DOCTYPE public identifier state</a></li><li><a href="tokenization.html#doctype-public-identifier-double-quoted-state"><span class="secno">8.2.4.58 </span>DOCTYPE public identifier (double-quoted) state</a></li><li><a href="tokenization.html#doctype-public-identifier-single-quoted-state"><span class="secno">8.2.4.59 </span>DOCTYPE public identifier (single-quoted) state</a></li><li><a href="tokenization.html#after-doctype-public-identifier-state"><span class="secno">8.2.4.60 </span>After DOCTYPE public identifier state</a></li><li><a href="tokenization.html#between-doctype-public-and-system-identifiers-state"><span class="secno">8.2.4.61 </span>Between DOCTYPE public and system identifiers state</a></li><li><a href="tokenization.html#after-doctype-system-keyword-state"><span class="secno">8.2.4.62 </span>After DOCTYPE system keyword state</a></li><li><a href="tokenization.html#before-doctype-system-identifier-state"><span class="secno">8.2.4.63 </span>Before DOCTYPE system identifier state</a></li><li><a href="tokenization.html#doctype-system-identifier-double-quoted-state"><span class="secno">8.2.4.64 </span>DOCTYPE system identifier (double-quoted) state</a></li><li><a href="tokenization.html#doctype-system-identifier-single-quoted-state"><span class="secno">8.2.4.65 </span>DOCTYPE system identifier (single-quoted) state</a></li><li><a href="tokenization.html#after-doctype-system-identifier-state"><span class="secno">8.2.4.66 </span>After DOCTYPE system identifier state</a></li><li><a href="tokenization.html#bogus-doctype-state"><span class="secno">8.2.4.67 </span>Bogus DOCTYPE state</a></li><li><a href="tokenization.html#cdata-section-state"><span class="secno">8.2.4.68 </span>CDATA section state</a></li><li><a href="tokenization.html#tokenizing-character-references"><span class="secno">8.2.4.69 </span>Tokenizing character references</a></li></ol></li><li><a href="tokenization.html#tree-construction"><span class="secno">8.2.5 </span>Tree construction</a>
1.1 mike 336: <ol><li><a href="tokenization.html#creating-and-inserting-elements"><span class="secno">8.2.5.1 </span>Creating and inserting elements</a></li><li><a href="tokenization.html#closing-elements-that-have-implied-end-tags"><span class="secno">8.2.5.2 </span>Closing elements that have implied end tags</a></li><li><a href="tokenization.html#foster-parenting"><span class="secno">8.2.5.3 </span>Foster parenting</a></li><li><a href="tokenization.html#the-initial-insertion-mode"><span class="secno">8.2.5.4 </span>The "initial" insertion mode</a></li><li><a href="tokenization.html#the-before-html-insertion-mode"><span class="secno">8.2.5.5 </span>The "before html" insertion mode</a></li><li><a href="tokenization.html#the-before-head-insertion-mode"><span class="secno">8.2.5.6 </span>The "before head" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inhead"><span class="secno">8.2.5.7 </span>The "in head" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inheadnoscript"><span class="secno">8.2.5.8 </span>The "in head noscript" insertion mode</a></li><li><a href="tokenization.html#the-after-head-insertion-mode"><span class="secno">8.2.5.9 </span>The "after head" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inbody"><span class="secno">8.2.5.10 </span>The "in body" insertion mode</a></li><li><a href="tokenization.html#parsing-main-incdata"><span class="secno">8.2.5.11 </span>The "text" insertion mode</a></li><li><a href="tokenization.html#parsing-main-intable"><span class="secno">8.2.5.12 </span>The "in table" insertion mode</a></li><li><a href="tokenization.html#parsing-main-intabletext"><span class="secno">8.2.5.13 </span>The "in table text" insertion mode</a></li><li><a href="tokenization.html#parsing-main-incaption"><span class="secno">8.2.5.14 </span>The "in caption" insertion mode</a></li><li><a href="tokenization.html#parsing-main-incolgroup"><span class="secno">8.2.5.15 </span>The "in column group" insertion mode</a></li><li><a href="tokenization.html#parsing-main-intbody"><span class="secno">8.2.5.16 </span>The "in table body" insertion mode</a></li><li><a href="tokenization.html#parsing-main-intr"><span class="secno">8.2.5.17 </span>The "in row" insertion mode</a></li><li><a href="tokenization.html#parsing-main-intd"><span class="secno">8.2.5.18 </span>The "in cell" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inselect"><span class="secno">8.2.5.19 </span>The "in select" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inselectintable"><span class="secno">8.2.5.20 </span>The "in select in table" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inforeign"><span class="secno">8.2.5.21 </span>The "in foreign content" insertion mode</a></li><li><a href="tokenization.html#parsing-main-afterbody"><span class="secno">8.2.5.22 </span>The "after body" insertion mode</a></li><li><a href="tokenization.html#parsing-main-inframeset"><span class="secno">8.2.5.23 </span>The "in frameset" insertion mode</a></li><li><a href="tokenization.html#parsing-main-afterframeset"><span class="secno">8.2.5.24 </span>The "after frameset" insertion mode</a></li><li><a href="tokenization.html#the-after-after-body-insertion-mode"><span class="secno">8.2.5.25 </span>The "after after body" insertion mode</a></li><li><a href="tokenization.html#the-after-after-frameset-insertion-mode"><span class="secno">8.2.5.26 </span>The "after after frameset" insertion mode</a></li></ol></li></ol></li></ol></li></ol></div>
337:
338: <div class="impl">
339:
1.29 mike 340: <h4 id="tokenization"><span class="secno">8.2.4 </span><dfn>Tokenization</dfn></h4>
1.1 mike 341:
342: <p>Implementations must act as if they used the following state
343: machine to tokenize HTML. The state machine must start in the
344: <a href="#data-state">data state</a>. Most states consume a single character,
345: which may have various side-effects, and either switches the state
346: machine to a new state to <em>reconsume</em> the same character, or
347: switches it to a new state (to consume the next character), or
348: repeats the same state (to consume the next character). Some states
349: have more complicated behavior and can consume several characters
350: before switching to another state. In some cases, the tokenizer
351: state is also changed by the tree construction stage.</p>
352:
353: <p>The exact behavior of certain states depends on the
354: <a href="parsing.html#insertion-mode">insertion mode</a> and the <a href="parsing.html#stack-of-open-elements">stack of open
355: elements</a>. Certain states also use a <dfn id="temporary-buffer"><var>temporary
356: buffer</var></dfn> to track progress.</p>
357:
358: <p>The output of the tokenization step is a series of zero or more
359: of the following tokens: DOCTYPE, start tag, end tag, comment,
360: character, end-of-file. DOCTYPE tokens have a name, a public
361: identifier, a system identifier, and a <i>force-quirks
362: flag</i>. When a DOCTYPE token is created, its name, public
363: identifier, and system identifier must be marked as missing (which
364: is a distinct state from the empty string), and the <i>force-quirks
365: flag</i> must be set to <i>off</i> (its other state is
366: <i>on</i>). Start and end tag tokens have a tag name, a
367: <i>self-closing flag</i>, and a list of attributes, each of which
368: has a name and a value. When a start or end tag token is created,
369: its <i>self-closing flag</i> must be unset (its other state is that
370: it be set), and its attributes list must be empty. Comment and
371: character tokens have data.</p>
372:
373: <p>When a token is emitted, it must immediately be handled by the
374: <a href="#tree-construction">tree construction</a> stage. The tree construction stage
375: can affect the state of the tokenization stage, and can insert
376: additional characters into the stream. (For example, the
377: <code><a href="scripting-1.html#script">script</a></code> element can result in scripts executing and
378: using the <a href="apis-in-html-documents.html#dynamic-markup-insertion">dynamic markup insertion</a> APIs to insert
379: characters into the stream being tokenized.)</p>
380:
381: <p>When a start tag token is emitted with its <i>self-closing
382: flag</i> set, if the flag is not <dfn id="acknowledge-self-closing-flag" title="acknowledge
383: self-closing flag">acknowledged</dfn> when it is processed by the
384: tree construction stage, that is a <a href="parsing.html#parse-error">parse error</a>.</p>
385:
386: <p>When an end tag token is emitted with attributes, that is a
387: <a href="parsing.html#parse-error">parse error</a>.</p>
388:
389: <p>When an end tag token is emitted with its <i>self-closing
390: flag</i> set, that is a <a href="parsing.html#parse-error">parse error</a>.</p>
391:
392: <p>An <dfn id="appropriate-end-tag-token">appropriate end tag token</dfn> is an end tag token whose
393: tag name matches the tag name of the last start tag to have been
394: emitted from this tokenizer, if any. If no start tag has been
395: emitted from this tokenizer, then no end tag token is
396: appropriate.</p>
397:
398: <p>Before each step of the tokenizer, the user agent must first
399: check the <a href="parsing.html#parser-pause-flag">parser pause flag</a>. If it is true, then the
400: tokenizer must abort the processing of any nested invocations of the
401: tokenizer, yielding control back to the caller.</p>
402:
403: <p>The tokenizer state machine consists of the states defined in the
404: following subsections.</p>
405:
406:
407: <!-- Order of the lists below is supposed to be non-error then
408: error, by unicode, then EOF, ending with "anything else" -->
409:
410:
1.29 mike 411: <h5 id="data-state"><span class="secno">8.2.4.1 </span><dfn>Data state</dfn></h5>
1.1 mike 412:
413: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
414:
415: <dl class="switch"><dt>U+0026 AMPERSAND (&)</dt>
416: <dd>Switch to the <a href="#character-reference-in-data-state">character reference in data
417: state</a>.</dd>
418:
419: <dt>U+003C LESS-THAN SIGN (<)</dt>
420: <dd>Switch to the <a href="#tag-open-state">tag open state</a>.</dd>
421:
422: <dt>EOF</dt>
423: <dd>Emit an end-of-file token.</dd>
424:
425: <dt>Anything else</dt>
426: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 427: token.</dd>
1.1 mike 428:
1.29 mike 429: </dl><h5 id="character-reference-in-data-state"><span class="secno">8.2.4.2 </span><dfn>Character reference in data state</dfn></h5>
1.1 mike 430:
431: <p>Attempt to <a href="#consume-a-character-reference">consume a character reference</a>, with no
432: <a href="#additional-allowed-character">additional allowed character</a>.</p>
433:
1.18 mike 434: <p>If nothing is returned, emit a U+0026 AMPERSAND character (&)
1.1 mike 435: token.</p>
436:
437: <p>Otherwise, emit the character token that was returned.</p>
438:
439: <p>Finally, switch to the <a href="#data-state">data state</a>.</p>
440:
441:
1.29 mike 442: <h5 id="rcdata-state"><span class="secno">8.2.4.3 </span><dfn>RCDATA state</dfn></h5>
1.1 mike 443:
444: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
445:
446: <dl class="switch"><dt>U+0026 AMPERSAND (&)</dt>
447: <dd>Switch to the <a href="#character-reference-in-rcdata-state">character reference in RCDATA
448: state</a>.</dd>
449:
450: <dt>U+003C LESS-THAN SIGN (<)</dt>
451: <dd>Switch to the <a href="#rcdata-less-than-sign-state">RCDATA less-than sign state</a>.</dd>
452:
453: <dt>EOF</dt>
454: <dd>Emit an end-of-file token.</dd>
455:
456: <dt>Anything else</dt>
457: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 458: token.</dd>
1.1 mike 459:
460: </dl><h5 id="character-reference-in-rcdata-state"><span class="secno">8.2.4.4 </span><dfn>Character reference in RCDATA state</dfn></h5>
461:
462: <p>Attempt to <a href="#consume-a-character-reference">consume a character reference</a>, with no
463: <a href="#additional-allowed-character">additional allowed character</a>.</p>
464:
1.18 mike 465: <p>If nothing is returned, emit a U+0026 AMPERSAND character (&)
1.1 mike 466: token.</p>
467:
468: <p>Otherwise, emit the character token that was returned.</p>
469:
470: <p>Finally, switch to the <a href="#rcdata-state">RCDATA state</a>.</p>
471:
472:
1.29 mike 473: <h5 id="rawtext-state"><span class="secno">8.2.4.5 </span><dfn>RAWTEXT state</dfn></h5>
1.1 mike 474:
475: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
476:
477: <dl class="switch"><dt>U+003C LESS-THAN SIGN (<)</dt>
478: <dd>Switch to the <a href="#rawtext-less-than-sign-state">RAWTEXT less-than sign state</a>.</dd>
479:
480: <dt>EOF</dt>
481: <dd>Emit an end-of-file token.</dd>
482:
483: <dt>Anything else</dt>
484: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 485: token.</dd>
1.1 mike 486:
1.29 mike 487: </dl><h5 id="script-data-state"><span class="secno">8.2.4.6 </span><dfn>Script data state</dfn></h5>
1.1 mike 488:
489: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
490:
491: <dl class="switch"><dt>U+003C LESS-THAN SIGN (<)</dt>
492: <dd>Switch to the <a href="#script-data-less-than-sign-state">script data less-than sign state</a>.</dd>
493:
494: <dt>EOF</dt>
495: <dd>Emit an end-of-file token.</dd>
496:
497: <dt>Anything else</dt>
498: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 499: token.</dd>
1.1 mike 500:
1.29 mike 501: </dl><h5 id="plaintext-state"><span class="secno">8.2.4.7 </span><dfn>PLAINTEXT state</dfn></h5>
1.1 mike 502:
503: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
504:
505: <dl class="switch"><dt>EOF</dt>
506: <dd>Emit an end-of-file token.</dd>
507:
508: <dt>Anything else</dt>
509: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 510: token.</dd>
1.1 mike 511:
1.29 mike 512: </dl><h5 id="tag-open-state"><span class="secno">8.2.4.8 </span><dfn>Tag open state</dfn></h5>
1.1 mike 513:
514: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
515:
516: <dl class="switch"><dt>U+0021 EXCLAMATION MARK (!)</dt>
517: <dd>Switch to the <a href="#markup-declaration-open-state">markup declaration open state</a>.</dd>
518:
519: <dt>U+002F SOLIDUS (/)</dt>
520: <dd>Switch to the <a href="#end-tag-open-state">end tag open state</a>.</dd>
521:
522: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
523: <dd>Create a new start tag token, set its tag name to the
524: lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to the
525: character's code point), then switch to the <a href="#tag-name-state">tag name
526: state</a>. (Don't emit the token yet; further details will
527: be filled in before it is emitted.)</dd>
528:
529: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
530: <dd>Create a new start tag token, set its tag name to the
531: <a href="parsing.html#current-input-character">current input character</a>, then switch to the <a href="#tag-name-state">tag
532: name state</a>. (Don't emit the token yet; further details will
533: be filled in before it is emitted.)</dd>
534:
535: <dt>U+003F QUESTION MARK (?)</dt>
536: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#bogus-comment-state">bogus
537: comment state</a>.</dd>
538:
539: <dt>Anything else</dt>
540: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+003C LESS-THAN SIGN
541: character token and reconsume the <a href="parsing.html#current-input-character">current input
542: character</a> in the <a href="#data-state">data state</a>.</dd>
543:
544: </dl><h5 id="end-tag-open-state"><span class="secno">8.2.4.9 </span><dfn>End tag open state</dfn></h5>
545:
546: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
547:
548: <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
549: <dd>Create a new end tag token, set its tag name to the lowercase
550: version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to
551: the character's code point), then switch to the <a href="#tag-name-state">tag name
552: state</a>. (Don't emit the token yet; further details will be
553: filled in before it is emitted.)</dd>
554:
555: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
556: <dd>Create a new end tag token, set its tag name to the
557: <a href="parsing.html#current-input-character">current input character</a>, then switch to the <a href="#tag-name-state">tag
558: name state</a>. (Don't emit the token yet; further details will
559: be filled in before it is emitted.)</dd>
560:
561: <dt>U+003E GREATER-THAN SIGN (>)</dt>
562: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
563: state</a>.</dd>
564:
565: <dt>EOF</dt>
566: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+003C LESS-THAN SIGN
567: character token and a U+002F SOLIDUS character token. Reconsume
568: the EOF character in the <a href="#data-state">data state</a>.</dd>
569:
570: <dt>Anything else</dt>
571: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#bogus-comment-state">bogus
572: comment state</a>.</dd>
573:
1.29 mike 574: </dl><h5 id="tag-name-state"><span class="secno">8.2.4.10 </span><dfn>Tag name state</dfn></h5>
1.1 mike 575:
576: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
577:
578: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
579: <dt>U+000A LINE FEED (LF)</dt>
580: <dt>U+000C FORM FEED (FF)</dt>
581: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
582: <dt>U+0020 SPACE</dt>
583: <dd>Switch to the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>
584:
585: <dt>U+002F SOLIDUS (/)</dt>
586: <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>
587:
588: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 589: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
590: token.</dd>
1.1 mike 591:
592: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
593: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
594: character</a> (add 0x0020 to the character's code point) to the
1.14 mike 595: current tag token's tag name.</dd>
1.1 mike 596:
597: <dt>EOF</dt>
598: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
599: <a href="#data-state">data state</a>.</dd>
600:
601: <dt>Anything else</dt>
602: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 603: tag token's tag name.</dd>
1.1 mike 604:
1.29 mike 605: </dl><h5 id="rcdata-less-than-sign-state"><span class="secno">8.2.4.11 </span><dfn>RCDATA less-than sign state</dfn></h5>
1.1 mike 606: <!-- identical to the RAWTEXT less-than sign state, except s/RAWTEXT/RCDATA/g -->
607:
608: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
609:
610: <dl class="switch"><dt>U+002F SOLIDUS (/)</dt>
611: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
612: to the <a href="#rcdata-end-tag-open-state">RCDATA end tag open state</a>.</dd>
613:
614: <dt>Anything else</dt>
615: <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
616: <a href="parsing.html#current-input-character">current input character</a> in the <a href="#rcdata-state">RCDATA
617: state</a>.</dd>
618:
1.29 mike 619: </dl><h5 id="rcdata-end-tag-open-state"><span class="secno">8.2.4.12 </span><dfn>RCDATA end tag open state</dfn></h5>
1.1 mike 620: <!-- identical to the RAWTEXT (and Script data) end tag open state, except s/RAWTEXT/RCDATA/g -->
621:
622: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
623:
624: <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
625: <dd>Create a new end tag token, and set its tag name to the
626: lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
627: 0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
628: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
629: switch to the <a href="#rcdata-end-tag-name-state">RCDATA end tag name state</a>. (Don't emit
630: the token yet; further details will be filled in before it is
631: emitted.)</dd>
632:
633: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
634: <dd>Create a new end tag token, and set its tag name to the
635: <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
636: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
637: switch to the <a href="#rcdata-end-tag-name-state">RCDATA end tag name state</a>. (Don't emit
638: the token yet; further details will be filled in before it is
639: emitted.)</dd>
640:
641: <dt>Anything else</dt>
642: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
643: character token, and reconsume the <a href="parsing.html#current-input-character">current input
644: character</a> in the <a href="#rcdata-state">RCDATA state</a>.</dd>
645:
1.29 mike 646: </dl><h5 id="rcdata-end-tag-name-state"><span class="secno">8.2.4.13 </span><dfn>RCDATA end tag name state</dfn></h5>
1.1 mike 647: <!-- identical to the RAWTEXT (and Script data) end tag name state, except s/RAWTEXT/RCDATA/g -->
648:
649: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
650:
651: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
652: <dt>U+000A LINE FEED (LF)</dt>
653: <dt>U+000C FORM FEED (FF)</dt>
654: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
655: <dt>U+0020 SPACE</dt>
656: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
657: token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
658: state</a>. Otherwise, treat it as per the "anything else" entry
659: below.</dd>
660:
661: <dt>U+002F SOLIDUS (/)</dt>
662: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
663: token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
664: state</a>. Otherwise, treat it as per the "anything else" entry
665: below.</dd>
666:
667: <dt>U+003E GREATER-THAN SIGN (>)</dt>
668: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
669: token</a>, then emit the current tag token and switch to the
670: <a href="#data-state">data state</a>. Otherwise, treat it as per the "anything
671: else" entry below.</dd>
672:
673: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
674: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
675: character</a> (add 0x0020 to the character's code point) to the
676: current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 677: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 678:
679: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
680: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
681: tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 682: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 683:
684: <dt>Anything else</dt>
685: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
686: character token, a character token for each of the characters in
687: the <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to
688: the buffer), and reconsume the <a href="parsing.html#current-input-character">current input character</a>
689: in the <a href="#rcdata-state">RCDATA state</a>.</dd>
690:
1.29 mike 691: </dl><h5 id="rawtext-less-than-sign-state"><span class="secno">8.2.4.14 </span><dfn>RAWTEXT less-than sign state</dfn></h5>
1.1 mike 692: <!-- identical to the RCDATA less-than sign state, except s/RCDATA/RAWTEXT/g -->
693:
694: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
695:
696: <dl class="switch"><dt>U+002F SOLIDUS (/)</dt>
697: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
698: to the <a href="#rawtext-end-tag-open-state">RAWTEXT end tag open state</a>.</dd>
699:
700: <dt>Anything else</dt>
701: <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
702: <a href="parsing.html#current-input-character">current input character</a> in the <a href="#rawtext-state">RAWTEXT
703: state</a>.</dd>
704:
1.29 mike 705: </dl><h5 id="rawtext-end-tag-open-state"><span class="secno">8.2.4.15 </span><dfn>RAWTEXT end tag open state</dfn></h5>
1.1 mike 706: <!-- identical to the RCDATA (and Script data) end tag open state, except s/RCDATA/RAWTEXT/g -->
707:
708: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
709:
710: <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
711: <dd>Create a new end tag token, and set its tag name to the
712: lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
713: 0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
714: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
715: switch to the <a href="#rawtext-end-tag-name-state">RAWTEXT end tag name state</a>. (Don't emit
716: the token yet; further details will be filled in before it is
717: emitted.)</dd>
718:
719: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
720: <dd>Create a new end tag token, and set its tag name to the
721: <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
722: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
723: switch to the <a href="#rawtext-end-tag-name-state">RAWTEXT end tag name state</a>. (Don't emit
724: the token yet; further details will be filled in before it is
725: emitted.)</dd>
726:
727: <dt>Anything else</dt>
728: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
729: character token, and reconsume the <a href="parsing.html#current-input-character">current input
730: character</a> in the <a href="#rawtext-state">RAWTEXT state</a>.</dd>
731:
1.29 mike 732: </dl><h5 id="rawtext-end-tag-name-state"><span class="secno">8.2.4.16 </span><dfn>RAWTEXT end tag name state</dfn></h5>
1.1 mike 733: <!-- identical to the RCDATA (and Script data) end tag name state, except s/RCDATA/RAWTEXT/g -->
734:
735: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
736:
737: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
738: <dt>U+000A LINE FEED (LF)</dt>
739: <dt>U+000C FORM FEED (FF)</dt>
740: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
741: <dt>U+0020 SPACE</dt>
742: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
743: token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
744: state</a>. Otherwise, treat it as per the "anything else" entry
745: below.</dd>
746:
747: <dt>U+002F SOLIDUS (/)</dt>
748: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
749: token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
750: state</a>. Otherwise, treat it as per the "anything else" entry
751: below.</dd>
752:
753: <dt>U+003E GREATER-THAN SIGN (>)</dt>
754: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
755: token</a>, then emit the current tag token and switch to the
756: <a href="#data-state">data state</a>. Otherwise, treat it as per the "anything
757: else" entry below.</dd>
758:
759: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
760: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
761: character</a> (add 0x0020 to the character's code point) to the
762: current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 763: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 764:
765: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
766: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
767: tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 768: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 769:
770: <dt>Anything else</dt>
771: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
772: character token, a character token for each of the characters in
773: the <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to
774: the buffer), and reconsume the <a href="parsing.html#current-input-character">current input character</a>
775: in the <a href="#rawtext-state">RAWTEXT state</a>.</dd>
776:
1.29 mike 777: </dl><h5 id="script-data-less-than-sign-state"><span class="secno">8.2.4.17 </span><dfn>Script data less-than sign state</dfn></h5>
1.1 mike 778:
779: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
780:
781: <dl class="switch"><dt>U+002F SOLIDUS (/)</dt>
782: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
783: to the <a href="#script-data-end-tag-open-state">script data end tag open state</a>.</dd>
784:
785: <dt>U+0021 EXCLAMATION MARK (!)</dt>
1.14 mike 786: <dd>Switch to the <a href="#script-data-escape-start-state">script data escape start state</a>. Emit
787: a U+003C LESS-THAN SIGN character token and a U+0021 EXCLAMATION
788: MARK character token.</dd>
1.1 mike 789:
790: <dt>Anything else</dt>
791: <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
792: <a href="parsing.html#current-input-character">current input character</a> in the <a href="#script-data-state">script data
793: state</a>.</dd>
794:
1.29 mike 795: </dl><h5 id="script-data-end-tag-open-state"><span class="secno">8.2.4.18 </span><dfn>Script data end tag open state</dfn></h5>
1.1 mike 796: <!-- identical to the RCDATA (and RAWTEXT) end tag open state, except s/RCDATA/Script data/g -->
797:
798: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
799:
800: <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
801: <dd>Create a new end tag token, and set its tag name to the
802: lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
803: 0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
804: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
805: switch to the <a href="#script-data-end-tag-name-state">script data end tag name state</a>. (Don't emit
806: the token yet; further details will be filled in before it is
807: emitted.)</dd>
808:
809: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
810: <dd>Create a new end tag token, and set its tag name to the
811: <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
812: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
813: switch to the <a href="#script-data-end-tag-name-state">script data end tag name state</a>. (Don't emit
814: the token yet; further details will be filled in before it is
815: emitted.)</dd>
816:
817: <dt>Anything else</dt>
818: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
819: character token, and reconsume the <a href="parsing.html#current-input-character">current input
820: character</a> in the <a href="#script-data-state">script data state</a>.</dd>
821:
1.29 mike 822: </dl><h5 id="script-data-end-tag-name-state"><span class="secno">8.2.4.19 </span><dfn>Script data end tag name state</dfn></h5>
1.1 mike 823: <!-- identical to the RCDATA (and RAWTEXT) end tag name state, except s/RCDATA/Script data/g -->
824:
825: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
826:
827: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
828: <dt>U+000A LINE FEED (LF)</dt>
829: <dt>U+000C FORM FEED (FF)</dt>
830: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
831: <dt>U+0020 SPACE</dt>
832: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
833: token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
834: state</a>. Otherwise, treat it as per the "anything else" entry
835: below.</dd>
836:
837: <dt>U+002F SOLIDUS (/)</dt>
838: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
839: token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
840: state</a>. Otherwise, treat it as per the "anything else" entry
841: below.</dd>
842:
843: <dt>U+003E GREATER-THAN SIGN (>)</dt>
844: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
845: token</a>, then emit the current tag token and switch to the
846: <a href="#data-state">data state</a>. Otherwise, treat it as per the "anything
847: else" entry below.</dd>
848:
849: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
850: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
851: character</a> (add 0x0020 to the character's code point) to the
852: current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 853: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 854:
855: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
856: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
857: tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 858: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 859:
860: <dt>Anything else</dt>
861: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
862: character token, a character token for each of the characters in
863: the <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to
864: the buffer), and reconsume the <a href="parsing.html#current-input-character">current input character</a>
865: in the <a href="#script-data-state">script data state</a>.</dd>
866:
1.29 mike 867: </dl><h5 id="script-data-escape-start-state"><span class="secno">8.2.4.20 </span><dfn>Script data escape start state</dfn></h5>
1.1 mike 868:
869: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
870:
871: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 872: <dd>Switch to the <a href="#script-data-escape-start-dash-state">script data escape start dash
873: state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 874:
875: <dt>Anything else</dt>
876: <dd>Reconsume the <a href="parsing.html#current-input-character">current input character</a> in the
877: <a href="#script-data-state">script data state</a>.</dd>
878:
1.29 mike 879: </dl><h5 id="script-data-escape-start-dash-state"><span class="secno">8.2.4.21 </span><dfn>Script data escape start dash state</dfn></h5>
1.1 mike 880:
881: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
882:
883: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 884: <dd>Switch to the <a href="#script-data-escaped-dash-dash-state">script data escaped dash dash
885: state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 886:
887: <dt>Anything else</dt>
888: <dd>Reconsume the <a href="parsing.html#current-input-character">current input character</a> in the
889: <a href="#script-data-state">script data state</a>.</dd>
890:
1.29 mike 891: </dl><h5 id="script-data-escaped-state"><span class="secno">8.2.4.22 </span><dfn>Script data escaped state</dfn></h5>
1.1 mike 892:
893: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
894:
895: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 896: <dd>Switch to the <a href="#script-data-escaped-dash-state">script data escaped dash state</a>. Emit
897: a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 898:
899: <dt>U+003C LESS-THAN SIGN (<)</dt>
900: <dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
901: state</a>.</p></dd>
902:
903: <dt>EOF</dt>
904: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
905: <a href="#data-state">data state</a>.</dd>
906:
907: <dt>Anything else</dt>
908: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 909: token.</dd>
1.1 mike 910:
1.29 mike 911: </dl><h5 id="script-data-escaped-dash-state"><span class="secno">8.2.4.23 </span><dfn>Script data escaped dash state</dfn></h5>
1.1 mike 912:
913: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
914:
915: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 916: <dd>Switch to the <a href="#script-data-escaped-dash-dash-state">script data escaped dash dash
917: state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 918:
919: <dt>U+003C LESS-THAN SIGN (<)</dt>
920: <dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
921: state</a>.</p></dd>
922:
923: <dt>EOF</dt>
924: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
925: <a href="#data-state">data state</a>.</dd>
926:
927: <dt>Anything else</dt>
1.14 mike 928: <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit the
929: <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>
1.1 mike 930:
1.29 mike 931: </dl><h5 id="script-data-escaped-dash-dash-state"><span class="secno">8.2.4.24 </span><dfn>Script data escaped dash dash state</dfn></h5>
1.1 mike 932:
933: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
934:
935: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 936: <dd>Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 937:
938: <dt>U+003C LESS-THAN SIGN (<)</dt>
939: <dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
940: state</a>.</p></dd>
941:
942: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 943: <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E
944: GREATER-THAN SIGN character token.</dd>
1.1 mike 945:
946: <dt>EOF</dt>
947: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
948: <a href="#data-state">data state</a>.</dd>
949:
950: <dt>Anything else</dt>
1.14 mike 951: <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit the
952: <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>
1.1 mike 953:
1.29 mike 954: </dl><h5 id="script-data-escaped-less-than-sign-state"><span class="secno">8.2.4.25 </span><dfn>Script data escaped less-than sign state</dfn></h5>
1.1 mike 955:
956: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
957:
958: <dl class="switch"><dt>U+002F SOLIDUS (/)</dt>
959: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
960: to the <a href="#script-data-escaped-end-tag-open-state">script data escaped end tag open state</a>.</dd>
961:
962: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1.14 mike 963: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Append
964: the lowercase version of the <a href="parsing.html#current-input-character">current input character</a>
965: (add 0x0020 to the character's code point) to the <var><a href="#temporary-buffer">temporary
1.1 mike 966: buffer</a></var>. Switch to the <a href="#script-data-double-escape-start-state">script data double escape start
1.14 mike 967: state</a>. Emit a U+003C LESS-THAN SIGN character token and the
968: <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>
1.1 mike 969:
970: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
1.14 mike 971: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Append
972: the <a href="parsing.html#current-input-character">current input character</a> to the <var><a href="#temporary-buffer">temporary
1.1 mike 973: buffer</a></var>. Switch to the <a href="#script-data-double-escape-start-state">script data double escape start
1.14 mike 974: state</a>. Emit a U+003C LESS-THAN SIGN character token and the
975: <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>
1.1 mike 976:
977: <dt>Anything else</dt>
978: <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
979: <a href="parsing.html#current-input-character">current input character</a> in the <a href="#script-data-escaped-state">script data
980: escaped state</a>.</dd>
981:
1.29 mike 982: </dl><h5 id="script-data-escaped-end-tag-open-state"><span class="secno">8.2.4.26 </span><dfn>Script data escaped end tag open state</dfn></h5>
1.1 mike 983:
984: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
985:
986: <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
987: <dd>Create a new end tag token, and set its tag name to the
988: lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
989: 0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
990: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
991: switch to the <a href="#script-data-escaped-end-tag-name-state">script data escaped end tag name
992: state</a>. (Don't emit the token yet; further details will be
993: filled in before it is emitted.)</dd>
994:
995: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
996: <dd>Create a new end tag token, and set its tag name to the
997: <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
998: input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
999: switch to the <a href="#script-data-escaped-end-tag-name-state">script data escaped end tag name
1000: state</a>. (Don't emit the token yet; further details will be
1001: filled in before it is emitted.)</dd>
1002:
1003: <dt>Anything else</dt>
1004: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
1005: character token, and reconsume the <a href="parsing.html#current-input-character">current input
1006: character</a> in the <a href="#script-data-escaped-state">script data escaped state</a>.</dd>
1007:
1.29 mike 1008: </dl><h5 id="script-data-escaped-end-tag-name-state"><span class="secno">8.2.4.27 </span><dfn>Script data escaped end tag name state</dfn></h5>
1.1 mike 1009:
1010: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1011:
1012: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1013: <dt>U+000A LINE FEED (LF)</dt>
1014: <dt>U+000C FORM FEED (FF)</dt>
1015: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1016: <dt>U+0020 SPACE</dt>
1017: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
1018: token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
1019: state</a>. Otherwise, treat it as per the "anything else" entry
1020: below.</dd>
1021:
1022: <dt>U+002F SOLIDUS (/)</dt>
1023: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
1024: token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
1025: state</a>. Otherwise, treat it as per the "anything else" entry
1026: below.</dd>
1027:
1028: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1029: <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
1030: token</a>, then emit the current tag token and switch to the
1031: <a href="#data-state">data state</a>. Otherwise, treat it as per the "anything
1032: else" entry below.</dd>
1033:
1034: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1035: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
1036: character</a> (add 0x0020 to the character's code point) to the
1037: current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 1038: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 1039:
1040: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
1041: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1042: tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
1.14 mike 1043: character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>
1.1 mike 1044:
1045: <dt>Anything else</dt>
1046: <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
1047: character token, a character token for each of the characters in
1048: the <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to
1049: the buffer), and reconsume the <a href="parsing.html#current-input-character">current input character</a>
1050: in the <a href="#script-data-escaped-state">script data escaped state</a>.</dd>
1051:
1.29 mike 1052: </dl><h5 id="script-data-double-escape-start-state"><span class="secno">8.2.4.28 </span><dfn>Script data double escape start state</dfn></h5>
1.1 mike 1053:
1054: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1055:
1056: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1057: <dt>U+000A LINE FEED (LF)</dt>
1058: <dt>U+000C FORM FEED (FF)</dt>
1059: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1060: <dt>U+0020 SPACE</dt>
1061: <dt>U+002F SOLIDUS (/)</dt>
1062: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1063: <dd>If the <var><a href="#temporary-buffer">temporary buffer</a></var> is the string "<code title="">script</code>", then switch to the <a href="#script-data-double-escaped-state">script data
1.1 mike 1064: double escaped state</a>. Otherwise, switch to the <a href="#script-data-escaped-state">script
1.14 mike 1065: data escaped state</a>. Emit the <a href="parsing.html#current-input-character">current input
1066: character</a> as a character token.</dd>
1.1 mike 1067:
1068: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1.14 mike 1069: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
1.1 mike 1070: character</a> (add 0x0020 to the character's code point) to the
1.14 mike 1071: <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
1072: character</a> as a character token.</dd>
1.1 mike 1073:
1074: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
1.14 mike 1075: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the
1076: <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
1077: character</a> as a character token.</dd>
1.1 mike 1078:
1079: <dt>Anything else</dt>
1080: <dd>Reconsume the <a href="parsing.html#current-input-character">current input character</a> in the
1081: <a href="#script-data-escaped-state">script data escaped state</a>.</dd>
1082:
1.29 mike 1083: </dl><h5 id="script-data-double-escaped-state"><span class="secno">8.2.4.29 </span><dfn>Script data double escaped state</dfn></h5>
1.1 mike 1084:
1085: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1086:
1087: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 1088: <dd>Switch to the <a href="#script-data-double-escaped-dash-state">script data double escaped dash
1089: state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 1090:
1091: <dt>U+003C LESS-THAN SIGN (<)</dt>
1.14 mike 1092: <dd><p>Switch to the <a href="#script-data-double-escaped-less-than-sign-state">script data double escaped less-than
1093: sign state</a>. Emit a U+003C LESS-THAN SIGN character
1094: token.</p></dd>
1.1 mike 1095:
1096: <dt>EOF</dt>
1097: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1098: <a href="#data-state">data state</a>.</dd>
1099:
1100: <dt>Anything else</dt>
1101: <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
1.14 mike 1102: token.</dd>
1.1 mike 1103:
1.29 mike 1104: </dl><h5 id="script-data-double-escaped-dash-state"><span class="secno">8.2.4.30 </span><dfn>Script data double escaped dash state</dfn></h5>
1.1 mike 1105:
1106: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1107:
1108: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 1109: <dd>Switch to the <a href="#script-data-double-escaped-dash-dash-state">script data double escaped dash dash
1110: state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 1111:
1112: <dt>U+003C LESS-THAN SIGN (<)</dt>
1.14 mike 1113: <dd><p>Switch to the <a href="#script-data-double-escaped-less-than-sign-state">script data double escaped less-than
1114: sign state</a>. Emit a U+003C LESS-THAN SIGN character
1115: token.</p></dd>
1.1 mike 1116:
1117: <dt>EOF</dt>
1118: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1119: <a href="#data-state">data state</a>.</dd>
1120:
1121: <dt>Anything else</dt>
1.14 mike 1122: <dd>Switch to the <a href="#script-data-double-escaped-state">script data double escaped
1123: state</a>. Emit the <a href="parsing.html#current-input-character">current input character</a> as a
1124: character token.</dd>
1.1 mike 1125:
1.29 mike 1126: </dl><h5 id="script-data-double-escaped-dash-dash-state"><span class="secno">8.2.4.31 </span><dfn>Script data double escaped dash dash state</dfn></h5>
1.1 mike 1127:
1128: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1129:
1130: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1.14 mike 1131: <dd>Emit a U+002D HYPHEN-MINUS character token.</dd>
1.1 mike 1132:
1133: <dt>U+003C LESS-THAN SIGN (<)</dt>
1.14 mike 1134: <dd><p>Switch to the <a href="#script-data-double-escaped-less-than-sign-state">script data double escaped less-than
1135: sign state</a>. Emit a U+003C LESS-THAN SIGN character
1136: token.</p></dd>
1.1 mike 1137:
1138: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1139: <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E
1140: GREATER-THAN SIGN character token.</dd>
1.1 mike 1141:
1142: <dt>EOF</dt>
1143: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1144: <a href="#data-state">data state</a>.</dd>
1145:
1146: <dt>Anything else</dt>
1.14 mike 1147: <dd>Switch to the <a href="#script-data-double-escaped-state">script data double escaped
1148: state</a>. Emit the <a href="parsing.html#current-input-character">current input character</a> as a
1149: character token.</dd>
1.1 mike 1150:
1.29 mike 1151: </dl><h5 id="script-data-double-escaped-less-than-sign-state"><span class="secno">8.2.4.32 </span><dfn>Script data double escaped less-than sign state</dfn></h5>
1.1 mike 1152:
1153: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1154:
1155: <dl class="switch"><dt>U+002F SOLIDUS (/)</dt>
1.14 mike 1156: <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
1157: to the <a href="#script-data-double-escape-end-state">script data double escape end state</a>. Emit a
1158: U+002F SOLIDUS character token.</dd>
1.1 mike 1159:
1160: <dt>Anything else</dt>
1161: <dd>Reconsume the <a href="parsing.html#current-input-character">current input character</a> in the
1162: <a href="#script-data-double-escaped-state">script data double escaped state</a>.</dd>
1163:
1.29 mike 1164: </dl><h5 id="script-data-double-escape-end-state"><span class="secno">8.2.4.33 </span><dfn>Script data double escape end state</dfn></h5>
1.1 mike 1165:
1166: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1167:
1168: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1169: <dt>U+000A LINE FEED (LF)</dt>
1170: <dt>U+000C FORM FEED (FF)</dt>
1171: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1172: <dt>U+0020 SPACE</dt>
1173: <dt>U+002F SOLIDUS (/)</dt>
1174: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1175: <dd>If the <var><a href="#temporary-buffer">temporary buffer</a></var> is the string "<code title="">script</code>", then switch to the <a href="#script-data-escaped-state">script data
1.1 mike 1176: escaped state</a>. Otherwise, switch to the <a href="#script-data-double-escaped-state">script data
1.14 mike 1177: double escaped state</a>. Emit the <a href="parsing.html#current-input-character">current input
1178: character</a> as a character token.</dd>
1.1 mike 1179:
1180: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1.14 mike 1181: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
1.1 mike 1182: character</a> (add 0x0020 to the character's code point) to the
1.14 mike 1183: <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
1184: character</a> as a character token.</dd>
1.1 mike 1185:
1186: <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
1.14 mike 1187: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the
1188: <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
1189: character</a> as a character token.</dd>
1.1 mike 1190:
1191: <dt>Anything else</dt>
1192: <dd>Reconsume the <a href="parsing.html#current-input-character">current input character</a> in the
1193: <a href="#script-data-double-escaped-state">script data double escaped state</a>.</dd>
1194:
1.29 mike 1195: </dl><h5 id="before-attribute-name-state"><span class="secno">8.2.4.34 </span><dfn>Before attribute name state</dfn></h5>
1.1 mike 1196:
1197: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1198:
1199: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1200: <dt>U+000A LINE FEED (LF)</dt>
1201: <dt>U+000C FORM FEED (FF)</dt>
1202: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1203: <dt>U+0020 SPACE</dt>
1.14 mike 1204: <dd>Ignore the character.</dd>
1.1 mike 1205:
1206: <dt>U+002F SOLIDUS (/)</dt>
1207: <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>
1208:
1209: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1210: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
1211: token.</dd>
1.1 mike 1212:
1213: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1214: <dd>Start a new attribute in the current tag token. Set that
1215: attribute's name to the lowercase version of the <a href="parsing.html#current-input-character">current input
1216: character</a> (add 0x0020 to the character's code point), and its
1217: value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
1218: state</a>.</dd>
1219:
1220: <dt>U+0022 QUOTATION MARK (")</dt>
1221: <dt>U+0027 APOSTROPHE (')</dt>
1222: <dt>U+003C LESS-THAN SIGN (<)</dt>
1223: <dt>U+003D EQUALS SIGN (=)</dt>
1224: <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
1225: entry below.</dd>
1226:
1227: <dt>EOF</dt>
1228: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1229: <a href="#data-state">data state</a>.</dd>
1230:
1231: <dt>Anything else</dt>
1232: <dd>Start a new attribute in the current tag token. Set that
1233: attribute's name to the <a href="parsing.html#current-input-character">current input character</a>, and its value to
1234: the empty string. Switch to the <a href="#attribute-name-state">attribute name
1235: state</a>.</dd>
1236:
1.29 mike 1237: </dl><h5 id="attribute-name-state"><span class="secno">8.2.4.35 </span><dfn>Attribute name state</dfn></h5>
1.1 mike 1238:
1239: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1240:
1241: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1242: <dt>U+000A LINE FEED (LF)</dt>
1243: <dt>U+000C FORM FEED (FF)</dt>
1244: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1245: <dt>U+0020 SPACE</dt>
1246: <dd>Switch to the <a href="#after-attribute-name-state">after attribute name state</a>.</dd>
1247:
1248: <dt>U+002F SOLIDUS (/)</dt>
1249: <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>
1250:
1251: <dt>U+003D EQUALS SIGN (=)</dt>
1252: <dd>Switch to the <a href="#before-attribute-value-state">before attribute value state</a>.</dd>
1253:
1254: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1255: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
1256: token.</dd>
1.1 mike 1257:
1258: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1259: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
1260: character</a> (add 0x0020 to the character's code point) to the
1.14 mike 1261: current attribute's name.</dd>
1.1 mike 1262:
1263: <dt>U+0022 QUOTATION MARK (")</dt>
1264: <dt>U+0027 APOSTROPHE (')</dt>
1265: <dt>U+003C LESS-THAN SIGN (<)</dt>
1266: <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
1267: entry below.</dd>
1268:
1269: <dt>EOF</dt>
1270: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1271: <a href="#data-state">data state</a>.</dd>
1272:
1273: <dt>Anything else</dt>
1274: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 1275: attribute's name.</dd>
1.1 mike 1276:
1277: </dl><p>When the user agent leaves the attribute name state (and before
1278: emitting the tag token, if appropriate), the complete attribute's
1279: name must be compared to the other attributes on the same token;
1280: if there is already an attribute on the token with the exact same
1281: name, then this is a <a href="parsing.html#parse-error">parse error</a> and the new
1282: attribute must be dropped, along with the value that gets
1283: associated with it (if any).</p>
1284:
1285:
1.29 mike 1286: <h5 id="after-attribute-name-state"><span class="secno">8.2.4.36 </span><dfn>After attribute name state</dfn></h5>
1.1 mike 1287:
1288: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1289:
1290: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1291: <dt>U+000A LINE FEED (LF)</dt>
1292: <dt>U+000C FORM FEED (FF)</dt>
1293: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1294: <dt>U+0020 SPACE</dt>
1.14 mike 1295: <dd>Ignore the character.</dd>
1.1 mike 1296:
1297: <dt>U+002F SOLIDUS (/)</dt>
1298: <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>
1299:
1300: <dt>U+003D EQUALS SIGN (=)</dt>
1301: <dd>Switch to the <a href="#before-attribute-value-state">before attribute value state</a>.</dd>
1302:
1303: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1304: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
1305: token.</dd>
1.1 mike 1306:
1307: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1308: <dd>Start a new attribute in the current tag token. Set that
1309: attribute's name to the lowercase version of the <a href="parsing.html#current-input-character">current
1310: input character</a> (add 0x0020 to the character's code point),
1311: and its value to the empty string. Switch to the <a href="#attribute-name-state">attribute
1312: name state</a>.</dd>
1313:
1314: <dt>U+0022 QUOTATION MARK (")</dt>
1315: <dt>U+0027 APOSTROPHE (')</dt>
1316: <dt>U+003C LESS-THAN SIGN (<)</dt>
1317: <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
1318: entry below.</dd>
1319:
1320: <dt>EOF</dt>
1321: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1322: <a href="#data-state">data state</a>.</dd>
1323:
1324: <dt>Anything else</dt>
1325: <dd>Start a new attribute in the current tag token. Set that
1326: attribute's name to the <a href="parsing.html#current-input-character">current input character</a>, and
1327: its value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
1328: state</a>.</dd>
1329:
1.29 mike 1330: </dl><h5 id="before-attribute-value-state"><span class="secno">8.2.4.37 </span><dfn>Before attribute value state</dfn></h5>
1.1 mike 1331:
1332: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1333:
1334: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1335: <dt>U+000A LINE FEED (LF)</dt>
1336: <dt>U+000C FORM FEED (FF)</dt>
1337: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1338: <dt>U+0020 SPACE</dt>
1.14 mike 1339: <dd>Ignore the character.</dd>
1.1 mike 1340:
1341: <dt>U+0022 QUOTATION MARK (")</dt>
1342: <dd>Switch to the <a href="#attribute-value-double-quoted-state">attribute value (double-quoted) state</a>.</dd>
1343:
1344: <dt>U+0026 AMPERSAND (&)</dt>
1345: <dd>Switch to the <a href="#attribute-value-unquoted-state">attribute value (unquoted) state</a>
1346: and reconsume this <a href="parsing.html#current-input-character">current input character</a>.</dd>
1347:
1348: <dt>U+0027 APOSTROPHE (')</dt>
1349: <dd>Switch to the <a href="#attribute-value-single-quoted-state">attribute value (single-quoted) state</a>.</dd>
1350:
1351: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1352: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
1353: state</a>. Emit the current tag token.</dd>
1.1 mike 1354:
1355: <dt>U+003C LESS-THAN SIGN (<)</dt>
1356: <dt>U+003D EQUALS SIGN (=)</dt>
1357: <dt>U+0060 GRAVE ACCENT (`)</dt>
1358: <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
1359: entry below.</dd>
1360:
1361: <dt>EOF</dt>
1362: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1363: <a href="#data-state">data state</a>.</dd>
1364:
1365: <dt>Anything else</dt>
1366: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1367: attribute's value. Switch to the <a href="#attribute-value-unquoted-state">attribute value (unquoted)
1368: state</a>.</dd>
1369:
1370: </dl><h5 id="attribute-value-double-quoted-state"><span class="secno">8.2.4.38 </span><dfn>Attribute value (double-quoted) state</dfn></h5>
1371:
1372: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1373:
1374: <dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt>
1375: <dd>Switch to the <a href="#after-attribute-value-quoted-state">after attribute value (quoted)
1376: state</a>.</dd>
1377:
1378: <dt>U+0026 AMPERSAND (&)</dt>
1379: <dd>Switch to the <a href="#character-reference-in-attribute-value-state">character reference in attribute value
1380: state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
1381: being U+0022 QUOTATION MARK (").</dd>
1382:
1383: <dt>EOF</dt>
1384: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1385: <a href="#data-state">data state</a>.</dd>
1386:
1387: <dt>Anything else</dt>
1388: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 1389: attribute's value.</dd>
1.1 mike 1390:
1391: </dl><h5 id="attribute-value-single-quoted-state"><span class="secno">8.2.4.39 </span><dfn>Attribute value (single-quoted) state</dfn></h5>
1392:
1393: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1394:
1395: <dl class="switch"><dt>U+0027 APOSTROPHE (')</dt>
1396: <dd>Switch to the <a href="#after-attribute-value-quoted-state">after attribute value (quoted)
1397: state</a>.</dd>
1398:
1399: <dt>U+0026 AMPERSAND (&)</dt>
1400: <dd>Switch to the <a href="#character-reference-in-attribute-value-state">character reference in attribute value
1401: state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
1402: being U+0027 APOSTROPHE (').</dd>
1403:
1404: <dt>EOF</dt>
1405: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1406: <a href="#data-state">data state</a>.</dd>
1407:
1408: <dt>Anything else</dt>
1409: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 1410: attribute's value.</dd>
1.1 mike 1411:
1412: </dl><h5 id="attribute-value-unquoted-state"><span class="secno">8.2.4.40 </span><dfn>Attribute value (unquoted) state</dfn></h5>
1413:
1414: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1415:
1416: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1417: <dt>U+000A LINE FEED (LF)</dt>
1418: <dt>U+000C FORM FEED (FF)</dt>
1419: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1420: <dt>U+0020 SPACE</dt>
1421: <dd>Switch to the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>
1422:
1423: <dt>U+0026 AMPERSAND (&)</dt>
1424: <dd>Switch to the <a href="#character-reference-in-attribute-value-state">character reference in attribute value
1425: state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
1426: being U+003E GREATER-THAN SIGN (>).</dd>
1427:
1428: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1429: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
1430: token.</dd>
1.1 mike 1431:
1432: <dt>U+0022 QUOTATION MARK (")</dt>
1433: <dt>U+0027 APOSTROPHE (')</dt>
1434: <dt>U+003C LESS-THAN SIGN (<)</dt>
1435: <dt>U+003D EQUALS SIGN (=)</dt>
1436: <dt>U+0060 GRAVE ACCENT (`)</dt>
1437: <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
1438: entry below.</dd>
1439:
1440: <dt>EOF</dt>
1441: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1442: <a href="#data-state">data state</a>.</dd>
1443:
1444: <dt>Anything else</dt>
1445: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 1446: attribute's value.</dd>
1.1 mike 1447:
1.29 mike 1448: </dl><h5 id="character-reference-in-attribute-value-state"><span class="secno">8.2.4.41 </span><dfn>Character reference in attribute value state</dfn></h5>
1.1 mike 1449:
1450: <p>Attempt to <a href="#consume-a-character-reference">consume a character reference</a>.</p>
1451:
1.18 mike 1452: <p>If nothing is returned, append a U+0026 AMPERSAND character
1453: (&) to the current attribute's value.</p>
1.1 mike 1454:
1455: <p>Otherwise, append the returned character token to the current
1456: attribute's value.</p>
1457:
1.27 mike 1458: <p>Finally, switch back to the attribute value state that switched
1459: into this state.</p>
1.1 mike 1460:
1461:
1462: <h5 id="after-attribute-value-quoted-state"><span class="secno">8.2.4.42 </span><dfn>After attribute value (quoted) state</dfn></h5>
1463:
1464: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1465:
1466: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1467: <dt>U+000A LINE FEED (LF)</dt>
1468: <dt>U+000C FORM FEED (FF)</dt>
1469: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1470: <dt>U+0020 SPACE</dt>
1471: <dd>Switch to the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>
1472:
1473: <dt>U+002F SOLIDUS (/)</dt>
1474: <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>
1475:
1476: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1477: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
1478: token.</dd>
1.1 mike 1479:
1480: <dt>EOF</dt>
1481: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1482: <a href="#data-state">data state</a>.</dd>
1483:
1484: <dt>Anything else</dt>
1485: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the character in
1486: the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>
1487:
1.29 mike 1488: </dl><h5 id="self-closing-start-tag-state"><span class="secno">8.2.4.43 </span><dfn>Self-closing start tag state</dfn></h5>
1.1 mike 1489:
1490: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1491:
1492: <dl class="switch"><dt>U+003E GREATER-THAN SIGN (>)</dt>
1493: <dd>Set the <i>self-closing flag</i> of the current tag
1.14 mike 1494: token. Switch to the <a href="#data-state">data state</a>. Emit the current tag
1495: token.</dd>
1.1 mike 1496:
1497: <dt>EOF</dt>
1498: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the EOF character in the
1499: <a href="#data-state">data state</a>.</dd>
1500:
1501: <dt>Anything else</dt>
1502: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the character in
1503: the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>
1504:
1.29 mike 1505: </dl><h5 id="bogus-comment-state"><span class="secno">8.2.4.44 </span><dfn>Bogus comment state</dfn></h5>
1.1 mike 1506:
1507: <p>Consume every character up to and including the first U+003E
1508: GREATER-THAN SIGN character (>) or the end of the file (EOF),
1509: whichever comes first. Emit a comment token whose data is the
1510: concatenation of all the characters starting from and including
1511: the character that caused the state machine to switch into the
1512: bogus comment state, up to and including the character immediately
1513: before the last consumed character (i.e. up to the character just
1514: before the U+003E or EOF character). (If the comment was started
1515: by the end of the file (EOF), the token is empty.)</p>
1516:
1517: <p>Switch to the <a href="#data-state">data state</a>.</p>
1518:
1519: <p>If the end of the file was reached, reconsume the EOF
1520: character.</p>
1521:
1522:
1.29 mike 1523: <h5 id="markup-declaration-open-state"><span class="secno">8.2.4.45 </span><dfn>Markup declaration open state</dfn></h5>
1.1 mike 1524:
1525: <p>If the next two characters are both U+002D HYPHEN-MINUS
1526: characters (-), consume those two characters, create a comment token
1527: whose data is the empty string, and switch to the <a href="#comment-start-state">comment
1528: start state</a>.</p>
1529:
1530: <p>Otherwise, if the next seven characters are an <a href="infrastructure.html#ascii-case-insensitive">ASCII
1531: case-insensitive</a> match for the word "DOCTYPE", then consume
1532: those characters and switch to the <a href="#doctype-state">DOCTYPE state</a>.</p>
1533:
1534: <p>Otherwise, if the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in foreign
1535: content</a>" and the <a href="parsing.html#current-node">current node</a> is not an element
1536: in the <a href="namespaces.html#html-namespace-0">HTML namespace</a> and the next seven characters are
1537: an <a href="infrastructure.html#case-sensitive">case-sensitive</a> match for the string "[CDATA[" (the
1538: five uppercase letters "CDATA" with a U+005B LEFT SQUARE BRACKET
1539: character before and after), then consume those characters and
1540: switch to the <a href="#cdata-section-state">CDATA section state</a>.</p>
1541:
1542: <p>Otherwise, this is a <a href="parsing.html#parse-error">parse error</a>. Switch to the
1543: <a href="#bogus-comment-state">bogus comment state</a>. The next character that is
1544: consumed, if any, is the first character that will be in the
1545: comment.</p>
1546:
1547:
1.29 mike 1548: <h5 id="comment-start-state"><span class="secno">8.2.4.46 </span><dfn>Comment start state</dfn></h5>
1.1 mike 1549:
1550: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1551:
1552: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1553: <dd>Switch to the <a href="#comment-start-dash-state">comment start dash state</a>.</dd>
1554:
1555: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1556: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
1557: state</a>. Emit the comment token.</dd> <!-- see comment in
1558: comment end state -->
1.1 mike 1559:
1560: <dt>EOF</dt>
1561: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the comment token. Reconsume
1562: the EOF character in the <a href="#data-state">data state</a>.</dd>
1563:
1564: <dt>Anything else</dt>
1565: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the comment
1566: token's data. Switch to the <a href="#comment-state">comment state</a>.</dd>
1567:
1.29 mike 1568: </dl><h5 id="comment-start-dash-state"><span class="secno">8.2.4.47 </span><dfn>Comment start dash state</dfn></h5>
1.1 mike 1569:
1570: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1571:
1572: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1573: <dd>Switch to the <a href="#comment-end-state">comment end state</a></dd>
1574:
1575: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1576: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
1577: state</a>. Emit the comment token.</dd>
1.1 mike 1578:
1579: <dt>EOF</dt>
1580: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the comment token. Reconsume the
1581: EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment
1582: in comment end state -->
1583:
1584: <dt>Anything else</dt>
1585: <dd>Append a U+002D HYPHEN-MINUS character (-) and the
1586: <a href="parsing.html#current-input-character">current input character</a> to the comment token's
1587: data. Switch to the <a href="#comment-state">comment state</a>.</dd>
1588:
1.29 mike 1589: </dl><h5 id="comment-state"><span class="secno">8.2.4.48 </span><dfn id="comment">Comment state</dfn></h5>
1.1 mike 1590:
1591: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1592:
1593: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1594: <dd>Switch to the <a href="#comment-end-dash-state">comment end dash state</a></dd>
1595:
1596: <dt>EOF</dt>
1597: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the comment token. Reconsume the
1598: EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment
1599: in comment end state -->
1600:
1601: <dt>Anything else</dt>
1602: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the comment
1.14 mike 1603: token's data.</dd>
1.1 mike 1604:
1.29 mike 1605: </dl><h5 id="comment-end-dash-state"><span class="secno">8.2.4.49 </span><dfn>Comment end dash state</dfn></h5>
1.1 mike 1606:
1607: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1608:
1609: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1610: <dd>Switch to the <a href="#comment-end-state">comment end state</a></dd>
1611:
1612: <dt>EOF</dt>
1613: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the comment token. Reconsume the
1614: EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment
1615: in comment end state -->
1616:
1617: <dt>Anything else</dt>
1618: <dd>Append a U+002D HYPHEN-MINUS character (-) and the
1619: <a href="parsing.html#current-input-character">current input character</a> to the comment token's
1620: data. Switch to the <a href="#comment-state">comment state</a>.</dd>
1621:
1.29 mike 1622: </dl><h5 id="comment-end-state"><span class="secno">8.2.4.50 </span><dfn>Comment end state</dfn></h5>
1.1 mike 1623:
1624: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1625:
1626: <dl class="switch"><dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1627: <dd>Switch to the <a href="#data-state">data state</a>. Emit the comment
1628: token.</dd>
1.1 mike 1629:
1630: <dt>U+0021 EXCLAMATION MARK (!)</dt>
1631: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#comment-end-bang-state">comment end bang
1632: state</a>.</dd>
1633:
1634: <dt>U+002D HYPHEN-MINUS (-)</dt>
1635: <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+002D HYPHEN-MINUS
1.14 mike 1636: character (-) to the comment token's data.</dd>
1.1 mike 1637:
1638: <dt>EOF</dt>
1639: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the comment token. Reconsume
1640: the EOF character in the <a href="#data-state">data state</a>.</dd> <!-- For
1641: security reasons: otherwise, hostile user could put a <script type="text/javascript"> in
1642: a comment e.g. in a blog comment and then DOS the server so that
1643: the end tag isn't read, and then the commented <script type="text/javascript"> tag would
1644: be treated as live code -->
1645:
1646: <dt>Anything else</dt>
1647: <dd><a href="parsing.html#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS
1648: characters (-) and the <a href="parsing.html#current-input-character">current input character</a> to the
1649: comment token's data. Switch to the <a href="#comment-state">comment
1650: state</a>.</dd>
1651:
1.29 mike 1652: </dl><h5 id="comment-end-bang-state"><span class="secno">8.2.4.51 </span><dfn>Comment end bang state</dfn></h5>
1.1 mike 1653:
1654: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1655:
1656: <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
1657: <dd>Append two U+002D HYPHEN-MINUS characters (-) and a U+0021
1658: EXCLAMATION MARK character (!) to the comment token's data. Switch
1659: to the <a href="#comment-end-dash-state">comment end dash state</a>.</dd>
1660:
1661: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1662: <dd>Switch to the <a href="#data-state">data state</a>. Emit the comment
1663: token.</dd>
1.1 mike 1664:
1665: <dt>EOF</dt>
1666: <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the comment token. Reconsume
1667: the EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see
1668: comment in comment end state -->
1669:
1670: <dt>Anything else</dt>
1671: <dd>Append two U+002D HYPHEN-MINUS characters (-), a U+0021
1672: EXCLAMATION MARK character (!), and the <a href="parsing.html#current-input-character">current input
1673: character</a> to the comment token's data. Switch to the
1674: <a href="#comment-state">comment state</a>.</dd>
1675:
1.37 mike 1676: </dl><h5 id="doctype-state"><span class="secno">8.2.4.52 </span><dfn>DOCTYPE state</dfn></h5>
1.1 mike 1677:
1678: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1679:
1680: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1681: <dt>U+000A LINE FEED (LF)</dt>
1682: <dt>U+000C FORM FEED (FF)</dt>
1683: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1684: <dt>U+0020 SPACE</dt>
1685: <dd>Switch to the <a href="#before-doctype-name-state">before DOCTYPE name state</a>.</dd>
1686:
1687: <dt>EOF</dt>
1688: <dd><a href="parsing.html#parse-error">Parse error</a>. Create a new DOCTYPE token. Set its
1689: <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume
1690: the EOF character in the <a href="#data-state">data state</a>.</dd>
1691:
1692: <dt>Anything else</dt>
1693: <dd><a href="parsing.html#parse-error">Parse error</a>. Reconsume the character in the
1694: <a href="#before-doctype-name-state">before DOCTYPE name state</a>.</dd>
1695:
1.37 mike 1696: </dl><h5 id="before-doctype-name-state"><span class="secno">8.2.4.53 </span><dfn>Before DOCTYPE name state</dfn></h5>
1.1 mike 1697:
1698: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1699:
1700: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1701: <dt>U+000A LINE FEED (LF)</dt>
1702: <dt>U+000C FORM FEED (FF)</dt>
1703: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1704: <dt>U+0020 SPACE</dt>
1.14 mike 1705: <dd>Ignore the character.</dd>
1.1 mike 1706:
1707: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1708: <dd>Create a new DOCTYPE token. Set the token's name to the
1709: lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to the
1710: character's code point). Switch to the <a href="#doctype-name-state">DOCTYPE name
1711: state</a>.</dd>
1712:
1713: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1714: <dd><a href="parsing.html#parse-error">Parse error</a>. Create a new DOCTYPE token. Set its
1.14 mike 1715: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
1716: state</a>. Emit the token.</dd>
1.1 mike 1717:
1718: <dt>EOF</dt>
1719: <dd><a href="parsing.html#parse-error">Parse error</a>. Create a new DOCTYPE token. Set its
1720: <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume
1721: the EOF character in the <a href="#data-state">data state</a>.</dd>
1722:
1723: <dt>Anything else</dt>
1724: <dd>Create a new DOCTYPE token. Set the token's name to the
1725: <a href="parsing.html#current-input-character">current input character</a>. Switch to the <a href="#doctype-name-state">DOCTYPE name
1726: state</a>.</dd>
1727:
1.37 mike 1728: </dl><h5 id="doctype-name-state"><span class="secno">8.2.4.54 </span><dfn>DOCTYPE name state</dfn></h5>
1.1 mike 1729:
1730: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1731:
1732: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1733: <dt>U+000A LINE FEED (LF)</dt>
1734: <dt>U+000C FORM FEED (FF)</dt>
1735: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1736: <dt>U+0020 SPACE</dt>
1737: <dd>Switch to the <a href="#after-doctype-name-state">after DOCTYPE name state</a>.</dd>
1738:
1739: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1740: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE
1741: token.</dd>
1.1 mike 1742:
1743: <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
1744: <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
1745: character</a> (add 0x0020 to the character's code point) to the
1.14 mike 1746: current DOCTYPE token's name.</dd>
1.1 mike 1747:
1748: <dt>EOF</dt>
1749: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1750: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1751: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1752:
1753: <dt>Anything else</dt>
1754: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 1755: DOCTYPE token's name.</dd>
1.1 mike 1756:
1.37 mike 1757: </dl><h5 id="after-doctype-name-state"><span class="secno">8.2.4.55 </span><dfn>After DOCTYPE name state</dfn></h5>
1.1 mike 1758:
1759: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1760:
1761: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1762: <dt>U+000A LINE FEED (LF)</dt>
1763: <dt>U+000C FORM FEED (FF)</dt>
1764: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1765: <dt>U+0020 SPACE</dt>
1.14 mike 1766: <dd>Ignore the character.</dd>
1.1 mike 1767:
1768: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1769: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE
1770: token.</dd>
1.1 mike 1771:
1772: <dt>EOF</dt>
1773: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1774: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1775: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1776:
1777: <dt>Anything else</dt>
1778: <dd>
1779:
1780: <p>If the six characters starting from the <a href="parsing.html#current-input-character">current input
1781: character</a> are an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match
1782: for the word "PUBLIC", then consume those characters and switch to
1783: the <a href="#after-doctype-public-keyword-state">after DOCTYPE public keyword state</a>.</p>
1784:
1785: <p>Otherwise, if the six characters starting from the
1786: <a href="parsing.html#current-input-character">current input character</a> are an <a href="infrastructure.html#ascii-case-insensitive">ASCII
1787: case-insensitive</a> match for the word "SYSTEM", then consume
1788: those characters and switch to the <a href="#after-doctype-system-keyword-state">after DOCTYPE system
1789: keyword state</a>.</p>
1790:
1791: <p>Otherwise, this is the <a href="parsing.html#parse-error">parse error</a>. Set the
1792: DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to
1793: the <a href="#bogus-doctype-state">bogus DOCTYPE state</a>.</p>
1794:
1795: </dd>
1796:
1.37 mike 1797: </dl><h5 id="after-doctype-public-keyword-state"><span class="secno">8.2.4.56 </span><dfn>After DOCTYPE public keyword state</dfn></h5>
1.1 mike 1798:
1799: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1800:
1801: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1802: <dt>U+000A LINE FEED (LF)</dt>
1803: <dt>U+000C FORM FEED (FF)</dt>
1804: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1805: <dt>U+0020 SPACE</dt>
1806: <dd>Switch to the <a href="#before-doctype-public-identifier-state">before DOCTYPE public identifier
1807: state</a>.</dd>
1808:
1809: <dt>U+0022 QUOTATION MARK (")</dt>
1810: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's public
1811: identifier to the empty string (not missing), then switch to the
1812: <a href="#doctype-public-identifier-double-quoted-state">DOCTYPE public identifier (double-quoted) state</a>.</dd>
1813:
1814: <dt>U+0027 APOSTROPHE (')</dt>
1815: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's public
1816: identifier to the empty string (not missing), then switch to the
1817: <a href="#doctype-public-identifier-single-quoted-state">DOCTYPE public identifier (single-quoted) state</a>.</dd>
1818:
1819: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1820: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 1821: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
1822: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 1823:
1824: <dt>EOF</dt>
1825: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1826: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1827: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1828:
1829: <dt>Anything else</dt>
1830: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1831: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus
1832: DOCTYPE state</a>.</dd>
1833:
1.37 mike 1834: </dl><h5 id="before-doctype-public-identifier-state"><span class="secno">8.2.4.57 </span><dfn>Before DOCTYPE public identifier state</dfn></h5>
1.1 mike 1835:
1836: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1837:
1838: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1839: <dt>U+000A LINE FEED (LF)</dt>
1840: <dt>U+000C FORM FEED (FF)</dt>
1841: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1842: <dt>U+0020 SPACE</dt>
1.14 mike 1843: <dd>Ignore the character.</dd>
1.1 mike 1844:
1845: <dt>U+0022 QUOTATION MARK (")</dt>
1846: <dd>Set the DOCTYPE token's public identifier to the empty string
1847: (not missing), then switch to the <a href="#doctype-public-identifier-double-quoted-state">DOCTYPE public identifier
1848: (double-quoted) state</a>.</dd>
1849:
1850: <dt>U+0027 APOSTROPHE (')</dt>
1851: <dd>Set the DOCTYPE token's public identifier to the empty string
1852: (not missing), then switch to the <a href="#doctype-public-identifier-single-quoted-state">DOCTYPE public identifier
1853: (single-quoted) state</a>.</dd>
1854:
1855: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1856: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 1857: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
1858: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 1859:
1860: <dt>EOF</dt>
1861: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1862: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1863: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1864:
1865: <dt>Anything else</dt>
1866: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1867: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus
1868: DOCTYPE state</a>.</dd>
1869:
1.37 mike 1870: </dl><h5 id="doctype-public-identifier-double-quoted-state"><span class="secno">8.2.4.58 </span><dfn>DOCTYPE public identifier (double-quoted) state</dfn></h5>
1.1 mike 1871:
1872: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1873:
1874: <dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt>
1875: <dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd>
1876:
1877: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1878: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 1879: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
1880: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 1881:
1882: <dt>EOF</dt>
1883: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1884: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1885: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1886:
1887: <dt>Anything else</dt>
1888: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current DOCTYPE
1.14 mike 1889: token's public identifier.</dd>
1.1 mike 1890:
1.37 mike 1891: </dl><h5 id="doctype-public-identifier-single-quoted-state"><span class="secno">8.2.4.59 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
1.1 mike 1892:
1893: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1894:
1895: <dl class="switch"><dt>U+0027 APOSTROPHE (')</dt>
1896: <dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd>
1897:
1898: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1899: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 1900: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
1901: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 1902:
1903: <dt>EOF</dt>
1904: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1905: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1906: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1907:
1908: <dt>Anything else</dt>
1909: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current DOCTYPE
1.14 mike 1910: token's public identifier.</dd>
1.1 mike 1911:
1.37 mike 1912: </dl><h5 id="after-doctype-public-identifier-state"><span class="secno">8.2.4.60 </span><dfn>After DOCTYPE public identifier state</dfn></h5>
1.1 mike 1913:
1914: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1915:
1916: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1917: <dt>U+000A LINE FEED (LF)</dt>
1918: <dt>U+000C FORM FEED (FF)</dt>
1919: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1920: <dt>U+0020 SPACE</dt>
1921: <dd>Switch to the <a href="#between-doctype-public-and-system-identifiers-state">between DOCTYPE public and system
1922: identifiers state</a>.</dd>
1923:
1924: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1925: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE
1926: token.</dd>
1.1 mike 1927:
1928: <dt>U+0022 QUOTATION MARK (")</dt>
1929: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's system
1930: identifier to the empty string (not missing), then switch to the
1931: <a href="#doctype-system-identifier-double-quoted-state">DOCTYPE system identifier (double-quoted) state</a>.</dd>
1932:
1933: <dt>U+0027 APOSTROPHE (')</dt>
1934: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's system
1935: identifier to the empty string (not missing), then switch to the
1936: <a href="#doctype-system-identifier-single-quoted-state">DOCTYPE system identifier (single-quoted) state</a>.</dd>
1937:
1938: <dt>EOF</dt>
1939: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1940: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1941: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1942:
1943: <dt>Anything else</dt>
1944: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1945: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus
1946: DOCTYPE state</a>.</dd>
1947:
1.37 mike 1948: </dl><h5 id="between-doctype-public-and-system-identifiers-state"><span class="secno">8.2.4.61 </span><dfn>Between DOCTYPE public and system identifiers state</dfn></h5>
1.1 mike 1949:
1950: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1951:
1952: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1953: <dt>U+000A LINE FEED (LF)</dt>
1954: <dt>U+000C FORM FEED (FF)</dt>
1955: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1956: <dt>U+0020 SPACE</dt>
1.14 mike 1957: <dd>Ignore the character.</dd>
1.1 mike 1958:
1959: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 1960: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE
1961: token.</dd>
1.1 mike 1962:
1963: <dt>U+0022 QUOTATION MARK (")</dt>
1964: <dd>Set the DOCTYPE token's system identifier to the empty string
1965: (not missing), then switch to the <a href="#doctype-system-identifier-double-quoted-state">DOCTYPE system identifier
1966: (double-quoted) state</a>.</dd>
1967:
1968: <dt>U+0027 APOSTROPHE (')</dt>
1969: <dd>Set the DOCTYPE token's system identifier to the empty string
1970: (not missing), then switch to the <a href="#doctype-system-identifier-single-quoted-state">DOCTYPE system identifier
1971: (single-quoted) state</a>.</dd>
1972:
1973: <dt>EOF</dt>
1974: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1975: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
1976: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
1977:
1978: <dt>Anything else</dt>
1979: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1980: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus
1981: DOCTYPE state</a>.</dd>
1982:
1.37 mike 1983: </dl><h5 id="after-doctype-system-keyword-state"><span class="secno">8.2.4.62 </span><dfn>After DOCTYPE system keyword state</dfn></h5>
1.1 mike 1984:
1985: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
1986:
1987: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
1988: <dt>U+000A LINE FEED (LF)</dt>
1989: <dt>U+000C FORM FEED (FF)</dt>
1990: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
1991: <dt>U+0020 SPACE</dt>
1992: <dd>Switch to the <a href="#before-doctype-system-identifier-state">before DOCTYPE system identifier
1993: state</a>.</dd>
1994:
1995: <dt>U+0022 QUOTATION MARK (")</dt>
1996: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's system
1997: identifier to the empty string (not missing), then switch to the
1998: <a href="#doctype-system-identifier-double-quoted-state">DOCTYPE system identifier (double-quoted) state</a>.</dd>
1999:
2000: <dt>U+0027 APOSTROPHE (')</dt>
2001: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's system
2002: identifier to the empty string (not missing), then switch to the
2003: <a href="#doctype-system-identifier-single-quoted-state">DOCTYPE system identifier (single-quoted) state</a>.</dd>
2004:
2005: <dt>U+003E GREATER-THAN SIGN (>)</dt>
2006: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 2007: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
2008: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 2009:
2010: <dt>EOF</dt>
2011: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2012: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
2013: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
2014:
2015: <dt>Anything else</dt>
2016: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2017: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus
2018: DOCTYPE state</a>.</dd>
2019:
1.37 mike 2020: </dl><h5 id="before-doctype-system-identifier-state"><span class="secno">8.2.4.63 </span><dfn>Before DOCTYPE system identifier state</dfn></h5>
1.1 mike 2021:
2022: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
2023:
2024: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
2025: <dt>U+000A LINE FEED (LF)</dt>
2026: <dt>U+000C FORM FEED (FF)</dt>
2027: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
2028: <dt>U+0020 SPACE</dt>
1.14 mike 2029: <dd>Ignore the character.</dd>
1.1 mike 2030:
2031: <dt>U+0022 QUOTATION MARK (")</dt>
2032: <dd>Set the DOCTYPE token's system identifier to the empty string
2033: (not missing), then switch to the <a href="#doctype-system-identifier-double-quoted-state">DOCTYPE system identifier
2034: (double-quoted) state</a>.</dd>
2035:
2036: <dt>U+0027 APOSTROPHE (')</dt>
2037: <dd>Set the DOCTYPE token's system identifier to the empty string
2038: (not missing), then switch to the <a href="#doctype-system-identifier-single-quoted-state">DOCTYPE system identifier
2039: (single-quoted) state</a>.</dd>
2040:
2041: <dt>U+003E GREATER-THAN SIGN (>)</dt>
2042: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 2043: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
2044: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 2045:
2046: <dt>EOF</dt>
2047: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2048: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
2049: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
2050:
2051: <dt>Anything else</dt>
2052: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2053: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus
2054: DOCTYPE state</a>.</dd>
2055:
1.37 mike 2056: </dl><h5 id="doctype-system-identifier-double-quoted-state"><span class="secno">8.2.4.64 </span><dfn>DOCTYPE system identifier (double-quoted) state</dfn></h5>
1.1 mike 2057:
2058: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
2059:
2060: <dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt>
2061: <dd>Switch to the <a href="#after-doctype-system-identifier-state">after DOCTYPE system identifier
2062: state</a>.</dd>
2063:
2064: <dt>U+003E GREATER-THAN SIGN (>)</dt>
2065: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 2066: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
2067: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 2068:
2069: <dt>EOF</dt>
2070: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2071: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
2072: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
2073:
2074: <dt>Anything else</dt>
2075: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 2076: DOCTYPE token's system identifier.</dd>
1.1 mike 2077:
1.37 mike 2078: </dl><h5 id="doctype-system-identifier-single-quoted-state"><span class="secno">8.2.4.65 </span><dfn>DOCTYPE system identifier (single-quoted) state</dfn></h5>
1.1 mike 2079:
2080: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
2081:
2082: <dl class="switch"><dt>U+0027 APOSTROPHE (')</dt>
2083: <dd>Switch to the <a href="#after-doctype-system-identifier-state">after DOCTYPE system identifier
2084: state</a>.</dd>
2085:
2086: <dt>U+003E GREATER-THAN SIGN (>)</dt>
2087: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
1.14 mike 2088: <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
2089: state</a>. Emit that DOCTYPE token.</dd>
1.1 mike 2090:
2091: <dt>EOF</dt>
2092: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2093: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
2094: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
2095:
2096: <dt>Anything else</dt>
2097: <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
1.14 mike 2098: DOCTYPE token's system identifier.</dd>
1.1 mike 2099:
1.37 mike 2100: </dl><h5 id="after-doctype-system-identifier-state"><span class="secno">8.2.4.66 </span><dfn>After DOCTYPE system identifier state</dfn></h5>
1.1 mike 2101:
2102: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
2103:
2104: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
2105: <dt>U+000A LINE FEED (LF)</dt>
2106: <dt>U+000C FORM FEED (FF)</dt>
2107: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
2108: <dt>U+0020 SPACE</dt>
1.14 mike 2109: <dd>Ignore the character.</dd>
1.1 mike 2110:
2111: <dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 2112: <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE
2113: token.</dd>
1.1 mike 2114:
2115: <dt>EOF</dt>
2116: <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's
2117: <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
2118: Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
2119:
2120: <dt>Anything else</dt>
2121: <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#bogus-doctype-state">bogus DOCTYPE
2122: state</a>. (This does <em>not</em> set the DOCTYPE token's
2123: <i>force-quirks flag</i> to <i>on</i>.)</dd>
2124:
1.37 mike 2125: </dl><h5 id="bogus-doctype-state"><span class="secno">8.2.4.67 </span><dfn>Bogus DOCTYPE state</dfn></h5>
1.1 mike 2126:
2127: <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>
2128:
2129: <dl class="switch"><dt>U+003E GREATER-THAN SIGN (>)</dt>
1.14 mike 2130: <dd>Switch to the <a href="#data-state">data state</a>. Emit the DOCTYPE
2131: token.</dd>
1.1 mike 2132:
2133: <dt>EOF</dt>
2134: <dd>Emit the DOCTYPE token. Reconsume the EOF character in the
2135: <a href="#data-state">data state</a>.</dd>
2136:
2137: <dt>Anything else</dt>
1.14 mike 2138: <dd>Ignore the character.</dd>
1.1 mike 2139:
1.37 mike 2140: </dl><h5 id="cdata-section-state"><span class="secno">8.2.4.68 </span><dfn>CDATA section state</dfn></h5>
1.1 mike 2141:
2142: <p>Consume every character up to the next occurrence of the three
2143: character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
2144: BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or the
2145: end of the file (EOF), whichever comes first. Emit a series of
2146: character tokens consisting of all the characters consumed except
2147: the matching three character sequence at the end (if one was found
2148: before the end of the file).</p>
2149:
2150: <p>Switch to the <a href="#data-state">data state</a>.</p>
2151:
2152: <p>If the end of the file was reached, reconsume the EOF
2153: character.</p>
2154:
2155:
2156:
1.37 mike 2157: <h5 id="tokenizing-character-references"><span class="secno">8.2.4.69 </span>Tokenizing character references</h5>
1.1 mike 2158:
2159: <p>This section defines how to <dfn id="consume-a-character-reference">consume a character
2160: reference</dfn>. This definition is used when parsing character
2161: references <a href="#character-reference-in-data-state" title="character reference in data state">in
2162: text</a> and <a href="#character-reference-in-attribute-value-state" title="character reference in attribute value
2163: state">in attributes</a>.</p>
2164:
2165: <p>The behavior depends on the identity of the next character (the
2166: one immediately after the U+0026 AMPERSAND character):</p>
2167:
2168: <dl class="switch"><dt>U+0009 CHARACTER TABULATION</dt>
2169: <dt>U+000A LINE FEED (LF)</dt>
2170: <dt>U+000C FORM FEED (FF)</dt>
2171: <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
2172: <dt>U+0020 SPACE</dt>
2173: <dt>U+003C LESS-THAN SIGN</dt>
2174: <dt>U+0026 AMPERSAND</dt>
2175: <dt>EOF</dt>
2176: <dt>The <dfn id="additional-allowed-character">additional allowed character</dfn>, if there is one</dt>
2177:
2178: <dd>Not a character reference. No characters are consumed, and
2179: nothing is returned. (This is not an error, either.)</dd>
2180:
2181:
2182: <dt>U+0023 NUMBER SIGN (#)</dt>
2183:
2184: <dd>
2185:
2186: <p>Consume the U+0023 NUMBER SIGN.</p>
2187:
2188: <p>The behavior further depends on the character after the U+0023
2189: NUMBER SIGN:</p>
2190:
2191: <dl class="switch"><dt>U+0078 LATIN SMALL LETTER X</dt>
2192: <dt>U+0058 LATIN CAPITAL LETTER X</dt>
2193:
2194: <dd>
2195:
2196: <p>Consume the X.</p>
2197:
2198: <p>Follow the steps below, but using the range of characters
2199: U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U+0061 LATIN
2200: SMALL LETTER A to U+0066 LATIN SMALL LETTER F, and U+0041 LATIN
2201: CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F (in other
2202: words, 0-9, A-F, a-f).</p>
2203:
2204: <p>When it comes to interpreting the number, interpret it as a
2205: hexadecimal number.</p>
2206:
2207: </dd>
2208:
2209:
2210: <dt>Anything else</dt>
2211:
2212: <dd>
2213:
2214: <p>Follow the steps below, but using the range of characters
2215: U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).</p>
2216:
2217: <p>When it comes to interpreting the number, interpret it as a
2218: decimal number.</p>
2219:
2220: </dd>
2221:
2222: </dl><p>Consume as many characters as match the range of characters
2223: given above.</p>
2224:
2225: <p>If no characters match the range, then don't consume any
2226: characters (and unconsume the U+0023 NUMBER SIGN character and, if
2227: appropriate, the X character). This is a <a href="parsing.html#parse-error">parse
2228: error</a>; nothing is returned.</p>
2229:
2230: <p>Otherwise, if the next character is a U+003B SEMICOLON, consume
2231: that too. If it isn't, there is a <a href="parsing.html#parse-error">parse
2232: error</a>.</p>
2233:
2234: <p>If one or more characters match the range, then take them all
2235: and interpret the string of characters as a number (either
2236: hexadecimal or decimal as appropriate).</p>
2237:
2238: <p>If that number is one of the numbers in the first column of the
2239: following table, then this is a <a href="parsing.html#parse-error">parse error</a>. Find the
2240: row with that number in the first column, and return a character
2241: token for the Unicode character given in the second column of that
2242: row.</p>
2243:
1.26 mike 2244: <table id="table-charref-overrides"><thead><tr><th>Number </th><th colspan="2">Unicode character
1.1 mike 2245: </th></tr></thead><tbody><tr><td>0x00 </td><td>U+FFFD </td><td>REPLACEMENT CHARACTER
2246: </td></tr><tr><td>0x0D </td><td>U+000D </td><td>CARRIAGE RETURN (CR)
2247: </td></tr><tr><td>0x80 </td><td>U+20AC </td><td>EURO SIGN (€)
2248: </td></tr><tr><td>0x81 </td><td>U+0081 </td><td><control>
2249: </td></tr><tr><td>0x82 </td><td>U+201A </td><td>SINGLE LOW-9 QUOTATION MARK (‚)
2250: </td></tr><tr><td>0x83 </td><td>U+0192 </td><td>LATIN SMALL LETTER F WITH HOOK (ƒ)
2251: </td></tr><tr><td>0x84 </td><td>U+201E </td><td>DOUBLE LOW-9 QUOTATION MARK („)
2252: </td></tr><tr><td>0x85 </td><td>U+2026 </td><td>HORIZONTAL ELLIPSIS (…)
2253: </td></tr><tr><td>0x86 </td><td>U+2020 </td><td>DAGGER (†)
2254: </td></tr><tr><td>0x87 </td><td>U+2021 </td><td>DOUBLE DAGGER (‡)
2255: </td></tr><tr><td>0x88 </td><td>U+02C6 </td><td>MODIFIER LETTER CIRCUMFLEX ACCENT (ˆ)
2256: </td></tr><tr><td>0x89 </td><td>U+2030 </td><td>PER MILLE SIGN (‰)
2257: </td></tr><tr><td>0x8A </td><td>U+0160 </td><td>LATIN CAPITAL LETTER S WITH CARON (Š)
2258: </td></tr><tr><td>0x8B </td><td>U+2039 </td><td>SINGLE LEFT-POINTING ANGLE QUOTATION MARK (‹)
2259: </td></tr><tr><td>0x8C </td><td>U+0152 </td><td>LATIN CAPITAL LIGATURE OE (Œ)
2260: </td></tr><tr><td>0x8D </td><td>U+008D </td><td><control>
2261: </td></tr><tr><td>0x8E </td><td>U+017D </td><td>LATIN CAPITAL LETTER Z WITH CARON (Ž)
2262: </td></tr><tr><td>0x8F </td><td>U+008F </td><td><control>
2263: </td></tr><tr><td>0x90 </td><td>U+0090 </td><td><control>
2264: </td></tr><tr><td>0x91 </td><td>U+2018 </td><td>LEFT SINGLE QUOTATION MARK (‘)
2265: </td></tr><tr><td>0x92 </td><td>U+2019 </td><td>RIGHT SINGLE QUOTATION MARK (’)
2266: </td></tr><tr><td>0x93 </td><td>U+201C </td><td>LEFT DOUBLE QUOTATION MARK (“)
2267: </td></tr><tr><td>0x94 </td><td>U+201D </td><td>RIGHT DOUBLE QUOTATION MARK (”)
2268: </td></tr><tr><td>0x95 </td><td>U+2022 </td><td>BULLET (•)
2269: </td></tr><tr><td>0x96 </td><td>U+2013 </td><td>EN DASH (–)
2270: </td></tr><tr><td>0x97 </td><td>U+2014 </td><td>EM DASH (—)
2271: </td></tr><tr><td>0x98 </td><td>U+02DC </td><td>SMALL TILDE (˜)
2272: </td></tr><tr><td>0x99 </td><td>U+2122 </td><td>TRADE MARK SIGN (™)
2273: </td></tr><tr><td>0x9A </td><td>U+0161 </td><td>LATIN SMALL LETTER S WITH CARON (š)
2274: </td></tr><tr><td>0x9B </td><td>U+203A </td><td>SINGLE RIGHT-POINTING ANGLE QUOTATION MARK (›)
2275: </td></tr><tr><td>0x9C </td><td>U+0153 </td><td>LATIN SMALL LIGATURE OE (œ)
2276: </td></tr><tr><td>0x9D </td><td>U+009D </td><td><control>
2277: </td></tr><tr><td>0x9E </td><td>U+017E </td><td>LATIN SMALL LETTER Z WITH CARON (ž)
2278: </td></tr><tr><td>0x9F </td><td>U+0178 </td><td>LATIN CAPITAL LETTER Y WITH DIAERESIS (Ÿ)
2279: </td></tr></tbody></table><p>Otherwise, if the number is in the range 0xD800 to 0xDFFF<!--
2280: surrogates not allowed; see the comment in the "preprocessing the
2281: input stream" section for details --> or is greater than 0x10FFFF,
2282: then this is a <a href="parsing.html#parse-error">parse error</a>. Return a U+FFFD
2283: REPLACEMENT CHARACTER.</p>
2284:
2285: <p>Otherwise, return a character token for the Unicode character
2286: whose code point is that number.
2287:
2288: <!-- this is the same as the equivalent list in the input stream
2289: section -->
2290: If the number is in the range 0x0001 to 0x0008, <!-- HT, LF
2291: allowed --> <!-- U+000B is in the next list --> <!-- FF, CR
2292: allowed --> 0x000E to 0x001F, <!-- ASCII allowed --> 0x007F <!--to
2293: 0x0084, (0x0085 NEL not allowed), 0x0086--> to 0x009F, 0xFDD0 to
2294: 0xFDEF, or is one of 0x000B, 0xFFFE, 0xFFFF, 0x1FFFE, 0x1FFFF,
2295: 0x2FFFE, 0x2FFFF, 0x3FFFE, 0x3FFFF, 0x4FFFE, 0x4FFFF, 0x5FFFE,
2296: 0x5FFFF, 0x6FFFE, 0x6FFFF, 0x7FFFE, 0x7FFFF, 0x8FFFE, 0x8FFFF,
2297: 0x9FFFE, 0x9FFFF, 0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE,
2298: 0xCFFFF, 0xDFFFE, 0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF,
2299: 0x10FFFE, or 0x10FFFF, then this is a <a href="parsing.html#parse-error">parse
2300: error</a>.</p>
2301:
2302: </dd>
2303:
2304:
2305: <dt>Anything else</dt>
2306:
2307: <dd>
2308:
2309: <p>Consume the maximum number of characters possible, with the
2310: consumed characters matching one of the identifiers in the first
2311: column of the <a href="named-character-references.html#named-character-references">named character references</a> table (in a
2312: <a href="infrastructure.html#case-sensitive">case-sensitive</a> manner).</p>
2313:
2314: <p>If no match can be made, then no characters are consumed, and
2315: nothing is returned. In this case, if the characters after the
2316: U+0026 AMPERSAND character (&) consist of a sequence of one or
2317: more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT
2318: NINE (9), U+0061 LATIN SMALL LETTER A to U+007A LATIN SMALL LETTER
2319: Z, and U+0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL
2320: LETTER Z, followed by a U+003B SEMICOLON character (;), then this
2321: is a <a href="parsing.html#parse-error">parse error</a>.</p>
2322:
2323: <p>If the character reference is being consumed <a href="#character-reference-in-attribute-value-state" title="character reference in attribute value state">as part of an
2324: attribute</a>, and the last character matched is not a U+003B
2325: SEMICOLON character (;), and the next character is either a U+003D
2326: EQUALS SIGN character (=) or in the range U+0030 DIGIT ZERO (0) to
2327: U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A to U+005A
2328: LATIN CAPITAL LETTER Z, or U+0061 LATIN SMALL LETTER A to U+007A
2329: LATIN SMALL LETTER Z, then, for historical reasons, all the
2330: characters that were matched after the U+0026 AMPERSAND character
2331: (&) must be unconsumed, and nothing is returned.</p>
2332: <!-- "=" added because of http://www.w3.org/Bugs/Public/show_bug.cgi?id=9207#c5 -->
2333:
2334: <p>Otherwise, a character reference is parsed. If the last
2335: character matched is not a U+003B SEMICOLON character (;), there
2336: is a <a href="parsing.html#parse-error">parse error</a>.</p>
2337:
1.41 mike 2338: <p>Return one or two character tokens for the character(s)
2339: corresponding to the character reference name (as given by the
2340: second column of the <a href="named-character-references.html#named-character-references">named character references</a>
2341: table).</p>
1.1 mike 2342:
2343: <div class="example">
2344:
2345: <p>If the markup contains (not in an attribute) the string <code title="">I'm &notit; I tell you</code>, the character
2346: reference is parsed as "not", as in, <code title="">I'm ¬it;
2347: I tell you</code> (and this is a parse error). But if the markup
2348: was <code title="">I'm &notin; I tell you</code>, the
2349: character reference would be parsed as "notin;", resulting in
2350: <code title="">I'm ∉ I tell you</code> (and no parse
2351: error).</p>
2352:
2353: </div>
2354:
2355: </dd>
2356:
2357: </dl></div><div class="impl">
2358:
2359: <!-- v2: One thing that this doesn't define is handling deeply
2360: nested documents. There are compatibility requirements around that:
2361: you can't throw away the elements altogether, consider Tux made only
2362: with opening <font> elements, one per character. Seems that the best
2363: thing to do is to close some formatting elements from the middle of
2364: the stack when you hit a limit, or something. -->
2365:
1.29 mike 2366: <h4 id="tree-construction"><span class="secno">8.2.5 </span><dfn>Tree construction</dfn></h4>
1.1 mike 2367:
2368: <p>The input to the tree construction stage is a sequence of tokens
2369: from the <a href="#tokenization">tokenization</a> stage. The tree construction
2370: stage is associated with a DOM <code><a href="infrastructure.html#document">Document</a></code> object when a
2371: parser is created. The "output" of this stage consists of
2372: dynamically modifying or extending that document's DOM tree.</p>
2373:
2374: <p>This specification does not define when an interactive user agent
2375: has to render the <code><a href="infrastructure.html#document">Document</a></code> so that it is available to
2376: the user, or when it has to begin accepting user input.</p>
2377:
2378: <p>As each token is emitted from the tokenizer, the user agent must
2379: process the token according to the rules given in the section
2380: corresponding to the current <a href="parsing.html#insertion-mode">insertion mode</a>.</p>
2381:
2382: <p>When the steps below require the UA to <dfn id="insert-a-character">insert a
2383: character</dfn> into a node, if that node has a child immediately
2384: before where the character is to be inserted, and that child is a
1.24 mike 2385: <code><a href="infrastructure.html#text">Text</a></code> node, then the character must be appended to that
2386: <code><a href="infrastructure.html#text">Text</a></code> node; otherwise, a new <code><a href="infrastructure.html#text">Text</a></code> node
2387: whose data is just that character must be inserted in the
2388: appropriate place.</p>
1.1 mike 2389:
2390: <div class="example">
2391:
2392: <p>Here are some sample inputs to the parser and the corresponding
2393: number of text nodes that they result in, assuming a user agent
2394: that executes scripts.</p>
2395:
2396: <table><thead><tr><th>Input </th><th>Number of text nodes
2397: </th></tr></thead><tbody><tr><td><pre>A<script>
2398: var script = document.getElementsByTagName('script')[0];
2399: document.body.removeChild(script);
2400: </script>B</pre>
1.24 mike 2401: </td><td>One text node in the document, containing "AB".
1.1 mike 2402: </td></tr><tr><td><pre>A<script>
2403: var text = document.createTextNode('B');
2404: document.body.appendChild(text);
2405: </script>C</pre>
1.24 mike 2406: </td><td>Three text nodes; "A" before the script, the script's contents, and "BC" after the script (the parser appends to the text node created by the script).
1.1 mike 2407: </td></tr><tr><td><pre>A<script>
2408: var text = document.getElementsByTagName('script')[0].firstChild;
2409: text.data = 'B';
2410: document.body.appendChild(text);
1.24 mike 2411: </script>C</pre>
2412: </td><td>Two adjacent text nodes in the document, containing "A" and "BC".
2413: </td></tr><tr><td><pre>A<table>B<tr>C</tr>D</table></pre>
2414: </td><td>One text node before the table, containing "ABCD". (This is caused by <a href="#foster-parent" title="foster parent">foster parenting</a>.)
2415: </td></tr><tr><td><pre>A<table><tr> B</tr> C</table></pre>
2416: </td><td>One text node before the table, containing "A B C" (A-space-B-space-C). (This is caused by <a href="#foster-parent" title="foster parent">foster parenting</a>.)
1.1 mike 2417: </td></tr><tr><td><pre>A<table><tr> B</tr> </em>C</table></pre>
1.24 mike 2418: </td><td>One text node before the table, containing "A BC" (A-space-B-C), and one text node inside the table (as a child of a <code><a href="tabular-data.html#the-tbody-element">tbody</a></code>) with a single space character. (Space characters separated from non-space characters by non-character tokens are not affected by <a href="#foster-parent" title="foster parent">foster parenting</a>, even if those other tokens then get ignored.)
1.1 mike 2419: </td></tr></tbody></table></div>
2420:
2421: <p id="mutation-during-parsing">DOM mutation events must not fire
2422: for changes caused by the UA parsing the document. (Conceptually,
2423: the parser is not mutating the DOM, it is constructing it.) This
2424: includes the parsing of any content inserted using <code title="dom-document-write"><a href="apis-in-html-documents.html#dom-document-write">document.write()</a></code> and <code title="dom-document-writeln"><a href="apis-in-html-documents.html#dom-document-writeln">document.writeln()</a></code> calls. <a href="references.html#refsDOMEVENTS">[DOMEVENTS]</a></p>
2425:
2426: <p class="note">Not all of the tag names mentioned below are
2427: conformant tag names in this specification; many are included to
2428: handle legacy content. They still form part of the algorithm that
2429: implementations are required to implement to claim conformance.</p>
2430:
2431: <p class="note">The algorithm described below places no limit on the
2432: depth of the DOM tree generated, or on the length of tag names,
2433: attribute names, attribute values, text nodes, etc. While
2434: implementors are encouraged to avoid arbitrary limits, it is
2435: recognized that <a href="infrastructure.html#hardwareLimitations">practical
2436: concerns</a> will likely force user agents to impose nesting depth
2437: constraints.</p>
2438:
2439:
1.29 mike 2440: <h5 id="creating-and-inserting-elements"><span class="secno">8.2.5.1 </span>Creating and inserting elements</h5>
1.1 mike 2441:
2442: <p>When the steps below require the UA to <dfn id="create-an-element-for-the-token" title="create an
2443: element for the token">create an element for a token</dfn> in a
2444: particular namespace, the UA must create a node implementing the
2445: interface appropriate for the element type corresponding to the tag
2446: name of the token in the given namespace (as given in the
2447: specification that defines that element, e.g. for an <code><a href="text-level-semantics.html#the-a-element">a</a></code>
2448: element in the <a href="namespaces.html#html-namespace-0">HTML namespace</a>, this specification
2449: defines it to be the <code><a href="text-level-semantics.html#htmlanchorelement">HTMLAnchorElement</a></code> interface), with
2450: the tag name being the name of that element, with the node being in
2451: the given namespace, and with the attributes on the node being those
2452: given in the given token.</p>
2453:
2454: <p>The interface appropriate for an element in the <a href="namespaces.html#html-namespace-0">HTML
2455: namespace</a> that is not defined in this specification (or
2456: <a href="infrastructure.html#other-applicable-specifications">other applicable specifications</a>) is
2457: <code><a href="elements.html#htmlunknownelement">HTMLUnknownElement</a></code>. Element in other namespaces whose
2458: interface is not defined by that namespace's specification must use
2459: the interface <code><a href="infrastructure.html#element">Element</a></code>.</p>
2460:
2461: <p>When a <a href="forms.html#category-reset" title="category-reset">resettable element</a> is
2462: created in this manner, its <a href="association-of-controls-and-forms.html#concept-form-reset-control" title="concept-form-reset-control">reset algorithm</a> must be
2463: invoked once the attributes are set. (This initializes the element's
2464: <a href="association-of-controls-and-forms.html#concept-fe-value" title="concept-fe-value">value</a> and <a href="association-of-controls-and-forms.html#concept-fe-checked" title="concept-fe-checked">checkedness</a> based on the element's
2465: attributes.)</p>
2466:
2467: <hr><p>When the steps below require the UA to <dfn id="insert-an-html-element">insert an HTML
2468: element</dfn> for a token, the UA must first <a href="#create-an-element-for-the-token">create an element
2469: for the token</a> in the <a href="namespaces.html#html-namespace-0">HTML namespace</a>, and then
2470: append this node to the <a href="parsing.html#current-node">current node</a>, and push it onto
2471: the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> so that it is the new
2472: <a href="parsing.html#current-node">current node</a>.</p>
2473:
2474: <p>The steps below may also require that the UA insert an HTML
2475: element in a particular place, in which case the UA must follow the
2476: same steps except that it must insert or append the new node in the
2477: location specified instead of appending it to the <a href="parsing.html#current-node">current
2478: node</a>. (This happens in particular during the parsing of
2479: tables with invalid content.)</p>
2480:
2481: <p>If an element created by the <a href="#insert-an-html-element">insert an HTML element</a>
2482: algorithm is a <a href="forms.html#form-associated-element">form-associated element</a>, and the
2483: <a href="parsing.html#form-element-pointer"><code title="">form</code> element pointer</a> is not null,
2484: and the newly created element doesn't have a <code title="attr-fae-form"><a href="association-of-controls-and-forms.html#attr-fae-form">form</a></code> attribute, the user agent must
2485: <a href="association-of-controls-and-forms.html#concept-form-association" title="concept-form-association">associate</a> the newly
2486: created element with the <code><a href="forms.html#the-form-element">form</a></code> element pointed to by the
1.30 mike 2487: <a href="parsing.html#form-element-pointer"><code title="">form</code> element pointer</a> when the
2488: element is inserted, instead of running the <a href="association-of-controls-and-forms.html#reset-the-form-owner">reset the form
2489: owner</a> algorithm.</p>
1.1 mike 2490:
2491: <hr><p>When the steps below require the UA to <dfn id="insert-a-foreign-element">insert a foreign
2492: element</dfn> for a token, the UA must first <a href="#create-an-element-for-the-token">create an element
2493: for the token</a> in the given namespace, and then append this
2494: node to the <a href="parsing.html#current-node">current node</a>, and push it onto the
2495: <a href="parsing.html#stack-of-open-elements">stack of open elements</a> so that it is the new
2496: <a href="parsing.html#current-node">current node</a>. If the newly created element has an <code title="">xmlns</code> attribute in the <a href="namespaces.html#xmlns-namespace">XMLNS namespace</a>
2497: whose value is not exactly the same as the element's namespace, that
2498: is a <a href="parsing.html#parse-error">parse error</a>. Similarly, if the newly created
2499: element has an <code title="">xmlns:xlink</code> attribute in the
2500: <a href="namespaces.html#xmlns-namespace">XMLNS namespace</a> whose value is not the <a href="namespaces.html#xlink-namespace">XLink
2501: Namespace</a>, that is a <a href="parsing.html#parse-error">parse error</a>.</p>
2502:
2503: <p>When the steps below require the user agent to <dfn id="adjust-mathml-attributes">adjust MathML
2504: attributes</dfn> for a token, then, if the token has an attribute
2505: named <code title="">definitionurl</code>, change its name to <code title="">definitionURL</code> (note the case difference).</p>
2506:
2507: <p>When the steps below require the user agent to <dfn id="adjust-svg-attributes">adjust SVG
2508: attributes</dfn> for a token, then, for each attribute on the token
2509: whose attribute name is one of the ones in the first column of the
2510: following table, change the attribute's name to the name given in
2511: the corresponding cell in the second column. (This fixes the case of
2512: SVG attributes that are not all lowercase.)</p>
2513:
2514: <table><thead><tr><th> Attribute name on token </th><th> Attribute name on element
2515: </th></tr></thead><tbody><tr><td> <code title="">attributename</code> </td><td> <code title="">attributeName</code>
2516: </td></tr><tr><td> <code title="">attributetype</code> </td><td> <code title="">attributeType</code>
2517: </td></tr><tr><td> <code title="">basefrequency</code> </td><td> <code title="">baseFrequency</code>
2518: </td></tr><tr><td> <code title="">baseprofile</code> </td><td> <code title="">baseProfile</code>
2519: </td></tr><tr><td> <code title="">calcmode</code> </td><td> <code title="">calcMode</code>
2520: </td></tr><tr><td> <code title="">clippathunits</code> </td><td> <code title="">clipPathUnits</code>
2521: </td></tr><tr><td> <code title="">contentscripttype</code> </td><td> <code title="">contentScriptType</code>
2522: </td></tr><tr><td> <code title="">contentstyletype</code> </td><td> <code title="">contentStyleType</code>
2523: </td></tr><tr><td> <code title="">diffuseconstant</code> </td><td> <code title="">diffuseConstant</code>
2524: </td></tr><tr><td> <code title="">edgemode</code> </td><td> <code title="">edgeMode</code>
2525: </td></tr><tr><td> <code title="">externalresourcesrequired</code> </td><td> <code title="">externalResourcesRequired</code>
2526: </td></tr><tr><td> <code title="">filterres</code> </td><td> <code title="">filterRes</code>
2527: </td></tr><tr><td> <code title="">filterunits</code> </td><td> <code title="">filterUnits</code>
2528: </td></tr><tr><td> <code title="">glyphref</code> </td><td> <code title="">glyphRef</code>
2529: </td></tr><tr><td> <code title="">gradienttransform</code> </td><td> <code title="">gradientTransform</code>
2530: </td></tr><tr><td> <code title="">gradientunits</code> </td><td> <code title="">gradientUnits</code>
2531: </td></tr><tr><td> <code title="">kernelmatrix</code> </td><td> <code title="">kernelMatrix</code>
2532: </td></tr><tr><td> <code title="">kernelunitlength</code> </td><td> <code title="">kernelUnitLength</code>
2533: </td></tr><tr><td> <code title="">keypoints</code> </td><td> <code title="">keyPoints</code>
2534: </td></tr><tr><td> <code title="">keysplines</code> </td><td> <code title="">keySplines</code>
2535: </td></tr><tr><td> <code title="">keytimes</code> </td><td> <code title="">keyTimes</code>
2536: </td></tr><tr><td> <code title="">lengthadjust</code> </td><td> <code title="">lengthAdjust</code>
2537: </td></tr><tr><td> <code title="">limitingconeangle</code> </td><td> <code title="">limitingConeAngle</code>
2538: </td></tr><tr><td> <code title="">markerheight</code> </td><td> <code title="">markerHeight</code>
2539: </td></tr><tr><td> <code title="">markerunits</code> </td><td> <code title="">markerUnits</code>
2540: </td></tr><tr><td> <code title="">markerwidth</code> </td><td> <code title="">markerWidth</code>
2541: </td></tr><tr><td> <code title="">maskcontentunits</code> </td><td> <code title="">maskContentUnits</code>
2542: </td></tr><tr><td> <code title="">maskunits</code> </td><td> <code title="">maskUnits</code>
2543: </td></tr><tr><td> <code title="">numoctaves</code> </td><td> <code title="">numOctaves</code>
2544: </td></tr><tr><td> <code title="">pathlength</code> </td><td> <code title="">pathLength</code>
2545: </td></tr><tr><td> <code title="">patterncontentunits</code> </td><td> <code title="">patternContentUnits</code>
2546: </td></tr><tr><td> <code title="">patterntransform</code> </td><td> <code title="">patternTransform</code>
2547: </td></tr><tr><td> <code title="">patternunits</code> </td><td> <code title="">patternUnits</code>
2548: </td></tr><tr><td> <code title="">pointsatx</code> </td><td> <code title="">pointsAtX</code>
2549: </td></tr><tr><td> <code title="">pointsaty</code> </td><td> <code title="">pointsAtY</code>
2550: </td></tr><tr><td> <code title="">pointsatz</code> </td><td> <code title="">pointsAtZ</code>
2551: </td></tr><tr><td> <code title="">preservealpha</code> </td><td> <code title="">preserveAlpha</code>
2552: </td></tr><tr><td> <code title="">preserveaspectratio</code> </td><td> <code title="">preserveAspectRatio</code>
2553: </td></tr><tr><td> <code title="">primitiveunits</code> </td><td> <code title="">primitiveUnits</code>
2554: </td></tr><tr><td> <code title="">refx</code> </td><td> <code title="">refX</code>
2555: </td></tr><tr><td> <code title="">refy</code> </td><td> <code title="">refY</code>
2556: </td></tr><tr><td> <code title="">repeatcount</code> </td><td> <code title="">repeatCount</code>
2557: </td></tr><tr><td> <code title="">repeatdur</code> </td><td> <code title="">repeatDur</code>
2558: </td></tr><tr><td> <code title="">requiredextensions</code> </td><td> <code title="">requiredExtensions</code>
2559: </td></tr><tr><td> <code title="">requiredfeatures</code> </td><td> <code title="">requiredFeatures</code>
2560: </td></tr><tr><td> <code title="">specularconstant</code> </td><td> <code title="">specularConstant</code>
2561: </td></tr><tr><td> <code title="">specularexponent</code> </td><td> <code title="">specularExponent</code>
2562: </td></tr><tr><td> <code title="">spreadmethod</code> </td><td> <code title="">spreadMethod</code>
2563: </td></tr><tr><td> <code title="">startoffset</code> </td><td> <code title="">startOffset</code>
2564: </td></tr><tr><td> <code title="">stddeviation</code> </td><td> <code title="">stdDeviation</code>
2565: </td></tr><tr><td> <code title="">stitchtiles</code> </td><td> <code title="">stitchTiles</code>
2566: </td></tr><tr><td> <code title="">surfacescale</code> </td><td> <code title="">surfaceScale</code>
2567: </td></tr><tr><td> <code title="">systemlanguage</code> </td><td> <code title="">systemLanguage</code>
2568: </td></tr><tr><td> <code title="">tablevalues</code> </td><td> <code title="">tableValues</code>
2569: </td></tr><tr><td> <code title="">targetx</code> </td><td> <code title="">targetX</code>
2570: </td></tr><tr><td> <code title="">targety</code> </td><td> <code title="">targetY</code>
2571: </td></tr><tr><td> <code title="">textlength</code> </td><td> <code title="">textLength</code>
2572: </td></tr><tr><td> <code title="">viewbox</code> </td><td> <code title="">viewBox</code>
2573: </td></tr><tr><td> <code title="">viewtarget</code> </td><td> <code title="">viewTarget</code>
2574: </td></tr><tr><td> <code title="">xchannelselector</code> </td><td> <code title="">xChannelSelector</code>
2575: </td></tr><tr><td> <code title="">ychannelselector</code> </td><td> <code title="">yChannelSelector</code>
2576: </td></tr><tr><td> <code title="">zoomandpan</code> </td><td> <code title="">zoomAndPan</code>
2577: </td></tr></tbody></table><p>When the steps below require the user agent to <dfn id="adjust-foreign-attributes">adjust
2578: foreign attributes</dfn> for a token, then, if any of the attributes
2579: on the token match the strings given in the first column of the
2580: following table, let the attribute be a namespaced attribute, with
2581: the prefix being the string given in the corresponding cell in the
2582: second column, the local name being the string given in the
2583: corresponding cell in the third column, and the namespace being the
2584: namespace given in the corresponding cell in the fourth
2585: column. (This fixes the use of namespaced attributes, in particular
2586: <a href="elements.html#attr-xml-lang" title="attr-xml-lang"><code title="">lang</code> attributes in
2587: the <span>XML namespace</span></a>.)</p>
2588:
2589: <table><thead><tr><th> Attribute name </th><th> Prefix </th><th> Local name </th><th> Namespace
2590: </th></tr></thead><tbody><tr><td> <code title="">xlink:actuate</code> </td><td> <code title="">xlink</code> </td><td> <code title="">actuate</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2591: </td></tr><tr><td> <code title="">xlink:arcrole</code> </td><td> <code title="">xlink</code> </td><td> <code title="">arcrole</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2592: </td></tr><tr><td> <code title="">xlink:href</code> </td><td> <code title="">xlink</code> </td><td> <code title="">href</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2593: </td></tr><tr><td> <code title="">xlink:role</code> </td><td> <code title="">xlink</code> </td><td> <code title="">role</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2594: </td></tr><tr><td> <code title="">xlink:show</code> </td><td> <code title="">xlink</code> </td><td> <code title="">show</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2595: </td></tr><tr><td> <code title="">xlink:title</code> </td><td> <code title="">xlink</code> </td><td> <code title="">title</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2596: </td></tr><tr><td> <code title="">xlink:type</code> </td><td> <code title="">xlink</code> </td><td> <code title="">type</code> </td><td> <a href="namespaces.html#xlink-namespace">XLink namespace</a>
2597: </td></tr><tr><td> <code title="">xml:base</code> </td><td> <code title="">xml</code> </td><td> <code title="">base</code> </td><td> <a href="namespaces.html#xml-namespace">XML namespace</a> <!-- attr-xml-base -->
2598: </td></tr><tr><td> <code title="">xml:lang</code> </td><td> <code title="">xml</code> </td><td> <code title="">lang</code> </td><td> <a href="namespaces.html#xml-namespace">XML namespace</a>
2599: </td></tr><tr><td> <code title="">xml:space</code> </td><td> <code title="">xml</code> </td><td> <code title="">space</code> </td><td> <a href="namespaces.html#xml-namespace">XML namespace</a>
2600: </td></tr><tr><td> <code title="">xmlns</code> </td><td> (none) </td><td> <code title="">xmlns</code> </td><td> <a href="namespaces.html#xmlns-namespace">XMLNS namespace</a>
2601: </td></tr><tr><td> <code title="">xmlns:xlink</code> </td><td> <code title="">xmlns</code> </td><td> <code title="">xlink</code> </td><td> <a href="namespaces.html#xmlns-namespace">XMLNS namespace</a>
2602: </td></tr></tbody></table><hr><p>The <dfn id="generic-raw-text-element-parsing-algorithm">generic raw text element parsing algorithm</dfn> and the
2603: <dfn id="generic-rcdata-element-parsing-algorithm">generic RCDATA element parsing algorithm</dfn> consist of the
2604: following steps. These algorithms are always invoked in response to
2605: a start tag token.</p>
2606:
2607: <ol><li><p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p></li>
2608:
2609: <li><p>If the algorithm that was invoked is the <a href="#generic-raw-text-element-parsing-algorithm">generic raw
2610: text element parsing algorithm</a>, switch the tokenizer to the
2611: <a href="#rawtext-state">RAWTEXT state</a>; otherwise the algorithm invoked
2612: was the <a href="#generic-rcdata-element-parsing-algorithm">generic RCDATA element parsing algorithm</a>,
2613: switch the tokenizer to the <a href="#rcdata-state">RCDATA state</a>.</p></li>
2614:
2615: <li><p>Let the <a href="parsing.html#original-insertion-mode">original insertion mode</a> be the current
2616: <a href="parsing.html#insertion-mode">insertion mode</a>.</p>
2617:
2618: </li><li><p>Then, switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-incdata" title="insertion mode: text">text</a>".</p></li>
2619:
1.29 mike 2620: </ol><h5 id="closing-elements-that-have-implied-end-tags"><span class="secno">8.2.5.2 </span>Closing elements that have implied end tags</h5>
1.1 mike 2621:
2622: <p>When the steps below require the UA to <dfn id="generate-implied-end-tags">generate implied end
2623: tags</dfn>, then, while the <a href="parsing.html#current-node">current node</a> is a
2624: <code><a href="grouping-content.html#the-dd-element">dd</a></code> element, a <code><a href="grouping-content.html#the-dt-element">dt</a></code> element, an
2625: <code><a href="grouping-content.html#the-li-element">li</a></code> element, an <code><a href="the-button-element.html#the-option-element">option</a></code> element, an
2626: <code><a href="the-button-element.html#the-optgroup-element">optgroup</a></code> element, a <code><a href="grouping-content.html#the-p-element">p</a></code> element, an
2627: <code><a href="text-level-semantics.html#the-rp-element">rp</a></code> element, or an <code><a href="text-level-semantics.html#the-rt-element">rt</a></code> element, the UA must
2628: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
2629: elements</a>.</p>
2630:
2631: <p>If a step requires the UA to generate implied end tags but lists
2632: an element to exclude from the process, then the UA must perform the
2633: above steps as if that element was not in the above list.</p>
2634:
2635:
1.29 mike 2636: <h5 id="foster-parenting"><span class="secno">8.2.5.3 </span>Foster parenting</h5>
1.1 mike 2637:
2638: <p>Foster parenting happens when content is misnested in tables.</p>
2639:
2640: <p>When a node <var title="">node</var> is to be <dfn id="foster-parent" title="foster
2641: parent">foster parented</dfn>, the node <var title="">node</var>
2642: must be inserted into the <i><a href="#foster-parent-element">foster parent element</a></i>.</p>
2643:
2644: <p>The <dfn id="foster-parent-element">foster parent element</dfn> is the parent element of the
2645: last <code><a href="tabular-data.html#the-table-element">table</a></code> element in the <a href="parsing.html#stack-of-open-elements">stack of open
2646: elements</a>, if there is a <code><a href="tabular-data.html#the-table-element">table</a></code> element and it has
1.11 mike 2647: such a parent element.</p>
2648:
2649: <p class="note">It might have no parent or some other kind parent if
2650: a script manipulated the DOM after the element was inserted by the
2651: parser.</p>
2652:
2653: <p>If there is no <code><a href="tabular-data.html#the-table-element">table</a></code> element in the <a href="parsing.html#stack-of-open-elements">stack of
2654: open elements</a> (<a href="the-end.html#fragment-case">fragment case</a>), then the
2655: <i><a href="#foster-parent-element">foster parent element</a></i> is the first element in the <a href="parsing.html#stack-of-open-elements">stack
2656: of open elements</a> (the <code><a href="semantics.html#the-html-element-0">html</a></code> element). Otherwise,
2657: if there is a <code><a href="tabular-data.html#the-table-element">table</a></code> element in the <a href="parsing.html#stack-of-open-elements">stack of open
1.1 mike 2658: elements</a>, but the last <code><a href="tabular-data.html#the-table-element">table</a></code> element in the
2659: <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has no parent, or its parent
1.11 mike 2660: node is not an element, then the <i><a href="#foster-parent-element">foster parent element</a></i> is the
2661: element before the last <code><a href="tabular-data.html#the-table-element">table</a></code> element in the
1.1 mike 2662: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
2663:
2664: <p>If the <i><a href="#foster-parent-element">foster parent element</a></i> is the parent element of the
2665: last <code><a href="tabular-data.html#the-table-element">table</a></code> element in the <a href="parsing.html#stack-of-open-elements">stack of open
1.11 mike 2666: elements</a>, then <var title="">node</var> must be inserted into
2667: the <i><a href="#foster-parent-element">foster parent element</a></i>, immediately <em>before</em> the
2668: last <code><a href="tabular-data.html#the-table-element">table</a></code> element in the <a href="parsing.html#stack-of-open-elements">stack of open
2669: elements</a>; otherwise, <var title="">node</var> must be
1.1 mike 2670: <em>appended</em> to the <i><a href="#foster-parent-element">foster parent element</a></i>.</p>
2671:
2672:
2673:
1.29 mike 2674: <h5 id="the-initial-insertion-mode"><span class="secno">8.2.5.4 </span>The "<dfn title="insertion mode: initial">initial</dfn>" insertion mode</h5>
1.1 mike 2675:
2676: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#the-initial-insertion-mode" title="insertion
2677: mode: initial">initial</a>", tokens must be handled as follows:</p>
2678:
2679: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
2680: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2681: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
2682: <dd>
2683: <p>Ignore the token.</p>
2684: </dd>
2685:
2686: <dt>A comment token</dt>
2687: <dd>
2688: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <code><a href="infrastructure.html#document">Document</a></code>
2689: object with the <code title="">data</code> attribute set to the
2690: data given in the comment token.</p>
2691: </dd>
2692:
2693: <dt>A DOCTYPE token</dt>
2694: <dd>
2695:
2696: <p>If the DOCTYPE token's name is not a
2697: <a href="infrastructure.html#case-sensitive">case-sensitive</a> match for the string "<code title="">html</code>", or the token's public identifier is not
2698: missing, or the token's system identifier is neither missing nor a
2699: <a href="infrastructure.html#case-sensitive">case-sensitive</a> match for the string
2700: "<code><a href="urls.html#about:legacy-compat">about:legacy-compat</a></code>", and none of the sets of
2701: conditions in the following list are matched, then there is a
2702: <a href="parsing.html#parse-error">parse error</a>.</p>
2703:
2704: <ul><!-- only things that trigger no-quirks mode and were valid in
2705: some other spec are allowed in this list --><li>The DOCTYPE token's name is a <a href="infrastructure.html#case-sensitive">case-sensitive</a>
2706: match for the string "<code title="">html</code>", the token's
2707: public identifier is the <a href="infrastructure.html#case-sensitive">case-sensitive</a> string
2708: "<code title="">-//W3C//DTD HTML 4.0//EN</code>", and
2709: the token's system identifier is either missing or the
2710: <a href="infrastructure.html#case-sensitive">case-sensitive</a> string "<code title="">http://www.w3.org/TR/REC-html40/strict.dtd</code>".</li>
2711:
2712: <li>The DOCTYPE token's name is a <a href="infrastructure.html#case-sensitive">case-sensitive</a>
2713: match for the string "<code title="">html</code>", the token's
2714: public identifier is the <a href="infrastructure.html#case-sensitive">case-sensitive</a> string
2715: "<code title="">-//W3C//DTD HTML 4.01//EN</code>", and
2716: the token's system identifier is either missing or the
2717: <a href="infrastructure.html#case-sensitive">case-sensitive</a> string "<code title="">http://www.w3.org/TR/html4/strict.dtd</code>".</li>
2718:
2719: <li>The DOCTYPE token's name is a <a href="infrastructure.html#case-sensitive">case-sensitive</a>
2720: match for the string "<code title="">html</code>", the token's
2721: public identifier is the <a href="infrastructure.html#case-sensitive">case-sensitive</a> string
2722: "<code title="">-//W3C//DTD XHTML 1.0 Strict//EN</code>",
2723: and the token's system identifier is the
2724: <a href="infrastructure.html#case-sensitive">case-sensitive</a> string "<code title="">http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code>".</li>
2725:
2726: <li>The DOCTYPE token's name is a <a href="infrastructure.html#case-sensitive">case-sensitive</a>
2727: match for the string "<code title="">html</code>", the token's
2728: public identifier is the <a href="infrastructure.html#case-sensitive">case-sensitive</a> string
2729: "<code title="">-//W3C//DTD XHTML 1.1//EN</code>", and
2730: the token's system identifier is the <a href="infrastructure.html#case-sensitive">case-sensitive</a>
2731: string "<code title="">http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</code>".</li>
2732:
2733: </ul><p>Conformance checkers may, based on the values (including
2734: presence or lack thereof) of the DOCTYPE token's name, public
2735: identifier, or system identifier, switch to a conformance checking
2736: mode for another language (e.g. based on the DOCTYPE token a
2737: conformance checker could recognize that the document is an
2738: HTML4-era document, and defer to an HTML4 conformance
2739: checker.)</p>
2740:
2741: <p>Append a <code><a href="infrastructure.html#documenttype">DocumentType</a></code> node to the
2742: <code><a href="infrastructure.html#document">Document</a></code> node, with the <code title="">name</code>
2743: attribute set to the name given in the DOCTYPE token, or the empty
2744: string if the name was missing; the <code title="">publicId</code>
2745: attribute set to the public identifier given in the DOCTYPE token,
2746: or the empty string if the public identifier was missing; the
2747: <code title="">systemId</code> attribute set to the system
2748: identifier given in the DOCTYPE token, or the empty string if the
2749: system identifier was missing; and the other attributes specific
2750: to <code><a href="infrastructure.html#documenttype">DocumentType</a></code> objects set to null and empty lists
2751: as appropriate. Associate the <code><a href="infrastructure.html#documenttype">DocumentType</a></code> node with
2752: the <code><a href="infrastructure.html#document">Document</a></code> object so that it is returned as the
2753: value of the <code title="">doctype</code> attribute of the
2754: <code><a href="infrastructure.html#document">Document</a></code> object.</p>
2755:
2756: <p id="quirks-mode-doctypes">Then, if the DOCTYPE token matches
2757: one of the conditions in the following list, then set the
2758: <code><a href="infrastructure.html#document">Document</a></code> to <a href="dom.html#quirks-mode">quirks mode</a>:</p>
2759:
2760: <ul class="brief"><li> The <i>force-quirks flag</i> is set to <i>on</i>. </li>
2761: <li> The name is set to anything other than "<code title="">html</code>" (compared <a href="infrastructure.html#case-sensitive" title="case-sensitive">case-sensitively</a>). </li>
2762: <li> The public identifier starts with: "<code title="">+//Silmaril//dtd html Pro v0r11 19970101//<!--EN--></code>" </li>
2763: <li> The public identifier starts with: "<code title="">-//AdvaSoft Ltd//DTD HTML 3.0 asWedit + extensions//<!--EN--></code>" </li>
2764: <li> The public identifier starts with: "<code title="">-//AS//DTD HTML 3.0 asWedit + extensions//<!--EN--></code>" </li>
2765: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Level 1//<!--EN--></code>" </li>
2766: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Level 2//<!--EN--></code>" </li>
2767: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Strict Level 1//<!--EN--></code>" </li>
2768: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Strict Level 2//<!--EN--></code>" </li>
2769: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Strict//<!--EN--></code>" </li>
2770: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0//<!--EN--></code>" </li>
2771: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.1E//<!--EN--></code>" </li>
2772: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 3.0//<!--EN--></code>" </li>
2773: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML 3.0//EN//</code>" </li>-->
2774: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 3.2 Final//<!--EN--></code>" </li>
2775: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 3.2//<!--EN--></code>" </li>
2776: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 3//<!--EN--></code>" </li>
2777: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Level 0//<!--EN--></code>" </li>
2778: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Level 0//EN//2.0</code>" </li>-->
2779: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Level 1//<!--EN--></code>" </li>
2780: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Level 1//EN//2.0</code>" </li>-->
2781: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Level 2//<!--EN--></code>" </li>
2782: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Level 2//EN//2.0</code>" </li>-->
2783: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Level 3//<!--EN--></code>" </li>
2784: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Level 3//EN//3.0</code>" </li>-->
2785: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict Level 0//<!--EN--></code>" </li>
2786: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict Level 0//EN//2.0</code>" </li>-->
2787: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict Level 1//<!--EN--></code>" </li>
2788: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict Level 1//EN//2.0</code>" </li>-->
2789: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict Level 2//<!--EN--></code>" </li>
2790: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict Level 2//EN//2.0</code>" </li>-->
2791: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict Level 3//<!--EN--></code>" </li>
2792: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict Level 3//EN//3.0</code>" </li>-->
2793: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict//<!--EN--></code>" </li>
2794: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict//EN//2.0</code>" </li>-->
2795: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict//EN//3.0</code>" </li>-->
2796: <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML//<!--EN--></code>" </li>
2797: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML//EN//2.0</code>" </li>-->
2798: <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML//EN//3.0</code>" </li>-->
2799: <li> The public identifier starts with: "<code title="">-//Metrius//DTD Metrius Presentational//<!--EN--></code>" </li>
2800: <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 2.0 HTML Strict//<!--EN--></code>" </li>
2801: <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 2.0 HTML//<!--EN--></code>" </li>
2802: <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 2.0 Tables//<!--EN--></code>" </li>
2803: <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 3.0 HTML Strict//<!--EN--></code>" </li>
2804: <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 3.0 HTML//<!--EN--></code>" </li>
2805: <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 3.0 Tables//<!--EN--></code>" </li>
2806: <li> The public identifier starts with: "<code title="">-//Netscape Comm. Corp.//DTD HTML//<!--EN--></code>" </li>
2807: <li> The public identifier starts with: "<code title="">-//Netscape Comm. Corp.//DTD Strict HTML//<!--EN--></code>" </li>
2808: <li> The public identifier starts with: "<code title="">-//O'Reilly and Associates//DTD HTML 2.0//<!--EN--></code>" </li>
2809: <li> The public identifier starts with: "<code title="">-//O'Reilly and Associates//DTD HTML Extended 1.0//<!--EN--></code>" </li>
2810: <li> The public identifier starts with: "<code title="">-//O'Reilly and Associates//DTD HTML Extended Relaxed 1.0//<!--EN--></code>" </li>
2811: <li> The public identifier starts with: "<code title="">-//SoftQuad Software//DTD HoTMetaL PRO 6.0::19990601::extensions to HTML 4.0//<!--EN--></code>" </li>
2812: <li> The public identifier starts with: "<code title="">-//SoftQuad//DTD HoTMetaL PRO 4.0::19971010::extensions to HTML 4.0//<!--EN--></code>" </li>
2813: <li> The public identifier starts with: "<code title="">-//Spyglass//DTD HTML 2.0 Extended//<!--EN--></code>" </li>
2814: <li> The public identifier starts with: "<code title="">-//SQ//DTD HTML 2.0 HoTMetaL + extensions//<!--EN--></code>" </li>
2815: <li> The public identifier starts with: "<code title="">-//Sun Microsystems Corp.//DTD HotJava HTML//<!--EN--></code>" </li>
2816: <li> The public identifier starts with: "<code title="">-//Sun Microsystems Corp.//DTD HotJava Strict HTML//<!--EN--></code>" </li>
2817: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3 1995-03-24//<!--EN--></code>" </li>
2818: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3.2 Draft//<!--EN--></code>" </li>
2819: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3.2 Final//<!--EN--></code>" </li>
2820: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3.2//<!--EN--></code>" </li>
2821: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3.2S Draft//<!--EN--></code>" </li>
2822: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 4.0 Frameset//<!--EN--></code>" </li>
2823: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 4.0 Transitional//<!--EN--></code>" </li>
2824: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML Experimental 19960712//<!--EN--></code>" </li>
2825: <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML Experimental 970421//<!--EN--></code>" </li>
2826: <li> The public identifier starts with: "<code title="">-//W3C//DTD W3 HTML//<!--EN--></code>" </li>
2827: <li> The public identifier starts with: "<code title="">-//W3O//DTD W3 HTML 3.0//<!--EN--></code>" </li>
2828: <!--<li> The public identifier is set to: "<code title="">-//W3O//DTD W3 HTML 3.0//EN//</code>" </li>-->
2829: <li> The public identifier is set to: "<code title="">-//W3O//DTD W3 HTML Strict 3.0//EN//</code>" </li>
2830: <li> The public identifier starts with: "<code title="">-//WebTechs//DTD Mozilla HTML 2.0//<!--EN--></code>" </li>
2831: <li> The public identifier starts with: "<code title="">-//WebTechs//DTD Mozilla HTML//<!--EN--></code>" </li>
2832: <li> The public identifier is set to: "<code title="">-/W3C/DTD HTML 4.0 Transitional/EN</code>" </li>
2833: <li> The public identifier is set to: "<code title="">HTML</code>" </li>
2834: <li> The system identifier is set to: "<code title="">http://www.ibm.com/data/dtd/v11/ibmxhtml1-transitional.dtd</code>" </li>
2835: <li> The system identifier is missing and the public identifier starts with: "<code title="">-//W3C//DTD HTML 4.01 Frameset//<!--EN--></code>" </li>
2836: <li> The system identifier is missing and the public identifier starts with: "<code title="">-//W3C//DTD HTML 4.01 Transitional//<!--EN--></code>" </li>
2837: </ul><p>Otherwise, if the DOCTYPE token matches one of the conditions
2838: in the following list, then set the <code><a href="infrastructure.html#document">Document</a></code> to
2839: <a href="dom.html#limited-quirks-mode">limited-quirks mode</a>:</p>
2840:
2841: <ul class="brief"><li> The public identifier starts with: "<code title="">-//W3C//DTD XHTML 1.0 Frameset//<!--EN--></code>" </li>
2842: <li> The public identifier starts with: "<code title="">-//W3C//DTD XHTML 1.0 Transitional//<!--EN--></code>" </li>
2843: <li> The system identifier is not missing and the public identifier starts with: "<code title="">-//W3C//DTD HTML 4.01 Frameset//<!--EN--></code>" </li>
2844: <li> The system identifier is not missing and the public identifier starts with: "<code title="">-//W3C//DTD HTML 4.01 Transitional//<!--EN--></code>" </li>
2845: </ul><p>The system identifier and public identifier strings must be
2846: compared to the values given in the lists above in an <a href="infrastructure.html#ascii-case-insensitive">ASCII
2847: case-insensitive</a> manner. A system identifier whose value is
2848: the empty string is not considered missing for the purposes of the
2849: conditions above.</p>
2850:
2851: <p>Then, switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-before-html-insertion-mode" title="insertion mode: before html">before html</a>".</p>
2852:
2853: </dd>
2854:
2855: <dt>Anything else</dt>
2856: <dd>
2857:
2858: <p>If the document is <em>not</em> <a href="the-iframe-element.html#an-iframe-srcdoc-document">an <code>iframe</code>
2859: <code title="attr-iframe-srcdoc">srcdoc</code> document</a>,
2860: then this is a <a href="parsing.html#parse-error">parse error</a>; set the
2861: <code><a href="infrastructure.html#document">Document</a></code> to <a href="dom.html#quirks-mode">quirks mode</a>.</p>
2862:
2863: <p>In any case, switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-before-html-insertion-mode" title="insertion mode: before html">before html</a>", then
2864: reprocess the current token.</p>
2865:
2866: </dd>
2867:
1.29 mike 2868: </dl><h5 id="the-before-html-insertion-mode"><span class="secno">8.2.5.5 </span>The "<dfn title="insertion mode: before html">before html</dfn>" insertion mode</h5>
1.1 mike 2869:
2870: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#the-before-html-insertion-mode" title="insertion
2871: mode: before html">before html</a>", tokens must be handled as follows:</p>
2872:
2873: <dl class="switch"><dt>A DOCTYPE token</dt>
2874: <dd>
2875: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
2876: </dd>
2877:
2878: <dt>A comment token</dt>
2879: <dd>
2880: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <code><a href="infrastructure.html#document">Document</a></code>
2881: object with the <code title="">data</code> attribute set to the
2882: data given in the comment token.</p>
2883: </dd>
2884:
2885: <dt>A character token that is one of U+0009 CHARACTER
2886: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2887: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
2888: <dd>
2889: <p>Ignore the token.</p>
2890: </dd>
2891:
2892: <dt>A start tag whose tag name is "html"</dt>
2893: <dd>
2894:
2895: <p><a href="#create-an-element-for-the-token">Create an element for the token</a> in the <a href="namespaces.html#html-namespace-0">HTML
2896: namespace</a>. Append it to the <code><a href="infrastructure.html#document">Document</a></code>
2897: object. Put this element in the <a href="parsing.html#stack-of-open-elements">stack of open
2898: elements</a>.</p>
2899:
2900: <p id="parser-appcache">If the <code><a href="infrastructure.html#document">Document</a></code> is being
2901: loaded as part of <a href="history.html#navigate" title="navigate">navigation</a> of a
2902: <a href="browsers.html#browsing-context">browsing context</a>, then: if the newly created element
2903: has a <code title="attr-html-manifest"><a href="semantics.html#attr-html-manifest">manifest</a></code> attribute
2904: whose value is not the empty string, then <a href="urls.html#resolve-a-url" title="resolve a
2905: url">resolve</a> the value of that attribute to an
2906: <a href="urls.html#absolute-url">absolute URL</a>, relative to the newly created element,
2907: and if that is successful, run the <a href="offline.html#concept-appcache-init" title="concept-appcache-init">application cache selection
2908: algorithm</a> with the resulting <a href="urls.html#absolute-url">absolute URL</a> with
2909: any <a href="urls.html#url-fragment" title="url-fragment"><fragment></a> component
2910: removed; otherwise, if there is no such attribute, or its value is
2911: the empty string, or resolving its value fails, run the <a href="offline.html#concept-appcache-init" title="concept-appcache-init">application cache selection
2912: algorithm</a> with no manifest. The algorithm must be passed
2913: the <code><a href="infrastructure.html#document">Document</a></code> object.</p>
2914:
2915: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-before-head-insertion-mode" title="insertion mode: before head">before head</a>".</p>
2916:
2917: </dd>
2918:
2919: <dt>An end tag whose tag name is one of: "head", "body", "html", "br"</dt>
2920: <dd>
2921: <p>Act as described in the "anything else" entry below.</p>
2922: </dd>
2923:
2924: <dt>Any other end tag</dt>
2925: <dd>
2926: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
2927: </dd>
2928:
2929: <dt>Anything else</dt>
2930: <dd>
2931:
2932: <p>Create an <code><a href="semantics.html#the-html-element-0">html</a></code> element. Append it to the
2933: <code><a href="infrastructure.html#document">Document</a></code> object. Put this element in the <a href="parsing.html#stack-of-open-elements">stack
2934: of open elements</a>.</p>
2935:
2936: <p>If the <code><a href="infrastructure.html#document">Document</a></code> is being loaded as part of <a href="history.html#navigate" title="navigate">navigation</a> of a <a href="browsers.html#browsing-context">browsing
2937: context</a>, then: run the <a href="offline.html#concept-appcache-init" title="concept-appcache-init">application cache selection
2938: algorithm</a> with no manifest, passing it the
2939: <code><a href="infrastructure.html#document">Document</a></code> object.</p>
2940:
2941: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-before-head-insertion-mode" title="insertion mode: before head">before head</a>", then
2942: reprocess the current token.</p>
2943:
2944: </dd>
2945:
2946: </dl><p>The root element can end up being removed from the
2947: <code><a href="infrastructure.html#document">Document</a></code> object, e.g. by scripts; nothing in particular
2948: happens in such cases, content continues being appended to the nodes
2949: as described in the next section.</p>
2950:
2951:
1.29 mike 2952: <h5 id="the-before-head-insertion-mode"><span class="secno">8.2.5.6 </span>The "<dfn title="insertion mode: before head">before head</dfn>" insertion mode</h5>
1.1 mike 2953:
2954: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#the-before-head-insertion-mode" title="insertion
2955: mode: before head">before head</a>", tokens must be handled as follows:</p>
2956:
2957: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
2958: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2959: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
2960: <dd>
2961: <p>Ignore the token.</p> <!-- :-( -->
2962: </dd>
2963:
2964: <dt>A comment token</dt>
2965: <dd>
2966: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
2967: node</a> with the <code title="">data</code> attribute set to
2968: the data given in the comment token.</p>
2969: </dd>
2970:
2971: <dt>A DOCTYPE token</dt>
2972: <dd>
2973: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
2974: </dd>
2975:
2976: <dt>A start tag whose tag name is "html"</dt>
2977: <dd>
2978: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
2979: mode</a>.</p>
2980: </dd>
2981:
2982: <dt>A start tag whose tag name is "head"</dt>
2983: <dd>
2984:
2985: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
2986:
2987: <p>Set the <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointer</a>
2988: to the newly created <code><a href="semantics.html#the-head-element-0">head</a></code> element.</p>
2989:
2990: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>".</p>
2991:
2992: </dd>
2993:
2994: <dt>An end tag whose tag name is one of: "head", "body", "html", "br"</dt>
2995: <dd>
2996:
2997: <p>Act as if a start tag token with the tag name "head" and no
2998: attributes had been seen, then reprocess the current token.</p>
2999:
3000: </dd>
3001:
3002: <dt>Any other end tag</dt>
3003: <dd>
3004:
3005: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3006:
3007: </dd>
3008:
3009: <dt>Anything else</dt>
3010: <dd>
3011:
3012: <p>Act as if a start tag token with the tag name "head" and no
3013: attributes had been seen, then reprocess the current
3014: token.</p>
3015:
3016: </dd>
3017:
1.29 mike 3018: </dl><h5 id="parsing-main-inhead"><span class="secno">8.2.5.7 </span>The "<dfn title="insertion mode: in head">in head</dfn>" insertion mode</h5>
1.1 mike 3019:
3020: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inhead" title="insertion
3021: mode: in head">in head</a>", tokens must be handled as follows:</p>
3022:
3023: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
3024: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
3025: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
3026: <dd>
3027: <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
3028: the <a href="parsing.html#current-node">current node</a>.</p>
3029: </dd>
3030:
3031: <dt>A comment token</dt>
3032: <dd>
3033: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
3034: node</a> with the <code title="">data</code> attribute set to
3035: the data given in the comment token.</p>
3036: </dd>
3037:
3038: <dt>A DOCTYPE token</dt>
3039: <dd>
3040: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3041: </dd>
3042:
3043: <dt>A start tag whose tag name is "html"</dt>
3044: <dd>
3045: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
3046: mode</a>.</p>
3047: </dd>
3048:
1.10 mike 3049: <dt>A start tag whose tag name is one of: "base", "basefont",
3050: "bgsound", "command", "link"</dt>
1.1 mike 3051: <dd>
3052:
3053: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately
3054: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
3055: elements</a>.</p>
3056:
3057: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
3058: token's <i>self-closing flag</i></a>, if it is set.</p>
3059:
3060: </dd>
3061:
3062: <dt>A start tag whose tag name is "meta"</dt>
3063: <dd>
3064:
3065: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately
3066: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
3067: elements</a>.</p>
3068:
3069: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
3070: token's <i>self-closing flag</i></a>, if it is set.</p>
3071:
1.22 mike 3072: <p id="meta-charset-during-parse">If the element has a <code title="attr-meta-charset"><a href="semantics.html#attr-meta-charset">charset</a></code> attribute, and its value
3073: is either a supported <a href="infrastructure.html#ascii-compatible-character-encoding">ASCII-compatible character
3074: encoding</a> or a UTF-16 encoding, and the <a href="parsing.html#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> is currently
3075: <i>tentative</i>, then <a href="parsing.html#change-the-encoding">change the encoding</a> to the
3076: encoding given by the value of the <code title="attr-meta-charset"><a href="semantics.html#attr-meta-charset">charset</a></code> attribute.</p>
1.1 mike 3077:
3078: <p>Otherwise, if the element has an <code title="attr-meta-http-equiv"><a href="semantics.html#attr-meta-http-equiv">http-equiv</a></code> attribute whose
3079: value is an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match for the
3080: string "<code title="">Content-Type</code>", and the element has a
3081: <code title="attr-meta-content"><a href="semantics.html#attr-meta-content">content</a></code> attribute, and
3082: applying the <a href="fetching-resources.html#algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding from a
3083: Content-Type</a> to that attribute's value returns a supported
3084: encoding <var title="">encoding</var>, and the <a href="parsing.html#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> is currently
3085: <i>tentative</i>, then <a href="parsing.html#change-the-encoding">change the encoding</a> to the
3086: encoding <var title="">encoding</var>.</p>
3087:
3088: </dd>
3089:
3090: <dt>A start tag whose tag name is "title"</dt>
3091: <dd>
3092: <p>Follow the <a href="#generic-rcdata-element-parsing-algorithm">generic RCDATA element parsing algorithm</a>.</p>
3093: </dd>
3094:
3095: <dt>A start tag whose tag name is "noscript", if the <a href="parsing.html#scripting-flag">scripting flag</a> is enabled</dt>
3096: <dt>A start tag whose tag name is one of: "noframes", "style"</dt>
3097: <dd>
3098: <p>Follow the <a href="#generic-raw-text-element-parsing-algorithm">generic raw text element parsing algorithm</a>.</p>
3099: </dd>
3100:
3101: <dt>A start tag whose tag name is "noscript", if the <a href="parsing.html#scripting-flag">scripting flag</a> is disabled</dt>
3102: <dd>
3103:
3104: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3105:
3106: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inheadnoscript" title="insertion mode: in head noscript">in head
3107: noscript</a>".</p>
3108:
3109: </dd>
3110:
3111: <dt id="scriptTag">A start tag whose tag name is "script"</dt>
3112: <dd>
3113:
3114: <p>Run these steps:</p>
3115:
3116: <ol><li><p><a href="#create-an-element-for-the-token">Create an element for the token</a> in the
3117: <a href="namespaces.html#html-namespace-0">HTML namespace</a>.</p></li>
3118:
3119: <li>
3120:
3121: <p>Mark the element as being <a href="scripting-1.html#parser-inserted">"parser-inserted"</a>.</p>
3122:
3123: <p class="note">This ensures that, if the script is external,
3124: any <code title="dom-document-write"><a href="apis-in-html-documents.html#dom-document-write">document.write()</a></code>
3125: calls in the script will execute in-line, instead of blowing the
3126: document away, as would happen in most other cases. It also
3127: prevents the script from executing until the end tag is
3128: seen.</p>
3129:
3130: </li>
3131:
3132: <li><p>If the parser was originally created for the <a href="the-end.html#html-fragment-parsing-algorithm">HTML
3133: fragment parsing algorithm</a>, then mark the
3134: <code><a href="scripting-1.html#script">script</a></code> element as <a href="scripting-1.html#already-started">"already
3135: started"</a>. (<a href="the-end.html#fragment-case">fragment case</a>)</p></li>
3136:
3137: <li><p>Append the new element to the <a href="parsing.html#current-node">current node</a>
3138: and push it onto the <a href="parsing.html#stack-of-open-elements">stack of open
3139: elements</a>.</p></li>
3140:
3141: <li><p>Switch the tokenizer to the <a href="#script-data-state">script data
3142: state</a>.</p></li>
3143:
3144: <li><p>Let the <a href="parsing.html#original-insertion-mode">original insertion mode</a> be the current
3145: <a href="parsing.html#insertion-mode">insertion mode</a>.</p>
3146:
3147: </li><li><p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-incdata" title="insertion mode: text">text</a>".</p></li>
3148:
3149: </ol></dd>
3150:
3151: <dt>An end tag whose tag name is "head"</dt>
3152: <dd>
3153:
3154: <p>Pop the <a href="parsing.html#current-node">current node</a> (which will be the
3155: <code><a href="semantics.html#the-head-element-0">head</a></code> element) off the <a href="parsing.html#stack-of-open-elements">stack of open
3156: elements</a>.</p>
3157:
3158: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-after-head-insertion-mode" title="insertion mode: after head">after head</a>".</p>
3159:
3160: </dd>
3161:
3162: <dt>An end tag whose tag name is one of: "body", "html", "br"</dt>
3163: <dd>
3164: <p>Act as described in the "anything else" entry below.</p>
3165: </dd>
3166:
3167: <dt>A start tag whose tag name is "head"</dt>
3168: <dt>Any other end tag</dt>
3169: <dd>
3170: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3171: </dd>
3172:
3173: <dt>Anything else</dt>
3174: <dd>
3175:
3176: <!-- can't get here with an EOF and a fragment case -->
3177:
3178: <p>Act as if an end tag token with the tag name "head" had
3179: been seen, and reprocess the current token.</p>
3180:
3181: </dd>
3182:
1.29 mike 3183: </dl><h5 id="parsing-main-inheadnoscript"><span class="secno">8.2.5.8 </span>The "<dfn title="insertion mode: in head noscript">in head noscript</dfn>" insertion mode</h5>
1.1 mike 3184:
3185: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inheadnoscript" title="insertion
3186: mode: in head noscript">in head noscript</a>", tokens must be handled as follows:</p>
3187:
3188: <dl class="switch"><dt>A DOCTYPE token</dt>
3189: <dd>
3190: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3191: </dd>
3192:
3193: <dt>A start tag whose tag name is "html"</dt>
3194: <dd>
3195: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
3196: mode</a>.</p>
3197: </dd>
3198:
3199: <dt>An end tag whose tag name is "noscript"</dt>
3200: <dd>
3201:
3202: <p>Pop the <a href="parsing.html#current-node">current node</a> (which will be a
3203: <code><a href="scripting-1.html#the-noscript-element">noscript</a></code> element) from the <a href="parsing.html#stack-of-open-elements">stack of open
3204: elements</a>; the new <a href="parsing.html#current-node">current node</a> will be a
3205: <code><a href="semantics.html#the-head-element-0">head</a></code> element.</p>
3206:
3207: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>".</p>
3208:
3209: </dd>
3210:
3211: <dt>A character token that is one of U+0009 CHARACTER
3212: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
3213: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
3214: <dt>A comment token</dt>
1.10 mike 3215: <dt>A start tag whose tag name is one of: "basefont", "bgsound",
3216: "link", "meta", "noframes", "style"</dt>
1.1 mike 3217: <dd>
3218: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
3219: mode</a>.</p>
3220: </dd>
3221:
3222: <dt>An end tag whose tag name is "br"</dt>
3223: <dd>
3224: <p>Act as described in the "anything else" entry below.</p>
3225: </dd>
3226:
3227: <dt>A start tag whose tag name is one of: "head", "noscript"</dt>
3228: <dt>Any other end tag</dt>
3229: <dd>
3230: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3231: </dd>
3232:
3233: <dt>Anything else</dt>
3234: <dd>
3235:
3236: <!-- can't get here with an EOF and a fragment case -->
3237:
3238: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if an end tag with the tag
3239: name "noscript" had been seen and reprocess the current
3240: token.</p>
3241:
3242: </dd>
3243:
1.29 mike 3244: </dl><h5 id="the-after-head-insertion-mode"><span class="secno">8.2.5.9 </span>The "<dfn title="insertion mode: after head">after head</dfn>" insertion mode</h5>
1.1 mike 3245:
3246: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#the-after-head-insertion-mode" title="insertion
3247: mode: after head">after head</a>", tokens must be handled as follows:</p>
3248:
3249: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
3250: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
3251: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
3252: <dd>
3253: <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
3254: the <a href="parsing.html#current-node">current node</a>.</p>
3255: </dd>
3256:
3257: <dt>A comment token</dt>
3258: <dd>
3259: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
3260: node</a> with the <code title="">data</code> attribute set to
3261: the data given in the comment token.</p>
3262: </dd>
3263:
3264: <dt>A DOCTYPE token</dt>
3265: <dd>
3266: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3267: </dd>
3268:
3269: <dt>A start tag whose tag name is "html"</dt>
3270: <dd>
3271: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
3272: mode</a>.</p>
3273: </dd>
3274:
3275: <dt>A start tag whose tag name is "body"</dt>
3276: <dd>
3277:
3278: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3279:
3280: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
3281:
3282: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>".</p>
3283:
3284: </dd>
3285:
3286: <dt>A start tag whose tag name is "frameset"</dt>
3287: <dd>
3288:
3289: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3290:
3291: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inframeset" title="insertion mode: in frameset">in frameset</a>".</p>
3292:
3293: </dd>
3294:
1.10 mike 3295: <dt>A start tag token whose tag name is one of: "base", "basefont",
3296: "bgsound", "link", "meta", "noframes", "script", "style",
3297: "title"</dt>
1.1 mike 3298: <dd>
3299:
3300: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
3301:
3302: <p>Push the node pointed to by the <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointer</a> onto the
3303: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
3304:
3305: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
3306: mode</a>.</p>
3307:
3308: <p>Remove the node pointed to by the <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointer</a> from the <a href="parsing.html#stack-of-open-elements">stack
3309: of open elements</a>.</p>
3310:
3311: <p class="note">The <a href="parsing.html#head-element-pointer"><code title="">head</code> element
3312: pointer</a> cannot be null at this point.</p>
3313:
3314: </dd>
3315:
3316: <dt>An end tag whose tag name is one of: "body", "html", "br"</dt>
3317: <dd>
3318: <p>Act as described in the "anything else" entry below.</p>
3319: </dd>
3320:
3321: <dt>A start tag whose tag name is "head"</dt>
3322: <dt>Any other end tag</dt>
3323: <dd>
3324: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3325: </dd>
3326:
3327: <dt>Anything else</dt>
3328: <dd>
3329: <p>Act as if a start tag token with the tag name "body" and no
3330: attributes had been seen, then set the <a href="parsing.html#frameset-ok-flag">frameset-ok
3331: flag</a> back to "ok", and then reprocess the current
3332: token.</p>
3333: </dd>
3334:
1.29 mike 3335: </dl><h5 id="parsing-main-inbody"><span class="secno">8.2.5.10 </span>The "<dfn title="insertion mode: in body">in body</dfn>" insertion mode</h5>
1.1 mike 3336:
3337: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion
3338: mode: in body">in body</a>", tokens must be handled as follows:</p>
3339:
3340: <dl class="switch"><dt>A character token</dt>
3341: <dd>
3342:
3343: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
3344: any.</p>
3345:
3346: <p><a href="#insert-a-character" title="insert a character">Insert the token's
3347: character</a> into the <a href="parsing.html#current-node">current node</a>.</p>
3348:
3349: <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
3350: LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
1.7 mike 3351: (CR), U+0020 SPACE, or U+FFFD REPLACEMENT CHARACTER, then set the
3352: <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
3353:
3354: <!-- U+FFFD REPLACEMENT CHARACTER is in this list because the
3355: D-Link DSL-G604T ADSL router has a zero byte in its
3356: configuration UI before a <frameset>. Zero bytes get
3357: converted to U+FFFD, which (without that character in this
3358: list) would mean the <frameset> would be ignored.
3359: refs: https://bugzilla.mozilla.org/show_bug.cgi?id=563526
3360: http://www.w3.org/Bugs/Public/show_bug.cgi?id=9659
3361: -->
1.1 mike 3362:
3363: </dd>
3364:
3365: <dt>A comment token</dt>
3366: <dd>
3367: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
3368: node</a> with the <code title="">data</code> attribute set to
3369: the data given in the comment token.</p>
3370: </dd>
3371:
3372: <dt>A DOCTYPE token</dt>
3373: <dd>
3374: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
3375: </dd>
3376:
3377: <dt>A start tag whose tag name is "html"</dt>
3378: <dd>
3379: <p><a href="parsing.html#parse-error">Parse error</a>. For each attribute on the token,
3380: check to see if the attribute is already present on the top
3381: element of the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>. If it is not,
3382: add the attribute and its corresponding value to that element.</p>
3383: </dd>
3384:
1.10 mike 3385: <dt>A start tag token whose tag name is one of: "base", "basefont",
3386: "bgsound", "command", "link", "meta", "noframes", "script",
3387: "style", "title"</dt>
1.1 mike 3388: <dd>
3389: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
3390: mode</a>.</p>
3391: </dd>
3392:
3393: <dt>A start tag whose tag name is "body"</dt>
3394: <dd>
3395:
3396: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
3397:
3398: <p>If the second element on the <a href="parsing.html#stack-of-open-elements">stack of open
3399: elements</a> is not a <code><a href="sections.html#the-body-element-0">body</a></code> element, or, if the
3400: <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has only one node on it,
3401: then ignore the token. (<a href="the-end.html#fragment-case">fragment case</a>)</p>
3402:
1.40 mike 3403: <p>Otherwise, set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok";
3404: then, for each attribute on the token, check to see if the
3405: attribute is already present on the <code><a href="sections.html#the-body-element-0">body</a></code> element (the
3406: second element) on the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>, and if
3407: it is not, add the attribute and its corresponding value to that
3408: element.</p>
1.1 mike 3409:
3410: </dd>
3411:
3412: <dt>A start tag whose tag name is "frameset"</dt>
3413: <dd>
3414:
3415: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
3416:
3417: <p>If the second element on the <a href="parsing.html#stack-of-open-elements">stack of open
3418: elements</a> is not a <code><a href="sections.html#the-body-element-0">body</a></code> element, or, if the
3419: <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has only one node on it,
3420: then ignore the token. (<a href="the-end.html#fragment-case">fragment case</a>)</p>
3421:
3422: <p>If the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> is set to "not ok", ignore
3423: the token.</p>
3424:
3425: <p>Otherwise, run the following steps:</p>
3426:
3427: <ol><li><p>Remove the second element on the <a href="parsing.html#stack-of-open-elements">stack of open
3428: elements</a> from its parent node, if it has one.</p></li>
3429:
3430: <li><p>Pop all the nodes from the bottom of the <a href="parsing.html#stack-of-open-elements">stack of
3431: open elements</a>, from the <a href="parsing.html#current-node">current node</a> up to,
3432: but not including, the root <code><a href="semantics.html#the-html-element-0">html</a></code> element.</p>
3433:
3434: </li><li><p><a href="#insert-an-html-element">Insert an HTML element</a> for the
3435: token.</p></li>
3436:
3437: <li><p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inframeset" title="insertion mode: in frameset">in frameset</a>".</p>
3438:
3439: </li></ol></dd>
3440:
3441: <dt>An end-of-file token</dt>
3442: <dd>
3443:
3444: <p>If there is a node in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
3445: that is not either a <code><a href="grouping-content.html#the-dd-element">dd</a></code> element, a <code><a href="grouping-content.html#the-dt-element">dt</a></code>
3446: element, an <code><a href="grouping-content.html#the-li-element">li</a></code> element, a <code><a href="grouping-content.html#the-p-element">p</a></code> element, a
3447: <code><a href="tabular-data.html#the-tbody-element">tbody</a></code> element, a <code><a href="tabular-data.html#the-td-element">td</a></code> element, a
3448: <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code> element, a <code><a href="tabular-data.html#the-th-element">th</a></code> element, a
3449: <code><a href="tabular-data.html#the-thead-element">thead</a></code> element, a <code><a href="tabular-data.html#the-tr-element">tr</a></code> element, the
3450: <code><a href="sections.html#the-body-element-0">body</a></code> element, or the <code><a href="semantics.html#the-html-element-0">html</a></code> element, then
3451: this is a <a href="parsing.html#parse-error">parse error</a>.</p> <!-- (some of those are
3452: fragment cases) -->
3453:
3454: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
3455:
3456: </dd>
3457:
3458: <dt>An end tag whose tag name is "body"</dt>
3459: <dd>
3460:
3461: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">have a <code>body</code> element
3462: in scope</a>, this is a <a href="parsing.html#parse-error">parse error</a>; ignore the
3463: token.</p>
3464:
3465: <!-- if we get here, the insertion mode here is forcibly "in
3466: body". -->
3467:
3468: <p>Otherwise, if there is a node in the <a href="parsing.html#stack-of-open-elements">stack of open
3469: elements</a> that is not either a <code><a href="grouping-content.html#the-dd-element">dd</a></code> element, a
3470: <code><a href="grouping-content.html#the-dt-element">dt</a></code> element, an <code><a href="grouping-content.html#the-li-element">li</a></code> element, an
3471: <code><a href="the-button-element.html#the-optgroup-element">optgroup</a></code> element, an <code><a href="the-button-element.html#the-option-element">option</a></code> element, a
3472: <code><a href="grouping-content.html#the-p-element">p</a></code> element, an <code><a href="text-level-semantics.html#the-rp-element">rp</a></code> element, an
3473: <code><a href="text-level-semantics.html#the-rt-element">rt</a></code> element, a <code><a href="tabular-data.html#the-tbody-element">tbody</a></code> element, a
3474: <code><a href="tabular-data.html#the-td-element">td</a></code> element, a <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code> element, a
3475: <code><a href="tabular-data.html#the-th-element">th</a></code> element, a <code><a href="tabular-data.html#the-thead-element">thead</a></code> element, a
3476: <code><a href="tabular-data.html#the-tr-element">tr</a></code> element, the <code><a href="sections.html#the-body-element-0">body</a></code> element, or the
3477: <code><a href="semantics.html#the-html-element-0">html</a></code> element, then this is a <a href="parsing.html#parse-error">parse
3478: error</a>.</p> <!-- (some of those are fragment cases, e.g. for
3479: <tbody> you'd have hit the first paragraph since the <body>
3480: wouldn't be in scope, unless it was a fragment case) -->
3481:
3482: <!-- If we ever change the frameset-ok flag to an insertion mode,
3483: then we'd have to somehow keep track of its state when we switch
3484: to after-body. -->
3485:
3486: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-afterbody" title="insertion mode: after body">after body</a>".</p>
3487:
3488: </dd>
3489:
3490: <dt>An end tag whose tag name is "html"</dt>
3491: <dd>
3492:
3493: <p>Act as if an end tag with tag name "body" had been seen,
3494: then, if that token wasn't ignored, reprocess the current
3495: token.</p>
3496:
3497: </dd>
3498:
3499: <!-- start tags for non-phrasing flow content elements -->
3500:
3501: <!-- the normal ones -->
3502: <dt>A start tag whose tag name is one of: "address", "article",
3503: "aside", "blockquote", "center", <!--v2DATAGRID"datagrid",-->
1.15 mike 3504: "details", "dir", "div", "dl", "fieldset", "figcaption", "figure",
3505: "footer", "header", "hgroup", "menu", "nav", "ol", "p", "section",
3506: "summary", "ul"</dt>
1.1 mike 3507: <dd>
3508:
3509: <!-- As of May 2008 this doesn't match any browser exactly, but is
3510: as close to what IE does as I can get without doing the non-tree
3511: DOM nonsense, and thus should actually afford better compatibility
3512: when implemented by the other browsers. -->
3513:
1.8 mike 3514: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has an
3515: element in button scope">has a <code>p</code> element in button
3516: scope</a>, then act as if an end tag with the tag name "p" had
3517: been seen.</p>
1.1 mike 3518:
3519: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3520:
3521: </dd>
3522:
3523: <!-- as normal, but close h1-h6 if it's the current node -->
3524: <dt>A start tag whose tag name is one of: "h1", "h2", "h3", "h4",
3525: "h5", "h6"</dt>
3526: <dd>
3527:
1.8 mike 3528: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
3529: an element in button scope">has a <code>p</code> element in button
1.1 mike 3530: scope</a>, then act as if an end tag with the tag name
3531: "p" had been seen.</p>
3532:
3533: <p>If the <a href="parsing.html#current-node">current node</a> is an element whose tag name
3534: is one of "h1", "h2", "h3", "h4", "h5", or "h6", then this is a
3535: <a href="parsing.html#parse-error">parse error</a>; pop the <a href="parsing.html#current-node">current node</a> off
3536: the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
3537: <!-- See https://bugs.webkit.org/show_bug.cgi?id=12646 -->
3538:
3539: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3540:
3541: </dd>
3542:
3543: <!-- as normal, but drops leading newline -->
3544: <dt>A start tag whose tag name is one of: "pre", "listing"</dt>
3545: <dd>
3546:
1.8 mike 3547: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
3548: an element in button scope">has a <code>p</code> element in button
1.1 mike 3549: scope</a>, then act as if an end tag with the tag name
3550: "p" had been seen.</p>
3551:
3552: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3553:
3554: <p>If the next token is a U+000A LINE FEED (LF) character
3555: token, then ignore that token and move on to the next
3556: one. (Newlines at the start of <code><a href="grouping-content.html#the-pre-element">pre</a></code> blocks are
3557: ignored as an authoring convenience.)</p>
3558:
3559: <!-- <pre>[CR]X will eat the [CR], <pre>X will eat the
3560: , but <pre>X will not eat the . -->
3561:
3562: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
3563:
3564: </dd>
3565:
3566: <!-- as normal, but interacts with the form element pointer -->
3567: <dt>A start tag whose tag name is "form"</dt>
3568: <dd>
3569:
3570: <p>If the <a href="parsing.html#form-element-pointer"><code title="form">form</code> element
3571: pointer</a> is not null, then this is a <a href="parsing.html#parse-error">parse
3572: error</a>; ignore the token.</p>
3573:
3574: <p>Otherwise:</p>
3575:
1.8 mike 3576: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
3577: an element in button scope">has a <code>p</code> element in button
1.1 mike 3578: scope</a>, then act as if an end tag with the tag name
3579: "p" had been seen.</p>
3580:
3581: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, and set the
3582: <a href="parsing.html#form-element-pointer"><code title="form">form</code> element pointer</a> to
3583: point to the element created.</p>
3584:
3585: </dd>
3586:
3587: <!-- as normal, but imply </li> when there's another <li> open in weird cases -->
3588: <dt>A start tag whose tag name is "li"</dt>
3589: <dd>
3590:
3591: <p>Run these steps:</p>
3592:
3593: <ol><li><p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p></li>
3594:
3595: <li><p>Initialize <var title="">node</var> to be the <a href="parsing.html#current-node">current
3596: node</a> (the bottommost node of the stack).</p></li>
3597:
3598: <li><p><i>Loop</i>: If <var title="">node</var> is an
3599: <code><a href="grouping-content.html#the-li-element">li</a></code> element, then act as if an end tag with the tag
3600: name "li" had been seen, then jump to the last step.</p></li>
3601:
1.12 mike 3602: <li><p>If <var title="">node</var> is in the <a href="parsing.html#special">special</a>
3603: category, but is not an <code><a href="sections.html#the-address-element">address</a></code>, <code><a href="grouping-content.html#the-div-element">div</a></code>,
3604: or <code><a href="grouping-content.html#the-p-element">p</a></code> element, then jump to the last step.</p></li>
3605: <!-- an element <foo> is in this list if the following markup:
1.1 mike 3606:
3607: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><body><ol><li><foo><li>
3608:
3609: ...results in the second <li> not being (in any way) a descendant
3610: of the first <li>, or if <foo> is a formatting element that gets
3611: reopened later. -->
3612:
3613: <li><p>Otherwise, set <var title="">node</var> to the previous
3614: entry in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> and return to
3615: the step labeled <i>loop</i>.</p></li>
3616:
3617: <li>
3618:
3619: <p>This is the last step.</p>
3620:
1.8 mike 3621: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
3622: an element in button scope">has a <code>p</code> element in button
1.1 mike 3623: scope</a>, then act as if an end tag with the tag name
3624: "p" had been seen.</p>
3625:
3626: <p>Finally, <a href="#insert-an-html-element">insert an HTML element</a> for the
3627: token.</p>
3628:
3629: </li>
3630:
3631: </ol></dd>
3632:
3633: <!-- as normal, but imply </dt> or </dd> when there's another <dt> or <dd> open in weird cases -->
3634: <dt>A start tag whose tag name is one of: "dd", "dt"</dt>
3635: <dd>
3636:
3637: <p>Run these steps:</p>
3638:
3639: <ol><li><p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p></li>
3640:
3641: <li><p>Initialize <var title="">node</var> to be the <a href="parsing.html#current-node">current
3642: node</a> (the bottommost node of the stack).</p></li>
3643:
3644: <li><p><i>Loop</i>: If <var title="">node</var> is a
3645: <code><a href="grouping-content.html#the-dd-element">dd</a></code> or <code><a href="grouping-content.html#the-dt-element">dt</a></code> element, then act as if an end
3646: tag with the same tag name as <var title="">node</var> had been
3647: seen, then jump to the last step.</p></li>
3648:
1.12 mike 3649: <li><p>If <var title="">node</var> is in the <a href="parsing.html#special">special</a>
3650: category, but is not an <code><a href="sections.html#the-address-element">address</a></code>, <code><a href="grouping-content.html#the-div-element">div</a></code>,
3651: or <code><a href="grouping-content.html#the-p-element">p</a></code> element, then jump to the last step.</p></li>
3652: <!-- an element <foo> is in this list if the following markup:
1.1 mike 3653:
3654: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><body><dl><dt><foo><dt>
3655:
3656: ...results in the second <dt> not being (in any way) a descendant
3657: of the first <dt>, or if <foo> is a formatting element that gets
3658: reopened later. -->
3659:
3660: <li><p>Otherwise, set <var title="">node</var> to the previous
3661: entry in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> and return to
3662: the step labeled <i>loop</i>.</p></li>
3663:
3664: <li>
3665:
3666: <p>This is the last step.</p>
3667:
1.8 mike 3668: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
3669: an element in button scope">has a <code>p</code> element in button
1.1 mike 3670: scope</a>, then act as if an end tag with the tag name
3671: "p" had been seen.</p>
3672:
3673: <p>Finally, <a href="#insert-an-html-element">insert an HTML element</a> for the
3674: token.</p>
3675:
3676: </li>
3677:
3678: </ol></dd>
3679:
3680: <!-- same as normal, but effectively ends parsing -->
3681: <dt>A start tag whose tag name is "plaintext"</dt>
3682: <dd>
3683:
1.8 mike 3684: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
3685: an element in button scope">has a <code>p</code> element in button
1.1 mike 3686: scope</a>, then act as if an end tag with the tag name
3687: "p" had been seen.</p>
3688:
3689: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3690:
3691: <p>Switch the tokenizer to the <a href="#plaintext-state">PLAINTEXT state</a>.</p>
3692:
3693: <p class="note">Once a start tag with the tag name "plaintext" has
3694: been seen, that will be the last token ever seen other than
3695: character tokens (and the end-of-file token), because there is no
3696: way to switch out of the <a href="#plaintext-state">PLAINTEXT state</a>.</p>
3697:
3698: </dd>
3699:
3700: <!-- button is a hybrid -->
3701: <dt>A start tag whose tag name is "button"</dt>
3702: <dd>
3703:
3704: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-scope" title="has
3705: an element in scope">has a <code>button</code> element in
3706: scope</a>, then this is a <a href="parsing.html#parse-error">parse error</a>;
3707: act as if an end tag with the tag name "button" had been seen,
3708: then reprocess the token.</p>
3709:
3710: <p>Otherwise:</p>
3711:
3712: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
3713: any.</p>
3714:
3715: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
3716:
3717: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
3718:
3719: </dd>
3720:
3721: <!-- end tags for non-phrasing flow content elements (and button) -->
3722:
3723: <!-- the normal ones -->
3724: <dt>An end tag whose tag name is one of: "address", "article",
3725: "aside", "blockquote", "button", "center",
3726: <!--v2DATAGRID"datagrid",--> "details", "dir", "div", "dl",
1.15 mike 3727: "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
3728: "listing", "menu", "nav", "ol", "pre", "section", "summary",
3729: "ul"</dt>
1.1 mike 3730: <dd>
3731:
3732: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">have an element in scope</a>
3733: with the same tag name as that of the token, then this is a
3734: <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
3735:
3736: <p>Otherwise, run these steps:</p>
3737:
3738: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>.</p></li>
3739:
3740: <li><p>If the <a href="parsing.html#current-node">current node</a> is not an element with
3741: the same tag name as that of the token, then this is a
3742: <a href="parsing.html#parse-error">parse error</a>.</p></li>
3743:
3744: <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
3745: until an element with the same tag name as the token has been
3746: popped from the stack.</p></li>
3747:
3748: </ol></dd>
3749:
3750: <!-- removes the form element pointer instead of the matching node -->
3751: <dt>An end tag whose tag name is "form"</dt>
3752: <dd>
3753:
3754: <p>Let <var title="">node</var> be the element that the
3755: <a href="parsing.html#form-element-pointer"><code title="">form</code> element pointer</a> is set
3756: to.</p>
3757:
3758: <p>Set the <a href="parsing.html#form-element-pointer"><code title="">form</code> element pointer</a>
3759: to null.</p>
3760:
3761: <p>If <var title="">node</var> is null or the <a href="parsing.html#stack-of-open-elements">stack of open
3762: elements</a> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in
3763: scope">have <var title="">node</var> in scope</a>, then this is
3764: a <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
3765:
3766: <p>Otherwise, run these steps:</p>
3767:
3768: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>.</p></li>
3769:
3770: <li><p>If the <a href="parsing.html#current-node">current node</a> is not <var title="">node</var>, then this is a <a href="parsing.html#parse-error">parse
3771: error</a>.</p></li>
3772:
3773: <li><p>Remove <var title="">node</var> from the <a href="parsing.html#stack-of-open-elements">stack of
3774: open elements</a>.</p></li>
3775:
3776: </ol></dd>
3777:
3778: <!-- as normal, except </p> implies <p> if there's no <p> in scope, and needs care as the elements have optional tags -->
3779: <dt>An end tag whose tag name is "p"</dt>
3780: <dd>
3781:
1.29 mike 3782: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-button-scope" title="has an element in button scope">have an element in button
3783: scope</a> with the same tag name as that of the token, then this
1.8 mike 3784: is a <a href="parsing.html#parse-error">parse error</a>; act as if a start tag with the tag
3785: name "p" had been seen, then reprocess the current token.</p>
1.1 mike 3786:
3787: <p>Otherwise, run these steps:</p>
3788:
3789: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>, except
3790: for elements with the same tag name as the token.</p></li>
3791:
3792: <li><p>If the <a href="parsing.html#current-node">current node</a> is not an element with
3793: the same tag name as that of the token, then this is a
3794: <a href="parsing.html#parse-error">parse error</a>.</p></li>
3795:
3796: <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
3797: until an element with the same tag name as the token has been
3798: popped from the stack.</p></li>
3799:
3800: </ol></dd>
3801:
3802: <!-- as normal, but needs care as the elements have optional tags, and are further scoped by <ol>/<ul> -->
3803: <dt>An end tag whose tag name is "li"</dt>
3804: <dd>
3805:
3806: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-list-item-scope" title="has an element in list item scope">have an element in list
3807: item scope</a> with the same tag name as that of the token,
3808: then this is a <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
3809:
3810: <p>Otherwise, run these steps:</p>
3811:
3812: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>, except
3813: for elements with the same tag name as the token.</p></li>
3814:
3815: <li><p>If the <a href="parsing.html#current-node">current node</a> is not an element with
3816: the same tag name as that of the token, then this is a
3817: <a href="parsing.html#parse-error">parse error</a>.</p></li>
3818:
3819: <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
3820: until an element with the same tag name as the token has been
3821: popped from the stack.</p></li>
3822:
3823: </ol></dd>
3824:
3825: <!-- as normal, but needs care as the elements have optional tags -->
3826: <dt>An end tag whose tag name is one of: "dd", "dt"</dt>
3827: <dd>
3828:
3829: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">have an element in scope</a>
3830: with the same tag name as that of the token, then this is a
3831: <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
3832:
3833: <p>Otherwise, run these steps:</p>
3834:
3835: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>, except
3836: for elements with the same tag name as the token.</p></li>
3837:
3838: <li><p>If the <a href="parsing.html#current-node">current node</a> is not an element with
3839: the same tag name as that of the token, then this is a
3840: <a href="parsing.html#parse-error">parse error</a>.</p></li>
3841:
3842: <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
3843: until an element with the same tag name as the token has been
3844: popped from the stack.</p></li>
3845:
3846: </ol></dd>
3847:
3848: <!-- as normal, except acts as a closer for any of the h1-h6 elements -->
3849: <dt>An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"</dt>
3850: <dd>
3851:
3852: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">have an element in scope</a>
3853: whose tag name is one of "h1", "h2", "h3", "h4", "h5", or "h6",
3854: then this is a <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
3855:
3856: <p>Otherwise, run these steps:</p>
3857:
3858: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>.</p></li>
3859:
3860: <li><p>If the <a href="parsing.html#current-node">current node</a> is not an element with
3861: the same tag name as that of the token, then this is a
3862: <a href="parsing.html#parse-error">parse error</a>.</p></li>
3863:
3864: <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
3865: until an element whose tag name is one of "h1", "h2", "h3", "h4",
3866: "h5", or "h6" has been popped from the stack.</p></li>
3867:
3868: </ol></dd>
3869:
3870: <!-- see also applet/marquee/object lower down -->
3871:
3872: <dt>An end tag whose tag name is "sarcasm"</dt>
3873: <dd>
3874: <p>Take a deep breath, then act as described in the "any other end
3875: tag" entry below.</p>
3876: </dd>
3877:
3878: <!-- ADOPTION AGENCY ELEMENTS
3879: Mozilla-only: bdo blink del ins sub sup q
3880: Safari-only: code dfn kbd nobr samp var wbr
3881: Both: a b big em font i s small strike strong tt u -->
3882:
3883: <dt>A start tag whose tag name is "a"</dt>
3884: <dd>
3885:
3886: <p>If the <a href="parsing.html#list-of-active-formatting-elements">list of active formatting elements</a>
3887: contains an element whose tag name is "a" between the end of
3888: the list and the last marker on the list (or the start of the
3889: list if there is no marker on the list), then this is a
3890: <a href="parsing.html#parse-error">parse error</a>; act as if an end tag with the tag
3891: name "a" had been seen, then remove that element from the
3892: <a href="parsing.html#list-of-active-formatting-elements">list of active formatting elements</a> and the
3893: <a href="parsing.html#stack-of-open-elements">stack of open elements</a> if the end tag didn't
3894: already remove it (it might not have if the element is not
3895: <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">in table
3896: scope</a>).</p>
3897:
3898: <p class="example">In the non-conforming stream
3899: <code><a href="a">a<table><a href="b">b</table>x</code>,
3900: the first <code><a href="text-level-semantics.html#the-a-element">a</a></code> element would be closed upon seeing the
3901: second one, and the "x" character would be inside a link to "b",
3902: not to "a". This is despite the fact that the outer <code><a href="text-level-semantics.html#the-a-element">a</a></code>
3903: element is not in table scope (meaning that a regular
3904: <code></a></code> end tag at the start of the table wouldn't
3905: close the outer <code><a href="text-level-semantics.html#the-a-element">a</a></code> element). The result is that the
3906: two <code><a href="text-level-semantics.html#the-a-element">a</a></code> elements are indirectly nested inside each
3907: other — non-conforming markup will often result in
3908: non-conforming DOMs when parsed.</p>
3909:
3910: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
3911: any.</p>
3912:
1.45 mike 3913: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. <a href="parsing.html#push-onto-the-list-of-active-formatting-elements">Push
3914: onto the list of active formatting elements</a> that
3915: element.</p>
1.1 mike 3916:
3917: </dd>
3918:
3919: <dt>A start tag whose tag name is one of: "b", "big", "code", "em",
3920: "font", "i", "s", "small", "strike", "strong", "tt", "u"</dt>
3921: <dd>
3922:
3923: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
3924: any.</p>
3925:
1.45 mike 3926: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. <a href="parsing.html#push-onto-the-list-of-active-formatting-elements">Push
3927: onto the list of active formatting elements</a> that
3928: element.</p>
1.1 mike 3929:
3930: </dd>
3931:
3932: <dt>A start tag whose tag name is "nobr"</dt>
3933: <dd>
3934:
3935: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
3936: any.</p>
3937:
3938: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-scope" title="has an
3939: element in scope">has a <code>nobr</code> element in scope</a>,
3940: then this is a <a href="parsing.html#parse-error">parse error</a>; act as if an end tag with
3941: the tag name "nobr" had been seen, then once again
3942: <a href="parsing.html#reconstruct-the-active-formatting-elements">reconstruct the active formatting elements</a>, if
3943: any.</p>
3944:
1.45 mike 3945: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. <a href="parsing.html#push-onto-the-list-of-active-formatting-elements">Push
3946: onto the list of active formatting elements</a> that
3947: element.</p>
1.1 mike 3948:
3949: </dd>
3950:
3951: <dt id="adoptionAgency">An end tag whose tag name is one of: "a",
3952: "b", "big", "code", "em", "font", "i", "nobr", "s", "small",
3953: "strike", "strong", "tt", "u"</dt>
3954: <dd>
3955:
3956: <p>Run these steps:</p>
3957:
1.47 ! mike 3958: <ol><li><p>Let <var title="">outer loop counter</var> be
! 3959: zero.</p></li>
1.1 mike 3960:
1.47 ! mike 3961: <li><p><i>Outer loop</i>: If <var title="">outer loop
! 3962: counter</var> is greater than or equal to eight, then abort these
! 3963: steps.</p></li>
! 3964:
! 3965: <li><p>Increment <var title="">outer loop counter</var> by
! 3966: one.</p></li>
! 3967:
! 3968: <li>
! 3969:
! 3970: <p>Let the <var title="">formatting element</var> be the last
! 3971: element in the <a href="parsing.html#list-of-active-formatting-elements">list of active formatting elements</a>
! 3972: that:</p>
1.1 mike 3973:
3974: <ul><li>is between the end of the list and the last scope
3975: marker in the list, if any, or the start of the list
3976: otherwise, and</li>
3977:
3978: <li>has the same tag name as the token.</li>
3979:
1.45 mike 3980: </ul><p>If there is no such node, then abort these steps and instead
3981: act as described in the "any other end tag" entry below.</p>
1.1 mike 3982:
3983: <p>Otherwise, if there is such a node, but that node is not
3984: in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>, then this is a
3985: <a href="parsing.html#parse-error">parse error</a>; remove the element from the list,
3986: and abort these steps.</p>
3987:
1.45 mike 3988: <p>Otherwise, if there is such a node, and that node is also in
3989: the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>, but the element is not
3990: <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">in scope</a>, then this
3991: is a <a href="parsing.html#parse-error">parse error</a>; ignore the token, and abort these
3992: steps.</p>
3993:
1.1 mike 3994: <p>Otherwise, there is a <var title="">formatting
3995: element</var> and that element is in <a href="parsing.html#stack-of-open-elements" title="stack of
3996: open elements">the stack</a> and is <a href="parsing.html#has-an-element-in-scope" title="has an
3997: element in scope">in scope</a>. If the element is not the
3998: <a href="parsing.html#current-node">current node</a>, this is a <a href="parsing.html#parse-error">parse
3999: error</a>. In any case, proceed with the algorithm as
4000: written in the following steps.</p>
4001:
4002: </li>
4003:
1.12 mike 4004: <li><p>Let the <var title="">furthest block</var> be the topmost
4005: node in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> that is lower in
4006: the stack than the <var title="">formatting element</var>, and is
4007: an element in the <a href="parsing.html#special">special</a> category. There might not
4008: be one.</p></li>
1.1 mike 4009:
4010: <li><p>If there is no <var title="">furthest block</var>,
4011: then the UA must skip the subsequent steps and instead just
4012: pop all the nodes from the bottom of the <a href="parsing.html#stack-of-open-elements">stack of open
4013: elements</a>, from the <a href="parsing.html#current-node">current node</a> up to and
4014: including the <var title="">formatting element</var>, and
4015: remove the <var title="">formatting element</var> from the
4016: <a href="parsing.html#list-of-active-formatting-elements">list of active formatting elements</a>.</p></li>
4017:
4018: <li><p>Let the <var title="">common ancestor</var> be the element
4019: immediately above the <var title="">formatting element</var> in the
4020: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p></li>
4021:
4022: <li><p>Let a bookmark note the position of the <var title="">formatting element</var> in the <a href="parsing.html#list-of-active-formatting-elements">list of active
4023: formatting elements</a> relative to the elements on either
4024: side of it in the list.</p></li>
4025:
4026: <li>
4027:
4028: <p>Let <var title="">node</var> and <var title="">last node</var> be the
4029: <var title="">furthest block</var>. Follow these steps:</p>
4030:
1.47 ! mike 4031: <ol><li><p>Let <var title="">inner loop counter</var> be
! 4032: zero.</p></li>
! 4033:
! 4034: <li><p><i>Inner loop</i>: If <var title="">inner loop
! 4035: counter</var> is greater than or equal to three, then abort these
! 4036: steps.</p></li>
! 4037:
! 4038: <li><p>Increment <var title="">inner loop counter</var> by
! 4039: one.</p></li>
! 4040:
! 4041: <li>Let <var title="">node</var> be the element immediately
1.1 mike 4042: above <var title="">node</var> in the <a href="parsing.html#stack-of-open-elements">stack of open
4043: elements</a>, or if <var title="">node</var> is no longer in
4044: the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> (e.g. because it got
4045: removed by the next step), the element that was immediately
4046: above <var title="">node</var> in the <a href="parsing.html#stack-of-open-elements">stack of open
4047: elements</a> before <var title="">node</var> was
4048: removed.</li>
4049:
4050: <li>If <var title="">node</var> is not in the <a href="parsing.html#list-of-active-formatting-elements">list of
4051: active formatting elements</a>, then remove <var title="">node</var> from the <a href="parsing.html#stack-of-open-elements">stack of open
1.47 ! mike 4052: elements</a> and then go back to the step labeled <i>inner
! 4053: loop</i>.</li>
1.1 mike 4054:
4055: <li>Otherwise, if <var title="">node</var> is the <var title="">formatting element</var>, then go to the next step
4056: in the overall algorithm.</li>
4057:
4058: <li><a href="#create-an-element-for-the-token">Create an element for the token</a> for which the
4059: element <var title="">node</var> was created, replace the entry
4060: for <var title="">node</var> in the <a href="parsing.html#list-of-active-formatting-elements">list of active
4061: formatting elements</a> with an entry for the new element,
4062: replace the entry for <var title="">node</var> in the
4063: <a href="parsing.html#stack-of-open-elements">stack of open elements</a> with an entry for the new
4064: element, and let <var title="">node</var> be the new
4065: element.</li>
4066:
1.13 mike 4067: <li>If <var title="">last node</var> is the <var title="">furthest block</var>, then move the aforementioned
4068: bookmark to be immediately after the new <var title="">node</var> in the <a href="parsing.html#list-of-active-formatting-elements">list of active formatting
4069: elements</a>.</li>
4070:
1.1 mike 4071: <li>Insert <var title="">last node</var> into <var title="">node</var>, first removing it from its previous
4072: parent node if any.</li>
4073:
4074: <li>Let <var title="">last node</var> be <var title="">node</var>.</li>
4075:
1.47 ! mike 4076: <li>Return to the step labeled <i>inner loop</i>.</li>
1.1 mike 4077:
4078: </ol></li>
4079:
4080: <li>
4081:
4082: <p>If the <var title="">common ancestor</var> node is a
4083: <code><a href="tabular-data.html#the-table-element">table</a></code>, <code><a href="tabular-data.html#the-tbody-element">tbody</a></code>, <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code>,
4084: <code><a href="tabular-data.html#the-thead-element">thead</a></code>, or <code><a href="tabular-data.html#the-tr-element">tr</a></code> element, then,
4085: <a href="#foster-parent">foster parent</a> whatever <var title="">last
4086: node</var> ended up being in the previous step, first removing
4087: it from its previous parent node if any.</p>
4088:
4089: <p>Otherwise, append whatever <var title="">last node</var>
4090: ended up being in the previous step to the <var title="">common
4091: ancestor</var> node, first removing it from its previous parent
4092: node if any.</p>
4093:
4094: </li>
4095:
4096: <li><p><a href="#create-an-element-for-the-token">Create an element for the token</a> for which the
4097: <var title="">formatting element</var> was created.</p></li>
4098:
4099: <li><p>Take all of the child nodes of the <var title="">furthest
4100: block</var> and append them to the element created in the last
4101: step.</p></li>
4102:
4103: <li><p>Append that new element to the <var title="">furthest
4104: block</var>.</p></li>
4105:
4106: <li><p>Remove the <var title="">formatting element</var> from the
4107: <a href="parsing.html#list-of-active-formatting-elements">list of active formatting elements</a>, and insert the
4108: new element into the <a href="parsing.html#list-of-active-formatting-elements">list of active formatting
4109: elements</a> at the position of the aforementioned
4110: bookmark.</p></li>
4111:
4112: <li><p>Remove the <var title="">formatting element</var> from the
4113: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>, and insert the new element
4114: into the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> immediately below
4115: the position of the <var title="">furthest block</var> in that
4116: stack.</p></li>
4117:
1.47 ! mike 4118: <li><p>Jump back to the step labeled <i>outer loop</i>.</p></li>
1.1 mike 4119:
4120: </ol><p class="note">Because of the way this algorithm causes elements
4121: to change parents, it has been dubbed the "adoption agency
4122: algorithm" (in contrast with other possible algorithms for dealing
4123: with misnested content, which included the "incest algorithm", the
4124: "secret affair algorithm", and the "Heisenberg algorithm").</p>
4125:
4126: </dd>
4127:
4128: <dt>A start tag token whose tag name is one of: "applet",
4129: "marquee", "object"</dt>
4130: <dd>
4131:
4132: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4133: any.</p>
4134:
4135: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4136:
4137: <p>Insert a marker at the end of the <a href="parsing.html#list-of-active-formatting-elements">list of active
4138: formatting elements</a>.</p>
4139:
4140: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4141:
4142: </dd>
4143:
4144: <dt>An end tag token whose tag name is one of: "applet",
4145: "marquee", "object"</dt>
4146: <dd>
4147:
4148: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">have an element in scope</a>
4149: with the same tag name as that of the token, then this is a
4150: <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
4151:
4152: <p>Otherwise, run these steps:</p>
4153:
4154: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>.</p></li>
4155:
4156: <li><p>If the <a href="parsing.html#current-node">current node</a> is not an element with
4157: the same tag name as that of the token, then this is a
4158: <a href="parsing.html#parse-error">parse error</a>.</p></li>
4159:
4160: <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
4161: until an element with the same tag name as the token has been
4162: popped from the stack.</p></li>
4163:
4164: <li><a href="parsing.html#clear-the-list-of-active-formatting-elements-up-to-the-last-marker">Clear the list of active formatting elements up to the
4165: last marker</a>.</li>
4166:
4167: </ol></dd>
4168:
4169: <dt>A start tag whose tag name is "table"</dt>
4170: <dd>
4171:
4172: <p>If the <code><a href="infrastructure.html#document">Document</a></code> is <em>not</em> set to
4173: <a href="dom.html#quirks-mode">quirks mode</a>, and the <a href="parsing.html#stack-of-open-elements">stack of open
1.8 mike 4174: elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has an element in button scope">has a
4175: <code>p</code> element in button scope</a>, then act as if an
4176: end tag with the tag name "p" had been seen.</p> <!-- i hate
4177: myself (this quirk was basically caused by acid2; if i'd realised
4178: we could change the specs when i wrote acid2, we could have
4179: avoided having any parsing-mode quirks) -Hixie -->
1.1 mike 4180:
4181: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4182:
4183: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4184:
4185: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>".</p>
4186:
4187: </dd>
4188:
1.10 mike 4189: <dt>A start tag whose tag name is one of: "area", "br", "embed",
4190: "img", "input", "keygen", "wbr"</dt>
1.1 mike 4191: <dd>
4192:
4193: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4194: any.</p>
4195:
4196: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately
4197: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4198: elements</a>.</p>
4199:
4200: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
4201: token's <i>self-closing flag</i></a>, if it is set.</p>
4202:
1.10 mike 4203: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4204: <!-- shouldn't really do this for <area> -->
1.1 mike 4205:
4206: </dd>
4207:
4208: <dt>A start tag whose tag name is one of: "param", "source", "track"</dt>
4209: <dd>
4210:
4211: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately
4212: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4213: elements</a>.</p>
4214:
4215: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
4216: token's <i>self-closing flag</i></a>, if it is set.</p>
4217:
4218: </dd>
4219:
4220: <dt>A start tag whose tag name is "hr"</dt>
4221: <dd>
4222:
1.8 mike 4223: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
4224: an element in button scope">has a <code>p</code> element in button
1.1 mike 4225: scope</a>, then act as if an end tag with the tag name
4226: "p" had been seen.</p>
4227:
4228: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately
4229: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4230: elements</a>.</p>
4231:
4232: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
4233: token's <i>self-closing flag</i></a>, if it is set.</p>
4234:
4235: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4236:
4237: </dd>
4238:
4239: <dt>A start tag whose tag name is "image"</dt>
4240: <dd>
4241: <p><a href="parsing.html#parse-error">Parse error</a>. Change the token's tag name
4242: to "img" and reprocess it. (Don't ask.)</p> <!-- As of
4243: 2005-12, studies showed that around 0.2% of pages used the
4244: <image> element. -->
4245: </dd>
4246:
4247: <dt id="isindex">A start tag whose tag name is "isindex"</dt>
4248: <dd>
4249:
4250: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
4251:
4252: <p>If the <a href="parsing.html#form-element-pointer"><code title="">form</code> element
4253: pointer</a> is not null, then ignore the token.</p>
4254:
4255: <p>Otherwise:</p>
4256:
4257: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
4258: token's <i>self-closing flag</i></a>, if it is set.</p> <!--
4259: purely to reduce the number of errors (we don't care if they
4260: included the /, they're not supposed to be including the tag at
4261: all! -->
4262:
4263: <p>Act as if a start tag token with the tag name "form" had been seen.</p>
4264:
4265: <p>If the token has an attribute called "action", set the
4266: <code title="attr-form-action">action</code> attribute on the
4267: resulting <code><a href="forms.html#the-form-element">form</a></code> element to the value of the
4268: "action" attribute of the token.</p>
4269:
4270: <p>Act as if a start tag token with the tag name "hr" had been
4271: seen.</p>
4272:
4273: <p>Act as if a start tag token with the tag name "label" had been
4274: seen.</p>
4275:
4276: <p>Act as if a stream of character tokens had been seen (see below
4277: for what they should say).</p>
4278:
4279: <p>Act as if a start tag token with the tag name "input" had been
4280: seen, with all the attributes from the "isindex" token except
4281: "name", "action", and "prompt". Set the <code title="attr-fe-name"><a href="association-of-controls-and-forms.html#attr-fe-name">name</a></code> attribute of the resulting
1.16 mike 4282: <code><a href="the-input-element.html#the-input-element">input</a></code> element to the value "<code title="attr-fe-name-isindex"><a href="association-of-controls-and-forms.html#attr-fe-name-isindex">isindex</a></code>".</p>
1.1 mike 4283:
4284: <p>Act as if a stream of character tokens had been seen (see
4285: below for what they should say).</p>
4286:
4287: <p>Act as if an end tag token with the tag name "label" had been
4288: seen.</p>
4289:
4290: <p>Act as if a start tag token with the tag name "hr" had been
4291: seen.</p>
4292:
4293: <p>Act as if an end tag token with the tag name "form" had been
4294: seen.</p>
4295:
4296: <p>If the token has an attribute with the name "prompt", then the
4297: first stream of characters must be the same string as given in
4298: that attribute, and the second stream of characters must be
4299: empty. Otherwise, the two streams of character tokens together
4300: should, together with the <code><a href="the-input-element.html#the-input-element">input</a></code> element, express the
4301: equivalent of "This is a searchable index. Enter search keywords:
4302: (input field)" in the user's preferred language.</p>
4303:
4304: </dd>
4305:
4306: <dt>A start tag whose tag name is "textarea"</dt>
4307: <dd>
4308:
4309: <p>Run these steps:</p>
4310:
4311: <ol><li><p><a href="#insert-an-html-element">Insert an HTML element</a> for the
4312: token.</p></li>
4313:
4314: <li><p>If the next token is a U+000A LINE FEED (LF) character
4315: token, then ignore that token and move on to the next
4316: one. (Newlines at the start of <code><a href="the-button-element.html#the-textarea-element">textarea</a></code> elements are
4317: ignored as an authoring convenience.)</p></li>
4318:
4319: <!-- see comment in <pre> start tag bit -->
4320:
4321: <li><p>Switch the tokenizer to the <a href="#rcdata-state">RCDATA
4322: state</a>.</p></li>
4323:
4324: <li><p>Let the <a href="parsing.html#original-insertion-mode">original insertion mode</a> be the
4325: current <a href="parsing.html#insertion-mode">insertion mode</a>.</p>
4326:
4327: </li><li><p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not
4328: ok".</p></li>
4329:
4330: <li><p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-incdata" title="insertion mode: text">text</a>".</p></li>
4331:
4332: </ol></dd>
4333:
4334: <dt>A start tag whose tag name is "xmp"</dt>
4335: <dd>
4336:
1.8 mike 4337: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has
4338: an element in button scope">has a <code>p</code> element in button
1.1 mike 4339: scope</a>, then act as if an end tag with the tag name
4340: "p" had been seen.</p>
4341:
4342: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4343: any.</p>
4344:
4345: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4346:
4347: <p>Follow the <a href="#generic-raw-text-element-parsing-algorithm">generic raw text element parsing algorithm</a>.</p>
4348:
4349: </dd>
4350:
4351: <dt>A start tag whose tag name is "iframe"</dt>
4352: <dd>
4353:
4354: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4355:
4356: <p>Follow the <a href="#generic-raw-text-element-parsing-algorithm">generic raw text element parsing algorithm</a>.</p>
4357:
4358: </dd>
4359:
4360: <dt>A start tag whose tag name is "noembed"</dt>
4361: <dt>A start tag whose tag name is "noscript", if the <a href="parsing.html#scripting-flag">scripting flag</a> is enabled</dt>
4362: <dd>
4363:
4364: <p>Follow the <a href="#generic-raw-text-element-parsing-algorithm">generic raw text element parsing algorithm</a>.</p>
4365:
4366: </dd>
4367:
4368: <dt>A start tag whose tag name is "select"</dt>
4369: <dd>
4370:
4371: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4372: any.</p>
4373:
4374: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4375:
4376: <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
4377:
1.28 mike 4378: <p>If the <a href="parsing.html#insertion-mode">insertion mode</a> is one of "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>", "<a href="#parsing-main-incaption" title="insertion mode: in caption">in caption</a>", "<a href="#parsing-main-intbody" title="insertion mode: in table body">in table body</a>",
4379: "<a href="#parsing-main-intr" title="insertion mode: in row">in row</a>", or "<a href="#parsing-main-intd" title="insertion mode: in cell">in cell</a>", then switch the
4380: <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inselectintable" title="insertion mode: in
4381: select in table">in select in table</a>". Otherwise, switch the
4382: <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inselect" title="insertion mode: in
4383: select">in select</a>".</p>
1.1 mike 4384:
4385: </dd>
4386:
4387: <dt>A start tag whose tag name is one of: "optgroup", "option"</dt>
4388: <dd>
4389:
1.35 mike 4390: <p>If the <a href="parsing.html#current-node">current node</a> is an <code><a href="the-button-element.html#the-option-element">option</a></code>
4391: element, then act as if an end tag with the tag name "option" had
4392: been seen.</p>
1.1 mike 4393:
4394: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4395: any.</p>
4396:
4397: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4398:
4399: </dd>
4400:
4401: <dt>A start tag whose tag name is one of: "rp", "rt"</dt>
4402: <dd>
4403:
4404: <!-- the parsing rules for ruby really don't match IE much at all,
4405: but in practice the markup used is very simple and so strict
4406: compatibility with IE isn't required. For example, as defined
4407: here we get very, very different behaviour than IE for
4408: pathological cases like:
4409:
4410: <ruby><ol><li><p>a<rt>b
4411: <ruby>a<rt>b<p>c
4412:
4413: But in practice most ruby markup falls into these cases:
4414:
4415: <ruby>a<rt>b</ruby>
4416: <ruby>a<rp>b<rt>c<rp>d</ruby>
4417: <ruby>a<rt>b</rt></ruby>
4418: <ruby>a<rp>b</rp><rt>c</rt><rp>d</rp></ruby>
4419:
4420: -->
4421:
4422: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-scope" title="has an
4423: element in scope">has a <code>ruby</code> element in scope</a>,
4424: then <a href="#generate-implied-end-tags">generate implied end tags</a>. If the <a href="parsing.html#current-node">current
4425: node</a> is not then a <code><a href="text-level-semantics.html#the-ruby-element">ruby</a></code> element, this is a
4426: <a href="parsing.html#parse-error">parse error</a>; pop all the nodes from the <a href="parsing.html#current-node">current
4427: node</a> up to the node immediately before the bottommost
4428: <code><a href="text-level-semantics.html#the-ruby-element">ruby</a></code> element on the <a href="parsing.html#stack-of-open-elements">stack of open
4429: elements</a>.</p>
4430:
4431: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4432:
4433: </dd>
4434:
4435: <dt>An end tag whose tag name is "br"</dt>
4436: <dd>
4437: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if a start tag token with
4438: the tag name "br" had been seen. Ignore the end tag token.</p>
4439: </dd>
4440:
4441: <dt>A start tag whose tag name is "math"</dt>
4442: <dd>
4443:
4444: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4445: any.</p>
4446:
4447: <p><a href="#adjust-mathml-attributes">Adjust MathML attributes</a> for the token. (This
4448: fixes the case of MathML attributes that are not all
4449: lowercase.)</p>
4450:
4451: <p><a href="#adjust-foreign-attributes">Adjust foreign attributes</a> for the token. (This
4452: fixes the use of namespaced attributes, in particular XLink.)</p>
4453:
4454: <p><a href="#insert-a-foreign-element">Insert a foreign element</a> for the token, in the
4455: <a href="namespaces.html#mathml-namespace">MathML namespace</a>.</p>
4456:
4457: <!-- If we ever change the frameset-ok flag to an insertion mode,
4458: the following change would be implied, except we'd have to do it
4459: even in the face of a self-closed tag:
4460: <p>Set the <span>frameset-ok flag</span> to "not ok".</p>
4461: -->
4462:
4463: <p>If the token has its <i>self-closing flag</i> set, pop the
4464: <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4465: elements</a> and <a href="#acknowledge-self-closing-flag" title="acknowledge self-closing
4466: flag">acknowledge the token's <i>self-closing flag</i></a>.</p>
4467:
4468: <p>Otherwise, if the <a href="parsing.html#insertion-mode">insertion mode</a> is not already
4469: "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in foreign
1.39 mike 4470: content</a>", switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in foreign
4471: content</a>".</p>
1.1 mike 4472:
4473: </dd>
4474:
4475: <dt>A start tag whose tag name is "svg"</dt>
4476: <dd>
4477:
4478: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4479: any.</p>
4480:
4481: <p><a href="#adjust-svg-attributes">Adjust SVG attributes</a> for the token. (This fixes
4482: the case of SVG attributes that are not all lowercase.)</p>
4483:
4484: <p><a href="#adjust-foreign-attributes">Adjust foreign attributes</a> for the token. (This
4485: fixes the use of namespaced attributes, in particular XLink in
4486: SVG.)</p>
4487:
4488: <p><a href="#insert-a-foreign-element">Insert a foreign element</a> for the token, in the
4489: <a href="namespaces.html#svg-namespace">SVG namespace</a>.</p>
4490:
4491: <!-- If we ever change the frameset-ok flag to an insertion mode,
4492: the following change would be implied, except we'd have to do it
4493: even in the face of a self-closed tag:
4494: <p>Set the <span>frameset-ok flag</span> to "not ok".</p>
4495: -->
4496:
4497: <p>If the token has its <i>self-closing flag</i> set, pop the
4498: <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4499: elements</a> and <a href="#acknowledge-self-closing-flag" title="acknowledge self-closing
4500: flag">acknowledge the token's <i>self-closing flag</i></a>.</p>
4501:
4502: <p>Otherwise, if the <a href="parsing.html#insertion-mode">insertion mode</a> is not already
4503: "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in foreign
1.39 mike 4504: content</a>", switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in foreign
4505: content</a>".</p>
1.1 mike 4506:
4507: </dd>
4508:
4509: <dt>A start <!--or end--> tag whose tag name is one of: "caption",
4510: "col", "colgroup", "frame", "head", "tbody", "td", "tfoot", "th",
4511: "thead", "tr"</dt>
4512: <!--<dt>An end tag whose tag name is one of: "area", "base",
4513: "basefont", "bgsound", "command", "embed", "hr", "iframe", "image",
4514: "img", "input", "isindex", "keygen", "link", "meta", "noembed",
4515: "noframes", "param", "script", "select", "source", "style",
4516: "table", "textarea", "title", "track", "wbr"</dt>-->
4517: <!--<dt>An end tag whose tag name is "noscript", if the
4518: <span>scripting flag</span> is enabled</dt>-->
4519: <dd>
4520: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
4521: <!-- end tags are commented out because since they can never end
4522: up on the stack anyway, the default end tag clause will
4523: automatically handle them. we don't want to have text in the spec
4524: that is just an optimisation, as that detracts from the spec
4525: itself -->
4526: </dd>
4527:
4528: <dt>Any other start tag</dt>
4529: <dd>
4530:
4531: <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
4532: any.</p>
4533:
4534: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4535:
1.8 mike 4536: <p class="note">This element will be a <a href="parsing.html#ordinary">ordinary</a>
1.1 mike 4537: element.</p>
4538:
4539: </dd>
4540:
4541: <dt>Any other end tag</dt>
4542: <dd>
4543:
4544: <p>Run these steps:</p>
4545:
4546: <ol><li><p>Initialize <var title="">node</var> to be the <a href="parsing.html#current-node">current
4547: node</a> (the bottommost node of the stack).</p></li>
4548:
1.12 mike 4549: <li><p><i>Loop</i>: If <var title="">node</var> has the same tag
4550: name as the token, then:</p>
1.1 mike 4551:
1.12 mike 4552: <ol><li><p><a href="#generate-implied-end-tags">Generate implied end tags</a>, except
4553: for elements with the same tag name as the token.</p></li>
1.1 mike 4554:
4555: <li><p>If the tag name of the end tag token does not match
4556: the tag name of the <a href="parsing.html#current-node">current node</a>, this is a
4557: <a href="parsing.html#parse-error">parse error</a>.</p></li>
4558:
4559: <li><p>Pop all the nodes from the <a href="parsing.html#current-node">current node</a> up
4560: to <var title="">node</var>, including <var title="">node</var>, then stop these steps.</p></li>
4561:
4562: </ol></li>
4563:
1.12 mike 4564: <li><p>Otherwise, if <var title="">node</var> is in the
4565: <a href="parsing.html#special">special</a> category, then this is a <a href="parsing.html#parse-error">parse
4566: error</a>; ignore the token, and abort these steps.</p></li>
1.1 mike 4567:
4568: <li><p>Set <var title="">node</var> to the previous entry in the
4569: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p></li>
4570:
1.12 mike 4571: <li><p>Return to the step labeld <i>loop</i>.</p></li>
1.1 mike 4572:
4573: </ol></dd>
4574:
1.29 mike 4575: </dl><h5 id="parsing-main-incdata"><span class="secno">8.2.5.11 </span>The "<dfn title="insertion mode: text">text</dfn>" insertion mode</h5>
1.1 mike 4576:
4577: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-incdata" title="insertion
4578: mode: text">text</a>", tokens must be handled as follows:</p>
4579:
4580: <dl class="switch"><dt>A character token</dt>
4581: <dd>
4582:
4583: <p><a href="#insert-a-character" title="insert a character">Insert the token's
4584: character</a> into the <a href="parsing.html#current-node">current node</a>.</p>
4585:
4586: </dd>
4587:
4588: <dt>An end-of-file token</dt>
4589: <dd>
4590:
4591: <!-- can't be the fragment case -->
4592: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
4593:
4594: <p>If the <a href="parsing.html#current-node">current node</a> is a <code><a href="scripting-1.html#script">script</a></code>
4595: element, mark the <code><a href="scripting-1.html#script">script</a></code> element as <a href="scripting-1.html#already-started">"already
4596: started"</a>.</p>
4597:
4598: <p>Pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4599: elements</a>.</p>
4600:
4601: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to the <a href="parsing.html#original-insertion-mode">original
4602: insertion mode</a> and reprocess the current token.</p>
4603:
4604: </dd>
4605:
4606: <dt id="scriptEndTag">An end tag whose tag name is "script"</dt>
4607: <dd>
4608:
4609: <p>Let <var title="">script</var> be the <a href="parsing.html#current-node">current node</a>
4610: (which will be a <code><a href="scripting-1.html#script">script</a></code> element).</p>
4611:
4612: <p>Pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4613: elements</a>.</p>
4614:
4615: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to the <a href="parsing.html#original-insertion-mode">original
4616: insertion mode</a>.</p>
4617:
4618: <p>Let the <var title="">old insertion point</var> have the
4619: same value as the current <a href="parsing.html#insertion-point">insertion point</a>. Let
4620: the <a href="parsing.html#insertion-point">insertion point</a> be just before the <a href="parsing.html#next-input-character">next
4621: input character</a>.</p>
4622:
4623: <p>Increment the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> by
4624: one.</p>
4625:
4626: <p><a href="scripting-1.html#running-a-script" title="running a script">Run</a> the <var title="">script</var>. This might cause some script to execute,
4627: which might cause <a href="apis-in-html-documents.html#dom-document-write" title="dom-document-write">new characters
4628: to be inserted into the tokenizer</a>, and might cause the
4629: tokenizer to output more tokens, resulting in a <a href="parsing.html#nestedParsing">reentrant invocation of the parser</a>.</p>
4630:
4631: <p>Decrement the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> by
4632: one. If the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> is zero,
4633: then set the <a href="parsing.html#parser-pause-flag">parser pause flag</a> to false.</p>
4634:
4635: <p>Let the <a href="parsing.html#insertion-point">insertion point</a> have the value of the <var title="">old insertion point</var>. (In other words, restore the
4636: <a href="parsing.html#insertion-point">insertion point</a> to its previous value. This value
4637: might be the "undefined" value.)</p>
4638:
4639: <p id="scriptTagParserResumes">At this stage, if there is a
4640: <a href="scripting-1.html#pending-parsing-blocking-script">pending parsing-blocking script</a>, then:</p>
4641:
4642: <dl class="switch"><dt>If the <a href="parsing.html#script-nesting-level">script nesting level</a> is not zero:</dt>
4643:
4644: <dd>
4645:
4646: <p>Set the <a href="parsing.html#parser-pause-flag">parser pause flag</a> to true, and abort the
4647: processing of any nested invocations of the tokenizer, yielding
4648: control back to the caller. (Tokenization will resume when the
4649: caller returns to the "outer" tree construction stage.)</p>
4650:
4651: <p class="note">The tree construction stage of this particular
4652: parser is <a href="parsing.html#nestedParsing">being called reentrantly</a>,
4653: say from a call to <code title="dom-document-write"><a href="apis-in-html-documents.html#dom-document-write">document.write()</a></code>.</p>
4654:
4655: </dd>
4656:
4657:
4658: <dt>Otherwise:</dt>
4659:
4660: <dd>
4661:
4662: <p>Run these steps:</p>
4663:
4664: <ol><li><p>Let <var title="">the script</var> be the <a href="scripting-1.html#pending-parsing-blocking-script">pending
4665: parsing-blocking script</a>. There is no longer a <a href="scripting-1.html#pending-parsing-blocking-script">pending
4666: parsing-blocking script</a>.</p></li>
4667:
4668: <li><p>Block the <a href="#tokenization" title="tokenization">tokenizer</a>
4669: for this instance of the <a href="parsing.html#html-parser">HTML parser</a>, such that
4670: the <a href="webappapis.html#event-loop">event loop</a> will not run <a href="webappapis.html#concept-task" title="concept-task">tasks</a> that invoke the <a href="#tokenization" title="tokenization">tokenizer</a>.</p></li>
4671:
1.34 mike 4672: <li><p><a href="webappapis.html#spin-the-event-loop">Spin the event loop</a> until there is no <a href="semantics.html#a-style-sheet-that-is-blocking-scripts" title="a style sheet that is blocking scripts">style sheet that
4673: is blocking scripts</a> and <var title="">the script</var>'s
4674: <a href="scripting-1.html#ready-to-be-parser-executed">"ready to be parser-executed"</a> flag is
4675: set.</p></li>
1.1 mike 4676:
4677: <li><p>Unblock the <a href="#tokenization" title="tokenization">tokenizer</a>
4678: for this instance of the <a href="parsing.html#html-parser">HTML parser</a>, such that
4679: <a href="webappapis.html#concept-task" title="concept-task">tasks</a> that invoke the <a href="#tokenization" title="tokenization">tokenizer</a> can again be
4680: run.</p></li>
4681:
4682: <li><p>Let the <a href="parsing.html#insertion-point">insertion point</a> be just before the
4683: <a href="parsing.html#next-input-character">next input character</a>.</p></li>
4684:
4685: <li><p>Increment the parser's <a href="parsing.html#script-nesting-level">script nesting level</a>
4686: by one (it should be zero before this step, so this sets it to
4687: one).</p></li>
4688:
4689: <li><p><a href="scripting-1.html#executing-a-script-block" title="executing a script block">Execute</a>
4690: <var title="">the script</var>.</p></li>
4691:
4692: <li><p>Decrement the parser's <a href="parsing.html#script-nesting-level">script nesting level</a>
4693: by one. If the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> is
4694: zero (which it always should be at this point), then set the
4695: <a href="parsing.html#parser-pause-flag">parser pause flag</a> to false.</p>
4696:
4697: </li><li><p>Let the <a href="parsing.html#insertion-point">insertion point</a> be undefined
4698: again.</p></li>
4699:
4700: <li><p>If there is once again a <a href="scripting-1.html#pending-parsing-blocking-script">pending parsing-blocking
4701: script</a>, then repeat these steps from step 1.</p></li>
4702:
4703: </ol></dd>
4704:
4705: </dl></dd>
4706:
4707: <dt>Any other end tag</dt>
4708: <dd>
4709:
4710: <p>Pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
4711: elements</a>.</p>
4712:
4713: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to the <a href="parsing.html#original-insertion-mode">original
4714: insertion mode</a>.</p>
4715:
4716: </dd>
4717:
1.29 mike 4718: </dl><h5 id="parsing-main-intable"><span class="secno">8.2.5.12 </span>The "<dfn title="insertion mode: in table">in table</dfn>" insertion mode</h5>
1.1 mike 4719:
4720: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intable" title="insertion
4721: mode: in table">in table</a>", tokens must be handled as follows:</p>
4722:
4723: <dl class="switch"><dt>A character token</dt>
4724: <dd>
4725:
4726: <p>Let the <dfn id="pending-table-character-tokens"><var>pending table character tokens</var></dfn>
4727: be an empty list of tokens.</p>
4728:
4729: <p>Let the <a href="parsing.html#original-insertion-mode">original insertion mode</a> be the current
4730: <a href="parsing.html#insertion-mode">insertion mode</a>.</p>
4731:
4732: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intabletext" title="insertion mode: in table text">in table text</a>" and
4733: reprocess the token.</p>
4734:
4735: </dd>
4736:
4737: <dt>A comment token</dt>
4738: <dd>
4739: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
4740: node</a> with the <code title="">data</code> attribute set to
4741: the data given in the comment token.</p>
4742: </dd>
4743:
4744: <dt>A DOCTYPE token</dt>
4745: <dd>
4746: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
4747: </dd>
4748:
4749: <dt>A start tag whose tag name is "caption"</dt>
4750: <dd>
4751:
4752: <p><a href="#clear-the-stack-back-to-a-table-context">Clear the stack back to a table context</a>. (See
4753: below.)</p>
4754:
4755: <p>Insert a marker at the end of the <a href="parsing.html#list-of-active-formatting-elements">list of active
4756: formatting elements</a>.</p>
4757:
4758: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, then
4759: switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-incaption" title="insertion mode: in caption">in caption</a>".</p>
4760:
4761: </dd>
4762:
4763: <dt>A start tag whose tag name is "colgroup"</dt>
4764: <dd>
4765:
4766: <p><a href="#clear-the-stack-back-to-a-table-context">Clear the stack back to a table context</a>. (See
4767: below.)</p>
4768:
4769: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, then
4770: switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-incolgroup" title="insertion mode: in column group">in column
4771: group</a>".</p>
4772:
4773: </dd>
4774:
4775: <dt>A start tag whose tag name is "col"</dt>
4776: <dd>
4777: <p>Act as if a start tag token with the tag name "colgroup"
4778: had been seen, then reprocess the current token.</p>
4779: </dd>
4780:
4781: <dt>A start tag whose tag name is one of: "tbody", "tfoot", "thead"</dt>
4782: <dd>
4783:
4784: <p><a href="#clear-the-stack-back-to-a-table-context">Clear the stack back to a table context</a>. (See
4785: below.)</p>
4786:
4787: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, then
4788: switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intbody" title="insertion mode: in table body">in table
4789: body</a>".</p>
4790:
4791: </dd>
4792:
4793: <dt>A start tag whose tag name is one of: "td", "th", "tr"</dt>
4794: <dd>
4795: <p>Act as if a start tag token with the tag name "tbody" had
4796: been seen, then reprocess the current token.</p>
4797: </dd>
4798:
4799: <dt>A start tag whose tag name is "table"</dt>
4800: <dd>
4801:
4802: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if an end tag token with
4803: the tag name "table" had been seen, then, if that token wasn't
4804: ignored, reprocess the current token.</p>
4805:
4806: <p class="note">The fake end tag token here can only be
4807: ignored in the <a href="the-end.html#fragment-case">fragment case</a>.</p>
4808:
4809: </dd>
4810:
4811: <dt>An end tag whose tag name is "table"</dt>
4812: <dd>
4813:
4814: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
4815: scope</a> with the same tag name as the token, this is a
4816: <a href="parsing.html#parse-error">parse error</a>. Ignore the token. (<a href="the-end.html#fragment-case">fragment
4817: case</a>)</p>
4818:
4819: <p>Otherwise:</p>
4820:
4821: <p>Pop elements from this stack until a <code><a href="tabular-data.html#the-table-element">table</a></code>
4822: element has been popped from the stack.</p>
4823:
4824: <p><a href="parsing.html#reset-the-insertion-mode-appropriately">Reset the insertion mode appropriately</a>.</p>
4825:
4826: </dd>
4827:
4828: <dt>An end tag whose tag name is one of: "body", "caption",
4829: "col", "colgroup", "html", "tbody", "td", "tfoot", "th",
4830: "thead", "tr"</dt>
4831: <dd>
4832: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
4833: </dd>
4834:
4835: <dt>A start tag whose tag name is one of: "style", "script"</dt>
4836: <dd>
4837:
4838: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
4839: mode</a>.</p>
4840:
4841: </dd>
4842:
4843: <dt>A start tag whose tag name is "input"</dt>
4844: <dd>
4845:
4846: <p>If the token does not have an attribute with the name "type",
4847: or if it does, but that attribute's value is not an <a href="infrastructure.html#ascii-case-insensitive">ASCII
4848: case-insensitive</a> match for the string "<code title="">hidden</code>", then: act as described in the "anything
4849: else" entry below.</p>
4850:
4851: <p>Otherwise:</p>
4852:
4853: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
4854:
4855: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
4856:
4857: <p>Pop that <code><a href="the-input-element.html#the-input-element">input</a></code> element off the <a href="parsing.html#stack-of-open-elements">stack of
4858: open elements</a>.</p>
4859:
4860: </dd>
4861:
4862: <dt>A start tag whose tag name is "form"</dt>
4863: <dd>
4864:
4865: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
4866:
4867: <p>If the <a href="parsing.html#form-element-pointer"><code title="form">form</code> element
4868: pointer</a> is not null, ignore the token.</p>
4869:
4870: <p>Otherwise:</p>
4871:
1.21 mike 4872: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, and set the
4873: <a href="parsing.html#form-element-pointer"><code title="form">form</code> element pointer</a> to
4874: point to the element created.</p>
1.1 mike 4875:
4876: <p>Pop that <code><a href="forms.html#the-form-element">form</a></code> element off the <a href="parsing.html#stack-of-open-elements">stack of
4877: open elements</a>.</p>
4878:
4879: </dd>
4880:
4881: <!-- "form" end tag falls through to in-body, which does the right thing -->
4882:
4883: <dt>An end-of-file token</dt>
4884: <dd>
4885:
4886: <p>If the <a href="parsing.html#current-node">current node</a> is not the root
4887: <code><a href="semantics.html#the-html-element-0">html</a></code> element, then this is a <a href="parsing.html#parse-error">parse
4888: error</a>.</p>
4889:
4890: <p class="note">It can only be the <a href="parsing.html#current-node">current node</a> in
4891: the <a href="the-end.html#fragment-case">fragment case</a>.</p>
4892:
4893: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
4894:
4895: </dd>
4896:
4897: <dt>Anything else</dt>
4898: <dd>
4899:
4900: <p><a href="parsing.html#parse-error">Parse error</a>. Process the token <a href="parsing.html#using-the-rules-for">using the
4901: rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in
4902: body</a>" <a href="parsing.html#insertion-mode">insertion mode</a>, except that if the
4903: <a href="parsing.html#current-node">current node</a> is a <code><a href="tabular-data.html#the-table-element">table</a></code>,
4904: <code><a href="tabular-data.html#the-tbody-element">tbody</a></code>, <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code>, <code><a href="tabular-data.html#the-thead-element">thead</a></code>, or
4905: <code><a href="tabular-data.html#the-tr-element">tr</a></code> element, then, whenever a node would be inserted
4906: into the <a href="parsing.html#current-node">current node</a>, it must instead be <a href="#foster-parent" title="foster parent">foster parented</a>.</p>
4907:
4908: </dd>
4909:
4910: </dl><p>When the steps above require the UA to <dfn id="clear-the-stack-back-to-a-table-context">clear the stack
4911: back to a table context</dfn>, it means that the UA must, while
4912: the <a href="parsing.html#current-node">current node</a> is not a <code><a href="tabular-data.html#the-table-element">table</a></code>
4913: element or an <code><a href="semantics.html#the-html-element-0">html</a></code> element, pop elements from the
4914: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
4915:
4916: <p class="note">The <a href="parsing.html#current-node">current node</a> being an
4917: <code><a href="semantics.html#the-html-element-0">html</a></code> element after this process is a <a href="the-end.html#fragment-case">fragment
4918: case</a>.</p>
4919:
4920:
4921:
1.29 mike 4922: <h5 id="parsing-main-intabletext"><span class="secno">8.2.5.13 </span>The "<dfn title="insertion mode: in table text">in table text</dfn>" insertion mode</h5>
1.1 mike 4923:
4924: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intabletext" title="insertion
4925: mode: in table text">in table text</a>", tokens must be handled
4926: as follows:</p>
4927:
4928: <dl class="switch"><dt>A character token</dt>
4929: <dd>
4930:
4931: <p>Append the character token to the <var><a href="#pending-table-character-tokens">pending table character
4932: tokens</a></var> list.</p>
4933:
4934: </dd>
4935:
4936:
4937: <dt>Anything else</dt>
4938: <dd>
4939:
4940: <p>If any of the tokens in the <var><a href="#pending-table-character-tokens">pending table character
4941: tokens</a></var> list are character tokens that are not one of U+0009
4942: CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED
4943: (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE, then
4944: reprocess those character tokens using the rules given in the
4945: "anything else" entry in the <a href="#parsing-main-intable" title="insertion mode: in
4946: table">in table</a>" insertion mode.</p>
4947:
4948: <p>Otherwise, <a href="#insert-a-character" title="insert a character">insert the
4949: characters</a> given by the <var><a href="#pending-table-character-tokens">pending table character
4950: tokens</a></var> list into the <a href="parsing.html#current-node">current node</a>.</p>
4951:
4952: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to the <a href="parsing.html#original-insertion-mode">original
4953: insertion mode</a> and reprocess the token.</p>
4954:
4955: </dd>
4956:
1.29 mike 4957: </dl><h5 id="parsing-main-incaption"><span class="secno">8.2.5.14 </span>The "<dfn title="insertion mode: in caption">in caption</dfn>" insertion mode</h5>
1.1 mike 4958:
4959: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-incaption" title="insertion
4960: mode: in caption">in caption</a>", tokens must be handled as follows:</p>
4961:
4962: <dl class="switch"><dt>An end tag whose tag name is "caption"</dt>
4963: <dd>
4964:
4965: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
4966: scope</a> with the same tag name as the token, this is a
4967: <a href="parsing.html#parse-error">parse error</a>. Ignore the token. (<a href="the-end.html#fragment-case">fragment
4968: case</a>)</p>
4969:
4970: <p>Otherwise:</p>
4971:
4972: <p><a href="#generate-implied-end-tags">Generate implied end tags</a>.</p>
4973:
4974: <p>Now, if the <a href="parsing.html#current-node">current node</a> is not a
4975: <code><a href="tabular-data.html#the-caption-element">caption</a></code> element, then this is a <a href="parsing.html#parse-error">parse
4976: error</a>.</p>
4977:
4978: <p>Pop elements from this stack until a <code><a href="tabular-data.html#the-caption-element">caption</a></code>
4979: element has been popped from the stack.</p>
4980:
4981: <p><a href="parsing.html#clear-the-list-of-active-formatting-elements-up-to-the-last-marker">Clear the list of active formatting elements up to
4982: the last marker</a>.</p>
4983:
4984: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>".</p>
4985:
4986: </dd>
4987:
4988: <dt>A start tag whose tag name is one of: "caption", "col",
4989: "colgroup", "tbody", "td", "tfoot", "th", "thead", "tr"</dt>
4990: <dt>An end tag whose tag name is "table"</dt>
4991: <dd>
4992:
4993: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if an end tag with the tag
4994: name "caption" had been seen, then, if that token wasn't
4995: ignored, reprocess the current token.</p>
4996:
4997: <p class="note">The fake end tag token here can only be
4998: ignored in the <a href="the-end.html#fragment-case">fragment case</a>.</p>
4999:
5000: </dd>
5001:
5002: <dt>An end tag whose tag name is one of: "body", "col",
5003: "colgroup", "html", "tbody", "td", "tfoot", "th", "thead",
5004: "tr"</dt>
5005: <dd>
5006: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5007: </dd>
5008:
5009: <dt>Anything else</dt>
5010: <dd>
5011: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5012: mode</a>.</p>
5013: </dd>
5014:
1.29 mike 5015: </dl><h5 id="parsing-main-incolgroup"><span class="secno">8.2.5.15 </span>The "<dfn title="insertion mode: in column group">in column group</dfn>" insertion mode</h5>
1.1 mike 5016:
5017: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-incolgroup" title="insertion
5018: mode: in column group">in column group</a>", tokens must be handled as follows:</p>
5019:
5020: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
5021: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
5022: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
5023: <dd>
5024: <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
5025: the <a href="parsing.html#current-node">current node</a>.</p>
5026: </dd>
5027:
5028: <dt>A comment token</dt>
5029: <dd>
5030: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
5031: node</a> with the <code title="">data</code> attribute set to
5032: the data given in the comment token.</p>
5033: </dd>
5034:
5035: <dt>A DOCTYPE token</dt>
5036: <dd>
5037: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5038: </dd>
5039:
5040: <dt>A start tag whose tag name is "html"</dt>
5041: <dd>
5042: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5043: mode</a>.</p>
5044: </dd>
5045:
5046: <dt>A start tag whose tag name is "col"</dt>
5047: <dd>
5048:
5049: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately
5050: pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
5051: elements</a>.</p>
5052:
5053: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
5054: token's <i>self-closing flag</i></a>, if it is set.</p>
5055:
5056: </dd>
5057:
5058: <dt>An end tag whose tag name is "colgroup"</dt>
5059: <dd>
5060:
5061: <p>If the <a href="parsing.html#current-node">current node</a> is the root
5062: <code><a href="semantics.html#the-html-element-0">html</a></code> element, then this is a <a href="parsing.html#parse-error">parse
5063: error</a>; ignore the token. (<a href="the-end.html#fragment-case">fragment
5064: case</a>)</p>
5065:
5066: <p>Otherwise, pop the <a href="parsing.html#current-node">current node</a> (which will be
5067: a <code><a href="tabular-data.html#the-colgroup-element">colgroup</a></code> element) from the <a href="parsing.html#stack-of-open-elements">stack of open
5068: elements</a>. Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to
5069: "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>".</p>
5070:
5071: </dd>
5072:
5073: <dt>An end tag whose tag name is "col"</dt>
5074: <dd>
5075: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5076: </dd>
5077:
5078: <dt>An end-of-file token</dt>
5079: <dd>
5080:
5081: <p>If the <a href="parsing.html#current-node">current node</a> is the root <code><a href="semantics.html#the-html-element-0">html</a></code>
5082: element, then <a href="the-end.html#stop-parsing">stop parsing</a>. (<a href="the-end.html#fragment-case">fragment
5083: case</a>)</p>
5084:
5085: <p>Otherwise, act as described in the "anything else" entry
5086: below.</p>
5087:
5088: </dd>
5089:
5090: <dt>Anything else</dt>
5091: <dd>
5092:
5093: <p>Act as if an end tag with the tag name "colgroup" had been
5094: seen, and then, if that token wasn't ignored, reprocess the
5095: current token.</p>
5096:
5097: <p class="note">The fake end tag token here can only be
5098: ignored in the <a href="the-end.html#fragment-case">fragment case</a>.</p>
5099:
5100: </dd>
5101:
1.29 mike 5102: </dl><h5 id="parsing-main-intbody"><span class="secno">8.2.5.16 </span>The "<dfn title="insertion mode: in table body">in table body</dfn>" insertion mode</h5>
1.1 mike 5103:
5104: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intbody" title="insertion
5105: mode: in table body">in table body</a>", tokens must be handled as follows:</p>
5106:
5107: <dl class="switch"><dt>A start tag whose tag name is "tr"</dt>
5108: <dd>
5109:
5110: <p><a href="#clear-the-stack-back-to-a-table-body-context">Clear the stack back to a table body
5111: context</a>. (See below.)</p>
5112:
5113: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, then switch
5114: the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intr" title="insertion mode:
5115: in row">in row</a>".</p>
5116:
5117: </dd>
5118:
5119: <dt>A start tag whose tag name is one of: "th", "td"</dt>
5120: <dd>
5121: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if a start tag with
5122: the tag name "tr" had been seen, then reprocess the current
5123: token.</p>
5124: </dd>
5125:
5126: <dt>An end tag whose tag name is one of: "tbody", "tfoot",
5127: "thead"</dt>
5128: <dd>
5129:
5130: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
5131: scope</a> with the same tag name as the token, this is a
5132: <a href="parsing.html#parse-error">parse error</a>. Ignore the token.</p>
5133:
5134: <p>Otherwise:</p>
5135:
5136: <p><a href="#clear-the-stack-back-to-a-table-body-context">Clear the stack back to a table body
5137: context</a>. (See below.)</p>
5138:
5139: <p>Pop the <a href="parsing.html#current-node">current node</a> from the <a href="parsing.html#stack-of-open-elements">stack of
5140: open elements</a>. Switch the <a href="parsing.html#insertion-mode">insertion mode</a>
5141: to "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>".</p>
5142:
5143: </dd>
5144:
5145: <dt>A start tag whose tag name is one of: "caption", "col",
5146: "colgroup", "tbody", "tfoot", "thead"</dt>
5147: <dt>An end tag whose tag name is "table"</dt>
5148: <dd>
5149:
5150: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have a
5151: <code>tbody</code>, <code>thead</code>, or <code>tfoot</code>
5152: element in table scope</a>, this is a <a href="parsing.html#parse-error">parse
5153: error</a>. Ignore the token. (<a href="the-end.html#fragment-case">fragment
5154: case</a>)</p>
5155:
5156: <p>Otherwise:</p>
5157:
5158: <p><a href="#clear-the-stack-back-to-a-table-body-context">Clear the stack back to a table body
5159: context</a>. (See below.)</p>
5160:
5161: <p>Act as if an end tag with the same tag name as the
5162: <a href="parsing.html#current-node">current node</a> ("tbody", "tfoot", or "thead") had
5163: been seen, then reprocess the current token.</p>
5164:
5165: </dd>
5166:
5167: <dt>An end tag whose tag name is one of: "body", "caption",
5168: "col", "colgroup", "html", "td", "th", "tr"</dt>
5169: <dd>
5170: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5171: </dd>
5172:
5173: <dt>Anything else</dt>
5174: <dd>
5175: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>" <a href="parsing.html#insertion-mode">insertion
5176: mode</a>.</p>
5177: </dd>
5178:
5179: </dl><p>When the steps above require the UA to <dfn id="clear-the-stack-back-to-a-table-body-context">clear the stack
5180: back to a table body context</dfn>, it means that the UA must,
5181: while the <a href="parsing.html#current-node">current node</a> is not a <code><a href="tabular-data.html#the-tbody-element">tbody</a></code>,
5182: <code><a href="tabular-data.html#the-tfoot-element">tfoot</a></code>, <code><a href="tabular-data.html#the-thead-element">thead</a></code>, or <code><a href="semantics.html#the-html-element-0">html</a></code>
5183: element, pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open
5184: elements</a>.</p>
5185:
5186: <p class="note">The <a href="parsing.html#current-node">current node</a> being an
5187: <code><a href="semantics.html#the-html-element-0">html</a></code> element after this process is a <a href="the-end.html#fragment-case">fragment
5188: case</a>.</p>
5189:
5190:
1.29 mike 5191: <h5 id="parsing-main-intr"><span class="secno">8.2.5.17 </span>The "<dfn title="insertion mode: in row">in row</dfn>" insertion mode</h5>
1.1 mike 5192:
5193: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intr" title="insertion
5194: mode: in row">in row</a>", tokens must be handled as follows:</p>
5195:
5196: <dl class="switch"><dt>A start tag whose tag name is one of: "th", "td"</dt>
5197: <dd>
5198:
5199: <p><a href="#clear-the-stack-back-to-a-table-row-context">Clear the stack back to a table row
5200: context</a>. (See below.)</p>
5201:
5202: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, then switch
5203: the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intd" title="insertion mode:
5204: in cell">in cell</a>".</p>
5205:
5206: <p>Insert a marker at the end of the <a href="parsing.html#list-of-active-formatting-elements">list of active
5207: formatting elements</a>.</p>
5208:
5209: </dd>
5210:
5211: <dt>An end tag whose tag name is "tr"</dt>
5212: <dd>
5213:
5214: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
5215: scope</a> with the same tag name as the token, this is a
5216: <a href="parsing.html#parse-error">parse error</a>. Ignore the token. (<a href="the-end.html#fragment-case">fragment
5217: case</a>)</p>
5218:
5219: <p>Otherwise:</p>
5220:
5221: <p><a href="#clear-the-stack-back-to-a-table-row-context">Clear the stack back to a table row
5222: context</a>. (See below.)</p>
5223:
5224: <p>Pop the <a href="parsing.html#current-node">current node</a> (which will be a
5225: <code><a href="tabular-data.html#the-tr-element">tr</a></code> element) from the <a href="parsing.html#stack-of-open-elements">stack of open
5226: elements</a>. Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to
5227: "<a href="#parsing-main-intbody" title="insertion mode: in table body">in table
5228: body</a>".</p>
5229:
5230: </dd>
5231:
5232: <dt>A start tag whose tag name is one of: "caption", "col",
5233: "colgroup", "tbody", "tfoot", "thead", "tr"</dt>
5234: <dt>An end tag whose tag name is "table"</dt>
5235: <dd>
5236:
5237: <p>Act as if an end tag with the tag name "tr" had been seen,
5238: then, if that token wasn't ignored, reprocess the current
5239: token.</p>
5240:
5241: <p class="note">The fake end tag token here can only be
5242: ignored in the <a href="the-end.html#fragment-case">fragment case</a>.</p>
5243:
5244: </dd>
5245:
5246: <dt>An end tag whose tag name is one of: "tbody", "tfoot",
5247: "thead"</dt>
5248: <dd>
5249:
5250: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
5251: scope</a> with the same tag name as the token, this is a
5252: <a href="parsing.html#parse-error">parse error</a>. Ignore the token.</p>
5253:
5254: <p>Otherwise, act as if an end tag with the tag name "tr" had
5255: been seen, then reprocess the current token.</p>
5256:
5257: </dd>
5258:
5259: <dt>An end tag whose tag name is one of: "body", "caption",
5260: "col", "colgroup", "html", "td", "th"</dt>
5261: <dd>
5262: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5263: </dd>
5264:
5265: <dt>Anything else</dt>
5266: <dd>
5267: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-intable" title="insertion mode: in table">in table</a>" <a href="parsing.html#insertion-mode">insertion
5268: mode</a>.</p>
5269: </dd>
5270:
5271: </dl><p>When the steps above require the UA to <dfn id="clear-the-stack-back-to-a-table-row-context">clear the stack
5272: back to a table row context</dfn>, it means that the UA must,
5273: while the <a href="parsing.html#current-node">current node</a> is not a <code><a href="tabular-data.html#the-tr-element">tr</a></code>
5274: element or an <code><a href="semantics.html#the-html-element-0">html</a></code> element, pop elements from the
5275: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
5276:
5277: <p class="note">The <a href="parsing.html#current-node">current node</a> being an
5278: <code><a href="semantics.html#the-html-element-0">html</a></code> element after this process is a <a href="the-end.html#fragment-case">fragment
5279: case</a>.</p>
5280:
5281:
1.29 mike 5282: <h5 id="parsing-main-intd"><span class="secno">8.2.5.18 </span>The "<dfn title="insertion mode: in cell">in cell</dfn>" insertion mode</h5>
1.1 mike 5283:
5284: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intd" title="insertion
5285: mode: in cell">in cell</a>", tokens must be handled as follows:</p>
5286:
5287: <dl class="switch"><dt>An end tag whose tag name is one of: "td", "th"</dt>
5288: <dd>
5289:
5290: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
5291: scope</a> with the same tag name as that of the token, then
5292: this is a <a href="parsing.html#parse-error">parse error</a> and the token must be
5293: ignored.</p>
5294:
5295: <p>Otherwise:</p>
5296:
5297: <p><a href="#generate-implied-end-tags">Generate implied end tags</a>.</p>
5298:
5299: <p>Now, if the <a href="parsing.html#current-node">current node</a> is not an element
5300: with the same tag name as the token, then this is a
5301: <a href="parsing.html#parse-error">parse error</a>.</p>
5302:
5303: <p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> stack
5304: until an element with the same tag name as the token has been
5305: popped from the stack.</p>
5306:
5307: <p><a href="parsing.html#clear-the-list-of-active-formatting-elements-up-to-the-last-marker">Clear the list of active formatting elements up to
5308: the last marker</a>.</p>
5309:
1.22 mike 5310: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-intr" title="insertion mode: in row">in row</a>".</p> <!-- current
5311: node here will be a <tr> normally; but could be <html> in the
5312: fragment case -->
1.1 mike 5313:
5314: </dd>
5315:
5316: <dt>A start tag whose tag name is one of: "caption", "col",
5317: "colgroup", "tbody", "td", "tfoot", "th", "thead", "tr"</dt>
5318: <dd>
5319:
5320: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does
5321: <em>not</em> <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have
5322: a <code>td</code> or <code>th</code> element in table
5323: scope</a>, then this is a <a href="parsing.html#parse-error">parse error</a>; ignore
5324: the token. (<a href="the-end.html#fragment-case">fragment case</a>)</p>
5325:
5326: <p>Otherwise, <a href="#close-the-cell">close the cell</a> (see below) and
5327: reprocess the current token.</p>
5328:
5329: </dd>
5330:
5331: <dt>An end tag whose tag name is one of: "body", "caption",
5332: "col", "colgroup", "html"</dt>
5333: <dd>
5334: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5335: </dd>
5336:
5337: <dt>An end tag whose tag name is one of: "table", "tbody",
5338: "tfoot", "thead", "tr"</dt>
5339: <dd>
5340:
5341: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have an element in table
1.14 mike 5342: scope</a> with the same tag name as that of the token (which
5343: can only happen for "tbody", "tfoot" and "thead", or in the
5344: <a href="the-end.html#fragment-case">fragment case</a>), then this is a <a href="parsing.html#parse-error">parse
1.1 mike 5345: error</a> and the token must be ignored.</p>
5346:
5347: <p>Otherwise, <a href="#close-the-cell">close the cell</a> (see below) and
5348: reprocess the current token.</p>
5349:
5350: </dd>
5351:
5352: <dt>Anything else</dt>
5353: <dd>
5354: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5355: mode</a>.</p>
5356: </dd>
5357:
5358: </dl><p>Where the steps above say to <dfn id="close-the-cell">close the cell</dfn>, they
5359: mean to run the following algorithm:</p>
5360:
5361: <ol><li><p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">has a <code>td</code>
5362: element in table scope</a>, then act as if an end tag token
5363: with the tag name "td" had been seen.</p></li>
5364:
5365: <li><p>Otherwise, the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> will
5366: <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">have a
5367: <code>th</code> element in table scope</a>; act as if an end
5368: tag token with the tag name "th" had been seen.</p></li>
5369:
1.31 mike 5370: </ol><p class="note">The <a href="parsing.html#stack-of-open-elements">stack of open elements</a> cannot have
5371: both a <code><a href="tabular-data.html#the-td-element">td</a></code> and a <code><a href="tabular-data.html#the-th-element">th</a></code> element <a href="parsing.html#has-an-element-in-table-scope" title="has an element in table scope">in table scope</a> at the
5372: same time, nor can it have neither when the <a href="#close-the-cell">close the
5373: cell</a> algorithm is invoked.</p>
1.1 mike 5374:
5375:
1.29 mike 5376: <h5 id="parsing-main-inselect"><span class="secno">8.2.5.19 </span>The "<dfn title="insertion mode: in select">in select</dfn>" insertion mode</h5>
1.1 mike 5377:
5378: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inselect" title="insertion
5379: mode: in select">in select</a>", tokens must be handled as follows:</p>
5380:
5381: <dl class="switch"><dt>A character token</dt>
5382: <dd>
5383: <p><a href="#insert-a-character" title="insert a character">Insert the token's
5384: character</a> into the <a href="parsing.html#current-node">current node</a>.</p>
5385: </dd>
5386:
5387: <dt>A comment token</dt>
5388: <dd>
5389: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
5390: node</a> with the <code title="">data</code> attribute set to
5391: the data given in the comment token.</p>
5392: </dd>
5393:
5394: <dt>A DOCTYPE token</dt>
5395: <dd>
5396: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5397: </dd>
5398:
5399: <dt>A start tag whose tag name is "html"</dt>
5400: <dd>
5401: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5402: mode</a>.</p>
5403: </dd>
5404:
5405: <dt>A start tag whose tag name is "option"</dt>
5406: <dd>
5407:
5408: <p>If the <a href="parsing.html#current-node">current node</a> is an <code><a href="the-button-element.html#the-option-element">option</a></code>
5409: element, act as if an end tag with the tag name "option" had
5410: been seen.</p>
5411:
5412: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
5413:
5414: </dd>
5415:
5416: <dt>A start tag whose tag name is "optgroup"</dt>
5417: <dd>
5418:
5419: <p>If the <a href="parsing.html#current-node">current node</a> is an <code><a href="the-button-element.html#the-option-element">option</a></code>
5420: element, act as if an end tag with the tag name "option" had
5421: been seen.</p>
5422:
5423: <p>If the <a href="parsing.html#current-node">current node</a> is an
5424: <code><a href="the-button-element.html#the-optgroup-element">optgroup</a></code> element, act as if an end tag with the
5425: tag name "optgroup" had been seen.</p>
5426:
5427: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
5428:
5429: </dd>
5430:
5431: <dt>An end tag whose tag name is "optgroup"</dt>
5432: <dd>
5433:
5434: <p>First, if the <a href="parsing.html#current-node">current node</a> is an
5435: <code><a href="the-button-element.html#the-option-element">option</a></code> element, and the node immediately before
5436: it in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> is an
5437: <code><a href="the-button-element.html#the-optgroup-element">optgroup</a></code> element, then act as if an end tag with
5438: the tag name "option" had been seen.</p>
5439:
5440: <p>If the <a href="parsing.html#current-node">current node</a> is an
5441: <code><a href="the-button-element.html#the-optgroup-element">optgroup</a></code> element, then pop that node from the
5442: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>. Otherwise, this is a
5443: <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p>
5444:
5445: </dd>
5446:
5447: <dt>An end tag whose tag name is "option"</dt>
5448: <dd>
5449:
5450: <p>If the <a href="parsing.html#current-node">current node</a> is an <code><a href="the-button-element.html#the-option-element">option</a></code>
5451: element, then pop that node from the <a href="parsing.html#stack-of-open-elements">stack of open
5452: elements</a>. Otherwise, this is a <a href="parsing.html#parse-error">parse
5453: error</a>; ignore the token.</p>
5454:
5455: </dd>
5456:
5457: <dt>An end tag whose tag name is "select"</dt>
5458: <dd>
5459:
1.39 mike 5460: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-select-scope" title="has an element in select scope">have an element in select
1.1 mike 5461: scope</a> with the same tag name as the token, this is a
5462: <a href="parsing.html#parse-error">parse error</a>. Ignore the token. (<a href="the-end.html#fragment-case">fragment
5463: case</a>)</p>
5464:
5465: <p>Otherwise:</p>
5466:
5467: <p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>
5468: until a <code><a href="the-button-element.html#the-select-element">select</a></code> element has been popped from the
5469: stack.</p>
5470:
5471: <p><a href="parsing.html#reset-the-insertion-mode-appropriately">Reset the insertion mode appropriately</a>.</p>
5472:
5473: </dd>
5474:
5475: <dt>A start tag whose tag name is "select"</dt>
5476: <dd>
5477:
5478: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if the token had been
5479: an end tag with the tag name "select" instead.</p>
5480:
5481: </dd>
5482:
5483: <dt>A start tag whose tag name is one of: "input", "keygen", "textarea"</dt>
5484: <dd>
5485:
5486: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
5487:
1.39 mike 5488: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not <a href="parsing.html#has-an-element-in-select-scope" title="has an element in select scope">have a <code>select</code>
5489: element in select scope</a>, ignore the token. (<a href="the-end.html#fragment-case">fragment
1.1 mike 5490: case</a>)</p>
5491:
5492: <p>Otherwise, act as if an end tag with the tag name "select" had
5493: been seen, and reprocess the token.</p>
5494:
5495: </dd>
5496:
5497: <dt>A start tag token whose tag name is "script"</dt>
5498: <dd>
5499: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
5500: mode</a>.</p>
5501: </dd>
5502:
5503: <dt>An end-of-file token</dt>
5504: <dd>
5505:
5506: <p>If the <a href="parsing.html#current-node">current node</a> is not the root
5507: <code><a href="semantics.html#the-html-element-0">html</a></code> element, then this is a <a href="parsing.html#parse-error">parse
5508: error</a>.</p>
5509:
5510: <p class="note">It can only be the <a href="parsing.html#current-node">current node</a> in
5511: the <a href="the-end.html#fragment-case">fragment case</a>.</p>
5512:
5513: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
5514:
5515: </dd>
5516:
5517: <dt>Anything else</dt>
5518: <dd>
5519: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5520: </dd>
5521:
1.29 mike 5522: </dl><h5 id="parsing-main-inselectintable"><span class="secno">8.2.5.20 </span>The "<dfn title="insertion mode: in select in table">in select in table</dfn>" insertion mode</h5>
1.1 mike 5523:
5524: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inselectintable" title="insertion
5525: mode: in select in table">in select in table</a>", tokens must be handled as follows:</p>
5526:
5527: <dl class="switch"><dt>A start tag whose tag name is one of: "caption", "table",
5528: "tbody", "tfoot", "thead", "tr", "td", "th"</dt>
5529: <dd>
5530: <p><a href="parsing.html#parse-error">Parse error</a>. Act as if an end tag with the tag
5531: name "select" had been seen, and reprocess the token.</p>
5532: </dd>
5533:
5534: <dt>An end tag whose tag name is one of: "caption", "table",
5535: "tbody", "tfoot", "thead", "tr", "td", "th"</dt>
5536: <dd>
5537:
5538: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
5539:
5540: <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-table-scope">has an
5541: element in table scope</a> with the same tag name as that
5542: of the token, then act as if an end tag with the tag name
5543: "select" had been seen, and reprocess the token. Otherwise,
5544: ignore the token.</p>
5545:
5546: </dd>
5547:
5548: <dt>Anything else</dt>
5549: <dd>
5550: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inselect" title="insertion mode: in select">in select</a>" <a href="parsing.html#insertion-mode">insertion
5551: mode</a>.</p>
5552: </dd>
5553:
1.29 mike 5554: </dl><h5 id="parsing-main-inforeign"><span class="secno">8.2.5.21 </span>The "<dfn title="insertion mode: in foreign content">in foreign content</dfn>" insertion mode</h5>
1.1 mike 5555:
5556: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inforeign" title="insertion
5557: mode: in foreign content">in foreign content</a>", tokens must be
5558: handled as follows:</p>
5559:
1.42 mike 5560: <dl class="switch"><dt>Any token, if the <a href="parsing.html#current-node">current node</a> is an element in the <a href="namespaces.html#html-namespace-0">HTML namespace</a></dt>
5561: <dt>A start tag whose tag name is neither "mglyph" nor "malignmark", if the <a href="parsing.html#current-node">current node</a> is a <a href="#mathml-text-integration-point">MathML text integration point</a></dt>
5562: <dt>A start tag whose tag name is "svg", if the <a href="parsing.html#current-node">current node</a> is an <code title="">annotation-xml</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></dt>
5563: <dt>A start tag, if the <a href="parsing.html#current-node">current node</a> is an <a href="#html-integration-point">HTML integration point</a></dt>
5564: <dt>A character token, if the <a href="parsing.html#current-node">current node</a> is an <a href="#html-integration-point">HTML integration point</a></dt>
5565: <dt>An end-of-file token</dt>
5566: <dd>
5567:
5568: <ol><li><p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the
5569: "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>"
5570: <a href="parsing.html#insertion-mode">insertion mode</a>, except that if those rules say to
5571: reprocess the token, these steps must be finished first (i.e. the
5572: insertion mode is reset by the following step before the token is
5573: reprocessed).</p></li>
5574:
5575: <li><p>If, after doing so, the <a href="parsing.html#insertion-mode">insertion mode</a> is
5576: still "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in
5577: foreign content</a>", <a href="parsing.html#reset-the-insertion-mode-appropriately">reset the insertion mode
5578: appropriately</a>.</p></li>
5579:
5580: </ol></dd>
5581:
5582: <dt>A character token</dt>
1.1 mike 5583: <dd>
5584:
5585: <p><a href="#insert-a-character" title="insert a character">Insert the token's
5586: character</a> into the <a href="parsing.html#current-node">current node</a>.</p>
5587:
5588: <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
5589: LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
5590: (CR), or U+0020 SPACE, then set the <a href="parsing.html#frameset-ok-flag">frameset-ok
5591: flag</a> to "not ok".</p>
5592:
5593: </dd>
5594:
5595: <dt>A comment token</dt>
5596: <dd>
5597: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
5598: node</a> with the <code title="">data</code> attribute set to
5599: the data given in the comment token.</p>
5600: </dd>
5601:
5602: <dt>A DOCTYPE token</dt>
5603: <dd>
5604: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5605: </dd>
5606:
5607: <dt>A start tag whose tag name is one of: <!--"a",--> "b", "big",
5608: "blockquote", "body"<!--by inspection-->, "br", "center", "code",
5609: "dd", "div", "dl", "dt"<!-- so that dd and dt can be handled
5610: uniformly throughout the parser -->, "em", "embed", "h1", "h2",
5611: "h3", "h4"<!--for completeness-->, "h5", "h6"<!--for
5612: completeness-->, "head"<!--by inspection-->, "hr", "i", "img",
5613: "li", "listing"<!-- so that pre and listing can be handled
5614: uniformly throughout the parser -->, "menu", "meta", "nobr",
5615: "ol"<!-- so that dl, ul, and ol can be handled uniformly throughout
5616: the parser -->, "p", "pre", "ruby", "s", <!--"script",--> "small",
5617: "span", "strong", "strike"<!-- so that s and strike can be handled
5618: uniformly throughout the parser -->, <!--"style",--> "sub", "sup",
5619: "table"<!--by inspection-->, "tt", "u", "ul", "var"</dt> <!-- this
5620: list was determined empirically by studying over 6,000,000,000
5621: pages that were specifically not XML pages -->
5622: <dt>A start tag whose tag name is "font", if the token has any
5623: attributes named "color", "face", or "size"</dt> <!-- the
5624: attributes here are required so that SVG <font> will go through as
5625: SVG but legacy <font>s won't -->
1.42 mike 5626:
5627: <dd>
1.1 mike 5628:
5629: <p><a href="parsing.html#parse-error">Parse error</a>.</p>
5630:
1.33 mike 5631: <p>Pop an element from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>,
5632: and then keep popping more elements from the <a href="parsing.html#stack-of-open-elements">stack of open
1.42 mike 5633: elements</a> until the <a href="parsing.html#current-node">current node</a> is a
5634: <a href="#mathml-text-integration-point">MathML text integration point</a>, an <a href="#html-integration-point">HTML
5635: integration point</a>, or an element in the <a href="namespaces.html#html-namespace-0">HTML
5636: namespace</a>.</p>
1.25 mike 5637:
1.42 mike 5638: <p>Then, <a href="parsing.html#reset-the-insertion-mode-appropriately">reset the insertion mode appropriately</a> and
1.38 mike 5639: reprocess the token.</p>
1.1 mike 5640:
5641: </dd>
5642:
5643: <dt>Any other start tag</dt>
5644: <dd>
5645:
5646: <p>If the <a href="parsing.html#current-node">current node</a> is an element in the
5647: <a href="namespaces.html#mathml-namespace">MathML namespace</a>, <a href="#adjust-mathml-attributes">adjust MathML
5648: attributes</a> for the token. (This fixes the case of MathML
5649: attributes that are not all lowercase.)</p>
5650:
5651: <p>If the <a href="parsing.html#current-node">current node</a> is an element in the <a href="namespaces.html#svg-namespace">SVG
5652: namespace</a>, and the token's tag name is one of the ones in
5653: the first column of the following table, change the tag name to
5654: the name given in the corresponding cell in the second
5655: column. (This fixes the case of SVG elements that are not all
5656: lowercase.)</p>
5657:
5658: <table><thead><tr><th> Tag name </th><th> Element name
5659: </th></tr></thead><tbody><tr><td> <code title="">altglyph</code> </td><td> <code title="">altGlyph</code>
5660: </td></tr><tr><td> <code title="">altglyphdef</code> </td><td> <code title="">altGlyphDef</code>
5661: </td></tr><tr><td> <code title="">altglyphitem</code> </td><td> <code title="">altGlyphItem</code>
5662: </td></tr><tr><td> <code title="">animatecolor</code> </td><td> <code title="">animateColor</code>
5663: </td></tr><tr><td> <code title="">animatemotion</code> </td><td> <code title="">animateMotion</code>
5664: </td></tr><tr><td> <code title="">animatetransform</code> </td><td> <code title="">animateTransform</code>
5665: </td></tr><tr><td> <code title="">clippath</code> </td><td> <code title="">clipPath</code>
5666: </td></tr><tr><td> <code title="">feblend</code> </td><td> <code title="">feBlend</code>
5667: </td></tr><tr><td> <code title="">fecolormatrix</code> </td><td> <code title="">feColorMatrix</code>
5668: </td></tr><tr><td> <code title="">fecomponenttransfer</code> </td><td> <code title="">feComponentTransfer</code>
5669: </td></tr><tr><td> <code title="">fecomposite</code> </td><td> <code title="">feComposite</code>
5670: </td></tr><tr><td> <code title="">feconvolvematrix</code> </td><td> <code title="">feConvolveMatrix</code>
5671: </td></tr><tr><td> <code title="">fediffuselighting</code> </td><td> <code title="">feDiffuseLighting</code>
5672: </td></tr><tr><td> <code title="">fedisplacementmap</code> </td><td> <code title="">feDisplacementMap</code>
5673: </td></tr><tr><td> <code title="">fedistantlight</code> </td><td> <code title="">feDistantLight</code>
5674: </td></tr><tr><td> <code title="">feflood</code> </td><td> <code title="">feFlood</code>
5675: </td></tr><tr><td> <code title="">fefunca</code> </td><td> <code title="">feFuncA</code>
5676: </td></tr><tr><td> <code title="">fefuncb</code> </td><td> <code title="">feFuncB</code>
5677: </td></tr><tr><td> <code title="">fefuncg</code> </td><td> <code title="">feFuncG</code>
5678: </td></tr><tr><td> <code title="">fefuncr</code> </td><td> <code title="">feFuncR</code>
5679: </td></tr><tr><td> <code title="">fegaussianblur</code> </td><td> <code title="">feGaussianBlur</code>
5680: </td></tr><tr><td> <code title="">feimage</code> </td><td> <code title="">feImage</code>
5681: </td></tr><tr><td> <code title="">femerge</code> </td><td> <code title="">feMerge</code>
5682: </td></tr><tr><td> <code title="">femergenode</code> </td><td> <code title="">feMergeNode</code>
5683: </td></tr><tr><td> <code title="">femorphology</code> </td><td> <code title="">feMorphology</code>
5684: </td></tr><tr><td> <code title="">feoffset</code> </td><td> <code title="">feOffset</code>
5685: </td></tr><tr><td> <code title="">fepointlight</code> </td><td> <code title="">fePointLight</code>
5686: </td></tr><tr><td> <code title="">fespecularlighting</code> </td><td> <code title="">feSpecularLighting</code>
5687: </td></tr><tr><td> <code title="">fespotlight</code> </td><td> <code title="">feSpotLight</code>
5688: </td></tr><tr><td> <code title="">fetile</code> </td><td> <code title="">feTile</code>
5689: </td></tr><tr><td> <code title="">feturbulence</code> </td><td> <code title="">feTurbulence</code>
5690: </td></tr><tr><td> <code title="">foreignobject</code> </td><td> <code title="">foreignObject</code>
5691: </td></tr><tr><td> <code title="">glyphref</code> </td><td> <code title="">glyphRef</code>
5692: </td></tr><tr><td> <code title="">lineargradient</code> </td><td> <code title="">linearGradient</code>
5693: </td></tr><tr><td> <code title="">radialgradient</code> </td><td> <code title="">radialGradient</code>
5694: <!--<tr> <td> <code title="">solidcolor</code> <td> <code title="">solidColor</code> (SVG 1.2)-->
5695: </td></tr><tr><td> <code title="">textpath</code> </td><td> <code title="">textPath</code>
5696: </td></tr></tbody></table><p>If the <a href="parsing.html#current-node">current node</a> is an element in the <a href="namespaces.html#svg-namespace">SVG
5697: namespace</a>, <a href="#adjust-svg-attributes">adjust SVG attributes</a> for the
5698: token. (This fixes the case of SVG attributes that are not all
5699: lowercase.)</p>
5700:
5701: <p><a href="#adjust-foreign-attributes">Adjust foreign attributes</a> for the token. (This
5702: fixes the use of namespaced attributes, in particular XLink in
5703: SVG.)</p>
5704:
5705: <p><a href="#insert-a-foreign-element">Insert a foreign element</a> for the token, in the
5706: same namespace as the <a href="parsing.html#current-node">current node</a>.</p>
5707:
5708: <p>If the token has its <i>self-closing flag</i> set, pop the
5709: <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
5710: elements</a> and <a href="#acknowledge-self-closing-flag" title="acknowledge self-closing
5711: flag">acknowledge the token's <i>self-closing flag</i></a>.</p>
5712:
5713: </dd>
5714:
1.42 mike 5715: <dt id="scriptForeignEndTag">An end tag whose tag name is "script", if the <a href="parsing.html#current-node">current node</a> is a <code title="">script</code> element in the <a href="namespaces.html#svg-namespace">SVG namespace</a></dt>
5716: <dd>
5717:
5718: <p>Pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open
5719: elements</a>.</p>
5720:
5721: <p>Let the <var title="">old insertion point</var> have the
5722: same value as the current <a href="parsing.html#insertion-point">insertion point</a>. Let
5723: the <a href="parsing.html#insertion-point">insertion point</a> be just before the <a href="parsing.html#next-input-character">next
5724: input character</a>.</p>
5725:
5726: <p>Increment the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> by
5727: one. Set the <a href="parsing.html#parser-pause-flag">parser pause flag</a> to true.</p>
5728:
5729: <p><a href="http://www.w3.org/TR/SVGMobile12/script.html#ScriptContentProcessing">Process
5730: the <code title="">script</code> element</a> according to the SVG
5731: rules, if the user agent supports SVG. <a href="references.html#refsSVG">[SVG]</a></p>
5732:
5733: <p class="note">Even if this causes <a href="apis-in-html-documents.html#dom-document-write" title="dom-document-write">new characters to be inserted into the
5734: tokenizer</a>, the parser will not be executed reentrantly,
5735: since the <a href="parsing.html#parser-pause-flag">parser pause flag</a> is true.</p>
5736:
5737: <p>Decrement the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> by
5738: one. If the parser's <a href="parsing.html#script-nesting-level">script nesting level</a> is zero,
5739: then set the <a href="parsing.html#parser-pause-flag">parser pause flag</a> to false.</p>
5740:
5741: <p>Let the <a href="parsing.html#insertion-point">insertion point</a> have the value of the <var title="">old insertion point</var>. (In other words, restore the
5742: <a href="parsing.html#insertion-point">insertion point</a> to its previous value. This value
5743: might be the "undefined" value.)</p>
5744:
5745: </dd>
5746:
5747: <dt>Any other end tag</dt>
5748:
5749: <dd>
5750:
5751: <p>Run these steps:</p>
5752:
5753: <ol><li><p>Initialize <var title="">node</var> to be the <a href="parsing.html#current-node">current
5754: node</a> (the bottommost node of the stack).</p></li>
5755:
5756: <li><p>If <var title="">node</var> is not an element with the
5757: same tag name as the token, then this is a <a href="parsing.html#parse-error">parse
5758: error</a>.</p></li>
5759:
5760: <li><p><i>Loop</i>: If <var title="">node</var>'s tag name,
5761: <a href="infrastructure.html#converted-to-ascii-lowercase">converted to ASCII lowercase</a>, is the same as as the
5762: tag name of the token, pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open
5763: elements</a> until <var title="">node</var> has been popped
5764: from the stack, and then jump to the last step of this list of
5765: steps.</p></li>
5766:
5767: <li><p>Set <var title="">node</var> to the previous entry in the
5768: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p></li>
5769:
5770: <li><p>If <var title="">node</var> is not an element in the
5771: <a href="namespaces.html#html-namespace-0">HTML namespace</a>, return to the step labeled
5772: <i>loop</i>.</p></li>
5773:
5774: <li><p>Otherwise, process the token <a href="parsing.html#using-the-rules-for">using the rules
5775: for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in
5776: body</a>" <a href="parsing.html#insertion-mode">insertion mode</a>, except that if those
5777: rules say to reprocess the token, these steps must be finished
5778: first (i.e. the insertion mode is reset by the following step
5779: before the token is reprocessed).</p></li>
5780:
5781: <li><p>If the <a href="parsing.html#insertion-mode">insertion mode</a> is still "<a href="#parsing-main-inforeign" title="insertion mode: in foreign content">in foreign
5782: content</a>", <a href="parsing.html#reset-the-insertion-mode-appropriately">reset the insertion mode
5783: appropriately</a>.</p></li>
5784:
5785: </ol></dd>
5786:
5787: </dl><p>The <a href="parsing.html#current-node">current node</a> is a <dfn id="mathml-text-integration-point">MathML text
5788: integration point</dfn> if it is one of the following elements:</p>
5789:
5790: <ul class="brief"><li>An <code title="">mi</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>
5791: <li>An <code title="">mo</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>
5792: <li>An <code title="">mn</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>
5793: <li>An <code title="">ms</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>
5794: <li>An <code title="">mtext</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a></li>
5795: </ul><p>The <a href="parsing.html#current-node">current node</a> is an <dfn id="html-integration-point">HTML
5796: integration point</dfn> if it is one of the following elements:</p>
5797:
5798: <ul class="brief"><li>An <code title="">annotation-xml</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a> whose start tag token had an attribute with the name "encoding" whose value was an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match for the string "<code title="">text/html</code>"</li>
5799: <li>An <code title="">annotation-xml</code> element in the <a href="namespaces.html#mathml-namespace">MathML namespace</a> whose start tag token had an attribute with the name "encoding" whose value was an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match for the string "<code title="">application/xhtml+xml</code>"</li>
5800: <li>A <code title="">foreignObject</code> element in the <a href="namespaces.html#svg-namespace">SVG namespace</a></li>
5801: <li>A <code title="">desc</code> element in the <a href="namespaces.html#svg-namespace">SVG namespace</a></li>
5802: <li>A <code title="">title</code> element in the <a href="namespaces.html#svg-namespace">SVG namespace</a></li>
5803: </ul><h5 id="parsing-main-afterbody"><span class="secno">8.2.5.22 </span>The "<dfn title="insertion mode: after body">after body</dfn>" insertion mode</h5>
1.1 mike 5804:
5805: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-afterbody" title="insertion
5806: mode: after body">after body</a>", tokens must be handled as follows:</p>
5807:
5808: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
5809: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
5810: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
5811: <dd>
5812: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5813: mode</a>.</p>
5814: </dd>
5815:
5816: <dt>A comment token</dt>
5817: <dd>
5818: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the first element in
5819: the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> (the <code><a href="semantics.html#the-html-element-0">html</a></code>
5820: element), with the <code title="">data</code> attribute set to
5821: the data given in the comment token.</p>
5822: </dd>
5823:
5824: <dt>A DOCTYPE token</dt>
5825: <dd>
5826: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5827: </dd>
5828:
5829: <dt>A start tag whose tag name is "html"</dt>
5830: <dd>
5831: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5832: mode</a>.</p>
5833: </dd>
5834:
5835: <dt>An end tag whose tag name is "html"</dt>
5836: <dd>
5837:
5838: <p>If the parser was originally created as part of the <a href="the-end.html#html-fragment-parsing-algorithm">HTML
5839: fragment parsing algorithm</a>, this is a <a href="parsing.html#parse-error">parse
5840: error</a>; ignore the token. (<a href="the-end.html#fragment-case">fragment case</a>)</p>
5841:
5842: <p>Otherwise, switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-after-after-body-insertion-mode" title="insertion mode: after after body">after after
5843: body</a>".</p>
5844:
5845: </dd>
5846:
5847: <dt>An end-of-file token</dt>
5848: <dd>
5849: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
5850: </dd>
5851:
5852: <dt>Anything else</dt>
5853: <dd>
5854:
5855: <p><a href="parsing.html#parse-error">Parse error</a>. Switch the <a href="parsing.html#insertion-mode">insertion
5856: mode</a> to "<a href="#parsing-main-inbody" title="insertion mode: in body">in
5857: body</a>" and reprocess the token.</p>
5858:
5859: </dd>
5860:
1.29 mike 5861: </dl><h5 id="parsing-main-inframeset"><span class="secno">8.2.5.23 </span>The "<dfn title="insertion mode: in frameset">in frameset</dfn>" insertion mode</h5>
1.1 mike 5862:
5863: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inframeset" title="insertion
5864: mode: in frameset">in frameset</a>", tokens must be handled as follows:</p>
5865:
5866: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
5867: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
5868: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
5869: <dd>
5870: <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
5871: the <a href="parsing.html#current-node">current node</a>.</p>
5872: </dd>
5873:
5874: <dt>A comment token</dt>
5875: <dd>
5876: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
5877: node</a> with the <code title="">data</code> attribute set to
5878: the data given in the comment token.</p>
5879: </dd>
5880:
5881: <dt>A DOCTYPE token</dt>
5882: <dd>
5883: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5884: </dd>
5885:
5886: <dt>A start tag whose tag name is "html"</dt>
5887: <dd>
5888: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5889: mode</a>.</p>
5890: </dd>
5891:
5892: <dt>A start tag whose tag name is "frameset"</dt>
5893: <dd>
5894: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p>
5895: </dd>
5896:
5897: <dt>An end tag whose tag name is "frameset"</dt>
5898: <dd>
5899:
5900: <p>If the <a href="parsing.html#current-node">current node</a> is the root
5901: <code><a href="semantics.html#the-html-element-0">html</a></code> element, then this is a <a href="parsing.html#parse-error">parse
5902: error</a>; ignore the token. (<a href="the-end.html#fragment-case">fragment
5903: case</a>)</p>
5904:
5905: <p>Otherwise, pop the <a href="parsing.html#current-node">current node</a> from the
5906: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
5907:
5908: <p>If the parser was <em>not</em> originally created as part
5909: of the <a href="the-end.html#html-fragment-parsing-algorithm">HTML fragment parsing algorithm</a>
5910: (<a href="the-end.html#fragment-case">fragment case</a>), and the <a href="parsing.html#current-node">current
5911: node</a> is no longer a <code><a href="obsolete.html#frameset">frameset</a></code> element, then
5912: switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-afterframeset" title="insertion mode: after frameset">after
5913: frameset</a>".</p>
5914:
5915: </dd>
5916:
5917: <dt>A start tag whose tag name is "frame"</dt>
5918: <dd>
5919:
5920: <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.
5921: Immediately pop the <a href="parsing.html#current-node">current node</a> off the
5922: <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p>
5923:
5924: <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the
5925: token's <i>self-closing flag</i></a>, if it is set.</p>
5926:
5927: </dd>
5928:
5929: <dt>A start tag whose tag name is "noframes"</dt>
5930: <dd>
5931: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
5932: mode</a>.</p>
5933: </dd>
5934:
5935: <dt>An end-of-file token</dt>
5936: <dd>
5937:
5938: <p>If the <a href="parsing.html#current-node">current node</a> is not the root
5939: <code><a href="semantics.html#the-html-element-0">html</a></code> element, then this is a <a href="parsing.html#parse-error">parse
5940: error</a>.</p>
5941:
5942: <p class="note">It can only be the <a href="parsing.html#current-node">current node</a> in
5943: the <a href="the-end.html#fragment-case">fragment case</a>.</p>
5944:
5945: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
5946:
5947: </dd>
5948:
5949: <dt>Anything else</dt>
5950: <dd>
5951: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5952: </dd>
5953:
1.29 mike 5954: </dl><h5 id="parsing-main-afterframeset"><span class="secno">8.2.5.24 </span>The "<dfn title="insertion mode: after frameset">after frameset</dfn>" insertion mode</h5>
1.1 mike 5955:
5956: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#parsing-main-afterframeset" title="insertion
5957: mode: after frameset">after frameset</a>", tokens must be handled as follows:</p>
5958:
5959: <!-- due to rules in the "in frameset" mode, this can't be entered in the fragment case -->
5960: <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
5961: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
5962: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
5963: <dd>
5964: <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
5965: the <a href="parsing.html#current-node">current node</a>.</p>
5966: </dd>
5967:
5968: <dt>A comment token</dt>
5969: <dd>
5970: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current
5971: node</a> with the <code title="">data</code> attribute set to
5972: the data given in the comment token.</p>
5973: </dd>
5974:
5975: <dt>A DOCTYPE token</dt>
5976: <dd>
5977: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
5978: </dd>
5979:
5980: <dt>A start tag whose tag name is "html"</dt>
5981: <dd>
5982: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
5983: mode</a>.</p>
5984: </dd>
5985:
5986: <dt>An end tag whose tag name is "html"</dt>
5987: <dd>
5988: <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-after-after-frameset-insertion-mode" title="insertion mode: after after frameset">after after
5989: frameset</a>".</p>
5990: </dd>
5991:
5992: <dt>A start tag whose tag name is "noframes"</dt>
5993: <dd>
5994: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
5995: mode</a>.</p>
5996: </dd>
5997:
5998: <dt>An end-of-file token</dt>
5999: <dd>
6000: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
6001: </dd>
6002:
6003: <dt>Anything else</dt>
6004: <dd>
6005: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
6006: </dd>
6007:
1.29 mike 6008: </dl><h5 id="the-after-after-body-insertion-mode"><span class="secno">8.2.5.25 </span>The "<dfn title="insertion mode: after after body">after after body</dfn>" insertion mode</h5>
1.1 mike 6009:
6010: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#the-after-after-body-insertion-mode" title="insertion
6011: mode: after after body">after after body</a>", tokens must be handled as follows:</p>
6012:
6013: <dl class="switch"><dt>A comment token</dt>
6014: <dd>
6015: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <code><a href="infrastructure.html#document">Document</a></code>
6016: object with the <code title="">data</code> attribute set to the
6017: data given in the comment token.</p>
6018: </dd>
6019:
6020: <dt>A DOCTYPE token</dt>
6021: <dt>A character token that is one of U+0009 CHARACTER
6022: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
6023: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
6024: <dt>A start tag whose tag name is "html"</dt>
6025: <dd>
6026: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
6027: mode</a>.</p>
6028: </dd>
6029:
6030: <dt>An end-of-file token</dt>
6031: <dd>
6032: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
6033: </dd>
6034:
6035: <dt>Anything else</dt>
6036: <dd>
6037: <p><a href="parsing.html#parse-error">Parse error</a>. Switch the <a href="parsing.html#insertion-mode">insertion mode</a>
6038: to "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" and
6039: reprocess the token.</p>
6040: </dd>
6041:
1.29 mike 6042: </dl><h5 id="the-after-after-frameset-insertion-mode"><span class="secno">8.2.5.26 </span>The "<dfn title="insertion mode: after after frameset">after after frameset</dfn>" insertion mode</h5>
1.1 mike 6043:
6044: <p>When the <a href="parsing.html#insertion-mode">insertion mode</a> is "<a href="#the-after-after-frameset-insertion-mode" title="insertion
6045: mode: after after frameset">after after frameset</a>", tokens must be handled as follows:</p>
6046:
6047: <dl class="switch"><dt>A comment token</dt>
6048: <dd>
6049: <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <code><a href="infrastructure.html#document">Document</a></code>
6050: object with the <code title="">data</code> attribute set to the
6051: data given in the comment token.</p>
6052: </dd>
6053:
6054: <dt>A DOCTYPE token</dt>
6055: <dt>A character token that is one of U+0009 CHARACTER
6056: TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
6057: U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
6058: <dt>A start tag whose tag name is "html"</dt>
6059: <dd>
6060: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion
6061: mode</a>.</p>
6062: </dd>
6063:
6064: <dt>An end-of-file token</dt>
6065: <dd>
6066: <p><a href="the-end.html#stop-parsing">Stop parsing</a>.</p>
6067: </dd>
6068:
6069: <dt>A start tag whose tag name is "noframes"</dt>
6070: <dd>
6071: <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion
6072: mode</a>.</p>
6073: </dd>
6074:
6075: <dt>Anything else</dt>
6076: <dd>
6077: <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p>
6078: </dd>
6079:
6080: </dl></div></body></html>
Webmaster