File:  [Public] / html5 / spec / tokenization.html
Revision 1.215: download - view: text, annotated - select for diffs
Wed Nov 21 04:44:11 2012 UTC (12 years, 5 months ago) by sruby
Branches: MAIN
CVS tags: HEAD
commit 1077bca89fbac7a99525653cda746ae1c3e87409
Author: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date:   Wed Nov 21 15:38:38 2012 +1100

    [Editorial] Update of revision number up to which WHATWG changes have
    been merged.

<!DOCTYPE html><html lang="en-US-x-Hixie"><meta charset="utf-8"><title>8.2.4 Tokenization &mdash; HTML5</title><style type="text/css">

   .applies thead th > * { display: block; }
   .applies thead code { display: block; }
   .applies tbody th { white-space: nowrap; }
   .applies td { text-align: center; }
   .applies .yes { background: yellow; }

   .matrix, .matrix td { border: hidden; text-align: right; }
   .matrix { margin-left: 2em; }

   .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }
   .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }
   .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }

   td.eg { border-width: thin; text-align: center; }

   #table-example-1 { border: solid thin; border-collapse: collapse; margin-left: 3em; }
   #table-example-1 * { font-family: "Essays1743", serif; line-height: 1.01em; }
   #table-example-1 caption { padding-bottom: 0.5em; }
   #table-example-1 thead, #table-example-1 tbody { border: none; }
   #table-example-1 th, #table-example-1 td { border: solid thin; }
   #table-example-1 th { font-weight: normal; }
   #table-example-1 td { border-style: none solid; vertical-align: top; }
   #table-example-1 th { padding: 0.5em; vertical-align: middle; text-align: center; }
   #table-example-1 tbody tr:first-child td { padding-top: 0.5em; }
   #table-example-1 tbody tr:last-child td { padding-bottom: 1.5em; }
   #table-example-1 tbody td:first-child { padding-left: 2.5em; padding-right: 0; width: 9em; }
   #table-example-1 tbody td:first-child::after { content: leader(". "); }
   #table-example-1 tbody td { padding-left: 2em; padding-right: 2em; }
   #table-example-1 tbody td:first-child + td { width: 10em; }
   #table-example-1 tbody td:first-child + td ~ td { width: 2.5em; }
   #table-example-1 tbody td:first-child + td + td + td ~ td { width: 1.25em; }

   .apple-table-examples { border: none; border-collapse: separate; border-spacing: 1.5em 0em; width: 40em; margin-left: 3em; }
   .apple-table-examples * { font-family: "Times", serif; }
   .apple-table-examples td, .apple-table-examples th { border: none; white-space: nowrap; padding-top: 0; padding-bottom: 0; }
   .apple-table-examples tbody th:first-child { border-left: none; width: 100%; }
   .apple-table-examples thead th:first-child ~ th { font-size: smaller; font-weight: bolder; border-bottom: solid 2px; text-align: center; }
   .apple-table-examples tbody th::after, .apple-table-examples tfoot th::after { content: leader(". ") }
   .apple-table-examples tbody th, .apple-table-examples tfoot th { font: inherit; text-align: left; }
   .apple-table-examples td { text-align: right; vertical-align: top; }
   .apple-table-examples.e1 tbody tr:last-child td { border-bottom: solid 1px; }
   .apple-table-examples.e1 tbody + tbody tr:last-child td { border-bottom: double 3px; }
   .apple-table-examples.e2 th[scope=row] { padding-left: 1em; }
   .apple-table-examples sup { line-height: 0; }

   .details-example img { vertical-align: top; }

   #base64-table {
     white-space: nowrap;
     font-size: 0.6em;
     column-width: 6em;
     column-count: 5;
     column-gap: 1em;
     -moz-column-width: 6em;
     -moz-column-count: 5;
     -moz-column-gap: 1em;
     -webkit-column-width: 6em;
     -webkit-column-count: 5;
     -webkit-column-gap: 1em;
   }
   #base64-table thead { display: none; }
   #base64-table * { border: none; }
   #base64-table tbody td:first-child:after { content: ':'; }
   #base64-table tbody td:last-child { text-align: right; }

   #named-character-references-table {
     white-space: nowrap;
     font-size: 0.6em;
     column-width: 30em;
     column-gap: 1em;
     -moz-column-width: 30em;
     -moz-column-gap: 1em;
     -webkit-column-width: 30em;
     -webkit-column-gap: 1em;
   }
   #named-character-references-table > table > tbody > tr > td:first-child + td,
   #named-character-references-table > table > tbody > tr > td:last-child { text-align: center; }
   #named-character-references-table > table > tbody > tr > td:last-child:hover > span { position: absolute; top: auto; left: auto; margin-left: 0.5em; line-height: 1.2; font-size: 5em; border: outset; padding: 0.25em 0.5em; background: white; width: 1.25em; height: auto; text-align: center; }
   #named-character-references-table > table > tbody > tr#entity-CounterClockwiseContourIntegral > td:first-child { font-size: 0.5em; }

   .glyph.control { color: red; }

   @font-face {
     font-family: 'Essays1743';
     src: url('fonts/Essays1743.ttf');
   }
   @font-face {
     font-family: 'Essays1743';
     font-weight: bold;
     src: url('fonts/Essays1743-Bold.ttf');
   }
   @font-face {
     font-family: 'Essays1743';
     font-style: italic;
     src: url('fonts/Essays1743-Italic.ttf');
   }
   @font-face {
     font-family: 'Essays1743';
     font-style: italic;
     font-weight: bold;
     src: url('fonts/Essays1743-BoldItalic.ttf');
   }

  </style><link href="data:text/css," id="complete" rel="stylesheet" title="Complete specification"><link href="data:text/css,.impl%20{%20display:%20none;%20}%0Ahtml%20{%20border:%20solid%20yellow;%20}%20.domintro:before%20{%20display:%20none;%20}" id="author" rel="alternate stylesheet" title="Author documentation only"><link href="data:text/css,.impl%20{%20background:%20%23FFEEEE;%20}%20.domintro:before%20{%20background:%20%23FFEEEE;%20}" id="highlight" rel="alternate stylesheet" title="Highlight implementation requirements"><style type="text/css">
   pre { margin-left: 2em; white-space: pre-wrap; }
   h2 { margin: 3em 0 1em 0; }
   h3 { margin: 2.5em 0 1em 0; }
   h4 { margin: 2.5em 0 0.75em 0; }
   h5, h6 { margin: 2.5em 0 1em; }
   h1 + h2, h1 + h2 + h2 { margin: 0.75em 0 0.75em; }
   h2 + h3, h3 + h4, h4 + h5, h5 + h6 { margin-top: 0.5em; }
   p { margin: 1em 0; }
   hr:not(.top) { display: block; background: none; border: none; padding: 0; margin: 2em 0; height: auto; }
   dl, dd { margin-top: 0; margin-bottom: 0; }
   dt { margin-top: 0.75em; margin-bottom: 0.25em; clear: left; }
   dt + dt { margin-top: 0; }
   dd dt { margin-top: 0.25em; margin-bottom: 0; }
   dd p { margin-top: 0; }
   dd dl + p { margin-top: 1em; }
   dd table + p { margin-top: 1em; }
   p + * > li, dd li { margin: 1em 0; }
   dt, dfn { font-weight: bold; font-style: normal; }
   i, em { font-style: italic; }
   dt dfn { font-style: italic; }
   pre, code { font-size: inherit; font-family: monospace; font-variant: normal; }
   pre strong { color: black; font: inherit; font-weight: bold; background: yellow; }
   pre em { font-weight: bolder; font-style: normal; }
   @media screen { code { color: orangered; } code :link, code :visited { color: inherit; } }
   var sub { vertical-align: bottom; font-size: smaller; position: relative; top: 0.1em; }
   table { border-collapse: collapse; border-style: hidden hidden none hidden; }
   table thead, table tbody { border-bottom: solid; }
   table tbody th:first-child { border-left: solid; }
   table tbody th { text-align: left; }
   table td, table th { border-left: solid; border-right: solid; border-bottom: solid thin; vertical-align: top; padding: 0.2em; }
   blockquote { margin: 0 0 0 2em; border: 0; padding: 0; font-style: italic; }

   .bad, .bad *:not(.XXX) { color: gray; border-color: gray; background: transparent; }
   .matrix, .matrix td { border: none; text-align: right; }
   .matrix { margin-left: 2em; }
   .dice-example { border-collapse: collapse; border-style: hidden solid solid hidden; border-width: thin; margin-left: 3em; }
   .dice-example caption { width: 30em; font-size: smaller; font-style: italic; padding: 0.75em 0; text-align: left; }
   .dice-example td, .dice-example th { border: solid thin; width: 1.35em; height: 1.05em; text-align: center; padding: 0; }

   .toc dfn, h1 dfn, h2 dfn, h3 dfn, h4 dfn, h5 dfn, h6 dfn { font: inherit; }
   img.extra, p.overview { float: right; }
   pre.idl { border: solid thin; background: #EEEEEE; color: black; padding: 0.5em 1em; position: relative; }
   pre.idl :link, pre.idl :visited { color: inherit; background: transparent; }
   pre.idl::before { content: "IDL"; font: bold small sans-serif; padding: 0.5em; background: white; position: absolute; top: 0; margin: -1px 0 0 -4em; width: 1.5em; border: thin solid; border-radius: 0 0 0 0.5em }
   pre.css { border: solid thin; background: #FFFFEE; color: black; padding: 0.5em 1em; }
   pre.css:first-line { color: #AAAA50; }
   dl.domintro { color: green; margin: 2em 0 2em 2em; padding: 0.5em 1em; border: none; background: #DDFFDD; }
   hr + dl.domintro, div.impl + dl.domintro { margin-top: 2.5em; margin-bottom: 1.5em; }
   dl.domintro dt, dl.domintro dt * { color: black; text-decoration: none; }
   dl.domintro dd { margin: 0.5em 0 1em 2em; padding: 0; }
   dl.domintro dd p { margin: 0.5em 0; }
   dl.domintro:before { display: table; margin: -1em -0.5em -0.5em auto; width: auto; content: 'This box is non-normative. Implementation requirements are given below this box.'; color: black; font-style: italic; border: solid 2px; background: white; padding: 0 0.25em; }
   dl.switch { padding-left: 2em; }
   dl.switch > dt { text-indent: -1.5em; }
   dl.switch > dt:before { content: '\21AA'; padding: 0 0.5em 0 0; display: inline-block; width: 1em; text-align: right; line-height: 0.5em; }
   dl.triple { padding: 0 0 0 1em; }
   dl.triple dt, dl.triple dd { margin: 0; display: inline }
   dl.triple dt:after { content: ':'; }
   dl.triple dd:after { content: '\A'; white-space: pre; }
   .diff-old { text-decoration: line-through; color: silver; background: transparent; }
   .diff-chg, .diff-new { text-decoration: underline; color: green; background: transparent; }
   a .diff-new { border-bottom: 1px blue solid; }

   figure.diagrams { border: double black; background: white; padding: 1em; }
   figure.diagrams img { display: block; margin: 1em auto; } 

   h2 { page-break-before: always; }
   h1, h2, h3, h4, h5, h6 { page-break-after: avoid; }
   h1 + h2, hr + h2.no-toc { page-break-before: auto; }

   p  > span:not([title=""]):not([class="XXX"]):not([class="impl"]):not([class="note"]),
   li > span:not([title=""]):not([class="XXX"]):not([class="impl"]):not([class="note"]) { border-bottom: solid #9999CC; }

   div.head { margin: 0 0 1em; padding: 1em 0 0 0; }
   div.head p { margin: 0; }
   div.head h1 { margin: 0; }
   div.head .logo { float: right; margin: 0 1em; }
   div.head .logo img { border: none } /* remove border from top image */
   div.head dl { margin: 1em 0; }
   div.head p.copyright, div.head p.alt { font-size: x-small; font-style: oblique; margin: 0; }

   body > .toc > li { margin-top: 1em; margin-bottom: 1em; }
   body > .toc.brief > li { margin-top: 0.35em; margin-bottom: 0.35em; }
   body > .toc > li > * { margin-bottom: 0.5em; }
   body > .toc > li > * > li > * { margin-bottom: 0.25em; }
   .toc, .toc li { list-style: none; }

   .brief { margin-top: 1em; margin-bottom: 1em; line-height: 1.1; }
   .brief li { margin: 0; padding: 0; }
   .brief li p { margin: 0; padding: 0; }

   .category-list { margin-top: -0.75em; margin-bottom: 1em; line-height: 1.5; }
   .category-list::before { content: '\21D2\A0'; font-size: 1.2em; font-weight: 900; }
   .category-list li { display: inline; }
   .category-list li:not(:last-child)::after { content: ', '; }
   .category-list li > span, .category-list li > a { text-transform: lowercase; }
   .category-list li * { text-transform: none; } /* don't affect <code> nested in <a> */

   .XXX { color: #E50000; background: white; border: solid red; padding: 0.5em; margin: 1em 0; }
   .XXX > :first-child { margin-top: 0; }
   p .XXX { line-height: 3em; }
   .annotation { border: solid thin black; background: #0C479D; color: white; position: relative; margin: 8px 0 20px 0; }
   .annotation:before { position: absolute; left: 0; top: 0; width: 100%; height: 100%; margin: 6px -6px -6px 6px; background: #333333; z-index: -1; content: ''; }
   .annotation :link, .annotation :visited { color: inherit; }
   .annotation :link:hover, .annotation :visited:hover { background: transparent; }
   .annotation span { border: none ! important; }
   .note { color: green; background: transparent; font-family: sans-serif; }
   .warning { color: red; background: transparent; }
   .note, .warning { font-weight: bolder; font-style: italic; }
   .note em, .warning em, .note i, .warning i { font-style: normal; }
   p.note, div.note { padding: 0.5em 2em; }
   span.note { padding: 0 2em; }
   .note p:first-child, .warning p:first-child { margin-top: 0; }
   .note p:last-child, .warning p:last-child { margin-bottom: 0; }
   .warning:before { font-style: normal; }
   p.note:before { content: 'Note: '; }
   p.warning:before { content: '\26A0 Warning! '; }

   .bookkeeping:before { display: block; content: 'Bookkeeping details'; font-weight: bolder; font-style: italic; }
   .bookkeeping { font-size: 0.8em; margin: 2em 0; }
   .bookkeeping p { margin: 0.5em 2em; display: list-item; list-style: square; }
   .bookkeeping dt { margin: 0.5em 2em 0; }
   .bookkeeping dd { margin: 0 3em 0.5em; }

   h4 { position: relative; z-index: 3; }
   h4 + .element, h4 + div + .element { margin-top: -2.5em; padding-top: 2em; }
   .element {
     background: #EEEEFF;
     color: black;
     margin: 0 0 1em 0.15em;
     padding: 0 1em 0.25em 0.75em;
     border-left: solid #9999FF 0.25em;
     position: relative;
     z-index: 1;
   }
   .element:before {
     position: absolute;
     z-index: 2;
     top: 0;
     left: -1.15em;
     height: 2em;
     width: 0.9em;
     background: #EEEEFF;
     content: ' ';
     border-style: none none solid solid;
     border-color: #9999FF;
     border-width: 0.25em;
   }

   .example { display: block; color: #222222; background: #FCFCFC; border-left: double; margin-left: 2em; padding-left: 1em; }
   td > .example:only-child { margin: 0 0 0 0.1em; }

   ul.domTree, ul.domTree ul { padding: 0 0 0 1em; margin: 0; }
   ul.domTree li { padding: 0; margin: 0; list-style: none; position: relative; }
   ul.domTree li li { list-style: none; }
   ul.domTree li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
   ul.domTree li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
   ul.domTree span { font-style: italic; font-family: serif; }
   ul.domTree .t1 code { color: purple; font-weight: bold; }
   ul.domTree .t2 { font-style: normal; font-family: monospace; }
   ul.domTree .t2 .name { color: black; font-weight: bold; }
   ul.domTree .t2 .value { color: blue; font-weight: normal; }
   ul.domTree .t3 code, .domTree .t4 code, .domTree .t5 code { color: gray; }
   ul.domTree .t7 code, .domTree .t8 code { color: green; }
   ul.domTree .t10 code { color: teal; }

   body.dfnEnabled dfn { cursor: pointer; }
   .dfnPanel {
     display: inline;
     position: absolute;
     z-index: 10;
     height: auto;
     width: auto;
     padding: 0.5em 0.75em;
     font: small sans-serif, Droid Sans Fallback;
     background: #DDDDDD;
     color: black;
     border: outset 0.2em;
   }
   .dfnPanel * { margin: 0; padding: 0; font: inherit; text-indent: 0; }
   .dfnPanel :link, .dfnPanel :visited { color: black; }
   .dfnPanel p { font-weight: bolder; }
   .dfnPanel * + p { margin-top: 0.25em; }
   .dfnPanel li { list-style-position: inside; }

   #configUI { position: absolute; z-index: 20; top: 10em; right: 1em; width: 11em; font-size: small; }
   #configUI p { margin: 0.5em 0; padding: 0.3em; background: #EEEEEE; color: black; border: inset thin; }
   #configUI p label { display: block; }
   #configUI #updateUI, #configUI .loginUI { text-align: center; }
   #configUI input[type=button] { display: block; margin: auto; }

   fieldset { margin: 1em; padding: 0.5em 1em; }
   fieldset > legend + * { margin-top: 0; }
   fieldset > :last-child { margin-bottom: 0; }
   fieldset p { margin: 0.5em 0; }

  </style><link href="http://www.w3.org/StyleSheets/TR/W3C-ED" rel="stylesheet" type="text/css"><script>
   function getCookie(name) {
     var params = location.search.substr(1).split("&");
     for (var index = 0; index < params.length; index++) {
       if (params[index] == name)
         return "1";
       var data = params[index].split("=");
       if (data[0] == name)
         return unescape(data[1]);
     }
     var cookies = document.cookie.split("; ");
     for (var index = 0; index < cookies.length; index++) {
       var data = cookies[index].split("=");
       if (data[0] == name)
         return unescape(data[1]);
     }
     return null;
   }
  </script><link href="parsing.html" rel="prev" title="8.2 Parsing HTML documents">
  <link href="index.html#contents" rel="contents" title="Table of contents">
  <link href="the-end.html" rel="next" title="8.2.6 The end">
  <body class="split chapter" onload="fixBrokenLink();"><div class="head" id="head">
   <p><a href="http://www.w3.org/"><img alt="W3C" height="48" src="http://www.w3.org/Icons/w3c_home" width="72"></a></p>

   <h1>HTML5</h1>
   <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2>
   <h2 class="no-num no-toc" id="editor's-draft-date-1-january-1970">Editor's Draft 21 November 2012</h2>
   </div>
  

  <nav class="prev_next">
   <a href="parsing.html">&slarr; 8.2 Parsing HTML documents</a> &ndash;
   <a href="index.html#contents">Table of contents</a> &ndash;
   <a href="the-end.html">8.2.6 The end &srarr;</a>
  <ol class="toc"><li><ol><li><ol><li><a href="tokenization.html#tokenization"><span class="secno">8.2.4 </span>Tokenization</a>
      <ol><li><a href="tokenization.html#data-state"><span class="secno">8.2.4.1 </span>Data state</a><li><a href="tokenization.html#character-reference-in-data-state"><span class="secno">8.2.4.2 </span>Character reference in data state</a><li><a href="tokenization.html#rcdata-state"><span class="secno">8.2.4.3 </span>RCDATA state</a><li><a href="tokenization.html#character-reference-in-rcdata-state"><span class="secno">8.2.4.4 </span>Character reference in RCDATA state</a><li><a href="tokenization.html#rawtext-state"><span class="secno">8.2.4.5 </span>RAWTEXT state</a><li><a href="tokenization.html#script-data-state"><span class="secno">8.2.4.6 </span>Script data state</a><li><a href="tokenization.html#plaintext-state"><span class="secno">8.2.4.7 </span>PLAINTEXT state</a><li><a href="tokenization.html#tag-open-state"><span class="secno">8.2.4.8 </span>Tag open state</a><li><a href="tokenization.html#end-tag-open-state"><span class="secno">8.2.4.9 </span>End tag open state</a><li><a href="tokenization.html#tag-name-state"><span class="secno">8.2.4.10 </span>Tag name state</a><li><a href="tokenization.html#rcdata-less-than-sign-state"><span class="secno">8.2.4.11 </span>RCDATA less-than sign state</a><li><a href="tokenization.html#rcdata-end-tag-open-state"><span class="secno">8.2.4.12 </span>RCDATA end tag open state</a><li><a href="tokenization.html#rcdata-end-tag-name-state"><span class="secno">8.2.4.13 </span>RCDATA end tag name state</a><li><a href="tokenization.html#rawtext-less-than-sign-state"><span class="secno">8.2.4.14 </span>RAWTEXT less-than sign state</a><li><a href="tokenization.html#rawtext-end-tag-open-state"><span class="secno">8.2.4.15 </span>RAWTEXT end tag open state</a><li><a href="tokenization.html#rawtext-end-tag-name-state"><span class="secno">8.2.4.16 </span>RAWTEXT end tag name state</a><li><a href="tokenization.html#script-data-less-than-sign-state"><span class="secno">8.2.4.17 </span>Script data less-than sign state</a><li><a href="tokenization.html#script-data-end-tag-open-state"><span class="secno">8.2.4.18 </span>Script data end tag open state</a><li><a href="tokenization.html#script-data-end-tag-name-state"><span class="secno">8.2.4.19 </span>Script data end tag name state</a><li><a href="tokenization.html#script-data-escape-start-state"><span class="secno">8.2.4.20 </span>Script data escape start state</a><li><a href="tokenization.html#script-data-escape-start-dash-state"><span class="secno">8.2.4.21 </span>Script data escape start dash state</a><li><a href="tokenization.html#script-data-escaped-state"><span class="secno">8.2.4.22 </span>Script data escaped state</a><li><a href="tokenization.html#script-data-escaped-dash-state"><span class="secno">8.2.4.23 </span>Script data escaped dash state</a><li><a href="tokenization.html#script-data-escaped-dash-dash-state"><span class="secno">8.2.4.24 </span>Script data escaped dash dash state</a><li><a href="tokenization.html#script-data-escaped-less-than-sign-state"><span class="secno">8.2.4.25 </span>Script data escaped less-than sign state</a><li><a href="tokenization.html#script-data-escaped-end-tag-open-state"><span class="secno">8.2.4.26 </span>Script data escaped end tag open state</a><li><a href="tokenization.html#script-data-escaped-end-tag-name-state"><span class="secno">8.2.4.27 </span>Script data escaped end tag name state</a><li><a href="tokenization.html#script-data-double-escape-start-state"><span class="secno">8.2.4.28 </span>Script data double escape start state</a><li><a href="tokenization.html#script-data-double-escaped-state"><span class="secno">8.2.4.29 </span>Script data double escaped state</a><li><a href="tokenization.html#script-data-double-escaped-dash-state"><span class="secno">8.2.4.30 </span>Script data double escaped dash state</a><li><a href="tokenization.html#script-data-double-escaped-dash-dash-state"><span class="secno">8.2.4.31 </span>Script data double escaped dash dash state</a><li><a href="tokenization.html#script-data-double-escaped-less-than-sign-state"><span class="secno">8.2.4.32 </span>Script data double escaped less-than sign state</a><li><a href="tokenization.html#script-data-double-escape-end-state"><span class="secno">8.2.4.33 </span>Script data double escape end state</a><li><a href="tokenization.html#before-attribute-name-state"><span class="secno">8.2.4.34 </span>Before attribute name state</a><li><a href="tokenization.html#attribute-name-state"><span class="secno">8.2.4.35 </span>Attribute name state</a><li><a href="tokenization.html#after-attribute-name-state"><span class="secno">8.2.4.36 </span>After attribute name state</a><li><a href="tokenization.html#before-attribute-value-state"><span class="secno">8.2.4.37 </span>Before attribute value state</a><li><a href="tokenization.html#attribute-value-(double-quoted)-state"><span class="secno">8.2.4.38 </span>Attribute value (double-quoted) state</a><li><a href="tokenization.html#attribute-value-(single-quoted)-state"><span class="secno">8.2.4.39 </span>Attribute value (single-quoted) state</a><li><a href="tokenization.html#attribute-value-(unquoted)-state"><span class="secno">8.2.4.40 </span>Attribute value (unquoted) state</a><li><a href="tokenization.html#character-reference-in-attribute-value-state"><span class="secno">8.2.4.41 </span>Character reference in attribute value state</a><li><a href="tokenization.html#after-attribute-value-(quoted)-state"><span class="secno">8.2.4.42 </span>After attribute value (quoted) state</a><li><a href="tokenization.html#self-closing-start-tag-state"><span class="secno">8.2.4.43 </span>Self-closing start tag state</a><li><a href="tokenization.html#bogus-comment-state"><span class="secno">8.2.4.44 </span>Bogus comment state</a><li><a href="tokenization.html#markup-declaration-open-state"><span class="secno">8.2.4.45 </span>Markup declaration open state</a><li><a href="tokenization.html#comment-start-state"><span class="secno">8.2.4.46 </span>Comment start state</a><li><a href="tokenization.html#comment-start-dash-state"><span class="secno">8.2.4.47 </span>Comment start dash state</a><li><a href="tokenization.html#comment-state"><span class="secno">8.2.4.48 </span>Comment state</a><li><a href="tokenization.html#comment-end-dash-state"><span class="secno">8.2.4.49 </span>Comment end dash state</a><li><a href="tokenization.html#comment-end-state"><span class="secno">8.2.4.50 </span>Comment end state</a><li><a href="tokenization.html#comment-end-bang-state"><span class="secno">8.2.4.51 </span>Comment end bang state</a><li><a href="tokenization.html#doctype-state"><span class="secno">8.2.4.52 </span>DOCTYPE state</a><li><a href="tokenization.html#before-doctype-name-state"><span class="secno">8.2.4.53 </span>Before DOCTYPE name state</a><li><a href="tokenization.html#doctype-name-state"><span class="secno">8.2.4.54 </span>DOCTYPE name state</a><li><a href="tokenization.html#after-doctype-name-state"><span class="secno">8.2.4.55 </span>After DOCTYPE name state</a><li><a href="tokenization.html#after-doctype-public-keyword-state"><span class="secno">8.2.4.56 </span>After DOCTYPE public keyword state</a><li><a href="tokenization.html#before-doctype-public-identifier-state"><span class="secno">8.2.4.57 </span>Before DOCTYPE public identifier state</a><li><a href="tokenization.html#doctype-public-identifier-(double-quoted)-state"><span class="secno">8.2.4.58 </span>DOCTYPE public identifier (double-quoted) state</a><li><a href="tokenization.html#doctype-public-identifier-(single-quoted)-state"><span class="secno">8.2.4.59 </span>DOCTYPE public identifier (single-quoted) state</a><li><a href="tokenization.html#after-doctype-public-identifier-state"><span class="secno">8.2.4.60 </span>After DOCTYPE public identifier state</a><li><a href="tokenization.html#between-doctype-public-and-system-identifiers-state"><span class="secno">8.2.4.61 </span>Between DOCTYPE public and system identifiers state</a><li><a href="tokenization.html#after-doctype-system-keyword-state"><span class="secno">8.2.4.62 </span>After DOCTYPE system keyword state</a><li><a href="tokenization.html#before-doctype-system-identifier-state"><span class="secno">8.2.4.63 </span>Before DOCTYPE system identifier state</a><li><a href="tokenization.html#doctype-system-identifier-(double-quoted)-state"><span class="secno">8.2.4.64 </span>DOCTYPE system identifier (double-quoted) state</a><li><a href="tokenization.html#doctype-system-identifier-(single-quoted)-state"><span class="secno">8.2.4.65 </span>DOCTYPE system identifier (single-quoted) state</a><li><a href="tokenization.html#after-doctype-system-identifier-state"><span class="secno">8.2.4.66 </span>After DOCTYPE system identifier state</a><li><a href="tokenization.html#bogus-doctype-state"><span class="secno">8.2.4.67 </span>Bogus DOCTYPE state</a><li><a href="tokenization.html#cdata-section-state"><span class="secno">8.2.4.68 </span>CDATA section state</a><li><a href="tokenization.html#tokenizing-character-references"><span class="secno">8.2.4.69 </span>Tokenizing character references</a></ol><li><a href="tokenization.html#tree-construction"><span class="secno">8.2.5 </span>Tree construction</a>
      <ol><li><a href="tokenization.html#creating-and-inserting-elements"><span class="secno">8.2.5.1 </span>Creating and inserting elements</a><li><a href="tokenization.html#closing-elements-that-have-implied-end-tags"><span class="secno">8.2.5.2 </span>Closing elements that have implied end tags</a><li><a href="tokenization.html#foster-parenting"><span class="secno">8.2.5.3 </span>Foster parenting</a><li><a href="tokenization.html#parsing-main-inhtml"><span class="secno">8.2.5.4 </span>The rules for parsing tokens in HTML content</a>
        <ol><li><a href="tokenization.html#the-initial-insertion-mode"><span class="secno">8.2.5.4.1 </span>The "initial" insertion mode</a><li><a href="tokenization.html#the-before-html-insertion-mode"><span class="secno">8.2.5.4.2 </span>The "before html" insertion mode</a><li><a href="tokenization.html#the-before-head-insertion-mode"><span class="secno">8.2.5.4.3 </span>The "before head" insertion mode</a><li><a href="tokenization.html#parsing-main-inhead"><span class="secno">8.2.5.4.4 </span>The "in head" insertion mode</a><li><a href="tokenization.html#parsing-main-inheadnoscript"><span class="secno">8.2.5.4.5 </span>The "in head noscript" insertion mode</a><li><a href="tokenization.html#the-after-head-insertion-mode"><span class="secno">8.2.5.4.6 </span>The "after head" insertion mode</a><li><a href="tokenization.html#parsing-main-inbody"><span class="secno">8.2.5.4.7 </span>The "in body" insertion mode</a><li><a href="tokenization.html#parsing-main-incdata"><span class="secno">8.2.5.4.8 </span>The "text" insertion mode</a><li><a href="tokenization.html#parsing-main-intable"><span class="secno">8.2.5.4.9 </span>The "in table" insertion mode</a><li><a href="tokenization.html#parsing-main-intabletext"><span class="secno">8.2.5.4.10 </span>The "in table text" insertion mode</a><li><a href="tokenization.html#parsing-main-incaption"><span class="secno">8.2.5.4.11 </span>The "in caption" insertion mode</a><li><a href="tokenization.html#parsing-main-incolgroup"><span class="secno">8.2.5.4.12 </span>The "in column group" insertion mode</a><li><a href="tokenization.html#parsing-main-intbody"><span class="secno">8.2.5.4.13 </span>The "in table body" insertion mode</a><li><a href="tokenization.html#parsing-main-intr"><span class="secno">8.2.5.4.14 </span>The "in row" insertion mode</a><li><a href="tokenization.html#parsing-main-intd"><span class="secno">8.2.5.4.15 </span>The "in cell" insertion mode</a><li><a href="tokenization.html#parsing-main-inselect"><span class="secno">8.2.5.4.16 </span>The "in select" insertion mode</a><li><a href="tokenization.html#parsing-main-inselectintable"><span class="secno">8.2.5.4.17 </span>The "in select in table" insertion mode</a><li><a href="tokenization.html#parsing-main-afterbody"><span class="secno">8.2.5.4.18 </span>The "after body" insertion mode</a><li><a href="tokenization.html#parsing-main-inframeset"><span class="secno">8.2.5.4.19 </span>The "in frameset" insertion mode</a><li><a href="tokenization.html#parsing-main-afterframeset"><span class="secno">8.2.5.4.20 </span>The "after frameset" insertion mode</a><li><a href="tokenization.html#the-after-after-body-insertion-mode"><span class="secno">8.2.5.4.21 </span>The "after after body" insertion mode</a><li><a href="tokenization.html#the-after-after-frameset-insertion-mode"><span class="secno">8.2.5.4.22 </span>The "after after frameset" insertion mode</a></ol><li><a href="tokenization.html#parsing-main-inforeign"><span class="secno">8.2.5.5 </span>The rules for parsing tokens in foreign content</a></ol></ol></ol></ol></nav>

  <div class="impl">

  <h4 id="tokenization"><span class="secno">8.2.4 </span><dfn>Tokenization</dfn></h4>

  <p>Implementations must act as if they used the following state
  machine to tokenize HTML. The state machine must start in the
  <a href="#data-state">data state</a>. Most states consume a single character,
  which may have various side-effects, and either switches the state
  machine to a new state to <i>reconsume</i> the same character, or
  switches it to a new state to consume the next character, or stays
  in the same state to consume the next character. Some states have
  more complicated behavior and can consume several characters before
  switching to another state. In some cases, the tokenizer state is
  also changed by the tree construction stage.</p>

  <p>The exact behavior of certain states depends on the
  <a href="parsing.html#insertion-mode">insertion mode</a> and the <a href="parsing.html#stack-of-open-elements">stack of open
  elements</a>. Certain states also use a <dfn id="temporary-buffer"><var>temporary
  buffer</var></dfn> to track progress.</p>

  <p>The output of the tokenization step is a series of zero or more
  of the following tokens: DOCTYPE, start tag, end tag, comment,
  character, end-of-file. DOCTYPE tokens have a name, a public
  identifier, a system identifier, and a <i>force-quirks
  flag</i>. When a DOCTYPE token is created, its name, public
  identifier, and system identifier must be marked as missing (which
  is a distinct state from the empty string), and the <i>force-quirks
  flag</i> must be set to <i>off</i> (its other state is
  <i>on</i>). Start and end tag tokens have a tag name, a
  <i>self-closing flag</i>, and a list of attributes, each of which
  has a name and a value. When a start or end tag token is created,
  its <i>self-closing flag</i> must be unset (its other state is that
  it be set), and its attributes list must be empty. Comment and
  character tokens have data.</p>

  <p>When a token is emitted, it must immediately be handled by the
  <a href="#tree-construction">tree construction</a> stage. The tree construction stage
  can affect the state of the tokenization stage, and can insert
  additional characters into the stream. (For example, the
  <code><a href="the-script-element.html#the-script-element">script</a></code> element can result in scripts executing and
  using the <a href="dynamic-markup-insertion.html#dynamic-markup-insertion">dynamic markup insertion</a> APIs to insert
  characters into the stream being tokenized.)</p>

  <p>When a start tag token is emitted with its <i>self-closing
  flag</i> set, if the flag is not <dfn id="acknowledge-self-closing-flag" title="acknowledge
  self-closing flag">acknowledged</dfn> when it is processed by the
  tree construction stage, that is a <a href="parsing.html#parse-error">parse error</a>.</p>

  <p>When an end tag token is emitted with attributes, that is a
  <a href="parsing.html#parse-error">parse error</a>.</p>

  <p>When an end tag token is emitted with its <i>self-closing
  flag</i> set, that is a <a href="parsing.html#parse-error">parse error</a>.</p>

  <p>An <dfn id="appropriate-end-tag-token">appropriate end tag token</dfn> is an end tag token whose
  tag name matches the tag name of the last start tag to have been
  emitted from this tokenizer, if any. If no start tag has been
  emitted from this tokenizer, then no end tag token is
  appropriate.</p>

  <p>Before each step of the tokenizer, the user agent must first
  check the <a href="parsing.html#parser-pause-flag">parser pause flag</a>. If it is true, then the
  tokenizer must abort the processing of any nested invocations of the
  tokenizer, yielding control back to the caller.</p>

  <p>The tokenizer state machine consists of the states defined in the
  following subsections.</p>


  <!-- Order of the lists below is supposed to be non-error then
  error, by unicode, then EOF, ending with "anything else" -->


  <h5 id="data-state"><span class="secno">8.2.4.1 </span><dfn>Data state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0026 AMPERSAND (&amp;)</dt>
   <dd>Switch to the <a href="#character-reference-in-data-state">character reference in data
   state</a>.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#tag-open-state">tag open state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>EOF</dt>
   <dd>Emit an end-of-file token.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="character-reference-in-data-state"><span class="secno">8.2.4.2 </span><dfn>Character reference in data state</dfn></h5>

  <p>Switch to the <a href="#data-state">data state</a>.</p>

  <p>Attempt to <a href="#consume-a-character-reference">consume a character reference</a>, with no
  <a href="#additional-allowed-character">additional allowed character</a>.</p>

  <p>If nothing is returned, emit a U+0026 AMPERSAND character (&amp;)
  token.</p>

  <p>Otherwise, emit the character tokens that were returned.</p>


  <h5 id="rcdata-state"><span class="secno">8.2.4.3 </span><dfn>RCDATA state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0026 AMPERSAND (&amp;)</dt>
   <dd>Switch to the <a href="#character-reference-in-rcdata-state">character reference in RCDATA
   state</a>.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#rcdata-less-than-sign-state">RCDATA less-than sign state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd>Emit an end-of-file token.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="character-reference-in-rcdata-state"><span class="secno">8.2.4.4 </span><dfn>Character reference in RCDATA state</dfn></h5>

  <p>Switch to the <a href="#rcdata-state">RCDATA state</a>.</p>

  <p>Attempt to <a href="#consume-a-character-reference">consume a character reference</a>, with no
  <a href="#additional-allowed-character">additional allowed character</a>.</p>

  <p>If nothing is returned, emit a U+0026 AMPERSAND character (&amp;)
  token.</p>

  <p>Otherwise, emit the character tokens that were returned.</p>


  <h5 id="rawtext-state"><span class="secno">8.2.4.5 </span><dfn>RAWTEXT state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#rawtext-less-than-sign-state">RAWTEXT less-than sign state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd>Emit an end-of-file token.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="script-data-state"><span class="secno">8.2.4.6 </span><dfn>Script data state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-less-than-sign-state">script data less-than sign state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd>Emit an end-of-file token.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="plaintext-state"><span class="secno">8.2.4.7 </span><dfn>PLAINTEXT state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd>Emit an end-of-file token.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="tag-open-state"><span class="secno">8.2.4.8 </span><dfn>Tag open state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"!" (U+0021)</dt>
   <dd>Switch to the <a href="#markup-declaration-open-state">markup declaration open state</a>.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>Switch to the <a href="#end-tag-open-state">end tag open state</a>.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Create a new start tag token, set its tag name to the
   lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to the
   character's code point), then switch to the <a href="#tag-name-state">tag name
   state</a>. (Don't emit the token yet; further details will
   be filled in before it is emitted.)</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Create a new start tag token, set its tag name to the
   <a href="parsing.html#current-input-character">current input character</a>, then switch to the <a href="#tag-name-state">tag
   name state</a>. (Don't emit the token yet; further details will
   be filled in before it is emitted.)</dd>

   <dt>"?" (U+003F)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#bogus-comment-state">bogus
   comment state</a>.</dd>

   <dt>Anything else</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit a U+003C LESS-THAN SIGN character token.
   Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="end-tag-open-state"><span class="secno">8.2.4.9 </span><dfn>End tag open state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Create a new end tag token, set its tag name to the lowercase
   version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to
   the character's code point), then switch to the <a href="#tag-name-state">tag name
   state</a>. (Don't emit the token yet; further details will be
   filled in before it is emitted.)</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Create a new end tag token, set its tag name to the
   <a href="parsing.html#current-input-character">current input character</a>, then switch to the <a href="#tag-name-state">tag
   name state</a>. (Don't emit the token yet; further details will
   be filled in before it is emitted.)</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit a U+003C LESS-THAN SIGN character token and a
   U+002F SOLIDUS character token. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#bogus-comment-state">bogus
   comment state</a>.</dd>

  </dl><h5 id="tag-name-state"><span class="secno">8.2.4.10 </span><dfn>Tag name state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Switch to the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   current tag token's tag name.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the current tag token's tag name.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   tag token's tag name.</dd>

  </dl><h5 id="rcdata-less-than-sign-state"><span class="secno">8.2.4.11 </span><dfn>RCDATA less-than sign state</dfn></h5>
  <!-- identical to the RAWTEXT less-than sign state, except s/RAWTEXT/RCDATA/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"/" (U+002F)</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
   to the <a href="#rcdata-end-tag-open-state">RCDATA end tag open state</a>.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#rcdata-state">RCDATA state</a>. Emit a U+003C
   LESS-THAN SIGN character token. Reconsume the <a href="parsing.html#current-input-character">current
   input character</a>.</dd>

  </dl><h5 id="rcdata-end-tag-open-state"><span class="secno">8.2.4.12 </span><dfn>RCDATA end tag open state</dfn></h5>
  <!-- identical to the RAWTEXT (and Script data) end tag open state, except s/RAWTEXT/RCDATA/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
   0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#rcdata-end-tag-name-state">RCDATA end tag name state</a>. (Don't emit
   the token yet; further details will be filled in before it is
   emitted.)</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#rcdata-end-tag-name-state">RCDATA end tag name state</a>. (Don't emit
   the token yet; further details will be filled in before it is
   emitted.)</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#rcdata-state">RCDATA state</a>. Emit a U+003C
   LESS-THAN SIGN character token and a U+002F SOLIDUS character token.
   Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="rcdata-end-tag-name-state"><span class="secno">8.2.4.13 </span><dfn>RCDATA end tag name state</dfn></h5>
  <!-- identical to the RAWTEXT (and Script data) end tag name state, except s/RAWTEXT/RCDATA/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#data-state">data state</a> and emit
   the current tag token. Otherwise, treat it as per the "anything
   else" entry below.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#rcdata-state">RCDATA state</a>. Emit a U+003C
   LESS-THAN SIGN character token, a U+002F SOLIDUS character token,
   and a character token for each of the characters in the
   <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to the
   buffer). Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="rawtext-less-than-sign-state"><span class="secno">8.2.4.14 </span><dfn>RAWTEXT less-than sign state</dfn></h5>
  <!-- identical to the RCDATA less-than sign state, except s/RCDATA/RAWTEXT/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"/" (U+002F)</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
   to the <a href="#rawtext-end-tag-open-state">RAWTEXT end tag open state</a>.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#rawtext-state">RAWTEXT state</a>. Emit a U+003C
   LESS-THAN SIGN character token. Reconsume the <a href="parsing.html#current-input-character">current
   input character</a>.</dd>

  </dl><h5 id="rawtext-end-tag-open-state"><span class="secno">8.2.4.15 </span><dfn>RAWTEXT end tag open state</dfn></h5>
  <!-- identical to the RCDATA (and Script data) end tag open state, except s/RCDATA/RAWTEXT/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
   0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#rawtext-end-tag-name-state">RAWTEXT end tag name state</a>. (Don't emit
   the token yet; further details will be filled in before it is
   emitted.)</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#rawtext-end-tag-name-state">RAWTEXT end tag name state</a>. (Don't emit
   the token yet; further details will be filled in before it is
   emitted.)</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#rawtext-state">RAWTEXT state</a>. Emit a U+003C
   LESS-THAN SIGN character token and a U+002F SOLIDUS character
   token. Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="rawtext-end-tag-name-state"><span class="secno">8.2.4.16 </span><dfn>RAWTEXT end tag name state</dfn></h5>
  <!-- identical to the RCDATA (and Script data) end tag name state, except s/RCDATA/RAWTEXT/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#data-state">data state</a> and emit
   the current tag token. Otherwise, treat it as per the "anything
   else" entry below.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#rawtext-state">RAWTEXT state</a>. Emit a U+003C
   LESS-THAN SIGN character token, a U+002F SOLIDUS character token,
   and a character token for each of the characters in the
   <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to the
   buffer). Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-less-than-sign-state"><span class="secno">8.2.4.17 </span><dfn>Script data less-than sign state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"/" (U+002F)</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
   to the <a href="#script-data-end-tag-open-state">script data end tag open state</a>.</dd>

   <dt>"!" (U+0021)</dt>
   <dd>Switch to the <a href="#script-data-escape-start-state">script data escape start state</a>. Emit
   a U+003C LESS-THAN SIGN character token and a U+0021 EXCLAMATION
   MARK character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003C
   LESS-THAN SIGN character token. Reconsume the <a href="parsing.html#current-input-character">current
   input character</a>.</dd>

  </dl><h5 id="script-data-end-tag-open-state"><span class="secno">8.2.4.18 </span><dfn>Script data end tag open state</dfn></h5>
  <!-- identical to the RCDATA (and RAWTEXT) end tag open state, except s/RCDATA/Script data/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
   0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#script-data-end-tag-name-state">script data end tag name state</a>. (Don't emit
   the token yet; further details will be filled in before it is
   emitted.)</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#script-data-end-tag-name-state">script data end tag name state</a>. (Don't emit
   the token yet; further details will be filled in before it is
   emitted.)</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003C
   LESS-THAN SIGN character token and a U+002F SOLIDUS character token.
   Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-end-tag-name-state"><span class="secno">8.2.4.19 </span><dfn>Script data end tag name state</dfn></h5>
  <!-- identical to the RCDATA (and RAWTEXT) end tag name state, except s/RCDATA/Script data/g -->

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#data-state">data state</a> and emit
   the current tag token. Otherwise, treat it as per the "anything
   else" entry below.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003C
   LESS-THAN SIGN character token, a U+002F SOLIDUS character token,
   and a character token for each of the characters in the
   <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to the
   buffer). Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-escape-start-state"><span class="secno">8.2.4.20 </span><dfn>Script data escape start state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#script-data-escape-start-dash-state">script data escape start dash
   state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Reconsume the
   <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-escape-start-dash-state"><span class="secno">8.2.4.21 </span><dfn>Script data escape start dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#script-data-escaped-dash-dash-state">script data escaped dash dash
   state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Reconsume the
   <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-escaped-state"><span class="secno">8.2.4.22 </span><dfn>Script data escaped state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#script-data-escaped-dash-state">script data escaped dash state</a>. Emit
   a U+002D HYPHEN-MINUS character token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
   state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. <a href="parsing.html#parse-error">Parse
   error</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="script-data-escaped-dash-state"><span class="secno">8.2.4.23 </span><dfn>Script data escaped dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#script-data-escaped-dash-dash-state">script data escaped dash dash
   state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
   state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#script-data-escaped-state">script data
   escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER character
   token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit the
   <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>

  </dl><h5 id="script-data-escaped-dash-dash-state"><span class="secno">8.2.4.24 </span><dfn>Script data escaped dash dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
   state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E
   GREATER-THAN SIGN character token.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#script-data-escaped-state">script data
   escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER character
   token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit the
   <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>

  </dl><h5 id="script-data-escaped-less-than-sign-state"><span class="secno">8.2.4.25 </span><dfn>Script data escaped less-than sign state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"/" (U+002F)</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
   to the <a href="#script-data-escaped-end-tag-open-state">script data escaped end tag open state</a>.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Append
   the lowercase version of the <a href="parsing.html#current-input-character">current input character</a>
   (add 0x0020 to the character's code point) to the <var><a href="#temporary-buffer">temporary
   buffer</a></var>. Switch to the <a href="#script-data-double-escape-start-state">script data double escape start
   state</a>. Emit a U+003C LESS-THAN SIGN character token and the
   <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Append
   the <a href="parsing.html#current-input-character">current input character</a> to the <var><a href="#temporary-buffer">temporary
   buffer</a></var>. Switch to the <a href="#script-data-double-escape-start-state">script data double escape start
   state</a>. Emit a U+003C LESS-THAN SIGN character token and the
   <a href="parsing.html#current-input-character">current input character</a> as a character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit a U+003C
   LESS-THAN SIGN character token. Reconsume the <a href="parsing.html#current-input-character">current
   input character</a>.</dd>

  </dl><h5 id="script-data-escaped-end-tag-open-state"><span class="secno">8.2.4.26 </span><dfn>Script data escaped end tag open state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add
   0x0020 to the character's code point). Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#script-data-escaped-end-tag-name-state">script data escaped end tag name
   state</a>. (Don't emit the token yet; further details will be
   filled in before it is emitted.)</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Create a new end tag token, and set its tag name to the
   <a href="parsing.html#current-input-character">current input character</a>. Append the <a href="parsing.html#current-input-character">current
   input character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>. Finally,
   switch to the <a href="#script-data-escaped-end-tag-name-state">script data escaped end tag name
   state</a>. (Don't emit the token yet; further details will be
   filled in before it is emitted.)</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit a
   U+003C LESS-THAN SIGN character token and a U+002F SOLIDUS
   character token. Reconsume the <a href="parsing.html#current-input-character">current input
   character</a>.</dd>

  </dl><h5 id="script-data-escaped-end-tag-name-state"><span class="secno">8.2.4.27 </span><dfn>Script data escaped end tag name state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#before-attribute-name-state">before attribute name
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#self-closing-start-tag-state">self-closing start tag
   state</a>. Otherwise, treat it as per the "anything else" entry
   below.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>If the current end tag token is an <a href="#appropriate-end-tag-token">appropriate end tag
   token</a>, then switch to the <a href="#data-state">data state</a> and emit
   the current tag token. Otherwise, treat it as per the "anything
   else" entry below.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   current tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   tag token's tag name. Append the <a href="parsing.html#current-input-character">current input
   character</a> to the <var><a href="#temporary-buffer">temporary buffer</a></var>.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Emit a
   U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS character
   token, and a character token for each of the characters in the
   <var><a href="#temporary-buffer">temporary buffer</a></var> (in the order they were added to the
   buffer). Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-double-escape-start-state"><span class="secno">8.2.4.28 </span><dfn>Script data double escape start state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dt>"/" (U+002F)</dt>
   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>If the <var><a href="#temporary-buffer">temporary buffer</a></var> is the string "<code title="">script</code>", then switch to the <a href="#script-data-double-escaped-state">script data
   double escaped state</a>. Otherwise, switch to the <a href="#script-data-escaped-state">script
   data escaped state</a>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the
   <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-escaped-state">script data escaped state</a>. Reconsume
   the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-double-escaped-state"><span class="secno">8.2.4.29 </span><dfn>Script data double escaped state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-dash-state">script data double escaped dash
   state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-less-than-sign-state">script data double escaped less-than
   sign state</a>. Emit a U+003C LESS-THAN SIGN character
   token.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Emit the <a href="parsing.html#current-input-character">current input character</a> as a character
   token.</dd>

  </dl><h5 id="script-data-double-escaped-dash-state"><span class="secno">8.2.4.30 </span><dfn>Script data double escaped dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-dash-dash-state">script data double escaped dash dash
   state</a>. Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-less-than-sign-state">script data double escaped less-than
   sign state</a>. Emit a U+003C LESS-THAN SIGN character
   token.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#script-data-double-escaped-state">script data
   double escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-state">script data double escaped
   state</a>. Emit the <a href="parsing.html#current-input-character">current input character</a> as a
   character token.</dd>

  </dl><h5 id="script-data-double-escaped-dash-dash-state"><span class="secno">8.2.4.31 </span><dfn>Script data double escaped dash dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Emit a U+002D HYPHEN-MINUS character token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-less-than-sign-state">script data double escaped less-than
   sign state</a>. Emit a U+003C LESS-THAN SIGN character
   token.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E
   GREATER-THAN SIGN character token.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#script-data-double-escaped-state">script data
   double escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER
   character token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-state">script data double escaped
   state</a>. Emit the <a href="parsing.html#current-input-character">current input character</a> as a
   character token.</dd>

  </dl><h5 id="script-data-double-escaped-less-than-sign-state"><span class="secno">8.2.4.32 </span><dfn>Script data double escaped less-than sign state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"/" (U+002F)</dt>
   <dd>Set the <var><a href="#temporary-buffer">temporary buffer</a></var> to the empty string. Switch
   to the <a href="#script-data-double-escape-end-state">script data double escape end state</a>. Emit a
   U+002F SOLIDUS character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-state">script data double escaped state</a>.
   Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="script-data-double-escape-end-state"><span class="secno">8.2.4.33 </span><dfn>Script data double escape end state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dt>"/" (U+002F)</dt>
   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>If the <var><a href="#temporary-buffer">temporary buffer</a></var> is the string "<code title="">script</code>", then switch to the <a href="#script-data-escaped-state">script data
   escaped state</a>. Otherwise, switch to the <a href="#script-data-double-escaped-state">script data
   double escaped state</a>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the
   <var><a href="#temporary-buffer">temporary buffer</a></var>. Emit the <a href="parsing.html#current-input-character">current input
   character</a> as a character token.</dd>

   <dt>Anything else</dt>
   <dd>Switch to the <a href="#script-data-double-escaped-state">script data double escaped state</a>.
   Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

  </dl><h5 id="before-attribute-name-state"><span class="secno">8.2.4.34 </span><dfn>Before attribute name state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Ignore the character.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Start a new attribute in the current tag token. Set that
   attribute's name to the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point), and its
   value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
   state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Start a new attribute in the current
   tag token. Set that attribute's name to a U+FFFD REPLACEMENT
   CHARACTER character, and its value to the empty string. Switch to
   the <a href="#attribute-name-state">attribute name state</a>.</dd>

   <dt>U+0022 QUOTATION MARK (")</dt>
   <dt>"'" (U+0027)</dt>
   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dt>"=" (U+003D)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
   entry below.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Start a new attribute in the current tag token. Set that
   attribute's name to the <a href="parsing.html#current-input-character">current input character</a>, and
   its value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
   state</a>.</dd>

  </dl><h5 id="attribute-name-state"><span class="secno">8.2.4.35 </span><dfn>Attribute name state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Switch to the <a href="#after-attribute-name-state">after attribute name state</a>.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>

   <dt>"=" (U+003D)</dt>
   <dd>Switch to the <a href="#before-attribute-value-state">before attribute value state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input
   character</a> (add 0x0020 to the character's code point) to the
   current attribute's name.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the current attribute's name.</dd>

   <dt>U+0022 QUOTATION MARK (")</dt>
   <dt>"'" (U+0027)</dt>
   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
   entry below.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   attribute's name.</dd>

  </dl><p>When the user agent leaves the attribute name state (and before
  emitting the tag token, if appropriate), the complete attribute's
  name must be compared to the other attributes on the same token;
  if there is already an attribute on the token with the exact same
  name, then this is a <a href="parsing.html#parse-error">parse error</a> and the new
  attribute must be dropped, along with the value that gets
  associated with it (if any).</p>


  <h5 id="after-attribute-name-state"><span class="secno">8.2.4.36 </span><dfn>After attribute name state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Ignore the character.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>

   <dt>"=" (U+003D)</dt>
   <dd>Switch to the <a href="#before-attribute-value-state">before attribute value state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
   <dd>Start a new attribute in the current tag token. Set that
   attribute's name to the lowercase version of the <a href="parsing.html#current-input-character">current
   input character</a> (add 0x0020 to the character's code point),
   and its value to the empty string. Switch to the <a href="#attribute-name-state">attribute
   name state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Start a new attribute in the current
   tag token. Set that attribute's name to a U+FFFD REPLACEMENT
   CHARACTER character, and its value to the empty string. Switch to
   the <a href="#attribute-name-state">attribute name state</a>.</dd>

   <dt>U+0022 QUOTATION MARK (")</dt>
   <dt>"'" (U+0027)</dt>
   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
   entry below.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Start a new attribute in the current tag token. Set that
   attribute's name to the <a href="parsing.html#current-input-character">current input character</a>, and
   its value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
   state</a>.</dd>

  </dl><h5 id="before-attribute-value-state"><span class="secno">8.2.4.37 </span><dfn>Before attribute value state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Ignore the character.</dd>

   <dt>U+0022 QUOTATION MARK (")</dt>
   <dd>Switch to the <a href="#attribute-value-(double-quoted)-state">attribute value (double-quoted) state</a>.</dd>

   <dt>U+0026 AMPERSAND (&amp;)</dt>
   <dd>Switch to the <a href="#attribute-value-(unquoted)-state">attribute value (unquoted) state</a>.
   Reconsume the <a href="parsing.html#current-input-character">current input character</a>.</dd>

   <dt>"'" (U+0027)</dt>
   <dd>Switch to the <a href="#attribute-value-(single-quoted)-state">attribute value (single-quoted) state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the current attribute's value. Switch to the
   <a href="#attribute-value-(unquoted)-state">attribute value (unquoted) state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the current tag token.</dd>

   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dt>"=" (U+003D)</dt>
   <dt>"`" (U+0060)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
   entry below.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   attribute's value. Switch to the <a href="#attribute-value-(unquoted)-state">attribute value (unquoted)
   state</a>.</dd>

  </dl><h5 id="attribute-value-(double-quoted)-state"><span class="secno">8.2.4.38 </span><dfn>Attribute value (double-quoted) state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt>
   <dd>Switch to the <a href="#after-attribute-value-(quoted)-state">after attribute value (quoted)
   state</a>.</dd>

   <dt>U+0026 AMPERSAND (&amp;)</dt>
   <dd>Switch to the <a href="#character-reference-in-attribute-value-state">character reference in attribute value
   state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
   being U+0022 QUOTATION MARK (").</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the current attribute's value.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   attribute's value.</dd>

  </dl><h5 id="attribute-value-(single-quoted)-state"><span class="secno">8.2.4.39 </span><dfn>Attribute value (single-quoted) state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"'" (U+0027)</dt>
   <dd>Switch to the <a href="#after-attribute-value-(quoted)-state">after attribute value (quoted)
   state</a>.</dd>

   <dt>U+0026 AMPERSAND (&amp;)</dt>
   <dd>Switch to the <a href="#character-reference-in-attribute-value-state">character reference in attribute value
   state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
   being "'" (U+0027).</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the current attribute's value.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   attribute's value.</dd>

  </dl><h5 id="attribute-value-(unquoted)-state"><span class="secno">8.2.4.40 </span><dfn>Attribute value (unquoted) state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Switch to the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>

   <dt>U+0026 AMPERSAND (&amp;)</dt>
   <dd>Switch to the <a href="#character-reference-in-attribute-value-state">character reference in attribute value
   state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
   being U+003E GREATER-THAN SIGN (&gt;).</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the current attribute's value.</dd>

   <dt>U+0022 QUOTATION MARK (")</dt>
   <dt>"'" (U+0027)</dt>
   <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
   <dt>"=" (U+003D)</dt>
   <dt>"`" (U+0060)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Treat it as per the "anything else"
   entry below.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current
   attribute's value.</dd>

  </dl><h5 id="character-reference-in-attribute-value-state"><span class="secno">8.2.4.41 </span><dfn>Character reference in attribute value state</dfn></h5>

  <p>Attempt to <a href="#consume-a-character-reference">consume a character reference</a>.</p>

  <p>If nothing is returned, append a U+0026 AMPERSAND character
  (&amp;) to the current attribute's value.</p>

  <p>Otherwise, append the returned character tokens to the current
  attribute's value.</p>

  <p>Finally, switch back to the attribute value state that switched
  into this state.</p>


  <h5 id="after-attribute-value-(quoted)-state"><span class="secno">8.2.4.42 </span><dfn>After attribute value (quoted) state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"tab" (U+0009)</dt>
   <dt>"LF" (U+000A)</dt>
   <dt>"FF" (U+000C)</dt>
   <!--<dt>"CR" (U+000D)</dt>-->
   <dt>U+0020 SPACE</dt>
   <dd>Switch to the <a href="#before-attribute-name-state">before attribute name state</a>.</dd>

   <dt>"/" (U+002F)</dt>
   <dd>Switch to the <a href="#self-closing-start-tag-state">self-closing start tag state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#before-attribute-name-state">before attribute
   name state</a>. Reconsume the character.</dd>

  </dl><h5 id="self-closing-start-tag-state"><span class="secno">8.2.4.43 </span><dfn>Self-closing start tag state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Set the <i>self-closing flag</i> of the current tag
   token. Switch to the <a href="#data-state">data state</a>. Emit the current tag
   token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#before-attribute-name-state">before attribute
   name state</a>. Reconsume the character.</dd>

  </dl><h5 id="bogus-comment-state"><span class="secno">8.2.4.44 </span><dfn>Bogus comment state</dfn></h5>

  <p>Consume every character up to and including the first U+003E
  GREATER-THAN SIGN character (&gt;) or the end of the file (EOF),
  whichever comes first. Emit a comment token whose data is the
  concatenation of all the characters starting from and including the
  character that caused the state machine to switch into the bogus
  comment state, up to and including the character immediately before
  the last consumed character (i.e. up to the character just before
  the U+003E or EOF character), but with any U+0000 NULL characters
  replaced by U+FFFD REPLACEMENT CHARACTER characters. (If the comment
  was started by the end of the file (EOF), the token is empty.
  Similarly, the token is empty if it was generated by the string
  "<code title="">&lt;!&gt;</code>".)</p>

  <p>Switch to the <a href="#data-state">data state</a>.</p>

  <p>If the end of the file was reached, reconsume the EOF
  character.</p>


  <h5 id="markup-declaration-open-state"><span class="secno">8.2.4.45 </span><dfn>Markup declaration open state</dfn></h5>

  <p>If the next two characters are both "-" (U+002D) characters, consume those two characters, create a comment token
  whose data is the empty string, and switch to the <a href="#comment-start-state">comment
  start state</a>.</p>

  <p>Otherwise, if the next seven characters are an <a href="infrastructure.html#ascii-case-insensitive">ASCII
  case-insensitive</a> match for the word "DOCTYPE", then consume
  those characters and switch to the <a href="#doctype-state">DOCTYPE state</a>.</p>

  <p>Otherwise, if there is a <a href="parsing.html#current-node">current node</a> and it is not
  an element in the <a href="namespaces.html#html-namespace-0">HTML namespace</a> and the next seven
  characters are a <a href="infrastructure.html#case-sensitive">case-sensitive</a> match for the string
  "[CDATA[" (the five uppercase letters "CDATA" with a U+005B LEFT
  SQUARE BRACKET character before and after), then consume those
  characters and switch to the <a href="#cdata-section-state">CDATA section state</a>.</p>

  <p>Otherwise, this is a <a href="parsing.html#parse-error">parse error</a>. Switch to the
  <a href="#bogus-comment-state">bogus comment state</a>. The next character that is
  consumed, if any, is the first character that will be in the
  comment.</p>


  <h5 id="comment-start-state"><span class="secno">8.2.4.46 </span><dfn>Comment start state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#comment-start-dash-state">comment start dash state</a>.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the comment token's data. Switch to the <a href="#comment-state">comment
   state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token.</dd> <!-- see comment in
   comment end state -->

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token. Reconsume the EOF character.</dd>

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the comment
   token's data. Switch to the <a href="#comment-state">comment state</a>.</dd>

  </dl><h5 id="comment-start-dash-state"><span class="secno">8.2.4.47 </span><dfn>Comment start dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#comment-end-state">comment end state</a></dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a "-" (U+002D) character and a U+FFFD REPLACEMENT CHARACTER character to the
   comment token's data. Switch to the <a href="#comment-state">comment
   state</a>.</dd>

   <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token. Reconsume the EOF
   character.</dd> <!-- see comment in comment end state -->

   <dt>Anything else</dt>
   <dd>Append a "-" (U+002D) character and the
   <a href="parsing.html#current-input-character">current input character</a> to the comment token's
   data. Switch to the <a href="#comment-state">comment state</a>.</dd>

  </dl><h5 id="comment-state"><span class="secno">8.2.4.48 </span><dfn id="comment">Comment state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#comment-end-dash-state">comment end dash state</a></dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
   character to the comment token's data.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token. Reconsume the EOF
   character.</dd> <!-- see comment in comment end state -->

   <dt>Anything else</dt>
   <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the comment
   token's data.</dd>

  </dl><h5 id="comment-end-dash-state"><span class="secno">8.2.4.49 </span><dfn>Comment end dash state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>"-" (U+002D)</dt>
   <dd>Switch to the <a href="#comment-end-state">comment end state</a></dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a "-" (U+002D) character and a U+FFFD REPLACEMENT CHARACTER character to the
   comment token's data. Switch to the <a href="#comment-state">comment
   state</a>.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token. Reconsume the EOF
   character.</dd> <!-- see comment in comment end state -->

   <dt>Anything else</dt>
   <dd>Append a "-" (U+002D) character and the
   <a href="parsing.html#current-input-character">current input character</a> to the comment token's
   data. Switch to the <a href="#comment-state">comment state</a>.</dd>

  </dl><h5 id="comment-end-state"><span class="secno">8.2.4.50 </span><dfn>Comment end state</dfn></h5>

  <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p>

  <dl class="switch"><dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
   <dd>Switch to the <a href="#data-state">data state</a>. Emit the comment
   token.</dd>

   <dt>U+0000 NULL</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append two "-" (U+002D) characters and a U+FFFD REPLACEMENT CHARACTER character to the
   comment token's data. Switch to the <a href="#comment-state">comment
   state</a>.</dd>

   <dt>"!" (U+0021)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#comment-end-bang-state">comment end bang
   state</a>.</dd>

   <dt>"-" (U+002D)</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Append a "-" (U+002D) character to the comment token's data.</dd>

   <dt>EOF</dt>
   <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
   state</a>. Emit the comment token. Reconsume the EOF
   character.</dd> <!-- For security reasons: otherwise, hostile user
   could put a <script> in a comment e.g. in a blog comment and then
   DOS the server so that the end tag isn't read, and then the
   commented <script> tag would be treated as live code -->

   <dt>Anything else</dt>
   
Parse errora>. Append two "-" (U+002D) characters and the <a href="parsing.html#current-input-character">current input character</a> to the comment token's data. Switch to the comment statea>.</dd> dl><h5 id="comment-end-bang-state"><span class="secno">8.2.4.51 </span>Comment end bang statedfn></h5>

Consume the next input charactera>:</p>

"-" (U+002D)dt> <dd>Append two "-" (U+002D) characters and a "!" (U+0021) character to the comment token's data. Switch to the <a href="#comment-end-dash-state">comment end dash state</a>.</dd> <dt>U+003E GREATER-THAN SIGN (&gt;)</dt> <dd>Switch to the <a href="#data-state">data state</a>. Emit the comment token.</dd> <dt>U+0000 NULL</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Append two "-" (U+002D) characters, a "!" (U+0021) character, and a U+FFFD REPLACEMENT CHARACTER character to the comment token's data. Switch to the <a href="#comment-state">comment state</a>.dd> <dt>EOF</dt>
Parse errora>. Switch to the <a href="#data-state">data state</a>. Emit the comment token. Reconsume the EOF character.dd> <!-- see comment in comment end state --> <dt>Anything else</dt>
Append two "-" (U+002D) characters, a "!" (U+0021) character, and the current input charactera> to the comment token's data. Switch to the <a href="#comment-state">comment state</a>.</dd> </dl><h5 id="doctype-state"><span class="secno">8.2.4.52 </span><dfn>DOCTYPE state</dfn></h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p> <dl class="switch"><dt>"tab" (U+0009)</dt> <dt>"LF" (U+000A)</dt> <dt>"FF" (U+000C)</dt> <!--<dt>"CR" (U+000D)</dt>--> <dt>U+0020 SPACE</dt> <dd>Switch to the <a href="#before-doctype-name-state">before DOCTYPE name state</a>.</dd> <dt>EOF</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data state</a>. Create a new DOCTYPE token. Set its <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume the EOF character.</dd> <dt>Anything else</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#before-doctype-name-state">before DOCTYPE name state</a>. Reconsume the character.</dd> </dl><h5 id="before-doctype-name-state"><span class="secno">8.2.4.53 </span><dfn>Before DOCTYPE name state</dfn></h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p> <dl class="switch"><dt>"tab" (U+0009)</dt> <dt>"LF" (U+000A)</dt> <dt>"FF" (U+000C)</dt> <!--<dt>"CR" (U+000D)</dt>--> <dt>U+0020 SPACE</dt> <dd>Ignore the character.</dd> <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt> <dd>Create a new DOCTYPE token. Set the token's name to the lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to the character's code point). Switch to the DOCTYPE name statea>.</dd>
U+0000 NULLdt> <dd><a href="parsing.html#parse-error">Parse error</a>. Create a new DOCTYPE token. Set the token's name to a U+FFFD REPLACEMENT CHARACTER character. Switch to the DOCTYPE name statea>.</dd>
U+003E GREATER-THAN SIGN (>)dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Create a new DOCTYPE token. Set its force-quirks flagi> to <i>on</i>. Switch to the data statea>. Emit the token.</dd>
EOFdt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the data statea>. Create a new DOCTYPE token. Set its <i>force-quirks flag</i> to oni>. Emit the token. Reconsume the EOF character.</dd>
Anything elsedt> <dd>Create a new DOCTYPE token. Set the token's name to the <a href="parsing.html#current-input-character">current input character</a>. Switch to the <a href="#doctype-name-state">DOCTYPE name state</a>.</dd> </dl><h5 id="doctype-name-state"><span class="secno">8.2.4.54 </span><dfn>DOCTYPE name state</dfn></h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p> <dl class="switch"><dt>"tab" (U+0009)</dt> <dt>"LF" (U+000A)</dt> <dt>"FF" (U+000C)</dt> <!--<dt>"CR" (U+000D)</dt>--> <dt>U+0020 SPACE</dt> <dd>Switch to the <a href="#after-doctype-name-state">after DOCTYPE name state</a>.</dd> <dt>U+003E GREATER-THAN SIGN (&gt;)</dt> <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE token.</dd> <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt> <dd>Append the lowercase version of the <a href="parsing.html#current-input-character">current input character</a> (add 0x0020 to the character's code point) to the current DOCTYPE token's name.</dd> <dt>U+0000 NULL</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER character to the current DOCTYPE token's name.</dd>
EOFdt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the data statea>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.</dd> <dt>Anything else</dt> <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current DOCTYPE token's name.</dd> dl><h5 id="after-doctype-name-state"><span class="secno">8.2.4.55 </span>After DOCTYPE name statedfn></h5>

Consume the next input charactera>:</p>

"tab" (U+0009)dt> <dt>"LF" (U+000A)</dt>
"FF" (U+000C)dt> <!--<dt>"CR" (U+000D)</dt>-->
U+0020 SPACEdt> <dd>Ignore the character.</dd>
U+003E GREATER-THAN SIGN (>)dt> <dd>Switch to the <a href="#data-state">data state</a>. Emit the current DOCTYPE token.dd> <dt>EOF</dt>
Parse errora>. Switch to the <a href="#data-state">data state</a>. Set the DOCTYPE token's force-quirks flagi> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.dd> <dt>Anything else</dt>

If the six characters starting from the current input charactera> are an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match for the word "PUBLIC", then consume those characters and switch to the after DOCTYPE public keyword statea>.</p>

Otherwise, if the six characters starting from the current input charactera> are an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match for the word "SYSTEM", then consume those characters and switch to the after DOCTYPE system keyword statea>.</p>

Otherwise, this is a parse errora>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus DOCTYPE state</a>.</p> </dd> </dl><h5 id="after-doctype-public-keyword-state"><span class="secno">8.2.4.56 </span><dfn>After DOCTYPE public keyword state</dfn></h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p> <dl class="switch"><dt>"tab" (U+0009)</dt> <dt>"LF" (U+000A)</dt> <dt>"FF" (U+000C)</dt> <!--<dt>"CR" (U+000D)</dt>--> <dt>U+0020 SPACE</dt> <dd>Switch to the <a href="#before-doctype-public-identifier-state">before DOCTYPE public identifier state</a>.</dd> <dt>U+0022 QUOTATION MARK (")</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's public identifier to the empty string (not missing), then switch to the <a href="#doctype-public-identifier-(double-quoted)-state">DOCTYPE public identifier (double-quoted) state</a>.dd> <dt>"'" (U+0027)</dt>

Parse errora>. Set the DOCTYPE token's public identifier to the empty string (not missing), then switch to the <a href="#doctype-public-identifier-(single-quoted)-state">DOCTYPE public identifier (single-quoted) state</a>.</dd> <dt>U+003E GREATER-THAN SIGN (&gt;)</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to oni>. Switch to the <a href="#data-state">data state</a>. Emit that DOCTYPE token.dd> <dt>EOF</dt>
Parse errora>. Switch to the <a href="#data-state">data state</a>. Set the DOCTYPE token's force-quirks flagi> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.dd> <dt>Anything else</dt>
Parse errora>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#bogus-doctype-state">bogus DOCTYPE state</a>.</dd> </dl><h5 id="before-doctype-public-identifier-state"><span class="secno">8.2.4.57 </span><dfn>Before DOCTYPE public identifier state</dfn></h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p> <dl class="switch"><dt>"tab" (U+0009)</dt> <dt>"LF" (U+000A)</dt> <dt>"FF" (U+000C)</dt> <!--<dt>"CR" (U+000D)</dt>--> <dt>U+0020 SPACE</dt> <dd>Ignore the character.</dd> <dt>U+0022 QUOTATION MARK (")</dt> <dd>Set the DOCTYPE token's public identifier to the empty string (not missing), then switch to the <a href="#doctype-public-identifier-(double-quoted)-state">DOCTYPE public identifier (double-quoted) state</a>.dd> <dt>"'" (U+0027)</dt>
Set the DOCTYPE token's public identifier to the empty string (not missing), then switch to the DOCTYPE public identifier (single-quoted) statea>.</dd>
U+003E GREATER-THAN SIGN (>)dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's force-quirks flagi> to <i>on</i>. Switch to the data statea>. Emit that DOCTYPE token.</dd>
EOFdt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the data statea>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.</dd> <dt>Anything else</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to oni>. Switch to the <a href="#bogus-doctype-state">bogus DOCTYPE state</a>.dd> </dl>
8.2.4.58 span><dfn>DOCTYPE public identifier (double-quoted) state</dfn>h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:p> <dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt> <dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd> <dt>U+0000 NULL</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER character to the current DOCTYPE token's public identifier.</dd> <dt>U+003E GREATER-THAN SIGN (&gt;)</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data state</a>. Emit that DOCTYPE token.</dd> <dt>EOF</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the <a href="#data-state">data state</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.</dd> <dt>Anything else</dt> <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current DOCTYPE token's public identifier.</dd> </dl><h5 id="doctype-public-identifier-(single-quoted)-state"><span class="secno">8.2.4.59 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:</p> <dl class="switch"><dt>"'" (U+0027)</dt> <dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd> <dt>U+0000 NULL</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER character to the current DOCTYPE token's public identifier.</dd>
U+003E GREATER-THAN SIGN (>)dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's force-quirks flagi> to <i>on</i>. Switch to the data statea>. Emit that DOCTYPE token.</dd>
EOFdt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the data statea>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.</dd> <dt>Anything else</dt> <dd>Append the <a href="parsing.html#current-input-character">current input character</a> to the current DOCTYPE token's public identifier.</dd> dl><h5 id="after-doctype-public-identifier-state"><span class="secno">8.2.4.60 </span>After DOCTYPE public identifier statedfn></h5>

Consume the next input charactera>:</p>

"tab" (U+0009)dt> <dt>"LF" (U+000A)</dt>
"FF" (U+000C)dt> <!--<dt>"CR" (U+000D)</dt>-->
U+0020 SPACEdt> <dd>Switch to the <a href="#between-doctype-public-and-system-identifiers-state">between DOCTYPE public and system identifiers state</a>.dd> <dt>U+003E GREATER-THAN SIGN (&gt;)</dt>
Switch to the data statea>. Emit the current DOCTYPE token.</dd>
U+0022 QUOTATION MARK (")dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's system identifier to the empty string (not missing), then switch to the DOCTYPE system identifier (double-quoted) statea>.</dd>
"'" (U+0027)dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's system identifier to the empty string (not missing), then switch to the DOCTYPE system identifier (single-quoted) statea>.</dd>
EOFdt> <dd><a href="parsing.html#parse-error">Parse error</a>. Switch to the data statea>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume the EOF character.</dd> <dt>Anything else</dt> <dd><a href="parsing.html#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to oni>. Switch to the <a href="#bogus-doctype-state">bogus DOCTYPE state</a>.dd> </dl>
8.2.4.61 span><dfn>Between DOCTYPE public and system identifiers state</dfn>h5> <p>Consume the <a href="parsing.html#next-input-character">next input character</a>:p> <dl class="switch"><dt>"tab" (U+0009)</dt>
"LF" (U+000A)dt> <dt>"FF" (U+000C)</dt> .p> <p>If the end of the file was reached, reconsume the EOF character.</p>
8.2.4.69 span>Tokenizing character references</h5>

This section defines how to consume a character referencedfn>. This definition is used when parsing character references <a href="#character-reference-in-data-state" title="character reference in data state">in text</a> and in attributesa>.</p>

The behavior depends on the identity of the next character (the one immediately after the U+0026 AMPERSAND character):p> <dl class="switch"><dt>"tab" (U+0009)</dt>

"LF" (U+000A)dt> <dt>"FF" (U+000C)</dt> code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Level 1//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Level 2//<!--EN--></code>" </li>
  • The public identifier starts with: "-//IETF//DTD HTML 2.0 Strict Level 1//code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Strict Level 2//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.0 Strict//<!--EN--></code>" </li>
  • The public identifier starts with: "-//IETF//DTD HTML 2.0//code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 2.1E//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML 3.0//<!--EN--></code>" </li> code>" </li> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Level 0//<!--EN--></code>" </li> <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Level 0//EN//2.0</code>" </li>-->
  • The public identifier starts with: "-//IETF//DTD HTML Level 1//code>" </li> <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Level 1//EN//2.0</code>" </li>--> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Level 2//<!--EN--></code>" </li>
  • The public identifier starts with: "-//IETF//DTD HTML Strict Level 0//code>" </li> <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict Level 0//EN//2.0</code>" </li>--> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict Level 1//<!--EN--></code>" </li>
  • The public identifier starts with: "-//IETF//DTD HTML Strict Level 3//code>" </li> <!--<li> The public identifier is set to: "<code title="">-//IETF//DTD HTML Strict Level 3//EN//3.0</code>" </li>--> <li> The public identifier starts with: "<code title="">-//IETF//DTD HTML Strict//<!--EN--></code>" </li> code>" </li> <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 2.0 HTML//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 2.0 Tables//<!--EN--></code>" </li>
  • The public identifier starts with: "-//Microsoft//DTD Internet Explorer 3.0 HTML Strict//code>" </li> <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 3.0 HTML//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//Microsoft//DTD Internet Explorer 3.0 Tables//<!--EN--></code>" </li>
  • The public identifier starts with: "-//Netscape Comm. Corp.//DTD HTML//code>" </li> <li> The public identifier starts with: "<code title="">-//Netscape Comm. Corp.//DTD Strict HTML//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//O'Reilly and Associates//DTD HTML 2.0//<!--EN--></code>" </li>
  • The public identifier starts with: "-//O'Reilly and Associates//DTD HTML Extended 1.0//code>" </li> <li> The public identifier starts with: "<code title="">-//O'Reilly and Associates//DTD HTML Extended Relaxed 1.0//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//SoftQuad Software//DTD HoTMetaL PRO 6.0::19990601::extensions to HTML 4.0//<!--EN--></code>" </li>
  • The public identifier starts with: "-//SoftQuad//DTD HoTMetaL PRO 4.0::19971010::extensions to HTML 4.0//code>" </li> <li> The public identifier starts with: "<code title="">-//Spyglass//DTD HTML 2.0 Extended//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//SQ//DTD HTML 2.0 HoTMetaL + extensions//<!--EN--></code>" </li>
  • The public identifier starts with: "-//Sun Microsystems Corp.//DTD HotJava HTML//code>" </li> <li> The public identifier starts with: "<code title="">-//Sun Microsystems Corp.//DTD HotJava Strict HTML//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3 1995-03-24//<!--EN--></code>" </li>
  • The public identifier starts with: "-//W3C//DTD HTML 3.2 Draft//code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3.2 Final//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 3.2//<!--EN--></code>" </li>
  • The public identifier starts with: "-//W3C//DTD HTML 3.2S Draft//code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 4.0 Frameset//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML 4.0 Transitional//<!--EN--></code>" </li>
  • The public identifier starts with: "-//W3C//DTD HTML Experimental 19960712//code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD HTML Experimental 970421//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD W3 HTML//<!--EN--></code>" </li>
  • The public identifier starts with: "-//W3O//DTD W3 HTML 3.0//code>" </li> <!--<li> The public identifier is set to: "<code title="">-//W3O//DTD W3 HTML 3.0//EN//</code>" </li>--> <li> The public identifier is set to: "<code title="">-//W3O//DTD W3 HTML Strict 3.0//EN//</code>" </li>
  • The public identifier starts with: "-//WebTechs//DTD Mozilla HTML 2.0//code>" </li> <li> The public identifier starts with: "<code title="">-//WebTechs//DTD Mozilla HTML//<!--EN--></code>" </li> <li> The public identifier is set to: "<code title="">-/W3C/DTD HTML 4.0 Transitional/EN</code>" </li>
  • The public identifier is set to: "HTMLcode>" </li> <li> The system identifier is set to: "<code title="">http://www.ibm.com/data/dtd/v11/ibmxhtml1-transitional.dtd</code>" </li> <li> The system identifier is missing and the public identifier starts with: "<code title="">-//W3C//DTD HTML 4.01 Frameset//<!--EN--></code>" </li>
  • The system identifier is missing and the public identifier starts with: "-//W3C//DTD HTML 4.01 Transitional//code>" </li> </ul><p>Otherwise, if the DOCTYPE token matches one of the conditions in the following list, then set the <code><a href="dom.html#document">Document</a></code> to <a href="infrastructure.html#limited-quirks-mode">limited-quirks mode</a>:</p> <ul class="brief"><li> The public identifier starts with: "<code title="">-//W3C//DTD XHTML 1.0 Frameset//<!--EN--></code>" </li> <li> The public identifier starts with: "<code title="">-//W3C//DTD XHTML 1.0 Transitional//<!--EN--></code>" </li>
  • The system identifier is not missing and the public identifier starts with: "-//W3C//DTD HTML 4.01 Frameset//code>" </li> <li> The system identifier is not missing and the public identifier starts with: "<code title="">-//W3C//DTD HTML 4.01 Transitional//<!--EN--></code>" </li> </ul>

    The system identifier and public identifier strings must be compared to the values given in the lists above in an ASCII case-insensitivea> manner. A system identifier whose value is the empty string is not considered missing for the purposes of the conditions above.</p>

    Then, switch the insertion modea> to "<a href="#the-before-html-insertion-mode" title="insertion mode: before html">before html</a>".</p> dd> <dt>Anything else</dt>

    If the document is notem> <a href="the-iframe-element.html#an-iframe-srcdoc-document">an <code>iframe</code> srcdoccode> document</a>, then this is a parse errora>; set the <code><a href="dom.html#document">Document</a>code> to <a href="infrastructure.html#quirks-mode">quirks mode</a>.p> <p>In any case, switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "before htmla>", then reprocess the current token.</p> </dd> </dl><h6 id="the-before-html-insertion-mode"><span class="secno">8.2.5.4.2 </span>The "<dfn title="insertion mode: before html">before html</dfn>" insertion modeh6> <p>When the user agent is to apply the rules for the "<a href="#the-before-html-insertion-mode" title="insertion mode: before html">before html</a>" <a href="parsing.html#insertion-mode">insertion mode</a>, the user agent must handle the token as follows:p> <dl class="switch"><dt>A DOCTYPE token</dt>

    Parse errora>. Ignore the token.</p> dd> <dt>A comment token</dt>

    Append a Commenta></code> node to the Documenta></code> object with the datacode> attribute set to the data given in the comment token.</p> dd> <dt>A character token that is one of U+0009 CHARACTER TABULATION, "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), or U+0020 SPACE</dt>

    Ignore the token.p> </dd>

    A start tag whose tag name is "html"dt> <dd> <p><a href="#create-an-element-for-the-token">Create an element for the token</a> in the HTML namespacea>. Append it to the <code><a href="dom.html#document">Document</a>code> object. Put this element in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.p> <p id="parser-appcache">If the <code><a href="dom.html#document">Document</a>code> is being loaded as part of <a href="history.html#navigate" title="navigate">navigation</a> of a browsing contexta>, then: if the newly created element has a <code title="attr-html-manifest"><a href="the-html-element.html#attr-html-manifest">manifest</a>code> attribute whose value is not the empty string, then <a href="urls.html#resolve-a-url" title="resolve a url">resolve</a> the value of that attribute to an absolute URLa>, relative to the newly created element, and if that is successful, run the <a href="offline.html#concept-appcache-init" title="concept-appcache-init">application cache selection algorithm</a> with the resulting absolute URLa> with any <a href="urls.html#url-fragment" title="url-fragment">&lt;fragment&gt;</a> component removed; otherwise, if there is no such attribute, or its value is the empty string, or resolving its value fails, run the application cache selection algorithma> with no manifest. The algorithm must be passed the <code><a href="dom.html#document">Document</a>code> object.</p>

    Switch the insertion modea> to "<a href="#the-before-head-insertion-mode" title="insertion mode: before head">before head</a>".</p> dd> <dt>An end tag whose tag name is one of: "head", "body", "html", "br"</dt>

    Act as described in the "anything else" entry below.p> </dd>

    Any other end tagdt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.p> </dd>
    Anything elsedt> <dd> <p>Create an <code><a href="the-html-element.html#the-html-element">html</a>code> element. Append it to the <code><a href="dom.html#document">Document</a>code> object. Put this element in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.p> <p>If the <code><a href="dom.html#document">Document</a>code> is being loaded as part of <a href="history.html#navigate" title="navigate">navigation</a> of a browsing contexta>, then: run the <a href="offline.html#concept-appcache-init" title="concept-appcache-init">application cache selection algorithm</a> with no manifest, passing it the Documenta></code> object.p> <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "before heada>", then reprocess the current token.</p> </dd> </dl><p>The root element can end up being removed from the <code><a href="dom.html#document">Document</a></code> object, e.g. by scripts; nothing in particular happens in such cases, content continues being appended to the nodes as described in the next section.</p> <h6 id="the-before-head-insertion-mode"><span class="secno">8.2.5.4.3 </span>The "<dfn title="insertion mode: before head">before head</dfn>" insertion modeh6> <p>When the user agent is to apply the rules for the "<a href="#the-before-head-insertion-mode" title="insertion mode: before head">before head</a>" <a href="parsing.html#insertion-mode">insertion mode</a>, the user agent must handle the token as follows:p> <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), or U+0020 SPACE</dt>

    Ignore the token.p> <!-- :-( --> </dd>

    A comment tokendt> <dd> <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a>code> node to the <a href="parsing.html#current-node">current node</a> with the datacode> attribute set to the data given in the comment token.</p> dd> <dt>A DOCTYPE token</dt>

    Parse errora>. Ignore the token.</p> dd> <dt>A start tag whose tag name is "html"</dt>

    Process the token using the rules fora> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion mode</a>.p> </dd>

    A start tag whose tag name is "head"dt> <dd> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.p> <p>Set the <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointera> to the newly created <code><a href="the-head-element.html#the-head-element">head</a>code> element.</p>

    Switch the insertion modea> to "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>".</p> dd> <dt>An end tag whose tag name is one of: "head", "body", "html", "br"</dt>

    Act as if a start tag token with the tag name "head" and no attributes had been seen, then reprocess the current token.p> </dd>

    Any other end tagdt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.p> </dd>
    Anything elsedt> <dd> <p>Act as if a start tag token with the tag name "head" and no attributes had been seen, then reprocess the current token.</p> dd> </dl>
    8.2.5.4.4 span>The "<dfn title="insertion mode: in head">in head</dfn>" insertion mode</h6>

    When the user agent is to apply the rules for the "in heada>" <a href="parsing.html#insertion-mode">insertion mode</a>, the user agent must handle the token as follows:</p> <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="parsing.html#current-node">current node</a>.</p> </dd> <dt>A comment token</dt> <dd> <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current node</a> with the <code title="">data</code> attribute set to the data given in the comment token.</p> </dd> <dt>A DOCTYPE token</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p> </dd> <dt>A start tag whose tag name is "html"</dt> <dd> <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" insertion modea>.</p> dd> <dt>A start tag whose tag name is one of: "base", "basefont", "bgsound", "command", "link"</dt>

    Insert an HTML elementa> for the token. Immediately pop the <a href="parsing.html#current-node">current node</a> off the stack of open elementsa>.</p>

    Acknowledge the token's self-closing flagi></a>, if it is set.p> </dd>

    A start tag whose tag name is "meta"dt> <dd> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token. Immediately pop the current nodea> off the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.p> <p><a href="#acknowledge-self-closing-flag" title="acknowledge self-closing flag">Acknowledge the token's <i>self-closing flag</i></a>, if it is set.</p> <p id="meta-charset-during-parse">If the element has a <code title="attr-meta-charset"><a href="the-meta-element.html#attr-meta-charset">charset</a></code> attribute, and its value is either a supported <a href="infrastructure.html#ascii-compatible-character-encoding">ASCII-compatible character encoding</a> or <a href="infrastructure.html#a-utf-16-encoding">a UTF-16 encoding</a>, and the <a href="parsing.html#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> is currently <i>tentative</i>, then <a href="parsing.html#change-the-encoding">change the encoding</a> to the encoding given by the value of the <code title="attr-meta-charset"><a href="the-meta-element.html#attr-meta-charset">charset</a></code> attribute.</p> <p>Otherwise, if the element has an <code title="attr-meta-http-equiv"><a href="the-meta-element.html#attr-meta-http-equiv">http-equiv</a></code> attribute whose value is an <a href="infrastructure.html#ascii-case-insensitive">ASCII case-insensitive</a> match for the string "<code title="">Content-Type</code>", and the element has a <code title="attr-meta-content"><a href="the-meta-element.html#attr-meta-content">content</a></code> attribute, and applying the <a href="urls.html#algorithm-for-extracting-a-character-encoding-from-a-meta-element">algorithm for extracting a character encoding from a <code>meta</code> element</a> to that attribute's value returns a supported <a href="infrastructure.html#ascii-compatible-character-encoding">ASCII-compatible character encoding</a> or a UTF-16 encodinga>, and the <a href="parsing.html#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> is currently tentativei>, then <a href="parsing.html#change-the-encoding">change the encoding</a> to the extracted encoding.p> </dd>
    A start tag whose tag name is "title"dt> <dd> <p>Follow the <a href="#generic-rcdata-element-parsing-algorithm">generic RCDATA element parsing algorithm</a>.p> </dd>
    A start tag whose tag name is "noscript", if the scripting flaga> is enabled</dt>
    A start tag whose tag name is one of: "noframes", "style"dt> <dd> <p>Follow the <a href="#generic-raw-text-element-parsing-algorithm">generic raw text element parsing algorithm</a>.p> </dd>
    A start tag whose tag name is "noscript", if the scripting flaga> is disabled</dt>

    Insert an HTML elementa> for the token.</p>

    Switch the insertion modea> to "<a href="#parsing-main-inheadnoscript" title="insertion mode: in head noscript">in head noscript</a>".</p> dd> <dt id="scriptTag">A start tag whose tag name is "script"</dt>

    Run these steps:p> <ol><li><p><a href="#create-an-element-for-the-token">Create an element for the token</a> in the HTML namespacea>.</li>

  • Mark the element as being "parser-inserted"a> and unset the element's <a href="the-script-element.html#force-async">"force-async"</a> flag.</p> <p class="note">This ensures that, if the script is external, any <code title="dom-document-write"><a href="dynamic-markup-insertion.html#dom-document-write">document.write()</a></code> calls in the script will execute in-line, instead of blowing the document away, as would happen in most other cases. It also prevents the script from executing until the end tag is seen.</p> </li> <li><p>If the parser was originally created for the <a href="the-end.html#html-fragment-parsing-algorithm">HTML fragment parsing algorithm</a>, then mark the <code><a href="the-script-element.html#the-script-element">script</a></code> element as <a href="the-script-element.html#already-started">"already started"</a>. (<a href="the-end.html#fragment-case">fragment case</a>)</li> <li><p>Append the new element to the <a href="parsing.html#current-node">current node</a> and push it onto the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</li> <li><p>Switch the tokenizer to the <a href="#script-data-state">script data state</a>.</li> <li><p>Let the <a href="parsing.html#original-insertion-mode">original insertion mode</a> be the current <a href="parsing.html#insertion-mode">insertion mode</a>.</p> <li><p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-incdata" title="insertion mode: text">text</a>".</li> </ol></dd> <dt>An end tag whose tag name is "head"</dt> <dd> <p>Pop the <a href="parsing.html#current-node">current node</a> (which will be the <code><a href="the-head-element.html#the-head-element">head</a></code> element) off the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p> <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#the-after-head-insertion-mode" title="insertion mode: after head">after head</a>".</p> </dd> <dt>An end tag whose tag name is one of: "body", "html", "br"</dt> <dd> <p>Act as described in the "anything else" entry below.</p> </dd> <dt>A start tag whose tag name is "head"</dt> <dt>Any other end tag</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p> </dd> <dt>Anything else</dt> <dd> <!-- can't get here with an EOF and a fragment case --> <p>Act as if an end tag token with the tag name "head" had been seen, and reprocess the current token.</p> dd> </dl>

    8.2.5.4.5 span>The "<dfn title="insertion mode: in head noscript">in head noscript</dfn>" insertion mode</h6>

    When the user agent is to apply the rules for the "in head noscripta>" <a href="parsing.html#insertion-mode">insertion mode</a>, the user agent must handle the token as follows:</p> <dl class="switch"><dt>A DOCTYPE token</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p> </dd> <dt>A start tag whose tag name is "html"</dt> <dd> <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" insertion modea>.</p> dd> <dt>An end tag whose tag name is "noscript"</dt>

    Pop the current nodea> (which will be a <code><a href="the-noscript-element.html#the-noscript-element">noscript</a>code> element) from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>; the new current nodea> will be a <code><a href="the-head-element.html#the-head-element">head</a>code> element.</p>

    Switch the insertion modea> to "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>".</p> dd> <dt>A character token that is one of U+0009 CHARACTER TABULATION, "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), or U+0020 SPACE</dt>

    A comment tokendt> <dt>A start tag whose tag name is one of: "basefont", "bgsound", "link", "meta", "noframes", "style"</dt>

    Process the token using the rules fora> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion mode</a>.p> </dd>

    An end tag whose tag name is "br"dt> <dd> <p>Act as described in the "anything else" entry below.</p> dd> <dt>A start tag whose tag name is one of: "head", "noscript"</dt>
    Any other end tagdt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.p> </dd>
    Anything elsedt> <dd> <!-- can't get here with an EOF and a fragment case --> <p><a href="parsing.html#parse-error">Parse error</a>. Act as if an end tag with the tag name "noscript" had been seen and reprocess the current token.</p> </dd> </dl><h6 id="the-after-head-insertion-mode"><span class="secno">8.2.5.4.6 </span>The "<dfn title="insertion mode: after head">after head</dfn>" insertion mode</h6> <p>When the user agent is to apply the rules for the "<a href="#the-after-head-insertion-mode" title="insertion mode: after head">after head</a>" <a href="parsing.html#insertion-mode">insertion mode</a>, the user agent must handle the token as follows:</p> <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="parsing.html#current-node">current node</a>.</p> </dd> <dt>A comment token</dt> <dd> <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a></code> node to the <a href="parsing.html#current-node">current node</a> with the <code title="">data</code> attribute set to the data given in the comment token.</p> </dd> <dt>A DOCTYPE token</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p> </dd> <dt>A start tag whose tag name is "html"</dt> <dd> <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion mode</a>.</p> </dd> <dt>A start tag whose tag name is "body"</dt> <dd> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p> <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>".</p> </dd> <dt>A start tag whose tag name is "frameset"</dt> <dd> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p> <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inframeset" title="insertion mode: in frameset">in frameset</a>".</p> </dd> <dt>A start tag token whose tag name is one of: "base", "basefont", "bgsound", "link", "meta", "noframes", "script", "style", "title"</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>.</p> <p>Push the node pointed to by the <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointer</a> onto the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p> <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inhead" title="insertion mode: in head">in head</a>" <a href="parsing.html#insertion-mode">insertion mode</a>.</p> <p>Remove the node pointed to by the <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointer</a> from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p> <p class="note">The <a href="parsing.html#head-element-pointer"><code title="">head</code> element pointer</a> cannot be null at this point.</p> </dd> <dt>An end tag whose tag name is one of: "body", "html", "br"</dt> <dd> <p>Act as described in the "anything else" entry below.</p> </dd> <dt>A start tag whose tag name is "head"</dt> <dt>Any other end tag</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p> </dd> <dt>Anything else</dt> <dd> <p>Act as if a start tag token with the tag name "body" and no attributes had been seen, then set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> back to "ok", and then reprocess the current token.</p> </dd> </dl><h6 id="parsing-main-inbody"><span class="secno">8.2.5.4.7 </span>The "<dfn title="insertion mode: in body">in body</dfn>" insertion mode</h6> <p>When the user agent is to apply the rules for the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="parsing.html#insertion-mode">insertion mode</a>, the user agent must handle the token as follows:</p> <dl class="switch"><dt>A character token that is U+0000 NULL</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>. Ignore the token.</p> <!-- The D-Link DSL-G604T ADSL router has a zero byte in its configuration UI before a <frameset>, which is why U+0000 is special-cased here. refs: https://bugzilla.mozilla.org/show_bug.cgi?id=563526 http://www.w3.org/Bugs/Public/show_bug.cgi?id=9659 --> </dd> <dt>A character token that is one of U+0009 CHARACTER TABULATION, "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), or U+0020 SPACE</dt> <dd> <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if any.</p> <p><a href="#insert-a-character" title="insert a character">Insert the token's character</a> into the current nodea>.</p> dd> <dt>Any other character token</dt>

    Reconstruct the active formatting elementsa>, if any.</p>

    Insert the token's charactera> into the <a href="parsing.html#current-node">current node</a>.p> <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".p> </dd>

    A comment tokendt> <dd> <p>Append a <code><a href="infrastructure.html#comment-0">Comment</a>code> node to the <a href="parsing.html#current-node">current node</a> with the datacode> attribute set to the data given in the comment token.</p> dd> <dt>A DOCTYPE token</dt>

    Parse errora>. Ignore the token.</p> dd> <dt>A start tag whose tag name is "html"</dt>

    Parse errora>. For each attribute on the token, check to see if the attribute is already present on the top element of the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>. If it is not, add the attribute and its corresponding value to that element.p> </dd>

    A start tag token whose tag name is one of: "base", "basefont", "bgsound", "command", "link", "meta", "noframes", "script", "style", "title"dt> <dd> <p>Process the token <a href="parsing.html#using-the-rules-for">using the rules for</a> the "in heada>" <a href="parsing.html#insertion-mode">insertion mode</a>.</p> </dd> <dt>A start tag whose tag name is "body"</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>.</p> <p>If the second element on the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> is not a <code><a href="the-body-element.html#the-body-element">body</a></code> element, or, if the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has only one node on it, then ignore the token. (<a href="the-end.html#fragment-case">fragment case</a>)</p> <p>Otherwise, set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok"; then, for each attribute on the token, check to see if the attribute is already present on the <code><a href="the-body-element.html#the-body-element">body</a></code> element (the second element) on the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>, and if it is not, add the attribute and its corresponding value to that element.</p> </dd> <dt>A start tag whose tag name is "frameset"</dt> <dd> <p><a href="parsing.html#parse-error">Parse error</a>.</p> <p>If the second element on the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> is not a <code><a href="the-body-element.html#the-body-element">body</a></code> element, or, if the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has only one node on it, then ignore the token. (<a href="the-end.html#fragment-case">fragment case</a>)</p> <p>If the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> is set to "not ok", ignore the token.</p> <p>Otherwise, run the following steps:</p> <ol><li><p>Remove the second element on the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> from its parent node, if it has one.</li> <li><p>Pop all the nodes from the bottom of the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>, from the <a href="parsing.html#current-node">current node</a> up to, but not including, the root <code><a href="the-html-element.html#the-html-element">html</a></code> element.</p> <li><p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</li> <li><p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-inframeset" title="insertion mode: in frameset">in frameset</a>".p> </ol>dd> <dt>An end-of-file token</dt>

    If there is a node in the stack of open elementsa> that is not either a <code><a href="the-dd-element.html#the-dd-element">dd</a>code> element, a <code><a href="the-dt-element.html#the-dt-element">dt</a>code> element, an <code><a href="the-li-element.html#the-li-element">li</a>code> element, a <code><a href="the-p-element.html#the-p-element">p</a>code> element, a <code><a href="the-tbody-element.html#the-tbody-element">tbody</a>code> element, a <code><a href="the-td-element.html#the-td-element">td</a>code> element, a <code><a href="the-tfoot-element.html#the-tfoot-element">tfoot</a>code> element, a <code><a href="the-th-element.html#the-th-element">th</a>code> element, a <code><a href="the-thead-element.html#the-thead-element">thead</a>code> element, a <code><a href="the-tr-element.html#the-tr-element">tr</a>code> element, the <code><a href="the-body-element.html#the-body-element">body</a>code> element, or the <code><a href="the-html-element.html#the-html-element">html</a>code> element, then this is a <a href="parsing.html#parse-error">parse error</a>.p> <!-- (some of those are fragment cases) --> <p><a href="the-end.html#stop-parsing">Stop parsing</a>.p> </dd>

    An end tag whose tag name is "body"dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not have a bodycode> element in scope</a>, this is a parse errora>; ignore the token.</p>

    Otherwise, if there is a node in the stack of open elementsa> that is not either a <code><a href="the-dd-element.html#the-dd-element">dd</a>code> element, a <code><a href="the-dt-element.html#the-dt-element">dt</a>code> element, an <code><a href="the-li-element.html#the-li-element">li</a>code> element, an <code><a href="the-optgroup-element.html#the-optgroup-element">optgroup</a>code> element, an <code><a href="the-option-element.html#the-option-element">option</a>code> element, a <code><a href="the-p-element.html#the-p-element">p</a>code> element, an <code><a href="the-rp-element.html#the-rp-element">rp</a>code> element, an <code><a href="the-rt-element.html#the-rt-element">rt</a>code> element, a <code><a href="the-tbody-element.html#the-tbody-element">tbody</a>code> element, a <code><a href="the-td-element.html#the-td-element">td</a>code> element, a <code><a href="the-tfoot-element.html#the-tfoot-element">tfoot</a>code> element, a <code><a href="the-th-element.html#the-th-element">th</a>code> element, a <code><a href="the-thead-element.html#the-thead-element">thead</a>code> element, a <code><a href="the-tr-element.html#the-tr-element">tr</a>code> element, the <code><a href="the-body-element.html#the-body-element">body</a>code> element, or the <code><a href="the-html-element.html#the-html-element">html</a>code> element, then this is a <a href="parsing.html#parse-error">parse error</a>.p> <!-- (some of those are fragment cases, e.g. for <tbody> you'd have hit the first paragraph since the <body> wouldn't be in scope, unless it was a fragment case) --> <!-- If we ever change the frameset-ok flag to an insertion mode, then we'd have to somehow keep track of its state when we switch to after-body. --> <p>Switch the <a href="parsing.html#insertion-mode">insertion mode</a> to "<a href="#parsing-main-afterbody" title="insertion mode: after body">after body</a>".</p> </dd> <dt>An end tag whose tag name is "html"</dt> <dd> <p>Act as if an end tag with tag name "body" had been seen, then, if that token wasn't ignored, reprocess the current token.</p> dd> <!-- start tags for non-phrasing flow content elements --> <!-- the normal ones --> <dt>A start tag whose tag name is one of: "address", "article", "aside", "blockquote", "center", "details", "dialog", "dir", "div", "dl", "fieldset", "figcaption", "figure", "footer", "header", "hgroup", "menu", "nav", "ol", "p", "section", "summary", "ul"</dt>

    If the stack of open elementsa> <a href="parsing.html#has-an-element-in-button-scope" title="has an element in button scope">has a <code>p</code> element in button scopea>, then act as if an end tag with the tag name "p" had been seen.</p>

    Insert an HTML elementa> for the token.</p> dd> <!-- as normal, but close h1-h6 if it's the current node --> <dt>A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"</dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has an element in button scope">has a <code>p</code> element in button scope</a>, then act as if an end tag with the tag name "p" had been seen.</p> <p>If the <a href="parsing.html#current-node">current node</a> is an element whose tag name is one of "h1", "h2", "h3", "h4", "h5", or "h6", then this is a <a href="parsing.html#parse-error">parse error</a>; pop the <a href="parsing.html#current-node">current node</a> off the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.</p> <!-- See https://bugs.webkit.org/show_bug.cgi?id=12646 --> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p> </dd> <!-- as normal, but drops leading newline --> <dt>A start tag whose tag name is one of: "pre", "listing"</dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has an element in button scope">has a <code>p</code> element in button scope</a>, then act as if an end tag with the tag name "p" had been seen.</p> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.</p> <p>If the next token is a "LF" (U+000A) character token, then ignore that token and move on to the next one. (Newlines at the start of <code><a href="the-pre-element.html#the-pre-element">pre</a></code> blocks are ignored as an authoring convenience.)</p> <!-- <pre>[CR]X will eat the [CR], <pre>&#x10;X will eat the &#x10;, but <pre>&#x13;X will not eat the &#x13;. --> <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> </dd> <!-- as normal, but interacts with the form element pointer --> <dt>A start tag whose tag name is "form"</dt> <dd> <p>If the <a href="parsing.html#form-element-pointer"><code title="form">form</code> element pointer</a> is not null, then this is a <a href="parsing.html#parse-error">parse error</a>; ignore the token.</p> <p>Otherwise:</p> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> <a href="parsing.html#has-an-element-in-button-scope" title="has an element in button scope">has a <code>p</code> element in button scope</a>, then act as if an end tag with the tag name "p" had been seen.</p> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token, and set the <a href="parsing.html#form-element-pointer"><code title="form">form</code> element pointer</a> to point to the element created.</p> </dd> <!-- as normal, but imply </li> when there's another <li> open in weird cases --> <dt>A start tag whose tag name is "li"</dt>

    Run these steps:p> <ol><li><p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".li> <li><p>Initialize <var title="">node</var> to be the current nodea> (the bottommost node of the stack).</li>

  • Loopi>: If <var title="">node</var> is an lia></code> element, then act as if an end tag with the tag name "li" had been seen, then jump to the last step.li> <li><p>If <var title="">node</var> is in the speciala> category, but is not an <code><a href="the-address-element.html#the-address-element">address</a>code>, <code><a href="the-div-element.html#the-div-element">div</a>code>, or <code><a href="the-p-element.html#the-p-element">p</a>code> element, then jump to the last step.</li>

  • Otherwise, set nodevar> to the previous entry in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> and return to the step labeled loopi>.</li>

  • This is the last step.p> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has a pcode> element in button scope</a>, then act as if an end tag with the tag name "p" had been seen.p> <p>Finally, <a href="#insert-an-html-element">insert an HTML element</a> for the token.p> </li> ol></dd>

    A start tag whose tag name is one of: "dd", "dt"dt> <dd> <p>Run these steps:</p>
    1. Set the frameset-ok flaga> to "not ok".</li>

    2. Initialize nodevar> to be the <a href="parsing.html#current-node">current node</a> (the bottommost node of the stack).li> <li><p><i>Loop</i>: If nodevar> is a <code><a href="the-dd-element.html#the-dd-element">dd</a>code> or <code><a href="the-dt-element.html#the-dt-element">dt</a>code> element, then act as if an end tag with the same tag name as <var title="">node</var> had been seen, then jump to the last step.li> <li><p>If <var title="">node</var> is in the speciala> category, but is not an <code><a href="the-address-element.html#the-address-element">address</a>code>, <code><a href="the-div-element.html#the-div-element">div</a>code>, or <code><a href="the-p-element.html#the-p-element">p</a>code> element, then jump to the last step.</li>

    3. Otherwise, set nodevar> to the previous entry in the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> and return to the step labeled loopi>.</li>

    4. This is the last step.p> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has a pcode> element in button scope</a>, then act as if an end tag with the tag name "p" had been seen.p> <p>Finally, <a href="#insert-an-html-element">insert an HTML element</a> for the token.p> </li> ol></dd>

      A start tag whose tag name is "plaintext"dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has a pcode> element in button scope</a>, then act as if an end tag with the tag name "p" had been seen.p> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.p> <p>Switch the tokenizer to the <a href="#plaintext-state">PLAINTEXT state</a>.p> <p class="note">Once a start tag with the tag name "plaintext" has been seen, that will be the last token ever seen other than character tokens (and the end-of-file token), because there is no way to switch out of the <a href="#plaintext-state">PLAINTEXT state</a>.p> </dd>
      A start tag whose tag name is "button"dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> has a buttoncode> element in scope</a>, then this is a parse errora>; act as if an end tag with the tag name "button" had been seen, then reprocess the token.</p>

      Otherwise:p> <p><a href="parsing.html#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if any.p> <p><a href="#insert-an-html-element">Insert an HTML element</a> for the token.p> <p>Set the <a href="parsing.html#frameset-ok-flag">frameset-ok flag</a> to "not ok".p> </dd>

      An end tag whose tag name is one of: "address", "article", "aside", "blockquote", "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset", "figcaption", "figure", "footer", "header", "hgroup", "listing", "menu", "nav", "ol", "pre", "section", "summary", "ul"dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not have an element in scopea> with the same tag name as that of the token, then this is a <a href="parsing.html#parse-error">parse error</a>; ignore the token.p> <p>Otherwise, run these steps:</p>
      1. Generate implied end tagsa>.</li>

      2. If the current nodea> is not an element with the same tag name as that of the token, then this is a <a href="parsing.html#parse-error">parse error</a>.li> <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> until an element with the same tag name as the token has been popped from the stack.li> </ol>dd> <!-- removes the form element pointer instead of the matching node --> <dt>An end tag whose tag name is "form"</dt>

        Let nodevar> be the element that the <a href="parsing.html#form-element-pointer"><code title="">form</code> element pointera> is set to.</p>

        Set the formcode> element pointer</a> to null.p> <p>If <var title="">node</var> is null or the stack of open elementsa> does not <a href="parsing.html#has-an-element-in-scope" title="has an element in scope">have <var title="">node</var> in scopea>, then this is a <a href="parsing.html#parse-error">parse error</a>; ignore the token.p> <p>Otherwise, run these steps:</p>

        1. Generate implied end tagsa>.</li>

        2. If the current nodea> is not <var title="">node</var>, then this is a parse errora>.</li>

        3. Remove nodevar> from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a>.li> </ol>dd> <!-- as normal, except </p> implies

          if there's no

          in scope, and needs care as the elements have optional tags -->

          An end tag whose tag name is "p"dt> <dd> <p>If the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> does not have an element in button scopea> with the same tag name as that of the token, then this is a <a href="parsing.html#parse-error">parse error</a>; act as if a start tag with the tag name "p" had been seen, then reprocess the current token.p> <p>Otherwise, run these steps:</p>
          1. Generate implied end tagsa>, except for elements with the same tag name as the token.</li>

          2. If the current nodea> is not an element with the same tag name as that of the token, then this is a <a href="parsing.html#parse-error">parse error</a>.li> <li><p>Pop elements from the <a href="parsing.html#stack-of-open-elements">stack of open elements</a> until an element with the same tag name as the token has been popped from the stack.li> </ol>dd> <!-- as normal, but needs care as the elements have optional tags, and are further scoped by <ol>/


  • Webmaster