Week2 Slides
Week2 Slides
Information representation
Markup
●
● Raw data vs Semantics
● Logical structure vs Styling
● HTML5 and CSS
Information representation
● Computers work only with “bits”
○ Binary digits: 0 and 1
● Numbers
○ Place value: binary numbers: eg. 6 = 0110
○ Two’s complement: negative numbers: eg. -6 = 1010
● Letters? Arbitrary Text?
Representing Text ●
●
ASCII
Unicode
● UTF-8
Information Interchange
● Communicate through machines - either between machines or between humans
● Machines only work with bits
● Standard “encoding”
○ Some sequence of bits interpreted as a character
Interpretation
What is “0100 0001”?
● String of bits
● Number with value 65 decimal
● Character “A”
● All of the above
Interpretation
What is “0100 0001”?
● String of bits
● Number with value 65 decimal
● Character “A”
● All of the above
Example:
1st Byte 2nd Byte 3rd Byte 4th Byte Free Bits Maximum Expressible Unicode Value
Src: https://www.w3.org/International/articles/definitions-characters/
UTF-8
● Use 8 bits for most common characters: ASCII subset
○ All ASCII documents are automatically UTF-8 compatible
● All other characters can be encoded based on prefix string
● More difficult for text processor:
○ first check prefix
○ linked list through chain of prefixes possible
○ Still more efficient for majority of documents
● Most common encoding in use today
Markup ●
●
Content vs Meaning
Types of markup
● (X)HTML
Content
Markup
Result
Types of Markup
Coombs et al, “Communication Systems and the Future of Scholarly Text Processing”,
Communications of ACM, 1987
Types of Markup
● Presentational
○ WYSIWYG: directly format output and display
○ Embed codes not part of regular text, specific to the editor
● Procedural
○ Details on how to display:
■ change font to large, bold
■ skip 2 lines, indent 4 columns
Coombs et al, “Communication Systems and the Future of Scholarly Text Processing”,
Communications of ACM, 1987
Types of Markup
● Presentational
○ WYSIWYG: directly format output and display
○ Embed codes not part of regular text, specific to the editor
● Procedural
○ Details on how to display:
■ change font to large, bold
■ skip 2 lines, indent 4 columns
● Descriptive
○ This is a <title>, this is a <heading>, this is a <paragraph>
Coombs et al, “Communication Systems and the Future of Scholarly Text Processing”,
Communications of ACM, 1987
Examples
● MS Word, Google Docs etc:
○ User interface focused on “appearance”, not meaning
○ WYSIWYG: direct control over styling
○ Often leads to complex formatting and loss of inherent meaning
● LaTeX, HTML (general *ML)
○ Focus on meaning
○ More complex to write and edit, not WYSIWYG in general
Semantic Markup
● Content vs Presentation
● Semantics
○ Meaning of the text
○ structure or logic of the document
HTML (and co.) ●
●
HyperText Markup Language
Generalizations
● Variants of Interest
HyperText Markup Language
● HTML first used by Tim Berners-Lee in original Web at CERN (~1989)
● Considered an application of SGML (Standard Generalized Markup Language)
○ Strict definitions on structure, syntax, validity
● HTML meant for browser interpretation
○ Very forgiving: loose validity checks
○ Best effort to display
HTML Example
<!DOCTYPE html>
<html>
<body>
</body>
</html>
Tags
● <h1> </h1> - paired tags
● Angle brackets < >
● Closing tag with /
● Location specific: <DOCTYPE>: only at head of doc
● Case-insensitive
Nesting
● <em><strong>Hello</strong></em>
● Hello
Invalid:
● <em><strong>Hello</em></strong>
● <em><strong>Hello</em>
● <em><strong>Hell<o/em></strong>
Presentation vs Semantics
● <strong>Hello</strong>
● <b>Hello</b>
● Hello
● <center>
● <font>
Document Object Model
<html>
<head>
<title>My title</title>
</head>
<body>
<h1>A heading</h1>
<a href=”link”>Link Text</a>
</body>
</html>
Document Object Model
<html>
<head>
<title>My title</title>
</head>
<body>
<h1>A heading</h1>
<a href=”link”>Link Text</a>
</body>
</html>
h1 {
color: maroon;
margin-left: 40px;
}
</style>
External CSS
● Extract common content for reuse
● Multiple CSS files can be included
● Latest definition of style takes precedence
Responsive Design
● Mobile and Tablets have smaller screens
○ Different form factors
● Adapt to screen - Respond
● CSS control styling - HTML controls content!
Bootstrap
● Commonly used framework
○ Originated from Twitter
○ Widely used now
● Standard styles for various components
○ Buttons
○ Forms
○ Icons
● Mobile first: highly responsive layout
Javascript?
● Interpreted language brought into the browser
● Not really related to Java in any way - formally ECMAScript
● Why?
○ HTML is not a programming language
○ CSS is not a programming language (well, …)
● Would still like to have “programmability” inside browser
● Not part of the core presentation requirements
○ Very useful, but will be considered later
● Presentation - Human interaction