UNIT | WEBSITE BASICS
By
Dr.Smitha.P.S
Associate Professor
Velammal Engineering College
WEBSITE
* A website is defined as a collection of web pages linked together that
has a unique domain name, that can be accessed from anywhere
across the globe over internet.
* It is hosted by a web server and viewed by web clients
* It can be developed in HTML, JavaScript, DHTML, PHP, etc.
Internet
*f Internet is a world-wide global system of interconnected computer
networks.
* fi Internet uses the standard Internet Protocol (TCP/IP).
* Pf] Every computer in internet is identified by a unique IP address.
«fl IP Address is a unique set of numbers (such as 110.22.33.114) which
identifies a computer location.
* fA special computer DNS (Domain Name Server) is used to give name to
the IP Address so that user can locate a computer by a name.
. For example, a ONS server will resolve a name
http://www.tutorialspoint.com to a particular IP address to uniquely
identify the computer on which this website is hosted.
* fi Internet is accessible to every user all over the world.
Internet Evolution
* The concept of Internet was originated in 1969 and has undergone several
technological & Infrastructural changes as discussed below:
* [7] The origin of Internet devised from the concept of Advanced Research
Project Agency Network (ARPANET).
* H ARPANET was developed by United States Department of Defense.
* fl Basic purpose of ARPANET was to provide communication among the
various bodies of government.
* ff Initially, there were only four nodes, formally called Hosts.
«fi In 1972, the ARPANET spread over the globe with 23 nodes located at
different countries and thus became known as Internet.
* fi By the time, with invention of new technologies such as TCP/IP protocols,
DNS, WWW, browsers, scripting languages etc.,Internet provided a
medium to publish and access information over the web.
interne
t oe a ae
lis es ="
~ y
#
_—
Disadvantages
i Internet }
Basic Internet Protocol:
Transmission Control Protocol! (TCP)
TCP is a connection oriented protocol and offers end-to-end packet delivery, It
acts as back bone
for connection. |t exhibits the following key features:
Transmission Contral Protocol (TCP) corresponds to the Transport Layer of OS!
Model.
TCP is a reliable and connection oriented protocol.
TCP offers:
Stream Data Transfer.
Reliability.
Efficient Flow Control
Full-duplex operation.
Data can be transmitted in both directions on a signal carrier at the same time
Multiplexing.
multiplexing is a method by which multiple analog or digital signals are combined
into one signal
over a shared medium
TCP offers connection oriented end-to-end packet delivery.
TCP ensures reliability by sequencing bytes with a forwarding acknowledgement
number that
indicates to the destination the next byte the source expect to receive.
It retransmits the bytes not acknowledged with in specified time period.
Internet Protocol (IP)
‘Internet Protocol is connectionless and unreliable protocol. It
ensures no guarantee of successfully transmission of data. In order to
make it reliable, it must be paired with reliable protocol such as TCP
at the transport layer.
* Points to remember:
* The length of datagram is variable.
* The Datagram is divided into two parts: header and data.
* The length of header is 20 to 60 bytes.
*The header contains information for routing and delivery of the
packet.
User Datagram Protocol (UDP)
* Like IP, UDP is connectionless and unreliable protocol. It doesn't
require making a connection with the host to exchange data. Since
UDP is unreliable protocol, there is no mechanism for ensuring that
data sent is received. UDP transmits the data in form of a datagram.
* Points to remember:
* UDP is used by the application that typically transmit small amount of
data at one time.
*UDP provides protocol port used i.e. UDP message contains both
source and destination port number, that makes it possible for UDP
software at the destination to deliver the message to correct
application program.
File Transfer Protocol (FTP)
* FTP is used to copy files from one host to another. FTP offers the
mechanism for the same in following manner:
* FTP creates two processes such as Control Process and Data Transfer
Process at both ends i.e. at client as well as at server.
* FTP establishes two different connections: one is for data transfer
and other is for control information.
* FTP uses port 21 for the contro! connection and Port 20 for the data
connection.
Trivial File Transfer Protocol (TFTP)
* Trivial File Transfer Protocol is also used to transfer the files but it
transfers the files without authentication. Unlike FTP, TFTP does not
separate control and data information. Since there is no authentication
exists, TFTP lacks in security features therefore it is not recommended to
use TFTP.
Key points
* TFTP makes use of UDP for data transport. Each TFTP message is carried in
separate UDP datagram.
* The first two bytes of a TFTP message specify the type of message.
* The TFTP session is initiated when a TFTP client sends a request to upload
or download a file.
* The request is sent from an ephemeral UDP port to the UDP port 69 of an
TFTP server.
World Wide Web
* The Web is the collection of machines (Web servers) on the Internet that provide
information,
* particularly HTML documents, via HTTP.
* Machines that access information on the Web are known as Web clients.
* A Web browser is software used by an end user to access the Web
Essential elements of the World Wide Web
are
* 1. web browsers - to surf the Web
* 2. server systems - to supply information to the browsers
* 3. computer networks — supports browser-server communication
What is a protocol and http?
* A Protocol is a standard procedure for defining and
regulating communication.
* Le. TCP, UDP, HTTP, ete. ==
* HTTP is the foundation of data communication for
the World Wide Web
* The HTTP is the Web's application-layer protocol for
transferring various forms of data between server and
client such as
* plaintext
* hypertext
* image
* videos and
* Sounds
* ec.
HTTP: Hypertext Transport Protocol
* HTTP is a form of communication protocol which specifies how web
clients and servers should communicate.
* The basic structure of HTTP follows a request-response model.
* A client always initiates a request message to the server; the server
generates a response message.
HTTP:
* It is the standard protocol for communication between web
browsers and web servers.
* It defines:
* how a client and server establish a connection,
* how the client requests data from the server
* how the server responds to that request
* how data is transferred from the server back to the client.
* and finally, how the connection is closed
* [t assumes very little about a particular system, and does not keep
state between different message exchanges.
* This makes HTTP a stateless protocol,
* The communication usually takes place over TCP/IP,
* The default port for TCP/IP is 80, but other ports can also be
used.
Hypertext Transport Protocol (HTTP)
* HTTP is based on the request-response communication model:
* Client sends a request
* Server sends a response
* HTTP is a stateless protocol:
* The protocol does not require the server to remember anything about the
client between requests.
* Ifa particular client asks for the same object twice in a period of a few
seconds, the server does not respond by saying that it just served the object
to the client; instead, the server resends the object, as it has completely
forgotten what it did earlier.
HTTP overview (continued)
ATTP is “stateless” ??
* Server maintains no information about past client requests
* Server does not remember any previous requests.
* Ifa particular client asks for the same object twice in a period of a few
seconds,
* the server does not respond by saying that it just served the object to
the client;
* instead, the server resends the object, as it has completely forgotten
what it did earlier,
* protocols that maintain “state” are complex!
* past history (state) must be maintained
« if server/client crashes, their views of “state” may be inconsistent, must be
reconciled
How http works?
* HTTP ts implemented in two programs:
* a client program and a server program,
executing on different end systems,
* talk to cach other by exchanging HTTP
Messages.
* The HTTP client first initiates a TCP %
connection with the server. Once the
connection is established, the browser and
the server processes access TCP through
their socket interfaces.
HTTP 5
* Normally implemented over a TCP connectior.,
* (BO is standard port number for HTTP)
* Typical browser-server interaction:
1. Client
* User enters Web address in browser
* Browser uses DNS to locate IP address
* Browser opens TCP connection to server
* Browser sends HTTP request over connection
2. Server
* Server sends HTTP response to browser over connection
3. Client
* Browser displays body of response in the client area of the browser
window
HTTP connections
non-persistent HTTP
* Al most one object sent over TCP connection, connection then closed
* Downloading multiple objects required multiple connections
* Separate TCP connection is needed to serve each resource (object).
persistent HTTP
* Multiple objects can be sent over single TCP connection between client and server
* Single TCP connection ts needed to serve multiple resources.
* Server leaves the connection open even after serving the request and closes
connection on
timeout.
Non-persistent HTTP
(contains text, and a
suppose user enters URL: references to style.css )
WWW. just.edu, jo/~zasharif/Web/S5E432/se0432 html
la. HTTP client initiates TCP
connection to HTTP server
(process) at lb. HTTP server at host
tomlin www justedujo waiting for TCP
‘aia s 30 _ connection at port 80. “accepts”
pe ee connection, notifying client
2. HTTP client sends HTTP requesr
message (containing URL) into
TCP connection socket. Mes 3, HTTP server receives request
indicates that client wanes object message, forms response message
from the folders: containing requested object, and
Web/SE432/se432. html Lo sends message into its socket
Non-persistent HTTP (cont.)_
we 4. HTTP server closes TCP
connection.
5 HTTP client receives response
message containing html file, displays
homl. Parsing hemi file, finds 10
referenced jpeg objects
6. Steps 1-5 repeated for each of the
referenced objects
HTTP Messages
* Request and Response Message Formats
Uri + Verb
Request
Response Pa)
Status Code + Message Body
HTTP request message: general format
* Structure of the request:
= wh Pp
start line/request
header field(s)
blank line/empty
optional body
header field name |ip) value | cr| if
Header lines
header field mame: jsp) value | er} If
Blank line: cr| Ef
Entity body———@§;
7
General format of an HTTP raquast massage
Example: HTTP request message
* HTTP request message:
* ASCII (human-readable format)
carmage retum character
lina-feed character
request line {
(GET, POST, et GET /-zashariiWeb/SE432/SE432 himl ATTP/1.14r\n
HEAD commands) | Host: www. just.edu.jo\r\n
User-Agent: Firefox/3.6.10\r\n
header Accept: text/html ,application/xhtml+aml1\r\n
: Accept-Lanquage: en-us,en;q=0,5\r\n
lin@S | Accept-Encoding: gzip,deflate\r\n
Accept-Charset: IS0-86859-1,utf-8;q=0.7\r\n
carnage return, Keep-Alive: 115\r\n
line feed at start oo keep-alive\r\n
——~ \r\n
of line indicates
and of header lines
HTTP Request: Start Line
* Start line
* Example: GET /~zasharif/Web/SE432/SE432. html HTTP/1.1
* Three space-separated parts:
* HTTP request method
* Request-URI (Uniform Resource Identifier)
* Request-URI is the portion of the requested URI that follows the host name (which
is
supplied by the required Host header field)
* In addition to http, some other URL schemes are https, ftp, mailto, and file
* HTTP version
* We will cover 1.1, in which version part of start line must be exactly as shown
HTTP Request: Common Request Methods
* GET
* Used if link is clicked or address typed in browser
* No message-body in request with GET method
* POST
* Used when submit button is clicked on a form
* Form information contained in the message-body of request
* HEAD
* Requests that only header fields (no body) be returned in the response
HTTP Request: Header field(s)
* Header field structure:
* Field-name t Field-value
* Syntax
* Field name is not case sensitive
* Ficld value mey continue on multiple lines by starting continuation lines with
white space
* Field values may contain MIME types, quality values, and wildcard
charaecters(*'s)
* Multipurpose Internet Mail Extensions (MIME)
* Standardized way to indicate the nature and format of a document (wt nec saa8)
* Convention for specifying content type of a message
* In HTTP, typically used to specify content type of the body of the response
* MIME content type syntax:
* top-level type / subtype
* Examples: text/himl, imageyjpeg
* Example header field with quality values:
accept: text/xml,text/html;q=0.9,text/plain;q=0.8, image/jpeg,
image/gif:q=0.2,*/*;q=0.1
HTTP Request: Common header fields
* Host: host name from URL (required)
* User-Agent: type of browser sending request
* Accept: MIME types of acceptable documents
* Connection: value close tells server to close connection after single
request/response
* Content-Type: MIME type of (POST) body, normally application/x-www-
form-urlencoded
* Content-Length: bytes in body
* Referrer: URL of document containing link that supplied URI for this
HTTP request
HTTP Response
* Structure of the response:
1. status line
2. header fleld(s)/lines
3. blank line
4. optional body
Status line.
Example: HTTP response message
—_— aR RV — ——
status line
(protocal
status i err 200 OK\r\n
status phrase) Date: Sun, 26 Sep 2010 20:09:20 GMT\r\n
Server: Apache/2.0.52 (CentOS) \r\n
Last-Modified: Tua, 30 Get 2007 17:00:02 GMT\r\n
Efag: “l7?de6-a5ec-b£716880"\r\n
header Accept-Ranges: bytes\r\n
lines Content-Length: 2652\r\n
Keep-Alive: timeout=10, max=L00\r\n
Connection: Keep-Alive\ri\n
Content-Type: text/html; charset=IS0-8859-1\r\n
\ri\n
data data data data data ...
data, ar a
requested
HTML file
HTTP Response: Common header fields
* Connection, Content-Type, Content-Length
* Date: date and time at which response was generated (required)
* Location: alternate URI if status is redirection
* Last-Modified: date and time the requested resource was last modified on the
server
* Expires: date and time after which the client's copy of the resource will be out-
of-
date
* ETag: a unique identifier (hashcode) for this version of the requested resource
(changes if resource changes)
HTTP Response: Status Codes
* The client can initiate requests to the server.
* In return, the server responds with status codes and message payloads
* (ie. inden inn)
* The status code is important and tells the client how to interpret the server
response
* The HTTP specification defines certain number ranges for specific types of
responses
* Three-digit number
* First digit is class of the status code:
* Lux: Informational Messages
* Dux: Successful
* Sxx: Redirection
© 4x: Client Error
* Sxx: Server Error
* Other two digits provide additional information
* See htto://www.w3.org/Protocols/rfic2616/rfic2616-sec10.html
And httos:/ /www.w3.org/Protocols/ric2616/ric2616.html
Common Status Codes
* 200 OR
* Everything worked, here's the data
301 Moved Pennanently
* requested object moved, new location specified later in this msg (Location:)
302 Moved temporarily
= URL temporanly out of service, keep the old one but use this one for now
400 Bad Request
* There is a syntax error in your request
403 Forbidden
* You can't do this, and we won't tell you why
404 Not Found
* No such document
408 Request Time-out, 504 Gateway Time-out
* Reguest took too long to fulfill for some reason
505 HTTP Version Not Supported
Web Client
* Web client is a software that accesses a web server by
sending an HTTP request message and processing the
resulting HTTP response.
* Web Browser is a typical web client.
* IE, Firefox, Safari, ete
* Browser-Wars
* En.wikipedia.org/wiki/Browser_wars
* Each company trying to add features and performance to its
browser in order to increase its market share.
Server
* The primary feature of every web server is to accept
HTTP request from web client and return an
appropriate resource (if available) in the HTTP
response.
|. Wait for connection requests from a client.
2. Receive an HTTP request.
3. Finding the requested file and creates an HTTP
response that contains the file in the body of the
response message.
It is a collection of web pages
It is a server on which web
application is executed
It is a software application
It is a physical entity that has
that has unique domain name | unique IP address
It can host many web pages __| It can host many websites
They communicate with web | They communicate with other
server servers such as DB server,
File server, etc
Web server = It receives request and gives
HTML&CSS +JS+DHTML_| corresponding response
_Ex: https://www.google.co.in_| Ex: IIS, Apache