Internetworking 2005
Internetworking 2005
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.04.26:13:06
Maguire
[email protected]
Cover.fm5
2005.04.26
Total pages: 1
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters 1-5
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.19:01:23
Maguire
[email protected]
Introduction.fm5
2005.05.19
Total pages: 74
Maguire
[email protected]
Introduction 3 of 74
Internetworking/Internetteknik
To be annouced
Administrative Assistant: recording of grades, registration, etc.
Maguire
[email protected]
Introduction 4 of 74
Internetworking/Internetteknik
To give deep knowledge and competence (designing, analyzing, and developing) of Internet protocols and architecture, both practical and analytical. To be able to read and understand the Internet standardization documents (IETF RFCs and Internet Drafts) and current Internet literature. You should have the knowledge and competence to do exciting Internet related research and development.
Scope and Method
Dig deeper into the TCP/IP protocol suite by using diagnostic tools to examine, observe, and analyze these protocols in action. Understanding the details! Demonstrate this by writing a written report (and passing the exam).
Maguire
[email protected]
Introduction 5 of 74
Internetworking/Internetteknik
Aim
After this course you should be able to read the current internet literature at the level of IEEE Communications Magazine, IEEE Network, IEEE Transactions on Communications, IEEE Transactions on Communications, IEEE Journal on Selected Areas in Communications, IEEE/ACM Transactions on Networking, IEEE Communications Surveys (On-line Journal), . See the IEEE Communication Societys list of publications. While you may not be able to understand all of the articles in the above journals and magazines, you should be able to read 90% or more of the articles and have good comprehension. You should develop a habit of reading the journals, trade papers, etc. You should be able to write internetworking articles at the level of Miller Freemans Network Magazine or IEEE Internet Computing. In subsequent courses you will also develop you ability to orally present your ideas.
Maguire
[email protected]
Aim
2005.05.19
Introduction 6 of 74
Internetworking/Internetteknik
Prerequisites
Datorkommunikation och datornt/Data and computer communication or Equivalent knowledge in Computer Communications (this requires permission of the instructor)
Maguire
[email protected]
Prerequisites
2005.05.19
Introduction 7 of 74
Internetworking/Internetteknik
Contents
This course will focus on the protocols that are the fundaments of the Internet. We will explore what internetworking means and what it requires. We will give both practical and more general knowledge concerning the Internet network architecture. The course consists of 18 hours of lectures and 18 hours of recitations (vningar) [possible some laboratory exercises].
Maguire
[email protected]
Contents
2005.05.19
Introduction 8 of 74
Internetworking/Internetteknik
Topics
What an internet is and what is required of protocols to allow internetworking details of routing and routing protocols (RIP, BGP, OSPF, ) multicasting Domain Name System (DNS, Dynamic DNS) what happens from the time a machine boots until the applications are running (RARP, BOOTP, DHCP, TFTP) details of the TCP protocols and some performance issues details of a number of application protocols (especially with respect to distributed le systems) network security (including rewalls, AAA, IPSec, SOCKs, ) differences between IPv6 and IPv4 network management (SNMP) and We will also examine some emerging topics:
cut-through routing, tag switching, ow switching, QoS, Mobile IP, Voice over IP, SIP, NAT, VPN, Diffserv, .
Maguire
[email protected]
Topics
2005.05.19
Introduction 9 of 74
Internetworking/Internetteknik
Examination requirements
Written examination (3 p)
based on literature, lectures, and recitations
Written assignments (1 p)
based on lectures, recitations, and your references
Grades: U, 3, 4, 5
Maguire
[email protected]
Examination requirements
2005.05.19
Introduction 10 of 74
Internetworking/Internetteknik
Written Assignment
Goal: to gain analytical or practical experience and to show that you have mastered some Internetworking knowledge (in addition to what you show on the written examination). Can be done in a group of 1 to 3 students (formed by yourself). Each student must contribute to the nal report. There will be one or more suggested topics, additional topics are possible (discuss this with one of the teachers before starting).
Maguire
[email protected]
Written Assignment
2005.05.19
Introduction 11 of 74
Internetworking/Internetteknik
Final Report: May 25, 2005 Send email with URL link to a PDF le to <[email protected]> Late assignments will not be accepted (i.e., there is no guarantee that they will graded in time for the end of the term) Note that it is pemissible to start working well in advance of the deadlines!
Maguire
[email protected]
Introduction 12 of 74
Internetworking/Internetteknik
Literature
The course will mainly be based on the book: Behrouz A. Forouzan, TCP/IP Protocol Suite, 3rd edition, McGraw-Hill, publication date January 2005, (Copyright 2006) 896 pages, ISBN 0072967722 (hardbound) or 0071115838 (softbound) Other additional references include: TCP/IP Illustrated, Volume 1: The Protocols by W. Richard Stevens, Addison-Wesley, 1994, ISBN 0-201-63346-9 and Internetworking with TCP/IP: Principles, Protocols, and Architectures, Vol. 1, by Douglas E. Comer, Prentice Hall, 4th edt. 2000, ISBN 0-13-018380-6. the commented source code in TCP/IP Illustrated, Volume 2: The Implementation by Gary R. Wright and W. Richard Stevens, Addison-Wesley, 1995, ISBN 0-201-63354-X IPv6: The New Internet Protocol, by Christian Huitema, Prentice-Hall, 1996, ISBN 0-13-241936-X.
Maguire
[email protected]
Literature
2005.05.19
Introduction 13 of 74
Internetworking/Internetteknik
concerning HTTP we will refer to TCP/IP Illustrated, Volume 3: TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols, Addison-Wesley, 1996, ISBN 0-201-63495-3. With regard to Mobile IP the following two books are useful as additional sources: Mobile IP: Design Principles and Practices by Charles E. Perkins, Addison-Wesley, 1998, ISDN 0-201-63469-4. Mobile IP: the Internet Unplugged by James D. Solomon, Prentice Hall, 1998, ISBN 0-13-856246-6. Internetworking Technologies Handbook by Kevin Downes (Editor), H. Kim Lew, Steve Spanier, Tim Stevenson (Online: http://www-fr.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/index.htm) We will refer to other books, articles, and RFCs as necessary. In addition, there will be compulsory written exercises.
Maguire
[email protected]
Literature
2005.05.19
Introduction 14 of 74
Internetworking/Internetteknik
Lecture Plan
Subject to revision! Lecture 1: Introduction and IP addressing Lecture 2: Basic routing, ARP, and basic IP Lecture 3: ICMP and User Datagram Protocol (UDP) Lecture 4: TCP Lecture 5: Dynamic Routing Lecture 6: IP Multicast and Autoconguration Lecture 7: Applications & Network Management Lecture 8: IPv6 and Mobile IP Lecture 9: Internet Security, VPNs, Firewalls, and NAT Lecture 10: Future Issues and Summary
Maguire
[email protected]
Lecture Plan
2005.05.19
Introduction 15 of 74
Internetworking/Internetteknik
1. http://www.lucent.com/enterprise/sig/exchange/present/slide2.html
Maguire
[email protected]
Introduction 16 of 74
Internetworking/Internetteknik
Network Architecture
WLAN MH
AP
FDDI R
switch
switch
Token Ring R
R
Ethernet LANs
WAN
switch R switch
R IWU MH BTS
BSC
MSC HLR/VLR
MH
Ad hoc
Cellular networks
PAN MH
MH
Note that some of the routers act as gateways between different types of networks.
Maguire
[email protected]
Network Architecture
2005.05.19
Introduction 17 of 74
Internetworking/Internetteknik
Introduction 18 of 74
Internetworking/Internetteknik
Internet Trends
Numbers of users and internet devices increases very rapidly
Network Wizards Internet Domain Survey - http://www.isc.org/index.pl?/ops/ds/ Jan. 2005: 317,646,084 hosts July 2001: 125,888,197 hosts RIPEs survey European hosts : Jan. 2005: 25,389,171 real hosts of which 1,521,328 are in Sweden vs. Nov. 2001: 33,866,411 total & 816,961 are in Sweden {Note that the above estimates are based on DNS information} Network Weather Maps - http://www.cybergeography.org/atlas/weather.html
QoS: Demand for integrating many different types of trafc, such as video, audio, and data trafc, into one network Multicast, IPv6, RSVP, DiffServ, emphasis on high performance, and TCP extensions (we will examine a number of these in this course) Mobility: both users and devices are mobile
There is a difference between portable (brbar) vs. mobile (mobil). IP is used in wireless systems (for example 3G cellular). Increasing use of wireless in the last hop (WLAN, PAN, Wireless MAN, )
Security:
Wireless mobile Internet - initial concern driven by wireless link Fixed Internet - distributed denial of service attacks, increasing telecommuting,
Maguire
[email protected]
Internet Trends
2005.05.19
Introduction 19 of 74
Internetworking/Internetteknik
trafc between US and Sweden many times the total voice+FAX trafc 60Gbit/s transatlantic ber
Fixed Links - arbitrarily fast:
LANs:
10Mbits/s, 100Mbits/s, 1Gbits/s, 10Gbits/s, Backbones: 45 Mbits/s or 34 Mbits/s 155 Mbit/s, 662 Mbit/s, and Gigabits/s Transoceanic bers between continents Gbit/s Tbit/s Major sites link to backbones: T1 (1.5 Mbits/sec) or E1 rate links (2 Mbits/sec) increasingly 10+Mbit/s to Gbit/s Individual users links: 28.8 Kbits/s and ISDN (128Kbits/s) ethernet and xDSL (Mbits/s .. ~51 Mbits/s in the fast direction)
Points of Presence (PoPs) + FIX/CIX/GIX/MAE1 GigaPoPs
(George) Guilders Law states that network speeds will triple every year for the next 25 years. This dwarfs Moores law that predicts CPU processor speed will double every 18 months.
1. Federal Internet eXchange (FIX), Commercial Internet eXchange (CIX), Global Internet eXchange (GIX), Metropolitan Access Exchange (MAE)
Maguire
[email protected]
Introduction 20 of 74
Internetworking/Internetteknik
Speed
... The Internet world moves fast. The integration of voice and data onto a single network is not being lead by the International Telecommunications Union or by Bellcore. Rather, its being lead by entrepreneurs like . Until now, the voice networks dominated. Data could ride on top of the phone network -- when it was convenient. The explosion of data networking and Internet telephony technology is making the opposite true. Now voice can ride on data networks -- when it is convenient.1
Because of bandwidth constraints, Internet telephony would not be a major factor for a long time -- maybe nine to twelve months. -- president of a major ISP2 Internet time - 7x real time -- Ira Goldstein, HP
1. from http://www.dialogic.com/solution/internet/apps.htm 2. from http://www.dialogic.com/solution/internet/apps.htm
Maguire
[email protected]
Speed
2005.05.19
Introduction 21 of 74
Internetworking/Internetteknik
Growth rates
Some people think the Internet bandwidth explosion is relatively recent, but right from the beginning its been a race against an ever-expanding load. It isnt something you can plan for. In fact, the notion of long-range planning like the telcos do is almost comical. Just last month, a local carrier asked us why we didnt do five-year plans, and we said, We do-about once a month!
-- Mike ODell1 VP and Chief Technologist UUNET Mike points out that the growth rate of the Internet is driven by the increasing speed of computers, while telcos have traffic which was proportional to the growth in numbers of people (each of whom could only use a very small amount of bandwidth).
by 1997 UUNET was adding at least one T3/day to their backbone
1. from http://www.data.com/25years/mike_odell.html
Maguire
[email protected]
Growth rates
2005.05.19
Introduction 22 of 74
Internetworking/Internetteknik
Question?
Which would you rather have twice as fast: your computers processor or modem? After 30 years of semiconductor doublings under Moores Law, processor speed are measured in megahertz. On the other hand, after 60 years of telcos snoozing under monopoly law, modem speeds are measure in kilobits. Modems are way too slow for Internet access, but you knew that.1
-- Bob Metcalfe, inventor of Ethernet in 1973
1. From the Ether: Moving intelligence and Java Packets into the Net will conserve bandwidth, by Bob Metcalfe, Inforworld, Oct., 6, 1997, pg. 171.
Maguire
[email protected]
Question?
2005.05.19
Introduction 23 of 74
Internetworking/Internetteknik
3 Mbps Ethernet (actually 2.944 Mbits/sec) 10 Mbps Ethernet (which became 802.3) 100 Mbps Ethernet (100Tx) Gigabit Ethernet (802.3z, 802.3ab) 10 Gbps Ethernet (IEEE 802.3ae)
Optical
Dense Wavelength Division Multiplexing (DWDM) - allowing 1000s of multi-Gbits/s channels to be carried on existing bers
Wireless
802.11 Wireless LAN (2 .. 54 Mbits/s) 802.15 Wireless Personal Area Network (WPAN) 802.16 Metropolitan Area Networks - Fixed Broadband Wireless (10 .. 66 GHz)
Maguire
[email protected]
Introduction 24 of 74
Internetworking/Internetteknik
Internetworking
Internetworking is based on the interconnection (concatenation) of multiple networks accommodates multiple underlying hardware technologies by providing a way to interconnect heterogeneous networks and makes them inter-operate. We will concern ourselves with one of the most common internetworking protocols IP (there are other internetworking protocols, such as Novells Internetwork Packet Exchange (IPX), Xerox Network Systems (XNS), IBMs Systems Network Architecture (SNA), OSIs ISO-IP). We will examine both IP: version 4 - which is in wide use version 6 - which is coming into use Internet: the worldwide internet
Maguire
[email protected]
Internetworking
2005.05.19
Introduction 25 of 74
Internetworking/Internetteknik
Local
ISP
Regional ISP
Local
ISP
Regional ISP
Introduction 26 of 74
Internetworking/Internetteknik
Basic concepts
open-architecture networking [1],[2] Each distinct network stands on its own makes its own technology choices, etc.
no changes within each of these networks in order to internet Based on best-effort delivery of datagrams Gateways interconnect the networks No global control
Some basic design principle for the Internet: Specic application-level functions should not be built into the lower levels Functions implemented in the network should be simple and general. Most functions are implemented (as software) at the edge
complexity of the core network is reduced increases the chances that new applications can be easily added.
See also [5], [6] Hourglass (Stuttgart wineglass) Model Anything over IP IP over anything Note the broad (and open) top - enabling lots and lots of application
WWW e-mail HTTP RTP TCP UDP
IP
Ethernet PPP Copper Fiber Radio
Maguire
[email protected]
Basic concepts
2005.05.19
Introduction 27 of 74
Internetworking/Internetteknik
Review of Layering
user process user process user process user process Applications
TCP
UDP
Transport
ICMP
IP
IGMP
Network
ARP
Hardware Interface
RARP
Link
media Figure 2: protocol layers in the TCP/IP protocol suite (see Stevens, Volume 1, figure 1.4, pg. 6)
Maguire
[email protected]
Review of Layering
2005.05.19
Introduction 28 of 74
Internetworking/Internetteknik
Encapsulation
user data Appl user data header TCP header 20 IP header 20 Ethernet header 14 Applications
application data
Transport
transport data
Network
Ethernet trailer 4
Link
Figure 3: Encapsulation of data (see Stevens, Volume 1, figure 1.7, pg. 10)
Maguire
[email protected]
Encapsulation
2005.05.19
Introduction 29 of 74
Internetworking/Internetteknik
Demultiplexing
user process user process user process user process Demux on TCP or UDP port number
TCP UDP
user process
user process
UDP
ARP
Driver
RARP
incoming frame - accepted by matching address or multicast address Figure 4: Demultiplexing (adapted from Stevens, Volume 1, figure 1.8, pg. 11; with dual IP stacks)
Maguire
[email protected]
Demultiplexing
2005.05.19
Introduction 30 of 74
Internetworking/Internetteknik
Addresses in TCP/IP
Transport layer
Port number
Network layer
IP address Protocol
Maguire
[email protected]
Addresses in TCP/IP
2005.05.19
Introduction 31 of 74
Internetworking/Internetteknik
4 bit version
8-bit Protocol 32 bit Source IP address 32 bit Destination IP address options (padded to 32 bit length) data
The fields: Version, Protocol, and Source & Destination IP addresses are all used for demultiplexing the incoming IP packet.
We will rst examine version 4, then later in the course version 6.
Maguire
[email protected]
Introduction 32 of 74
Internetworking/Internetteknik
20 bytes
16 bit identication
References
0 1 2 3 4 5 6 7 8 9 10 11 12
Maguire
[email protected]
HOPOPT ICMP IGMP GGP IP ST TCP CBT EGP IGP BBN-RCC-MON NVP-II PUP
IPv6 Hop-by-Hop Option Internet Control Message Internet Group Management Gateway-to-Gateway IP in IP (encapsulation) Stream Transmission Control CBT Exterior Gateway Protocol any private interior (e.g., used by Cisco for their IGRP) BBN RCC Monitoring Network Voice Protocol PUP
IP Protocol eld (RFC 1700)
2005.05.19
[RFC1883] [RFC792] [RFC1112] [RFC823] [RFC2003] [RFC1190,RFC1819] [RFC793] [Ballardie] [RFC888,DLM1] [IANA] [SGC] [RFC741,SC3] [PUP,XEROX]
Introduction 33 of 74
Internetworking/Internetteknik
Decimal
Keyword
Protocol
References
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Maguire
[email protected]
ARGUS EMCON XNET CHAOS UDP MUX DCN-MEAS HMP PRM XNS-IDP TRUNK-1 TRUNK-2 LEAF-1 LEAF-2 RDP IRTP ISO-TP4 NETBLT MFE-NSP MERIT-INP SEP 3PC IDPR
ARGUS EMCON Cross Net Debugger Chaos User Datagram Multiplexing DCN Measurement Subsystems Host Monitoring Packet Radio Measurement XEROX NS IDP Trunk-1 Trunk-2 Leaf-1 Leaf-2 Reliable Data Protocol Internet Reliable Transaction ISO Transport Protocol Class 4 Bulk Data Transfer Protocol MFE Network Services Protocol MERIT Internodal Protocol Sequential Exchange Protocol Third Party Connect Protocol Inter-Domain Policy Routing Protocol
IP Protocol eld (RFC 1700)
2005.05.19
[RWS4] [BN7] [IEN158,JFH2] [NC3] [RFC768,JBP] [IEN90,JBP] [DLM1] [RFC869,RH6] [ZSU] [ETHERNET,XEROX] [BWB6] [BWB6] [BWB6] [BWB6] [RFC908,RH6] [RFC938,TXM] [RFC905,RC77] [RFC969,DDC1] [MFENET,BCH2] [HWB] [JC120] [SAF3] [MXS1]
Introduction 34 of 74
Internetworking/Internetteknik
Decimal
Keyword
Protocol
References
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
Maguire
[email protected]
XTP DDP IDPR-CMTP TP++ IL IPv6 SDRP IPv6-Route IPv6-Frag IDRP RSVP GRE MHRP BNA ESP AH I-NLSP SWIPE NARP MOBILE TLSP SKIP
XTP Datagram Delivery Protocol IDPR Control Message Transport Proto TP++ Transport Protocol IL Transport Protocol Ipv6 Source Demand Routing Protocol Routing Header for IPv6 Fragment Header for IPv6 Inter-Domain Routing Protocol Reservation Protocol General Routing Encapsulation Mobile Host Routing Protoco BNA Encap Security Payload for IPv6 Authentication Header for IPv6 Integrated Net Layer Security TUBA IP with Encryption NBMA Address Resolution Protocol IP Mobility Transport Layer SecurityProtocol (using Kryptonet key management) SKIP
IP Protocol eld (RFC 1700)
2005.05.19
[GXC] [WXC] [MXS1] [DXF] [Presotto] [Deering] [DXE1] [Deering] [Deering] [Sue Hares] [Bob Braden] [Tony Li] [David Johnson] [Gary Salamon] [RFC1827] [RFC1826] [GLENN] [JI6] [RFC1735] [Perkins] [Oberg] [Markson]
Introduction 35 of 74
Internetworking/Internetteknik
Decimal
Keyword
Protocol
References
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
Maguire
[email protected]
IPv6-ICMP IPv6-NoNxt IPv6-Opts CFTP SAT-EXPAK KRYPTOLAN RVD IPPC SAT-MON VISA IPCV CPNX CPHB WSN PVP BR-SAT-MON SUN-ND WB-MON WB-EXPAK ISO-IP
ICMP for IPv6 No Next Header for IPv6 Destination Options for IPv6 any host internal protocol CFTP any local network SATNET and Backroom EXPAK Kryptolan MIT Remote Virtual Disk Protocol Internet Pluribus Packet Core any distributed le system SATNET Monitoring VISA Protocol Internet Packet Core Utility Computer Protocol Network Executive Computer Protocol Heart Beat Wang Span Network Packet Video Protocol Backroom SATNET Monitoring SUN ND PROTOCOL-Temporary WIDEBAND Monitoring WIDEBAND EXPAK ISO Internet Protocol
IP Protocol eld (RFC 1700)
2005.05.19
[RFC1883] [RFC1883] [RFC1883] [IANA] [CFTP,HCF2] [IANA] [SHB] [PXL1] [MBG] [SHB] [IANA] [SHB] [GXT1] [SHB] [DXM2] [DXM2] [VXD] [SC3] [SHB] [WM3] [SHB] [SHB] [MTR]
Introduction 36 of 74
Internetworking/Internetteknik
Decimal
Keyword
Protocol
References
VMTP SECURE-VMTP VINES TTP NSFNET-IGP DGP TCF EIGRP OSPFIGP Sprite-RPC LARP MTP AX.25 IPIP MICP SCC-SP ETHERIP ENCAP GMTP IFMP PNNI PIM
VMTP SECURE-VMTP VINES TTP NSFNET-IGP Dissimilar Gateway Protocol TCF EIGRP OSPFIGP Sprite RPC Protocol Locus Address Resolution Protocol Multicast Transport Protocol AX.25 Frames IP-within-IP Encapsulation Protocol Mobile Internetworking Control Pro. Semaphore Communications Sec. Pro. Ethernet-within-IP Encapsulation Encapsulation Header any private encryption scheme GMTP Ipsilon Flow Management Protocol PNNI over IP Protocol Independent Multicast
IP Protocol eld (RFC 1700)
2005.05.19
[DRC3] [DRC3] [BXH] [JXS] [HWB] [DGP,ML109] [GAL5] [CISCO,GXS] [RFC1583,JTM4] [SPRITE,BXW] [BXH] [SXA] [BK29] [JI6] [JI6] [HXH] [RXH1] [RFC1241,RXB3] [IANA] [RXB5] [Hinden] [Callon] [Farinacci]
Introduction 37 of 74
Internetworking/Internetteknik
Decimal
Keyword
Protocol
References
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
Maguire
[email protected]
ARIS SCPS QNX A/N IPComp SNP Compaq-Peer IPX-in-IP VRRP PGM L2TP DDX IATP STP SRP UTI SMP SM PTP ISIS FIRE CRTP
ARIS SCPS QNX Active Networks IP Payload Compression Protocol Sitara Networks Protocol Compaq Peer Protocol IPX in IP Virtual Router Redundancy Protocol PGM Reliable Transport Protocol any 0-hop protocol Layer Two Tunneling Protocol D-II Data Exchange (DDX) Interactive Agent Transfer Protocol Schedule Transfer Protocol SpectraLink Radio Protocol UTI Simple Message Protocol SM Performance Transparency Protocol over IPv4 Combat Radio Transport Protocol
IP Protocol eld (RFC 1700)
2005.05.19
[Feldman] [Durst] [Hunter] [Braden] [RFC2393] [Sridhar] [Volpe] [Lee] [Hinden] [Speakman] [IANA] [Aboba] [Worley] [Murphy] [JMP] [Hamilton] [Lothberg] [Ekblad] [Crowcroft] [Welzl] [Przygienda] [Partridge] [Sautter]
Introduction 38 of 74
Internetworking/Internetteknik
Decimal
Keyword
Protocol
References
127 128 129 130 131 132 133 134 136 137 138-252 253 254 255
Secure Packet Shield Private IP Encapsulation within IP Stream Control Transmission Protocol Fibre Channel
Unassigned Use for experimentation and testing Use for experimentation and testing Reserved
[Sautter] [Waber] [Hollbach] [McIntosh] [Petri] [Stewart] [Rajagopal] [RFC3175] [RFC3828] [RFC-ietf-mpls-in-ip-or-g re-08.txt] [IANA] [RFC3692] [RFC3692] [IANA]
There are 4 fewer available protocol numbers than the course in 2003 and 41 fewer since the course in 1999.
Maguire
[email protected]
Introduction 39 of 74
Internetworking/Internetteknik
Maguire
[email protected]
Introduction 40 of 74
Internetworking/Internetteknik
IP
Introduction 41 of 74
Internetworking/Internetteknik
7 9 13 19
7 9 13 19
server returns what the client sends server discards what the client sends Server returns the time and date in a human readable format TCP server sends a continual stream of character, until the connection is terminated by the client. UDP server sends a datagradm containing a random number of characters each time the client sends a datagram. File Transfer Protocol (Data) File Transfer Protocol (Control) Virtual Terminal Protocol Simple Mail Transfer Protocol
20 21 23 25 37 37 868
Server returns the time as a 32-bit binary number. This number is the time in seconds since 1 Jan. 1990, UTC
Maguire
[email protected]
Introduction 42 of 74
Internetworking/Internetteknik
Link Layer
Possible link layers include: Ethernet and IEEE 802.3 Encapsulation
with possible Trailer Encapsulation
SLIP: Serial Line IP CSLIP: Compress SLIP PPP: Point to Point Protocol Loopback Interface Virtual Interface carrier pigeons - CPIP (Carrier Pigeon Internet Protocol) April 1st 1990, RFC 1149 was written. A protocol for IP over avian carriers. Implementation (April 28 2001): http://www.blug.linux.no/rfc1149/
Some of the issues concerning links are: MTU and Path MTU Serial line throughput
Maguire
[email protected]
Link Layer
2005.05.19
Introduction 43 of 74
Internetworking/Internetteknik
repeater layer 1
Router layer3 switch bridge
subnet5
subnet2
subnet3
layer 2 domain
subnet4
Network domain Introduction 44 of 74
Internetworking/Internetteknik
Connecting Devices
Connecting Devices Networking Devices Internetworking Devices
Repeater Layer 1
Bridge Layer 2
Router Layer 3
Ethernet hub = a multiport repeater Ethernet switch = a multiport bridge Layer 3 switch = combines functions of an ethernet switch and a router
Maguire
[email protected]
Connecting Devices
2005.05.19
Introduction 45 of 74
Internetworking/Internetteknik
LAN Protocols
Data link Layer LLC Sublayer MAC Sublayers Ethernet IEEE 802.2
Physical Layer
OSI Layers
LAN specications
Maguire
[email protected]
LAN Protocols
2005.05.19
IEEE 802.3
0800
2 TYPE 2 TYPE PAD 0806 ARP request/reply 28 RARP request/reply 28 18 PAD 18
0835
2
46-1500 bytes
Figure 8: Ethernet encapsulation (see Stevens, Volume 1, figure 2.1, pg. 23) DST = Destination MAC Address, SRC = Source MAC Address (both are 48 bits in length); TYPE = Frame Type; CRC = Cyclic Redundancy Check, i.e., checksum
Maguire
[email protected]
Introduction 47 of 74
Internetworking/Internetteknik
0800
2 TYPE 2 TYPE
0835
2
46-1500 bytes
DSAP Destination Service Access Point; SSAP Source Service Access Point; SNAP Sub-Network Access Protocol; for other TYPE values see RFC1700.
Maguire
[email protected]
Introduction 48 of 74
Internetworking/Internetteknik
Link Service Access Point IEEE binary Internet binary decimal 00000000 00000000 0 01000000 00000010 2 11000000 00000011 3 00100000 00000100 4 01100000 00000110 6 01110000 00001110 14 01110010 01001110 78 01111010 01011110 94 01110001 10001110 142 01010101 10101010 170 01111111 11111110 254 11111111 11111111 255
Description
References
Null LSAP Individual LLC Sublayer Mgt Group LLC Sublayer Mgt SNA Path Control Reserved (DOD IP) PROWAY-LAN EIA-RS 511 ISI IP PROWAY-LAN SNAP ISO CLNS IS 8473 Global DSAP
[IEEE] [IEEE] [IEEE] [IEEE] [RFC76] [IEEE] [IEEE] [JBP] [IEEE] [IEEE] [RFC926] [IEEE]
Maguire
[email protected]
Introduction 49 of 74
Internetworking/Internetteknik
c0
1
db
1
END db dc
1 1
db dd
1 1
c0
1
Figure 10: SLIP Encapsulation (see Stevens, Volume 1, figure 2.2, pg. 25)
RFC 1055: Nonstandard for transmission of IP datagrams over serial lines: SLIP SLIP uses character stuffing, SLIP ESC character 0xdb SLIP END character 0xc0 point to point link, no IP addresses need to be sent there is no TYPE eld, you can only be sending IP, i.e., cant mix protocols there is no CHECKSUM, error detection has to be done by higher layers
Maguire
[email protected]
Introduction 50 of 74
Internetworking/Internetteknik
CSLIP (RFC 1144: Compressing TCP/IP headers for low-speed serial links, by Van Jacobson) reduces the header to 3-5 bytes, by: trying to keep response time under 100-200ms keeping state about ~16 TCP connections at each end of the link
the 96-bit tuple <src address, dst address, src port, dst port> reduced to 4 bits
many header elds rarely change - so dont transmit them some header elds change by a small amount - just send the delta no compression is attempted for UDP/IP a 5 byte compressed header on 100-200 bytes 95-98% line efciency
SLIP Problems CSLIP Compressed SLIP
2005.05.19
Maguire
[email protected]
Introduction 51 of 74
Internetworking/Internetteknik
Maguire
[email protected]
Introduction 52 of 74
Internetworking/Internetteknik
Family of Network Control Protocols (NCPs) - specic to different network protocols, currently:
IP (see RFC 1332) DECnet (see RFC 1376) OSI network layer (see RFC 1377) AppleTalk (see RFC 1378) XNS (see RFC 1764)
See PPP Design, Implementation, and Debugging, by James D. Carlson, Second edition, Addison-Wesley,2000, ISBN 0-201-70053-0.
Maguire
[email protected]
Introduction 53 of 74
Internetworking/Internetteknik
PPP frames
FLAG ADDR CNTLprotocol FF 7E 03 1 1 1 2 protocol data upto 1500 bytes IP datagram CRC 2 FLAG 7E 1
0021
2 protocol 2 protocol
8021
2
Figure 11: Format of PPP frame (see Stevens, Volume 1, figure 2.3, pg. 26)
The protocol eld behaves like the Ethernet TYPE eld. CRC can be used to detect errors in the frame. Either character or bit stufng is done depending on the link. you can negotiate away the CNTL and ADDRESS elds, and reduce the protocol eld to 1 byte minimum overhead of 3 bytes Van Jacobson header compression for IP and TCP
Maguire
[email protected]
PPP frames
2005.05.19
Introduction 54 of 74
Internetworking/Internetteknik
PPP summary
support for multiple protocols on a link CRC check on every frame dynamic negociation of IP address of each end header compression (similar to CSLIP) link control with facilities for negotiating lots of data-link options
Maguire
[email protected]
PPP summary
2005.05.19
Introduction 55 of 74
Internetworking/Internetteknik
Loopback interface
IP output function dispatch based on interface IP input function Ethernet driver destination IP broadcast or multicast no yes destination IP equal to interface address no yes ARP Loopback interface ARP place on IP input queue
receive send Ethernet Figure 12: Processing of IP Datagrams (adapted from Stevens, Volume 1, figure 2.4, pg. 28)
Maguire
[email protected]
Loopback interface
2005.05.19
Introduction 56 of 74
Internetworking/Internetteknik
Maguire
[email protected]
Introduction 57 of 74
Internetworking/Internetteknik
process packet
Virtual interface
Loopback interface
Ethernet driver send Ethernet Figure 13: Processing of network packets via a Virtual Interface
Maguire
[email protected]
Introduction 58 of 74
Internetworking/Internetteknik
tunneling
Figure 14: Using a Virtual Interface for Tunneling (IP in IP) adapted from John Ioannidiss thesis
Maguire
[email protected]
Introduction 59 of 74
Internetworking/Internetteknik
IP addresses
Address types Unicast = one-to-one Multicast = one-to-many Broadcast = one-to-all 32 bit address divided into two parts:
NetID
Host ID
Figure 15: IP address format
Note that although we refer to it as the Host ID part of the address, it is really the address of an interface. Dotted decimal notation: write each byte as a decimal number, separate each of these with a . i.e., 10000010 11101101 00100000 00110011 130.237.32.51 or in hexadecimal as: 0x82ED2033
Maguire
[email protected]
IP addresses
2005.05.19
Introduction 60 of 74
Internetworking/Internetteknik
Classful addressing
Classically the address range was divided into classes:
Class NetID Range (dotted decimal notation) host ID
A B C D E
0.0.0.0
to 127.255.255.255
24 bits of host ID 16 bits of host ID 8 bits of host ID 28 bits of Multicast address Reserved for future use
addresses roughly 27*224 + 214*216 + 221*28 = 3,758,096,384 interfaces (not the number of hosts) in 1983 this seemed like a lot of addresses problems with the size of the blocks lots of wasted addresses
lead to classless addressing!
Maguire
[email protected]
Classful addressing
2005.05.19
Introduction 61 of 74
Internetworking/Internetteknik
NetID
SubnetID
Figure 16: IP addresses and subnetting
Host
Although the Subnet field is shown as a field which is separate from the Host field, it could actually be divided on a bit by bit basis; this is done by a Subnet Mask. A common practice to avoid wasting large amounts of address space is to use Classless Interdomain Routing (CIDR) also called supernetting {see 10.8 of Stevens Vol. 1 and RFCs 1518 and 1519}.
Maguire
[email protected]
Introduction 62 of 74
Internetworking/Internetteknik
0 hostid any -1 -1 -1 -1
never never OK OK OK OK OK
this host on this net specied host on this net loopback address limited broadcast (never forwarded) net-directed broadcast to netid subnet-directed broadcast to netid, subnetid all-subnets-directed broadcast to netid
Thus for every subnet - the zero host ID address refers to this net and the all ones host ID is a subnet broadcast address; this uses up two addresses from every subnets address range.
Maguire
[email protected]
Introduction 63 of 74
Internetworking/Internetteknik
Subnet mask
32 bit value with a 1 for NetID + subnetID, 0 for HostID 16 bits NetID 1111 1111 1111 1111 8 bits SubnetID 1111 1111 8 bits HostID 0000 0000
2 different class B subnet arrangements 16 bits NetID 1111 1111 1111 1111 16 bits NetID 1111 1111 1111 1111
Maguire
[email protected]
Address mask
Notes
Address mask
Notes
/0 /1 /2 /3 /4 /5 /6 /7
Class A
Maguire
[email protected]
Introduction 65 of 74
Internetworking/Internetteknik
Address mask
Notes
Address mask
Notes
Class B
Class C
Maguire
[email protected]
Introduction 66 of 74
Internetworking/Internetteknik
IP address assignments
Internet Service Providers (ISPs) should contact their upstream registry or their appropriate Regional Internet Registries (RIR) at one of the following addresses:
Region
APNIC (Asia-Pacic Network Information Center) ARIN (American Registry for Internet Numbers ) RIPE NCC (Reseau IP Europeens)
Maguire
[email protected]
IP address assignments
2005.05.19
Introduction 67 of 74
Internetworking/Internetteknik
This is bad for mobility and multi-homing (see textbook figure 4.12 on pg. 95)
If a host changes its point of network attachment it must change its identity Later we will see how Mobile IP addresses this problem Host with multiple interfaces are limited in how they can use them Later we will see how SCTP addresses part of this problem
The result has been that multiple and dynamic addresses are difficult to handle and lead to a number of efforts to rethink how addresses are used.
Maguire
[email protected]
Introduction 68 of 74
Internetworking/Internetteknik
We will discuss these commands in more detail in following lectures and in the recitations.
Maguire
[email protected]
Introduction 69 of 74
Internetworking/Internetteknik
Standardization Organizations
The most relevant to the Internet are: Internet Society (ISOC)
Internet Engineering Task Force (IETF)
World-wide-web consortium (W3C) International Standards Organization (ISO) International Telecommunications Union - Telecommunication Standards Sector (ITU-T) Institute of Electrical and Electronics Engineers (IEEE) Read in the textbook sections 1.4 and 1.5.
Maguire
[email protected]
Standardization Organizations
2005.05.19
Introduction 70 of 74
Internetworking/Internetteknik
Summary
Course Introduction Internet Basics
Multiplexing and demultiplexing Dataggrams
Maguire
[email protected]
Summary
2005.05.19
Introduction 71 of 74
Internetworking/Internetteknik
W. Richard Stevens
Born in Luanshya, Northern Rhodesia (now Zambia) in 1951 Died on September 1, 1999 He studied Aerospace Engineering, Systems Engineering (image processing major, physiology minor) ight instructor and programmer His many books helped many people to understand and use TCP/IP
UNIX Network Programming, Prentice Hall, 1990. Advanced Programming in the UNIX Environment, Addison-Wesley, 1992. TCP/IP Illustrated, Volume 1: The Protocols, Addison-Wesley, 1994. TCP/IP Illustrated, Volume 2: The Implementation, Addison-Wesley, 1995. TCP/IP Illustrated, Volume 3: TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols, Addison-Wesley, 1996. UNIX Network Programming, Volume 1, Second Edition: Networking APIs: Sockets and XTI, Prentice Hall, 1998. UNIX Network Programming, Volume 2, Second Edition: Interprocess Communications, Prentice Hall, 1999.
Maguire
[email protected]
W. Richard Stevens
2005.05.19
Introduction 72 of 74
Internetworking/Internetteknik
References
[1] Barry M. Leiner, Vinton G. Cerf, David D. Clark, Robert E. Kahn, Leonard Kleinrock, Daniel C. Lynch, Jon Postel, Larry G. Roberts, and Stephen Wolff, A Brief History of the Internet, On The Internet, May/June 1997
http://www.isoc.org/oti/articles/0597/leiner.html
[2] R. Kahn, Communications Principles for Operating Systems. Internal BBN memorandum, Jan. 1972. [3] V. Cerf and R. Kahn, A protocol for packet network interconnection, IEEE Transactions on Communications Technology, Vol. COM-22, Number 5, May 1974, pp. 627-641.
http://global.mci.com/us/enterprise/insight/cerfs_up/technical_writings/protocol_paper/
[4] Jerome H. Saltzer, David P. Reed, David D. Clark, End-To-End Arguments In System Design In ACM Transactions on Computer Systems, V2, #4, Nov. 1984, pages 277-288
http://citeseer.ist.psu.edu/saltzer84endtoend.html
Maguire
[email protected]
References
2005.05.19
Introduction 73 of 74
Internetworking/Internetteknik
[5] David D. Clark and Marjory S. Blumenthal, Rethinking the Design of the Internet: The end to end arguments vs. the brave new world, In ACM Transactions on Internet Technology, Vol 1, No 1, August 2001, pp 70-109.
http://www.ana.lcs.mit.edu/papers/PDF/Rethinking_2001.pdf
[6] D. Clark, J. Wroclawski, K. Sollins, and R. Braden, Tussle in Cyberspace: Defining Tomorrows Internet, Proceedings of Sigcomm 2002.
http://www.acm.org/sigs/sigcomm/sigcomm2002/papers/tussle.pdf
Maguire
[email protected]
References
2005.05.19
Introduction 74 of 74
Internetworking/Internetteknik
2G1305 Internetworking/Internetteknik Spring 2005, Period 4 Module 2: IP Basics: Routing, ARP, and RARP
Lecture notes of G. Q. Maguire Jr.
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters 6 - 8
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.02:17:29
Maguire
[email protected]
IP_basics.fm5
2005.05.02
IP Basics Outline
IP Routing: Delivery and Routing of IP packets Address Resolution: ARP and RARP
Maguire
[email protected]
IP Basics Outline
2005.05.02
Connection-oriented vs Connectionless
Connection-Oriented Services
Network layer rst establishes a connection between a source and a destination Packets are sent along this connection Route is decided once at the time the connection is established Routers/switches in connection-oriented networks are stateful Network layer can process each packet independently A route lookup is performed for each packet IP is connectionless IP routers are stateless
Connectionless Services
Of course reality is (much) more complex, to gain performance IP routers dynamically create state (in caches) as there is frequently correlation between packets (i.e., if you just did a route lookup for destination B, there is a non-zero probability that another packet which will arrive shortly might also be headed to destination B).
Maguire
[email protected]
Connection-oriented vs Connectionless
2005.05.02
Routing
The internet protocols are based on moving packets from a source to a destination with each hop making a routing decision. Two components to routing: packet forwarding - Routing Mechanism: search the routing table and decide which interface to send a packet out.
A matching host address? If no, A matching network address? (using longest match) If no, Default entry.
computing routes - Routing Policy: rules that decide which routes should be added into the routing table. Traditionally most of the complexity was in the later (i.e., computing routes) while packet forwarding was very straight forward -- this is no longer true due to QoS. Routers vs. hosts -- a node can be both Routers forward IP packets Hosts generate or sink IP packets
Maguire
[email protected]
Routing
2005.05.02
Host (DST)
direct delivery
network
Indirect delivery
From router to router (note: the last delivery is always direct!) Destination address is used for a routing lookup in a routing table: Routing
Host (SRC)
indirect delivery
network network
indirect delivery direct delivery
Maguire
[email protected]
Forwarding
Next-Hop method - routing table holds only the address of the next hop
R1 R2
Host A
network
network
Routing table for R1 Destination Route Host B R2
network
Routing table for R2 Destination Route Host B
Host B
Host A
network1
Routing table for A Destination Route network2 R1 network3 R1
network2
Routing table for R1 Destination Route network3 R2
network3
Routing table for R2 Destination Route network1 R2
Host B
Host-specic method - per host routes Default method - species a default route (normally network address 0.0.0.0)
Routing table for A Destination network2 R1 default R2
R1
network1 Host A
R2
network2
Internet
Forwarding
2005.05.02
Processing
Rouing daemon route command netstat command
UDP
TCP
Routing Policy
Yes routing table update from adjacent routers
No ICMP
Routing Mechanism
Routing Table
forward datagram (if forwarding is enabled) ICMP redirects source IP Output: calculate next hop router (if necessary) routing process IP options
IP input queue
IP Layer
network interfaces
Maguire
[email protected]
Processing
2005.05.02
Forwarding module
A simplified view of forwarding using classful address without subnetting:
Error (slow path) Class A Network Address
Next hop address
Forwarding Module
D or E Find class A, B, or C
Interface number
Interface number
Packet
Interface number
The bulk of the forwarding effort is searching the tables (as most of the rest of the processing is simple logical bit operations).
Maguire
[email protected]
Forwarding module
2005.05.02
1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 ? ? ? ? ? ? ? ? Match
Maguire
[email protected]
Routing Table Search - Classless IP Basics: Routing, ARP, and RARP 83 of 121
2005.05.02 Internetworking/Internetteknik
Fast forwarding
Mikael Degermark, Andrej Brodnik, Svante Carlsson, Stephen Pink, Small Forwarding Tables for Fast Routing Lookups, in Proceedings of the ACM SIGCOMM97. (compressed postscript) {basis for Effnet AB} IP routing lookups must nd routing entry with longest matching prex. Networking community assumed it was impossible to do IP routing lookups in software fast enough to support gigabit speeds - but they were wrong! Paper presents a forwarding table data struct. designed for quick routing lookups.
Such forwarding tables are small enough to t in the cache of a conventional general purpose processor. The forwarding tables are very small, a large routing table with 40,000 routing entries can be compacted to a forwarding table of 150-160 Kbytes.
With the table in cache, a 200 MHz Pentium Pro or 333 MHz Alpha 21164 can perform >2 million lookups per second.
A lookup typically requires less than 100 instructions on an Alpha, using eight memory references accessing a total of 14 bytes.
Full routing lookup of each IP packet at gigabit speeds without special hardware
Maguire
[email protected]
Fast forwarding
2005.05.02
Routing Tables
Aggregate IP addresses (i.e., exploit CIDR)
more specic networks (with longer prexes) less specic networks (with shorter prexes) smaller routing tables
If each routing domain exports (i.e., tells others) only a small set of prexes, this makes it easier for other routers to send trafc to it
Unfortunately this requires clever address assignments
Current routing tables have ~157,975 entries [13] (of which a large fraction are /24 prexes) with a growth rate of 18,000 entries per year[14].
There are a limited number of prexes for Class A + B + C networks (2,113,664). If the longest prexes which a backbone router had to deal with were /24, then a table with 16,777,216 entries would be sufcient (even without aggregation) - each entry only needs to store the outgoing port number! This would allow a direct lookup in a memory of ~26Mbytes - with upto 256 outgoing ports.
Maguire
[email protected]
Routing Tables
2005.05.02
Routing table
Flags Destination IP address Next-hop Router IP address point to local interface to use Refcnt Use PMTU
UGH U UG UH
ddd ddd
ddd ddd
ddd ddd
where ddd is some numeric value. display the routing table with "netstat -rn"
"r" is for routing table "n" asks for numeric IP addresses rather than name
Flags:
U G H D M route is Up route is to a Gateway route is to a Host route was Discovered by a redirect route was Modied by a redirect
Maguire
[email protected]
Routing table
2005.05.02
Maguire
[email protected]
Host routing
A host either: knows a route - manually congured [i.e., "Static routes"]
from the interface (for directly connected networks) or manually via the "route" command
or uses a default route. On booting hosts send ~3 ICMP router solicitation messages (~3 seconds apart) to find a default router. This allows for dynamic discovery of the default router.
Maguire
[email protected]
Host routing
2005.05.02
Routing
Control Plane
Routing Table
Access List
Queuing Priority
Accounting Data
Data Plane
Cache
Packet
Switching Tasks
Security Tasks
Queuing Tasks
Accounting Tasks
The routing table tells us which output port to use based on the destination (and possibly the source) IP address. The data plane has to run at packet rates (i.e., in real-time). However, a router also performs a lot of other processing
Maguire
[email protected]
Routing
2005.05.02
Combining layers
Many devices now combine processing of several layers: Switch/Routers: combine layers 2+3 Devices combining layers 3+4 are appearing - which extract flows based on looking at transport layer port numbers in addition to network addresses.
Maguire
[email protected]
Combining layers
2005.05.02
Maguire
[email protected]
oscar.it.kth.se
130.237.212.253
?hostname?
A new computer
08:00:20:7a:bc:2d
Direct mapping - requires no I/O, just a computation; hard to maintain; and requires stable storage (since you have to store the mappings somewhere) or Dynamic Binding - easier to maintain; but has a delay while messages are exchanged
Maguire
[email protected]
What to do with a new computer? IP Basics: Routing, ARP, and RARP 92 of 121
2005.05.02 Internetworking/Internetteknik
ccslab1.kth.se
bit 130.237.15.254
RARP
Address Resolution
HW address: 48
Figure 20: mapping between host names, IP address, and MAC address
Address Resolution: ARP, RARP IP Basics: Routing, ARP, and RARP 93 of 121
2005.05.02 Internetworking/Internetteknik
TYPE
2
protocol length
OP=Request1, Reply2
6
hardware length
=6
=4
2 1 1 2 6 4 6 4 Figure 21: Format of ARP request/reply packet (see Stevens, Vol. 1, figure 4.3, pg. 56)
ARP Address Resolution Protocol (RFC826)IP Basics: Routing, ARP, and RARP 94 of
2005.05.02 Internetworking/Internetteknik
Maguire
[email protected]
ARP example 1
hostname ftp://B.kth.se/foo.dat
A
FTPd
B
FTPd
C
resolver (10)
FTP
(3)
TCP
TCP connections
TCP
TCP
(4)
IP ARP Ethernet driver ARP IP
(5)
ARP (8)
IP
(6)
(9)
(7)
Ethernet driver
Ethernet driver
Limited broadcast
Figure 22: Using ARP on host C to determine MAC address of the interface on host B
Maguire
[email protected]
ARP example 1
2005.05.02
Note that the later form (with the n option) does not lookup the hostname, this is very useful when you dont yet have a name resolution service working! ARP Renements Since the senders Internet-to-Physical address binding is in every ARP broadcast; (all) receivers update their caches before processing an ARP packet
Maguire
[email protected]
ARP Timeouts
If there is no reply to an ARP request
the machine is down or not responding request was lost, then retry (but not too often) eventually give up (When?)
ARP Timeouts
2005.05.02
ARP example 2
hostname ftp://B.kth.se/foo.dat
B
FTPd
C
resolver
FTP
(3)
TCP
TCP connections
TCP
R
IP ARP Ethernet driver ARP (7) Ethernet driver IP
(4) (5)
ARP (8) Ethernet driver IP
(6)
Ethernet driver
Figure 23: Router (R) doing a Proxy ARP to provide MAC address of B
Maguire
[email protected]
ARP example 2
2005.05.02
Maguire
[email protected]
Gratuitous ARP
Host sends a request for its own address generally done at boot time to inform other machines of its address (possibly a new address) - gives these other hosts a chance to update their cache entries immediately lets hosts check to see if there is another machine claiming the same address duplicate IP address sent from Ethernet address a:b:c:d:e:f As noted before, hosts have paid the price by servicing the broadcast, so they can cache this information - this is one of the ways the proxy ARP server could know the mapping. Note that faking that you are another machine can be used to provide failover for servers (see for example heartbeat, fake, etc. at http://www.linux-ha.org/download/ for a send_arp program). [It can also be used very various attacks!]
Maguire
[email protected]
Gratuitous ARP
2005.05.02
Maguire
[email protected]
IEEE 802.3 Ethernet Destination: ff:ff:ff:ff:ff:ff (Broadcast) Source: 00:40:8c:30:d4:32 (172.16.33.3) Length: 36 Trailer: 00000000000000000000 Type: ARP (0x0806) Address Resolution Protocol (request) Hardware type: IEEE 802 (0x0006) Protocol type: IP (0x0800) Hardware size: 6 Protocol size: 4 Opcode: request (0x0001) Sender MAC address: 00:40:8c:30:d4:32 (172.16.33.3) Sender IP address: 172.16.33.3 (172.16.33.3) Target MAC address: ff:ff:ff:ff:ff:ff (Broadcast) Target IP address: 172.16.33.2 (172.16.33.2)
0000 0010 0020 0030 ff ff ff ff ff ff 00 40 8c 30 d4 32 00 24 aa aa 03 00 00 00 08 06 00 06 08 00 06 04 00 01 00 40 8c 30 d4 32 ac 10 21 03 ff ff ff ff ff ff ac 10 21 02 00 00 00 00 00 00 00 00 00 00 [email protected].$.. ...............@ .0.2..!......... !........... <<< unlike what page 163 says it is not all zeros!
Maguire
[email protected]
B
FTPd
C
resolver
FTP
packet demultiplexed (8) TCP port number (21. then 20.) control TCP data packet demultiplexed (7) IP protocol eld (6.)
IP
(3)
TCP
TCP connections
(4)
IP
packet demultiplexed (6) PPP protocol eld (0x0021) PPP driver (5) packet received and deframed
PPP driver
Figure 24: On a point-to-point link there is no need for ARP (figure also shows explicitly the demultiplexing)
Note that the PPP protocol field plays a role similar to the ethernet frame type.
Maguire
[email protected]
TYPE
2
protocol length
OP=Request3, Reply4
6
hardware length
Note: You can now see what the publish aspect of the arp command is for.
Maguire
[email protected]
RARP: Reverse Address Resolution Protocol (RFC 903)IP Basics: Routing, ARP, and RARP
2005.05.02 Internetworking/Internetteknik
Ethernet II, Src: 00:40:8c:30:d4:32, Dst: ff:ff:ff:ff:ff:ff Destination: ff:ff:ff:ff:ff:ff (Broadcast) Source: 00:40:8c:30:d4:32 (172.16.33.3) Type: RARP (0x8035)
Trailer: 00000000000000000000000000000000...
Address Resolution Protocol (reverse request) Hardware type: Ethernet (0x0001) Protocol type: IP (0x0800) Hardware size: 6 Protocol size: 4 Opcode: reverse request (0x0003) Sender MAC address: 00:40:8c:30:d4:32 (172.16.33.3) Sender IP address: 0.0.0.0 (0.0.0.0) Target MAC address: 00:40:8c:30:d4:32 (172.16.33.3) Target IP address: 0.0.0.0 (0.0.0.0)
0000 0010 0020 0030 ff ff ff ff ff ff 00 40 8c 30 d4 32 80 35 00 01 08 00 06 04 00 03 00 40 8c 30 d4 32 00 00 00 00 00 40 8c 30 d4 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [email protected].. [email protected].... [email protected].......... ............ <<< as the source does not know its own IP address <<< as the source does not know the targets IP address
Maguire
[email protected]
RARP - as seen with ethereal IP Basics: Routing, ARP, and RARP 105 of 121
2005.05.02 Internetworking/Internetteknik
RARP server
Someone has to know the mappings - quite often this is in a file /etc/ethers Since this information is generally in a file, RARP servers are generally implemented as user processes (because a kernel process should not do file I/O!) Unlike ARP responses which are generally part of the TCP/IP implementation (often part of the kernel). How does the process get the packets - since they arent IP and wont come across a socket?
BSD Packet lters SVR4 Data Link Provider Interface (DLPI) SUNs Network Interface Tap (NIT) Interestingly in the appendix to RFC 903 an alternative to having data link level access was to have two IOCTLs, one that would "sleep until there is a translation to be done, then pass the request out to the user process"; the other means: "enter this translation into the kernel table"
RARP requests are sent as hardware level broadcasts - therefore are not forwarded across routers:
multiple servers per segement - so in case one is down; the rst response is used having the router answer
Maguire
[email protected]
RARP server
2005.05.02
Alternatives to RARP
In a later lecture we will examine: BOOTP and DHCP (for both IPv4 and IPv6) and autoconguration for IPv6.
Maguire
[email protected]
Alternatives to RARP
2005.05.02
If you change ethernet cards, you get a new address! Assumes that all machines are attached to a high capacity LAN. Advantages You only have to assign network numbers, then the hosts gure out their own address. Simpler administration. Novell NetWare provides: Service Advertising Protocol (SAP), Routing Information Protocol (RIP), and NetWare Core Protocol (NCP).
Maguire
[email protected]
Useful tools
For looking at and generating packets!
Maguire
[email protected]
Useful tools
2005.05.02
tcpdump
Under HP-UX 11.0 # ./tcpdump -i /dev/dlpi0 tcpdump: listening on /dev/dlpi0
22:25:43.217866 birk2.5900 > nucmed35.50251: . ack 3089200293 win 8080 (DF) 22:25:43.290636 birk2.5900 > nucmed35.50251: P 0:4(4) ack 1 win 8080 (DF) 22:25:43.360064 nucmed35.50251 > birk2.5900: . ack 4 win 32768 22:25:43.363786 birk2.5900 > nucmed35.50251: P 4:167(163) ack 1 win 8080 (DF) 22:25:43.364159 nucmed35.50251 > birk2.5900: P 1:11(10) ack 167 win 32768 22:25:43.543867 birk2.5900 > nucmed35.50251: . ack 11 win 8070 (DF) 22:25:43.577483 birk2.5900 > nucmed35.50251: P 167:171(4) ack 11 win 8070 (DF) 22:25:43.640052 nucmed35.50251 > birk2.5900: . ack 171 win 32768 22:25:43.643793 birk2.5900 > nucmed35.50251: P 171:334(163) ack 11 win 8070 (DF) 22:25:43.644132 nucmed35.50251 > birk2.5900: P 11:21(10) ack 334 win 32768 22:25:43.750062 birk2.5900 > nucmed35.50251: . ack 21 win 8060 (DF) 22:25:43.873349 birk2.5900 > nucmed35.50251: P 334:338(4) ack 21 win 8060 (DF) 22:25:43.940073 nucmed35.50251 > birk2.5900: . ack 338 win 32768 13 packets received by filter 0 packets dropped by kernel
Maguire
[email protected]
tcpdump
2005.05.02
tcpdump - Linux
nucmed30:/home/maguire # /usr/sbin/tcpdump -i eth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
14:21:52.736671 IP nucmed30.local.domain.must-p2p > jackb.ssh: P 1818006646:1818006726(80) ack 307068981 win 591> 14:21:52.737291 IP jackb.ssh > nucmed30.local.domain.must-p2p: P 1:113(112) ack 80 win 32768 <nop,nop,timestamp > 14:21:52.737917 IP nucmed30.local.domain.must-p2p > jackb.ssh: P 80:160(80) ack 113 win 5910 <nop,nop,timestamp > 14:21:52.802719 IP jackb.ssh > nucmed30.local.domain.must-p2p: . ack 160 win 32768 <nop,nop,timestamp 25983516 2> 14:21:57.782196 arp who-has jackscan tell nucmed30.local.domain 14:21:57.784218 arp reply jackscan is-at 00:40:8c:30:d4:32 14:21:57.784253 IP nucmed30.local.domain > jackscan: icmp 64: echo request seq 1 14:21:57.784971 IP jackscan > nucmed30.local.domain: icmp 64: echo reply seq 1 14:21:58.782187 IP nucmed30.local.domain > jackscan: icmp 64: echo request seq 2 14:21:58.782912 IP jackscan > nucmed30.local.domain: icmp 64: echo reply seq 2 14:21:59.783036 IP nucmed30.local.domain > jackscan: icmp 64: echo request seq 3 14:21:59.783759 IP jackscan > nucmed30.local.domain: icmp 64: echo reply seq 3 14:21:59.802600 IP jackb.ssh > nucmed30.local.domain.must-p2p: . ack 2864 win 32768 <nop,nop,timestamp 25984216 > 14:22:00.739485 IP nucmed30.local.domain.must-p2p > jackb.ssh: P 2864:2944(80) ack 897 win 5910 <nop,nop,timesta>
tcpdump - Linux
2005.05.02
tcpdump tcpdump
rarpd
user processes
IP, ICMP IGMP kernel IP, ICMP IGMP rcvd xmit nit_buf nit_pf
Ethernet driver
Ethernet driver
Figure 26: Two alternatives to get packets Note the BSF packet filter gets a copy of both the received and transmitted packets.
Maguire
[email protected]
Tools Used: tcpdump Program IP Basics: Routing, ARP, and RARP 112 of 121
2005.05.02 Internetworking/Internetteknik
Ethereal
Maguire
[email protected]
Ethereal
2005.05.02
Maguire
[email protected]
sock
TCP connection UDP server
Interactive client: default Interactive server: -s Source client: -i Sink server: -i -s Default TCP, -u for UDP Source Code Available: (Tcpdump and sock)
Maguire
[email protected]
IP, ICMP IGMP kernel IP, ICMP IGMP rcvd xmit nit_buf nit_pf
Ethernet driver
Ethernet driver
Maguire
[email protected]
Trafc generators
Distributed Internet Trafc Generator (D-ITG) [18] http://www.grid.unina.it/software/ITG/
Iperf http://dast.nlanr.net/Projects/Iperf/ MGEN: network performance tests and measurements using UDP/IP trafc http://mgen.pf.itd.nrl.navy.mil/ RUDE & CRUDE - Real-time UDP Data Emitter (RUDE) and Collector for RUDE (CRUDE) http://rude.sourceforge.net/ SUNs Packet Shell - http://playground.sun.com/psh/ TG http://www.caip.rutgers.edu/~arni/linux/tg1.html UDPgen http://www.fokus.fhg.de/usr/sebastian.zander/private/udpgen Netcoms SmartBits - hardware tester For additional traffic generators see: http://www.icir.org/models/trafficgenerators.html and
http://www.ip-measurement.org/
Maguire
[email protected]
Trafc generators
2005.05.02
Summary
This lecture we have discussed: Routing Principles
Routing Mechanism: Use the most specic route IP provides the mechanism to route packets Routing Policy: What routes should be put in the routing table? Use a routing daemon to provide the routing policy
Routing table ARP and RARP IPX/SPX Addresses - we will see something similar when we talk about IPv6 tcpdump, ethereal, sock For further information about routing see:
Bassam Halabi, Internet Routing Architectures, Cisco Press, 1997, ISBN 1-56205-652-2. -- especially useful for IGRP.
Summary
2005.05.02
References
[7]
http://www.renesas.com/fmwk.jsp?cnt=tcam_series_landing.jsp&fp=/applications/network/network_memory/tcam /
[8] Fany Yu, Randy H. Katz, and T. V. Lakshman, "Gigabit Rate Multiple-Pattern Matching with TCAM",
http://sahara.cs.berkeley.edu/jan2004-retreat/slides/Fang_retreat.ppt
[9] Geoff Huston, "Analyzing the Internet BGP Routing Table", Cisco Systems web page,
http://www.cisco.com/en/US/about/ac123/ac147/ac174/ac176/about_cisco_ipj_archive_article09186a00800c8 3cc.html
[10] Tian Bu, Lixin Gao, and Don Towsley, "On Characterizing BGP Routing Table Growth", Proceedings of Globe Internet 2002, 2002
http://www-unix.ecs.umass.edu/~lgao/globalinternet2002_tian.pdf
[11] H Narayan, R Govindan, G Varghese, "The Impact of Address Allocation and Routing on the Structure and Implementation of Routing Tables",
Maguire
[email protected]
References
2005.05.02
Proceedings of the 2003 Conference on Applications, technologies, architectures, and protocols for computer communications, 2003, pp 125-136, ISBN:1-58113-735-4 and SIGCOMM 03, August 25 29, 2003, Karlsruhe, Germany http://www.cs.ucsd.edu/~varghese/PAPERS/aram.pdf [12] Ravikumar V.C Rabi Mahapatra J.C. Liu, "Modified LC-Trie Based Efficient Routing Lookup",
http://faculty.cs.tamu.edu/rabi/Publications/Mascot-final-proceeding.pdf
[13] APNIC, Routing Table Report 04:00 +10GMT Sat 19 Mar, 2005, North American Network Operators Group, Weekly Routing Table Report, From: Routing Table Analysis, Mar 18 13:10:37 2005, "This is an automated weekly mailing describing the state of the Internet Routing Table as seen from APNICs router in Japan. Daily listings are sent to [email protected]" http://www.merit.edu/mail.archives/nanog/2005-03/msg00401.html [14] Geoff Huston, Routing Table Status Report, Policy SIG, APNIC19, Kyoto, Japan, Feb 24 2005
http://www.apnic.net/meetings/19/docs/sigs/routing/routing-pres-info-huston-routing-table.pdf
Maguire
[email protected]
References
2005.05.02
[15] Bassam Halabi, Internet Routing Architectures, Cisco Press, 1997, ISBN 1-56205-652-2. [16] Gianluca Insolvibile, The Linux Socket Filter: Sniffing Bytes over the Network, Linux Journal, 31 May 2001 http://www.linuxjournal.com/article/4659 [17] Gianluca Insolvibile, Inside the Linux Packet Filter, Part II, Linux Journal, 1 March 2002 http://www.linuxjournal.com/article/5617 [18] Stefano Avallone, Antonio Pescap, and Giorgio Ventre, Analysis and experimentation of Internet Traffic Generator, International Conference on Next Generation Teletraffic and Wired/Wireless Advanced Networking (NEW2AN04), February 02-06, 2004
http://www.grid.unina.it/software/ITG/D-ITGpubblications/New2an-ITG.pdf
Maguire
[email protected]
References
2005.05.02
2G1305 Internetworking/Internetteknik Spring 2005, Period 4 Module 3: IP, ICMP, and Tools
Lecture notes of G. Q. Maguire Jr.
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters 8-9
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.04.30:22:05
Maguire
[email protected]
IP_and_ICMP.fm5
2005.04.30
Maguire
[email protected]
4 bit version
8-bit Protocol 32 bit Source IP address 32 bit Destination IP address options (padded to 32 bit length) data
Figure 28: IP header (see Stevens, Vol. 1, figure 3.1, pg. 34)
Note: We examined the Version, Protocol, and IP address fields in the previous lecture. Today we will examine the length elds, TOS, identication, ags, offset, checksum, and options elds.
Maguire
[email protected]
20 bytes
16 bit identication
Length Fields
Header Length (4 bits)
Size of IPv4 header including IP options Expressed in number of 32-bit words (4-byte words) Minimum is 5 words (i.e., 20 bytes) Maximum is 15 words (i.e., 60 bytes) limited size limited use Total length of datagram including header If datagram is fragmented: length of this fragment Expressed in bytes
Maguire
[email protected]
Length Fields
2005.04.30
65535 17914 8166 4464 4352 2048 2002 1536 1500 1500 1492 1006 1006 576 544 512 508 296 68
Maguire
[email protected]
Ofcial maximum MTU 16Mbps IBM Token Ring IEEE 802.4 IEEE 802.5 (4Mbps max) FDDI (Revised) Wideband Network IEEE 802.5 (4Mb recommended) Experimental Ethernet Nets Ethernet Networks Point-to-Point (default) IEEE 802.3 SLIP ARPANET X.25 Networks DEC IP Portal NETBIOS IEEE 802/Source-Route Bridge Point-to-Point (low delay) Ofcial minimum MTU
simply a logical limit for interactive response we will see this number again!
Fragmentation
If an IP datagram is larger than the MTU of the link layer, it must be divided into several pieces fragmentation Fragmentation may occur multiple times
as a fragment might need to go across a link with an even smaller MTU!
Maguire
[email protected]
Fragmentation
2005.04.30
Flags: 3 bits
Reserved Fragment (RF) - set to 0 Dont Fragment (DF) Set to 1 if datagram should not be fragmented If set and fragmentation needed datagram will be discarded and an error message will be returned to the sender More Fragments (MF) Set to 1 for all fragments, except the last
Fragments can overlap - the receiver simply assembles what it receives (ignoring duplicate parts). If there are gaps - then at some point there will be a re-assembly error.
Maguire
[email protected]
Path MTU
Each link in path from source to destination can have a different MTU to avoid fragmentation you have to nd the minimum of these RFC 1191: Path MTU discovery[20] uses:
good guesses (i.e., likely values) By setting Dont Fragment (DF) bit in IP datagram change size while you get ICMP messages saying Destination Unreachable with a code saying fragmentation needed
Maguire
[email protected]
Path MTU
2005.04.30
Precedence 0 = normal Delay 0 = normal Throughput 0 = normal Relibility 0 = normal monetary Cost
1 2 3
Precedence
DELAY
Reserved
Few applications set the TOS eld (in fact most implementations will not let you set these bits!) However,
4.3BSD Reno and later - do support these bits. Differentiated Services (diffserv) proposes to use 6 of these bits to provide 64 priority levels - calling it the Differentiated Service (DS) eld [RFC2474] (using bits 0..5 as Differentiated Services CodePoint (DSCP)) SLIP guesses by looking at the protocol eld and then checks the source and destination port numbers.
There has been a lot of experimentation with this field, both for TOS and more recently for Early Congestion Notification (ECN): RFC 3168 [19] using bits 6 and 7 {ECN Capable Transport (ECT) and Congestion Experienced (CE)}.
Maguire
[email protected]
telnet/rlogin FTP control data any bulk data TFTP SMTP command phase data phase DNS UDP query TCP query zone transfer ICMP error query any IGP SNMP BOOTP NNTP
1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0
0 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0x10 0x10 0x08 0x08 0x10 0x10 0x08 0x10 0x00 0x08 0x00 0x00 0x04 0x04 0x00 0x02
a. Note that this is the hex value as see in the TOS/DS byte.
Maguire
[email protected]
Precedence
Precedence values are defined but are largely ignored, few applications use them.
111 110 101 100 011 010 001 000 Network Control Internetwork Control CRITIC/ECP Flash Override Flash Immediate Priority Routine
In the original ARPANET there were two priority levels defined (in order to support low delay services and regular traffic).
Maguire
[email protected]
Precedence
2005.04.30
Maguire
[email protected]
Maguire
[email protected]
Differentiated services
If bits 3, 4, and 5 are all zero (i.e., XXX000) treat the bits 1, 2, 3 as the traditional precedence bits, else the 6 bits define 64 services: Category 1: numbers 0, 2, 4, 62 - dened by IETF Category 2: numbers 3, 7, 11, 15, 63 dened by local authorities Category 3: numbers 1, 5, 9, 61 are for temporary/experimental use The numbering makes more sense when you see them as bit patterns:
Category 1 2 3 codepoint XXXXX0 XXXX11 XXXX01 Assigning authority
The big problems occur at gateways where the intepretation of local DS values is different on the incoming and outgoing links!
Maguire
[email protected]
Differentiated services
2005.04.30
TTL eld
Time To Live (TTL) (8 bits): Limits the lifetime of a datagram, to avoid innite loops A router receiving a packet with TTL>1 decrements the TTL eld and forwards the packet If TTL <= 1 shall not be orwarded f an ICMP time exceeded error is returned to the sender {we will cover ICMP shortly} Recommended value is 64 Should really be called Hop Limit (as in IPv6) Historically: Every router holding a datagram for more than 1 second was expected to decrement the TTL by the number of seconds the datagram resided in the router.
Maguire
[email protected]
TTL eld
2005.04.30
Header Checksum
Ensures integrity of header elds
Hop-by-hop (not end-to-end) Header elds must be correct for proper and safe processing of IP! Payload is not covered
Other checksums
Hop-by-hop: using link-layer CRC IP assumes a strong link layer checksum/CRC - as the IP checksum is weak End-to-end: Transport layer checksums, e.g., TCP & UDP checksums, cover payload
Note that recent work concerning IP over wireless links assume that the payload can have errors and will still be received (see work concerning selective coverage of UDP checksum).
Maguire
[email protected]
Header Checksum
2005.04.30
IPv4 Options
IPv4 options were intended for network testing & debugging Options are variable sized and follow the xed header Contiguous (i.e., no separators) Not required elds, but all IP implementations must include processing of options
Unfortunately, many implementations do not! Since the maximum header length is 60 bytes and the xed part is 20 bytes - there is very little space left!
Maguire
[email protected]
IPv4 Options
2005.04.30
IP Options Encoding
Two styles:
Code
Length
Data
Copy (to fragments) (1 bit) 0: copy only to the rst fragment 1: copy the option to all fragements Class (2 bits) 0 (00): Datagram or network control 2 (10): Debugging and measurement 1 (01) and 3 (11) reserved Option Number (5 bits)
Option Length: 1 byte, denes total length of option Data: option specic
copy
class
option number
Maguire
[email protected]
IP Options Encoding
2005.04.30
Categories of IP Options
Single byte (only code)
No operation (Option Number=0) End of operation (Option Number=1)
Multiple byte
Loose Source Route (Option Number=3) Path includes these router, but there can be multiple hops between the specied addresses Time stamp (Option Number=4) Like record route (below), but adds a timestamp at each of the routers (upto the space available - after this an overow eld is incremented - but it is only 4 bits) Record Route (Option Number=7) Strict Source Route (Option Number=9) The exact path is specied
However, due to the very limited space available for the options - these options are of little practical value in todays internet. (Consider the diameter of todays internet versus the number of IP addresses or timestamps that could be in the options field; i.e., record route can only store 9 IP addresses!)
Maguire
[email protected]
Categories of IP Options
2005.04.30
Destination Unreachable (Network/Host/Protocol/Port/) Time Exceeded (TTL expired) Parameter problem - IP header error Source Quench (requests source to decrease its data rate) Redirect - tell source to send its messages to a better address Echo Request/ Echo reply - for testing (e.g., ping program sends an Echo request) Timestamp Request/ Timestamp reply Information Request / Information reply Address Mask Request / Reply Traceroute Datagram conversion error Mobile Host Redirect/Registration Request/Registration Reply IPv6 Where-Are-You/I-Am-Here
Internet Control Message Protocol (ICMP)
2005.04.30
specify host and port number try to fetch a file about 25s later
tcpdump output
1 2 3 4 5 6 11 12 0.0 0.002050 0.002723 0.006399 5.000776 5.004304 20.001177 20.004759 arp who-has svr4 tell bsdi arp reply svr4 is-at 0:0:c0:c2:9b:26 bsdi.2924 > svr4.8888 udp 20 svr4 > bsdi: icmp: svr4 udp port 8888 unreachable bsdi.2924 > svr4.8888 udp 20 svr4 > bsdi: icmp: svr4 udp port 8888 unreachable repeats every 5 seconds bsdi.2924 > svr4.8888 udp 20 svr4 > bsdi: icmp: svr4 udp port 8888 unreachable
Maguire
[email protected]
ICMP Redirect
ICMP Redirect message is sent by a router (R1) to the sender of an IP datagram (host) when the datagram should have been sent to a different router (R2). host
(1) IP datagram (4) subsequent IP datagrams
R1
R2
Maguire
[email protected]
ICMP Redirect
2005.04.30
Look at ping across different connections2: LAN WAN Hardwired SLIP Dialup SLIP - extra delay due to the modems and the correction/compression With IP record route (RR) option tracing the route of the ping datagram.
1. Mike Muuss was killed in an automobile accident on November 20, 2000. http://ftp.arl.mil/~mike/
2. For examples, see Stevens, Vol. 1, Chapter 7, pp. 86-90.
Maguire
[email protected]
PING examples
On a Solaris machine:
Maguire
[email protected]
PING examples
2005.04.30
^C ----www.kth.se PING Statistics---5 packets transmitted, 5 packets received, 0% packet loss round-trip (ms) min/avg/max = 11/25/54 5 packets sent via: this is based on the record route information (caused by -ov)
217.208.194.247 - s31o268.telia.com 213.64.62.150 - fre-d4-geth6-0.se.telia.net 213.64.62.154 - fre-c3-geth6-0.se.telia.net 195.67.220.1 - fre-b1-pos0-1.se.telia.net 130.242.94.4 - STK-PR-2-SRP5.sunet.se 130.242.204.130 - STK-BB-2-POS4-3.sunet.se 130.242.204.121 - stockholm-1-FE1-1-0.sunet.se 130.237.32.3 - [ name lookup failed ] 130.237.32.51 - oberon.admin.kth.se Maguire
[email protected]
PING examples
2005.04.30
Maguire
[email protected]
0.0 0.000586 0.003067 0.004325 0.069810 0.071149 0.085162 0.086375 0.118608 0.226464 0.287296 0.395230 0.409504 0.517430
(0.0006) (0.0025) (0.0013) (0.0655) (0.0013) (0.0140) (0.0012) (0.0322) (0.1079) (0.0608) (0.1079) (0.0608) (0.1079)
arp who-has bsdi tell svr4 arp reply bsdi is-at 0:0:c0:6f:2d:40 svr4.42804 > slip.33435 udp 12 [ttl 1] bsdi > svr4: icmp: time exceeded in-transit svr4.42804 > slip.33436 udp 12 [ttl 1] bsdi > svr4: icmp: time exceeded in-transit svr4.42804 > slip.33437 udp 12 [ttl 1] bsdi > svr4: icmp: time exceeded in-transit svr4.42804 > slip.33438 udp 12 ttl=2 slip > svr4: icmp: slip udp port 33438 unreachable svr4.42804 > slip.33439 udp 12 ttl=2 slip > svr4: icmp: slip udp port 33439 unreachable svr4.42804 > slip.33440 udp 12 ttl=2 slip > svr4: icmp: slip udp port 33440 unreachable Useful Tool: Traceroute Programs
2005.04.30
ICMP Summary
Destination (Network/Host/Protocol/Port/...) Unreachable Time Exceeded - i.e., TTL expired
Used to implement traceroute
Parameter problem - IP header error Source Quench- asks source to decrease its sending rate Redirect - tells the source to send packets to a better address Echo Request/Echo reply - for testing
ping: sends an Echo Request, then measures the time until the matching reply is received Round Trip Time (RTT) computation Clock synchronization
ICMP Summary
2005.04.30
Summary
This lecture we have discussed: IP ICMP tools: Ping, Traceroute
Maguire
[email protected]
Summary
2005.04.30
References
[19] K. Ramakrishnan, S. Floyd, and D. Black, The Addition of Explicit Congestion Notification (ECN) to IP, IETF RFC 3168, September 2001. [20] J. Mogul and S. Deering, Path MTU Discovery, IETF, RFC 1191, November 1990. [21] J. Postel, Internet Control Message Protocol, IETF, RFC 792, September 1981 http://www.ietf.org/rfc/rfc0792.txt
Maguire
[email protected]
References
2005.04.30
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters 11, 16, 17
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.04.24:17:47
Maguire
[email protected]
UDP.fm5
2005.04.24
Outline
UDP Socket API BOOTP DHCP DNS, DDNS
Maguire
[email protected]
Outline
2005.04.24
Connection control - for connection-oriented transport protocols End-to-end Flow Control (in contrast to link level ow control) End-to-end Error Control (in contrast to link level error control)
Maguire
[email protected]
Maguire
[email protected]
Each output operation results in one UDP datagram, which causes one IP datagram to be sent Applications which use UDP: DNS, TFTP, BOOTP, DHCP, SNMP, NFS, VoIP, etc.
An advantage of UDP is that it is a base to build your own protocols on Especially if you dont need reliability and in order delivery of lots of data
Maguire
[email protected]
UDP Header
8 byte header + possible data
0 16 bit source port number 16 bit UDP length 15 16 16 bit destination port number 16 bit UDP checksum 31
IP datagram UDP datagram IP header 20 bytes UDP header 8 bytes UDP data 0 .. 216-(20+8+1) = 65,507bytes Port 1
Port 2
Port n
Maguire
[email protected]
UDP Header
2005.04.24
netwall kerberos
Maguire
[email protected]
a. Roughly 300 well know port numbers remain unassigned and 38 reserved Roughly 26k registered port numbers remain unassigned and 9 reserved
For the purpose of providing services to unknown callers, a service contact port is defined. This list specifies the port used by the server process as its contact port. The contact port is sometimes called the "well-known port".
http://www.iana.org/assignments/port-numbers
Linux chooses the local port to use for TCP and UDP traffic from this range:
$ cat /proc/sys/net/ipv4/ip_local_port_range 1024 29999
Maguire
[email protected]
IP header 20 bytes
Note there is no UDP header in the second fragment. Therefore, a frequent operation is to compute the path MTU before sending anything else. (see RFC 1191 for the table of common MTUs)
Maguire
[email protected]
Fragmentation Required
If datagram size > MTU, DF (Dont Fragment) in IP header is on, then the router sends ICMP Unreachable Error. Of course this can be used to find Path MTU.
Maguire
[email protected]
Fragmentation Required
2005.04.24
Example under BSD Bsdi% arp -a ARP cache is empty Bsdi% sock -u -i -n1 -w8192 svr4 discard
10.0 20.001234 30.001941 40.002775 50.003495 60.004319 70.008772 80.009911 90.011127 100.011255 110.012562 120.013458 130.014526 140.015583 (0.0012) (0.0007) (0.0008) (0.0007) (0.0008) (0.0045) (0.0011) (0.0012) (0.0001) (0.0013) (0.0009) (0.0011) (0.0011) arp who-has svr4 tell bsdi arp who-has svr4 tell bsdi arp who-has svr4 tell bsdi arp who-has svr4 tell bsdi arp who-has svr4 tell bsdi arp who-has svr4 tell bsdi arp reply svr4 is-at 0:0:c0:c2:9b:26 arp reply svr4 is-at 0:0:c0:c2:9b:26 bsdi > svr4: (frag 10863:800@7400) arp reply svr4 is-at 0:0:c0:c2:9b:26 arp reply svr4 is-at 0:0:c0:c2:9b:26 arp reply svr4 is-at 0:0:c0:c2:9b:26 arp reply svr4 is-at 0:0:c0:c2:9b:26 arp reply svr4 is-at 0:0:c0:c2:9b:26
Maguire
[email protected]
on a BSDI system:
each of the additional (5) fragments caused an ARP request to be generated this violates the Host Requirements RFC - which tries to prevent ARP ooding by limiting the maximum rate to 1 per second when the ARP reply is received the last fragment is sent Host Requirements RFC says that ARP should save at least one packet and this should be the latest packet unexplained anomaly: the System Vr4 system sent 7 ARP replies back! no ICMP time exceeded during reassembly message is sent BSD derived systems - never generate this error! It does set the timer internally and discard the fragments, but never sends an ICMP error. fragment 0 (which contains the UDP header) was not received - so there is no way to know which process sent the fragment; thus unless fragment 0 is received - you are not required to send an ICMP time exceeded during reassembly error.
a rare event) The same error occurs even if you dont have fragmentation - simply sending multiple UDP datagrams rapidly when there is no ARP entry is sufcient! NFS sends UDP datagrams whose length just exceeds 8192 bytes
NFS will timeout and resend however, there will always be this behavior - if the ARP cache has no entry for this destination!
Maguire
[email protected]
Still a problem?
A UDP with 8192 payload to echo port as seen on SuSE 9.2 linux 2.6.8-24:
No. 37 38 39 40 41 Time 3.020002 3.021385 3.021422 3.021452 3.021480 Source 172.16.33.16 172.16.33.5 172.16.33.16 172.16.33.16 172.16.33.16 Destination Broadcast 172.16.33.16 172.16.33.5 172.16.33.5 172.16.33.5 Protocol Info ARP Who has 172.16.33.5? Tell 172.16.33.16 ARP 172.16.33.5 is at 00:40:8c:24:37:f4 IP Fragmented IP protocol (proto=UDP 0x11, off=4440) IP Fragmented IP protocol (proto=UDP 0x11, off=5920) IP Fragmented IP protocol (proto=UDP 0x11, off=7400)
3.021385-3.020002=.001383 sec. 1.383ms for the ARP reply All but the last 3 fragments are dropped! Including the initial echo request packet -- so in the fragments that do arrive you dont know who they are for -- because the rst fragment was lost!
Maguire
[email protected]
Still a problem?
2005.04.24
The initial UDP Echo request is still lost! The key parameter is /proc/sys/net/ipv4/neigh/ethX/unres_qlen where X is the interface (i.e., eth0, eth1, ) -- the default value is 3.
Maguire
[email protected]
two limits:
sockets API limites size of send and receive buffer; generally 8 kbytes, but you can call a routine to change this TCP/IP implementation - Stevens found various limits to the sizes - even with loopback interface (see Stevens, Vol. 1, pg. 159)
Hosts are required to handle at least 576 byte IP datagrams, thus lots of protocols limit themselves to 512 bytes or less of data:
DNS, TFTP, BOOTP, and SNMP
Maguire
[email protected]
Datagram truncation
What if the application is not prepared to read the datagram of the size sent? Implementation dependent: traditional Berkeley: silently truncate 4.3BSD and Reno: can notify the application that the data was truncated SVR4: excess data returned in subsequent reads - application is not told that this all comes from one datagram TLI: sets a ag that more data is available, subsequent reads return the rest of the datagram
Maguire
[email protected]
Datagram truncation
2005.04.24
Socket API
int socket(int domain, int type, int protocol);
creates an endpoint description that you can use to send and receive network trafc
int connect(int
sockfd, const struct sockaddr *serv_addr, socklen_t addrlen); connect the local socket to a remote socket
int close(int fd) and int shutdown(int s, int how) - end it all!
Maguire
[email protected]
Socket API
2005.04.24
Brian "Beej" Hall, Beejs Guide to Network Programming: Using Internet Sockets, 04/08/2004 07:22:02 PM
http://www.ecst.csuchico.edu/~beej/guide/net/
Two addition socket functions for controlling various properties of sockets are: void *optval, socklen_t *optlen); int getsockopt(int s, int level, int optname, int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); For example, the options SO_SNDBUF and SO_RCVBUF - control the size of the sending buffer and the receiver buffer.
Maguire
[email protected]
#define bigBufferSize 8192 #define destination_host "172.16.33.5" main(argc, argv) int argc; char **argv; { int client_socket_fd; /* Socket to client, server */ struct sockaddr_in server_addr; /* servers address */ char bigBuffer[bigBufferSize]; /* buffer of data to send as payload */ int sendto_flags=0; /* create a UDP socket */ if ((client_socket_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1) { perror("Unable to open socket"); exit(1); }; /* initialize the server address structure */ memset( (char*)&server_addr, 0, sizeof(server_addr)); server_addr.sin_family=AF_INET; server_addr.sin_port=htons(9); /* 9 is the UDP port number for Discard */ if (inet_aton(destination_host, (struct sockaddr*)&server_addr.sin_addr) == 0) { fprintf(stderr, "could not get an address for: %s", destination_host);exit(1);} if ((sendto(client_socket_fd, bigBuffer, bigBufferSize, sendto_flags, (struct sockaddr*)&server_addr, sizeof(server_addr))) == -1) { perror("Unable to send to socket"); close(client_socket_fd); exit(1);} close(client_socket_fd); exit(0); } Maguire
[email protected]
UDP server design Stevens, Vol, 1, pp. 162-167 discusses how to program a UDP server You can often determine what IP address the request was sent to (i.e., the destination address): for example: thus ignoring datagrams sent to a broadcast address You can limit a server to a given incoming IP address: thus limiting requests to a given interface You can limit a server to a given foreign IP address and port: only accepting requests from a given foreign IP address and port # Multiple recipients per port (for implementations with multicasting support) setting SO_REUSEADDR socket option each process gets a copy of the incoming datagram Note: limited size input queue to each UDP port, can result in silent discards without an ICMP message being sent back (since OS discarded, not the network!)
Maguire
[email protected]
int client_socket_fd; struct sockaddr_in client_addr; struct sockaddr_in other_addr; int other_addr_len; char bigBuffer[bigBufferSize]; int sendto_flags=0;
/* create a UDP socket */ if ((client_socket_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1) { perror("Unable to open socket"); exit(1); }; memset((char*)&client_addr, 0, sizeof(client_addr)); /* initialize address structure */ client_addr.sin_family=AF_INET; client_addr.sin_port=htons(my_port); client_addr.sin_addr.s_addr = htonl(INADDR_ANY); if (bind(client_socket_fd, (struct sockaddr*)&client_addr, sizeof(client_addr))==-1) { close(client_socket_fd); exit(1); } if ((recvfrom(client_socket_fd, bigBuffer, bigBufferSize, sendto_flags, (struct sockaddr*)&other_addr, &other_addr_len)) == -1) { perror("Unable to receive from socket"); close(client_socket_fd); exit(1); } printf("Received packet from %s:%d\nData: %s\nString length=%d\n", inet_ntoa(other_addr.sin_addr), ntohs(other_addr.sin_port), bigBuffer, strlen(bigBuffer)); close(client_socket_fd); exit(0);} Maguire
[email protected]
Maguire
[email protected]
*/
#include #include #include #include #include #include #include #include #include
main(int argc, char **argv) { struct sockaddr_in *from; struct sockaddr_in *to; struct protoent *proto; int i; char *src,*dest; int srcp, destp; int packetsize,datasize;
if (argc!=5) {fprintf(stderr,"Usage: %s src_addr src_port dst_addr dst_port\n", argv[0]); fprintf(stderr,"src_addr and dst_addr must be given as IP addresses (xxx.xxx.xxx.xxx)\n"); exit(2);} src=argv[1]; srcp=atoi(argv[2]); dest=argv[3]; destp=atoi(argv[4]);
Maguire
[email protected]
if (!(proto = getprotobyname("raw")))
{perror("getprotobyname(raw)");exit(2);} {perror("socket");exit(2);}
if ((s = socket(AF_INET, SOCK_RAW, proto->p_proto)) < 0) memset(&addrfrom, 0, sizeof(struct sockaddr)); from = (struct sockaddr_in *)&addrfrom; from->sin_family = AF_INET; from->sin_port=htons(srcp); if (!inet_aton(src, &from->sin_addr)) {fprintf(stderr,"Incorrect memset(&addrto, 0, sizeof(struct sockaddr)); to = (struct sockaddr_in *)&addrto; to->sin_family = AF_INET; to->sin_port=htons(destp); if (!inet_aton(dest, &to->sin_addr)) {fprintf(stderr,"Incorrect packetsize=0; /* build a UDP packet from scratch */
ip=(struct iphdr *)outpack; ip->version=4; /* IPv4 */ ip->ihl=5; /* IP header length: 5 words */ ip->tos=0; /* no special type of service */ ip->id=0; /* no ID */ ip->frag_off=0; /* not a fragment - so there is no offset */ ip->ttl=0x40; /* TTL = 64 */ if (!(proto = getprotobyname("udp")))
{perror("getprotobyname(udp)"); exit(2);}
ip->protocol=proto->p_proto; ip->check=0; /* null checksum, will be automatically computed by the kernel */ ip->saddr=from->sin_addr.s_addr; /* set source and destination addresses */ ip->daddr=to->sin_addr.s_addr; /* end of ip header */ Maguire
[email protected]
packetsize+=ip->ihl<<2; /* udp header */ udp=(struct udphdr *)((int)outpack + (int)(ip->ihl<<2)); udp->source=htons(srcp); udp->dest=htons(destp); udp->check=0; /* ignore UDP checksum */ packetsize+=sizeof(struct udphdr); /* end of udp header */ /* add data to UDP payload if you want: */ for (datasize=0;datasize<8;datasize++) { outpack[packetsize+datasize]=A+datasize; } packetsize+=datasize; udp->len=htons(sizeof(struct udphdr)+datasize); ip->tot_len=htons(packetsize); if (sendto(s, (char *)outpack, packetsize, 0, &addrto, sizeof(struct sockaddr))==-1)
{perror("sendto"); exit(2);}
Maguire
[email protected]
ICMP Source Quench Error Since UDP has no ow control, a node could receive datagrams faster than it can process them. In this situation the host may send an ICMP source quench. Note: may be generated - it is not required to generate this error Stevens (Vol. 1, pp. 160-161) gives the example of sending 100 1024-byte datagrams from a machine on an ethernet via a router and SLIP line to another machine:
destination SLIP link
source
SLIP link is ~1000 times slower than the ethernet 26 datagrams are transmitted, then a source quench is sent for each successive datagram
Maguire
[email protected]
the router gets all 100 packets, before the rst has been sent across the link!
the new Router Requirements RFC - says that routers should not generate source quench errors, since it just consumes network bandwidth and it is an ineffective and unfair x for congestion
In any case, the sending program never responded to the source quench errors!
BSD implementations ignore received source quenchs if the protocol is UDP the program nished before the source quench was received!
Thus if you want reliability you have to build it in and do end-to-end flow control, error checking, and use (and thus wait for) acknowledgements.
Maguire
[email protected]
No error control
Since UDP has no error control, the sender has to take responsibility for sending the datagram again if this datagram must be delivered. But how does the send know if a datagram was successfully delivered? Unless the receiver sends a reply (or does some action due to receiving a given datagram) the sender will not know! [loss] Note that without some additional mechanism - the sender doesnt know if the datagram was delivered multiple times! [duplicates] If you want reliability you have to build your own protocol on top of UDP to achieve it. This includes deciding on your own retransmission scheme, timeouts, etc.
Maguire
[email protected]
No error control
2005.04.24
number of seconds client IP address your IP address server IP address gateway IP address client hardware address (16 bytes of space) server hostname (64 bytes) Boot le name (128 bytes) Vendor specic information (64 bytes)
unused
Maguire
[email protected]
BOOTP continued
When a request is sent as an IP datagram: if client does not know its IP address it uses 0.0.0.0 if it does not know the servers address it uses 255.255.255.255 if the client does not get a reply, it tries again in about 2 sec.
Maguire
[email protected]
BOOTP continued
2005.04.24
Maguire
[email protected]
1 2 3 4 5 6 7
Server Identier - used in DHCPOFFER and DHCPREQUEST ( optionally in DHCPACK and DHCPNAK) messages. Servers include this in the DHCPOFFER to allow the client to distinguish between lease offers. DHCP clients indicate which of several lease offers is being accepted by including this in a DHCPREQUEST message. (tag=54) Parameter Request List - used by a DHCP client to request values for specied conguration parameters. The client may list options in order of preference. The DHCP server must try to insert the requested options in the order requested by the client. (tag=55)
Maguire
[email protected]
Message - used by a server to provide an error message to client in a DHCPNAK message in the event of a failure. A client may use this in a DHCPDECLINE message to indicate the reason why the client declined the offered parameters.(tag=56) Maximum DHCP Message Size - species the maximum length DHCP message that it is willing to accept. A client may use the maximum DHCP message size option in DHCPDISCOVER or DHCPREQUEST messages, but should not use the option in DHCPDECLINE messages. (tag=57) Renewal (T1) Time Value - species the time interval from address assignment until the client transitions to the RENEWING state. (tag=58) Rebinding (T2) Time Value - species the time interval from address assignment until the client transitions to the REBINDING state.(tag=59) Class-identier - used by DHCP clients to optionally identify the type and conguration of a DHCP client. Vendors and sites may choose to dene specic class identiers to convey particular conguration or
Maguire
[email protected]
other identication information about a client. Servers not equipped to interpret the class-specic information sent by a client must ignore it (although it may be reported). (tag=60) Client-identier - used by DHCP clients to specify their unique identier. DHCP servers use this value to index their database of address bindings. This value is expected to be unique for all clients in an administrative domain. (tag=61)
Maguire
[email protected]
DHCPs importance
allows reuse of address, which avoids having to tie up addresses for systems which are not currently connected to the Internet avoids user conguration of IP address (avoids mistakes and effort) allows recycling of an IP address when devices are scrapped
How big a problem is manual conguration?
A large site (such as DuPont Co. - a large chemical company) has over 65,000 IP addressable devices; or consider what happens if each of the 815,000 Wal-Mart employees has an IP device
Address management software Product Vendor URL
The result is that a DHCP request can be answered in less than 100ms.
Maguire
[email protected]
Example of dhcpd.conf
### Managed by Linuxconf, you may edit by hand. ### Comments may not be fully preserved by linuxconf. server-identifier dhcptest1; default-lease-time 1000; max-lease-time 2000; option domain-name "3ctechnologies.se"; option domain-name-servers 130.237.12.2; option host-name "s1.3ctechnologies.se"; option routers 130.237.12.2; option subnet-mask 255.255.255.0; subnet 130.237.12.0 netmask 255.255.255.0 { range 130.237.12.3 130.237.12.200; default-lease-time 1000; max-lease-time 2000; } subnet 130.237.11.0 netmask 255.255.255.0 { range 130.237.11.3 130.237.11.254; default-lease-time 1000; max-lease-time 2000; }
Maguire
[email protected]
Example of dhcpd.conf
2005.04.24
The IETFs Dynamic Host Configuration (dhc) Working group http://www.ietf.org/html.charters/dhc-charter.html is working on addressing the issues concerning interaction between DHCP and DNS.
Maguire
[email protected]
The TFTP server (tftpd) is generally run setrooted (i.e., it only has access to is own directory) and with a special user and group ID since there is no password or other protection of the access to les via TFTP! TFTP request is sent to the well know port number (69/udp) TFTP server uses an unused ephemeral port for its replies
since a TFTP transfer can last for quite some time - it uses another port; thus freeing up the well known port for other requests
Maguire
[email protected]
IP header 20 bytes
opcode block 3=data number 2 bytes 2 bytes opcode block 4=ACK number
opcode block 5=error number error message 0 Figure 30: TFTP messages (see Stevens, Vol. 1, figure 15.1, pg. 210)
Filename and Mode (netascii or octet) are both N bytes sequences terminated by a null byte. Widely used for bootstrapping diskless systems (such as X terminals) and for dumping the configuration of routers (this is where the write request is used)
Maguire
[email protected]
ccslab1.kth.se
bit 130.237.15.254
RARP
Addr. Resolution
HW address: 48
Maguire
[email protected]
Uses UDP (for query) and TCP (zone transfer and large record query)
Maguire
[email protected]
Zones
A zone is a subtree of the DNS tree which is managed separately. Each zone must have multiple name servers: a primary name server for the zone
gets its data from disk les (or other stable store) must know the IP address of one or more root servers
To find a server you may have to walk the tree up to the root or possibly from the root down (but the later is not friendly).
Maguire
[email protected]
Zones
2005.04.24
16
Parameters Number of Answers Number of Additional
31
Operation: 0=Query, 1=Response Query type: 0=standard, 1=Inverse Set if answer is authoritative Set if answer is truncate Set if answer is desired Set if answer is available reserved Response Type: 0=No error, 1=Format error in query, 2=Server failure, 3=Name does not exist
DNS Message format
2005.04.24
Maguire
[email protected]
commercial organizations educational organizations other U.S. government organizations (see RFC 1811 for policies) international organizations U.S. Military networks other organizations special domain for address to name mappings, e.g., 5.215.237.130.in-addr.arpa United Arab Emerates Sweden Zimbabwe
Maguire
[email protected]
CORE (Council of Registrars) - operational organization composed of authorized Registrars for managing allocations under gTLDs.
WIPO provides arbitration concerning names:
http://arbiter.wipo.int/domains/gtld/newgtld.html
Maguire
[email protected]
Domain registrars
Internet Corporation for Assigned Names and Numbers (ICANN) Accredited Registrars, the full list is at http://www.icann.org/registrars/accredited-list.html Even more registrars are on their way to being accredited and operating!
Maguire
[email protected]
Domain registrars
2005.04.24
Address
II-Stiftelsen Sehlstedtsgatan 7 SE-115 28 Stockholm Sweden [email protected] +46 8 56849050 +46 8 50618470
Network Information Centre Sweden NIC-SE Box 5774 SE-114 87 Stockholm Sweden [email protected] +46 8 54585700 +46 8 54585729
A AAAA PTR CNAME HINFO MX NS TXT AFSDB ISDN KEY KX LOC MG MINFO MR NULL NS
Maguire
[email protected]
an IP address. Dened in RFC 1035 an IPv6 address. Dened in RFC 1886 pointer record in the in-addr.arpa format. Dened in RFC 1035. canonical name alias (in the format of a domain name). Dened in RFC 1035. Host information. Dened in RFC 1035. Mail eXchange record. Dened in RFC 1035. authoritative Name Server (gives authoritative name server for this domain).Dened in RFC 1035. other attributes. Dened in RFC 1035. AFS Data Base location. Dened in RFC 1183. ISDN. Dened in RFC 1183. Public key. Dened in RFC 2065. Key Exchanger. Dened in RFC 2230. Location. Dened in RFC 1876. mail group member. Dened in RFC 1035. mailbox or mail list information. Dened in RFC 1035. mail rename domain name. Dened in RFC 1035. null RR. Dened in RFC 1035. Name Server. Dened in RFC 1035.
Resource Records (RR)
2005.04.24
See Stevens, Vol. 1, gure 14.2, pg. 201 (augmented with additional entires) Record type Description
Network service access point address. Dened in RFC 1348. Redened in RFC 1637. Redened in RFC 1706. Next. Dened in RFC 2065. Pointer to X.400/RFC822 information. Dened in RFC 1664. Responsible Person. Dened in RFC 1183. Route Through. Dened in RFC 1183. Cryptographic signature. Dened in RFC 2065. Start Of Authority. Dened in RFC 1035. Server. DNS Server resource record -- RFC 2052, for use with DDNS. Text. Dened in RFC 1035. Well-Known Service. Dened in RFC 1035. X25. Dened in RFC 1183.
Note that an number of the RR types above are for experimental use. Name of an organization:
ISI.EDU. PTR 0.0.9.128.IN-ADDR.ARPA.
Maguire
[email protected]
Network names
Conventions: it.kth.se includes all the computers in the KTH/SU IT-University kth.se includes all the computers at KTH As resource records:
> set querytype=any > kth.se Non-authoritative answer: kth.se kth.se origin = kth.se mail addr = hostmaster.kth.se serial = 2002011500 refresh = 3600 (1H) internet address = 130.237.72.201
Maguire
[email protected]
Network names
2005.04.24
retry = 600 (10M) expire = 604800 (1W) minimum ttl = 86400 (1D) kth.se kth.se kth.se Authoritative answers can be found from: kth.se kth.se kth.se kth.se nic.lth.se ns.kth.se nameserver = kth.se nameserver = nic.lth.se nameserver = ns.kth.se internet address = 130.237.x.y internet address = 130.235.z.w internet address = 130.237.m.n nameserver = kth.se nameserver = nic.lth.se nameserver = ns.kth.se
ARPANET.ARPA. isi-net.isi.edu.
PTR PTR
0.0.0.10.IN-ADDR.ARPA. 0.0.9.128.IN-ADDR.ARPA.
Maguire
[email protected]
Network names
2005.04.24
Example:
$ORIGIN it.kth.se. @ 1D IN SOA bbbb hostmaster ( 2002012001 8H 2H 2W 8H ) 1D 1D 1D 1D 1D 1D 1D IN NS IN NS IN MX IN A IN AFSDB IN AFSDB IN AFSDB ns.ele.kth.se. ns.kth.se. 0 mail 130.237.x.y 1 xxxx 1 yyyy 1 zzzz ; serial ; refresh ; retry ; expiry ; minimum
Maguire
[email protected]
Example:
2005.04.24
MX information
> set querytype=MX > kth.se kth.se kth.se kth.se kth.se mail1.kth.se kth.se nic.lth.se ns.kth.se preference = 0, mail exchanger = mail1.kth.se nameserver = kth.se nameserver = nic.lth.se nameserver = ns.kth.se internet address = 130.237.32.62 internet address = 130.237.72.201 internet address = 130.235.20.3 internet address = 130.237.72.200
Maguire
[email protected]
MX information
2005.04.24
Internet Addresses: A second address for your host? to have multiple addresses for you computer, see section on ifcong
Hostinfo (HINFO)
> set querytype=HINFO > kth.se kth.se Entry xxxx owner clas CPU = sun-4/60 TTL 1D 1D RR type IN HINFO IN A OS = unix value "PC" "FLINUX" 130.237.x.y comment ; CPU OS
Maguire
[email protected]
For more information see: RFC 1464: Using the Domain Name System To Store Arbitrary String Attributes
Maguire
[email protected]
Conguring DNS
Conguring the BIND resolver
/etc/resolv.cong
Maguire
[email protected]
Conguring DNS
2005.04.24
Root servers
ROOT-SERVERS.NET Verisign Global Registry Services A B EP.Net C D E F IP address(es) Home ASN Location(s)
Cogent Communications University of Maryland NASA Ames Research Centre Internet Systems Consortium (ISC) US Department of Defence US Army Research Lab Autonomica/NORDUnet
Verisign Global Registry Services
198.41.0.4 192.228.79.201 2001:478:65::53 192.33.4.12 128.8.10.90 192.203.230.10 192.5.5.241 2001:500::1035 192.112.36.4 128.63.2.53 2001:500:1::803f:235 192.36.148.17 192.58.128.30 193.0.14.129 2001:7fd::1 198.32.64.12 202.12.27.33 2001:dc3::35
19836
Herndon, VA, US Marina del Rey, CA, US Herndon VA; Los Angeles; New York City; Chicago College Park, MD, US Mountain View, CA, US
Ottawa; Palo Alto; San Jose CA;New York City; San Francisco;Madrid; Hong Kong; Los Angeles;Rome; Auckland; Sao Paulo;Beijing; Seoul; Moscow; Taipei;Dubai; Paris; Singapore; Brisbane;Toronto; Monterrey; Lisbon;Johannesburg; Tel Aviv; Jakarta;Munich; Osaka; Prague
G H I J K L M
Herndon, VA, US
London (UK); Amsterdam (NL);Frankfurt (DE); Athens (GR);Doha (QA); Milan (IT);Reykjavik (IS); Helsinki (FI);Geneva (CH); Poznan (PL);Budapest (HU)
see http://www.root-servers.org/
Maguire
[email protected]
Root servers
2005.04.24
Maguire
[email protected]
Location
Auckland Brisbane Jakarta Hong Kong Osaka Beijing Seoul Taipei So Paulo Los Angeles New York Monterrey Palo Alto San Francisco Redwood City San Jose Ottawa Toronto Paris Dubai Johannesburg Lisbon Madrid Munich Prague Rome Moscow Tel Aviv
IPv4/IPv6
IPv4 IPv4 IPv4 IPv4 IPv4 and IPv6 IPv4 IPv4 and IPv6 IPv4 IPv4 IPv4 and IPv6 IPv4 and IPv6 IPv4 IPv4 and IPv6 IPv4 and IPv6 IPv4 and IPv6 IPv4 IPv4 and IPv6 IPv4 IPv4 and IPv6 IPv4 IPv4 IPv4 and IPv6 IPv4 IPv4 and IPv6 IPv4 IPv4 IPv4 IPv4
Node Type
Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Global Node Global Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node Local Node
Americas
Rest of World
Maguire
[email protected]
Where is f.root-servers.net ?
traceroute to f.root-servers.net (192.5.5.241), 30 hops max, 40 byte packets
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Maguire
[email protected]
net212a.imit.kth.se (130.237.212.2) cn4-kf4-p2p.gw.kth.se (130.237.211.205) ea4-cn4-p2p.gw.kth.se (130.237.211.102) kth2-ea4-b.gw.kth.se (130.237.211.241) stockholm4-SRP2.sunet.se (130.242.85.66) 130.242.82.49 (130.242.82.49) se-kth.nordu.net (193.10.252.177) so-1-0.hsa2.Stockholm1.Level3.net (213.242.69.21) so-4-1-0.mp2.Stockholm1.Level3.net (213.242.68.205) as-1-0.bbr2.London1.Level3.net (212.187.128.25) as-0-0.bbr1.NewYork1.Level3.net (4.68.128.106) ae-0-0.bbr2.SanJose1.Level3.net (64.159.1.130) so-14-0.hsa4.SanJose1.Level3.net (4.68.114.158) ISC-Level3-fe.SanJose1.Level3.net (209.245.146.219) f.root-servers.net (192.5.5.241)
Where is f.root-servers.net ?
2005.04.24
Where is i.root-servers.net ?
traceroute to i.root-servers.net (192.36.148.17), 30 hops max, 40 byte packets
1 2 3 4 5 6 7 8 9 10 net212a.imit.kth.se (130.237.212.2) cn4-kf4-p2p.gw.kth.se (130.237.211.205) cn5-cn4-p2p.gw.kth.se (130.237.211.201) kth1-cn5-p2p.gw.kth.se (130.237.211.41) stockholm3-SRP2.sunet.se (130.242.85.65) ge-2-2.cyb-gw.sth.netnod.se (194.68.123.73) ge-2-1.icyb-gw.sth.netnod.se (192.36.144.190) srp-1-1.ibyb-gw.sth.netnod.se (192.36.144.235) ge-0-0.r1.sth.dnsnode.net (194.146.105.187) i.root-servers.net (192.36.148.17) 1 ms 1 ms 4 ms 4 ms 4 ms 2 ms 1 ms 1 ms 1 ms 3 ms 0 ms 1 ms 4 ms 4 ms 4 ms 1 ms 1 ms 1 ms 2 ms 3 ms 0 ms 4 ms 4 ms 4 ms 4 ms 1 ms 2 ms 1 ms 2 ms 2 ms
Maguire
[email protected]
Where is i.root-servers.net ?
2005.04.24
IP-address 130.237.x.1 130.237.x.2 130.237.x.3 130.237.x.4 c/o address <<< we can update this dynamically
Dynamic Domain Name System (DDNS)
2005.04.24
DDNS
RFC 2136: Dynamic Updates in the Domain Name System (DNS UPDATE)
add or delete resource records
RFC 2052: A DNS RR for specifying the location of services (DNS SRV)
When a SRV-cognizant web-browser wants to retrieve http://www.asdf.com/ it does a lookup of http.tcp.www.asdf.com
Maguire
[email protected]
DDNS
2005.04.24
Summary
This lecture we have discussed: UDP BOOTP DHCP DNS, DDNS
Maguire
[email protected]
Summary
2005.04.24
References
[22] Joe Abley, f.root-servers.net, NZNOG 2005, February 2005, Hamilton, NZ
http://www.isc.org/pubs/pres/NZNOG/2005/F%20Root%20Server.pdf
Maguire
[email protected]
References
2005.04.24
2G1305 Internetworking/Internetteknik Spring 2005, Period 4 Module 5: TCP, HTTP, RPC, NFS, X
Lecture notes of G. Q. Maguire Jr.
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapter 12
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.04.26:13:15
Maguire
[email protected]
TCP.fm5
2005.04.26
Lecture 4: Outline
TCP HTTP Web enabled devices RPC, XDR, and NFS X Window System, and some tools for looking at these protocols
Maguire
[email protected]
Lecture 4: Outline
2005.04.26
Maguire
[email protected]
TCP applications write 8-bit bytes to a stream and read bytes from a stream
TCP decides how much data to send (not the application) - each unit is a segment There are no records (or record makers) - just a stream of bytes the receiver cant tell how much the sender wrote into the stream at any given time
Maguire
[email protected]
Applications which use TCP Lots of applications have been implemented on top of the TCP, such as: TELNET pro vides a virtual terminal {emulation} FTP used for le transfers SMTP forwarding e-mail HTTP transport data in the World Wide Web Here we will focus on some features not covered in the courses: Telesys, gk and Data and computer communication.
Maguire
[email protected]
TCP Header
0 7 8 15 16 23 24 31
reserved (6 bits)
U A P R S F R C S S Y I G K H T N N
Just as with UDP, TCP provides de/multiplexing via the 16 bit source and destination ports. The 4 bit header length indicates the number of 4-byte words in the TCP header
Maguire
[email protected]
TCP Header
2005.04.26
20 bytes
TCP resequences data at the receiving side all the bytes are delivered in order to the receiving application TCP discards duplicate data at the receiving side Urgent pointer - specifies that the stream data is offset and that the data field begins with "urgent data" which is to bypass the normal stream - for example ^C
Maguire
[email protected]
We will see how these bits are used as we examine each of them later. The window size (or more exactly the receive window size (rwnd)) - indicates how many bytes the receiver is prepared to receive (this number is relative to the acknowledgement number). Options - as with UDP there can be up to 40 bytes of options (we will cover these later)
Maguire
[email protected]
Connection establishment 3-way handshake: client server Guarantees both sides are SYN, seq=x ready to transfer data time Allows both sides to agree on SYN, seq=y, ACK=x+1 initial sequence numbers
ACK=y+1
Initial sequence number (ISN) must be chosen so that each instance of a specific TCP connection between two end-points has a different ISN. The entity initiating the connection is (normally) called the "client".
Maguire
[email protected]
SYN Flooding Attack It is clear that if a malacioous user simply sends a lot of SYN sgements to a target machine (with faked source IP addresses) this machine will spend a lot of resources to set up TCP connections which subsequently never occur. As the number of TCP control blocks and other resources are finite legitimate connection requests cant be answered the target machine might even crash The result is to deny service, this is one of many Denial of Services (DoS) Attacks
Maguire
[email protected]
Connection teardown client FIN, seq=x active close ACKs from client
FIN, seq: z ACK, seq: z+1
server
EOF sent to application
half-close
passive close
Note: it takes 4 segments to complete the close. Normally, the client performs an active close and the server performs passive close.
Maguire
[email protected]
TCP options These options are used to convey additional information to the destination or to align another options Single-byte Options
No operation End of option
Multiple-byte Options
Maximum segment size Window scale factor Timestamp
Maguire
[email protected]
Maximum Segment Size The Maximum Segment Size (MSS) is the largest amount of data TCP will send to the other side MSS can be announced in the options eld of the TCP header during connection establishment If a MSS is not announced a default value of 536 is assumed In general, the larger MSS is the better -- until fragmentation occurs
As when fragmentation occurs the overhead increases!
Maguire
[email protected]
Sliding window Flow control receiver: offered window - acknowledges data sent and what it is prepared to receive
thus the sender can send an ACK, but with a offered window of 0 later the sender sends a window update with a non-zero offered window size the receiver can increase or decrease this window size as it wants
cant send until can send window advances this much more sent but unacknowledged
Maguire
[email protected]
Window size Increasing window size can improve performance - more recent systems have increased buffer size ranging from 4096 ... 16,384 bytes. The later produces ~40% increase in file transfer performance on an ethernet. Socket API allows user to change the size of the send and receive buffers.
Maguire
[email protected]
ACK, seq=y+1
Thus each keystroke not only generates a byte of data for the remote application which has to be sent in a segments, but this will trigger an ACK along with an echo & its ACK generates 3 more segments! All to send a single byte of user data!!!
Maguire
[email protected]
Silly Window Syndrome If receiver advertises a small window, then sender will send a small amount of data, which fills receivers window, . To prevent this behavior: sender does not transmit unless:
full-size segment can be sent OR it can send at least 1/2 maximum sized window that the other has ever advertised we can send everything we have and are not expecting an ACK or Nagle algorithm is disabled
Maguire
[email protected]
Nagle Algorithm telnet/rlogin/... generate a packet (41 bytes) for each 1 byte of user data these small packets are called tinygrams not a problem on LANs adds to the congestion on WANs Nagle Algorithm each TCP connection can have only one outstanding (i.e., unacknowledged) small segment (i.e., a tinygram) while waiting - additional data is accumulated and sent as one segment when the ACK arrives self-clock: the faster ACKs come, the more often data is sent
thus automatically on slow WANs fewer segments are sent
Round trip time on a typical ethernet is ~16ms (for a single byte to be sent, acknowledged, and echoed) - thus to generate data faster than this would require typing faster than 60 characters per second! Thus rarely will Nagle be invoked on a LAN.
Maguire
[email protected]
Disabling the Nagle Algorithm But sometimes we need to send a small message - without waiting (for example, handling a mouse event in the X Window System) - therefore we set: TCP_NODELAY on the socket Host Requirements RFC says that hosts must implement the Nagle algorithm, but there must be a way to disable it on individual connections.
Maguire
[email protected]
Delayed acknowledgements Rather than sending an ACK immediately, TCP waits ~200ms hoping that there will be data in the reverse direction - thus enabling a piggybacked ACK. Host Requirements RFC states the delays must be less than 500ms Implementations often use a periodic 200ms timer - rather than setting a timer specically for computing this delay
similar to the periodic 500ms timer used for detecting timeouts
Maguire
[email protected]
Resulting bulk data ow Every segment cares a full MSS worth of data. Typically an ACK every other segment. client server
data
Bandwidth-Delay Product How large should the window be for optimal throughput? Calculate the capacity of the pipe as: capacity(bits) = bandwidth(bits/sec) * RTT(sec) This is the size the receiver advertised window should have for optimal throughput. Example:
T1 connection across the US: capacity = 1.544Mbit/s * 60ms = 11,580 bytes Gigabit connection across the US: capacity = 1Gbit/s * 60ms = 7,500,000 bytes!
However, the window size field is only 16 bits maximum value of 65535 bytes For Long Fat Pipes we can use the window scale option to allow much larger window sizes.
Maguire
[email protected]
Congestion Avoidance So far we have assumed that the sender is only limited by the receivers available buffer space. But if we inject lots of segments into a network - upto window size advertised by receiver works well if the hosts are on the same LAN if there are routers (i.e., queues) between them and if the trafc arrives faster than it can be forwarded, then either the packets have to be
buffered or thrown away - we refer to this condition as congestion
Lost packets lead to retransmission by the sender This adds even more packets to the network network collapse Therefore we need to be able to reduce the window size to avoid congestion.
Maguire
[email protected]
Congestion Control We introduce a Congestion Window Thus the senders window size will be determined both by the receiver and in reaction to congestion in the network Sender maintains 2 window sizes: Receiver-advertised window (rwnd)
advertised window is ow control imposed by receiver
Actual window size = min(rwnd, CWND) To deal with congestion, sender uses several strategies: Slow start Additive increase of CWND Multiplicative decrease of CWND
Maguire
[email protected]
Slow start In 1989, Van Jacobson introduced slow start based on his analysis of actual traffic and the application of control theory. All TCP implementations are now required to support slow start.
the rate at which new packets should be injected into the network is the rate at which acknowledgements are returned cwnd starts at number of bytes in one segment (as announced by other end) and increases exponentially with successfully received cwnd worth of data
Figure 33: Graphical plot of congestion window (cwnd) as the connection goes from slow start to congestion avoidance behavior (figure from Mattias Ronquist, TCP Reaction to Rapid Changes of the Link Characteristics due to Handover in a Mobile Environment, MS Thesis, Royal Institute of Technology, Teleinformatics, August 4, 1999.)
Maguire
[email protected]
Round-Trip Time Measurement Fundamental to TCPs timeout and retransmission is the measurement (M) of the round-trip time (RTT). As the RTT changes TCP should modify its timeouts. Originally TCP specificed:
R R + ( 1 )M RTO = R
RTO == retransmission timout time Van Jacobson found that this could not keep up with wide fluctuations in RTT, which leads to more retransmissions, when the network is already loaded! So he proposed tracking the variance in RTT and gave formulas which compute the RTO based on the mean and variance in RTT and can be easily calculated using integer arithmetic (see Stevens, Vol. 1, pg. 300 for details).
Maguire
[email protected]
Karns algorithm When a retransmisson occurs, RTO is backed off, the packet retransmitted with the new longer RTO, and an ACK is received. But is it the original ACK or the new ACK? we dont know, thus we dont recalculate a new RTO until an ACK is received for a segment which is not retransmitted
Maguire
[email protected]
Congestion Avoidance Algorithm Slow start keeps increasing cwnd, but at some point we hit a limit due to intervening routers and packets start to be dropped. The algorithm assumes that packet loss means congestion1. Signs of packet loss: timeout occuring receipt of duplicate ACKs Introduce another variable for each connection: ssthresh == slow start threshold when data is acknowledged we increase cwnd: if cwnd < ssthreshold we are doing slow start; increases continue until we are half way to where we hit congestion before else we are doing congestion avoidance; then increase by 1/cwnd + 1/8 of segment size each time an ACK is received (See Stevens, Vol. 1, figure 21.8, pg. 311 for a plot of this behavior)
1. Note: if your losses come from other causes (such as bit errors on the link) it will still think it is due to congestion!
Maguire
[email protected]
Van Jacobsons Fast retransmit and Fast Recovery Algorithm TCP is required to generate an immediate ACK (a duplicate ACK) when an out-of-order segment is received. This duplicate ACK should not be delayed. The purpose is to tell the sender that the segment arrived out of order and what segment number the receiver expects. Cause: segments arriving out of order OR lost segment If more than a small number (3) of duplicate ACKs are detected, assume that a segment has been lost; then retransmit the missing segment immediately (without waiting for a retransmission timeout) and perform congestion avoidance - but not slow start. Why not slow start? Because the only way you could have gotten duplicate ACKs is if subsequent segments did arrive - which means that data is getting through.
Maguire
[email protected]
Per-Route Metrics Newer TCPs keeps smoothed RTT, smoothed mean deviation, and slow start threshold in the routing table. When a connection is closed, if there was enough data exchanged (defined as 16 windows full) - then record the parameters i.e., 16 RTT samples smoothed RTT is accurate to ~5%
Maguire
[email protected]
TCP Persist Timer If the window size is 0 and an ACK is lost, then receiver is waiting for data and sender is waiting for a non-zero window! To prevent deadlock, introduce a persist timer that causes sender to query the receiver periodically with window probes to find out if window size has increased. Window probes sent every 60 seconds - TCP never gives up sending them.
Maguire
[email protected]
TCP Keepalive Timer No data flows across an idle TCP connection - connections can persist for days, months, etc. Even if intermediate routers and links go down the connection persists! However, some implementations have a keepalive timer. Host Requirements RFC gives 3 reasons not to use keepalive messages: can cause perfectly good connections to be dropped during transient failures they consume unnecessary bandwidth they produce additional packet charges (if you are on a net that charges per packet) Host Requirements RFC says you can have a keepalive time but: it must not be enabled unless an application specically asks the interval must be congurable, with a default of no less than 2 hrs.
Maguire
[email protected]
TCP Performance TCPs path MTU discovery: use min(MTU of outgoing interface, MSS announced by other end) use per-route saved MTU once an initial segment size is chosen - all packets have dont fragment bit set if you get an ICMP Cant fragment message - recompute Path MTU. periodically check for possibility of using a larger MTU (RFC 1191 recommends 10 minute intervals)
Maguire
[email protected]
Long Fat Pipes Networks with large bandwidth-delay products are called Long Fat Networks (LFNs) - pronounced elefants. TCP running over a LFN is a Long Fat Pipe. Window Scale option - to avoid 16 bit window size limit Timestamp option - putting a time stamp in each segment allows better computation of RTT Protection Against Wrapped Sequence Numbers (PAWS) - with large windows you could have sequence number wrap around and not know which instance of a given sequence number is the correct one; solved by using timestamps (which must simply be monotonic) T/TCP - TCP extension for Transactions; to avoid the three way handshake on connection setup and shorten the TIME_WAIT state. (for details of T/TCP see Stevens, Vol.3)
Maguire
[email protected]
Measuring TCP Performance Measured performance: Performance on Ethernets at ~90% of theoretical value (using workstations) TCP over FDDI at 80-98% of theoretical value TCP (between two Crays) at 781 Mbits/sec over a 800Mbit/sec HIPPI channel TCP at 907Mbits/sec on a loopback interface of a Cray Y-MP. Practical limits: cant run faster than the slowest link cant go faster than the memory bandwidth of the slowest machine (since you have to touch the data at least once) you cant go faster than the window size offered by the receiver divided by the round trip time (comes from the calculation of the bandwidth delay product)
thus with the maximum window scale factor (14) window size of 1.073 Gbits; just divide by RTT to nd the maximum bandwidth
TCP header continued
2005.04.26
Maguire
[email protected]
1. Figure 7-3, from Mattias Ronquist, TCP Reaction to Rapid Changes of the Link Characteristics due to Handover in a Mobile Environment, MS Thesis, Royal Institute of Technology, Teleinformatics, August 4, 1999, p.38.
Maguire
[email protected]
TCP servers
Stevens, Vol. 1, pp. 254-260 discusses how to design a TCP server, which is similar to list of features discussed for UDP server, but now it is incoming connection requests which are queued rather than UDP datagrams note that incoming requests for connections which exceed the queue are silently ignored - it is up to the sender to time out it active open this limited queuing has been one of the targets of denial of service attacks
TCP SYN Attack - see http://cio.cisco.com/warp/public/707/4.html Increase size of the SYN_RCVD queue (kernel variable somaxconn limits the maximum backlog on a listen socket - backlog is the sum of both the SYN_RCVD and accept queues) and decrease the time you will wait for an ACK in response to your SYN_ACK for a nice HTTP server example, see
http://www.cs.rice.edu/CS/Systems/Web-measurement/paper/node3.html
Maguire
[email protected]
TCP servers
2005.04.26
HTTP described by an Internet Draft in 1993; replaced with RFC 1945, Hypertext Transfer Protocol -- HTTP/1.0, May 1996; RFC 2068, Hypertext Transfer Protocol -- HTTP/1.1, January 1997; replaced by RFC 2616, June 1999, RFC 2817 Upgrading to TLS Within HTTP/1.1, May 2000.
Maguire
[email protected]
request line headers (0 or more) <blank_line> body (only for a POST request)
response
HTTP/1.0 response
HTTP Requests request-line == request request-URI HTTP-version Three requests: GET - returns information identied by request URI HEAD - similar to GET but only returns header information POST - sends a body with a request; used for posting e-mail, news, sending a llin form, etc. Universal Resource Idendifiers (URIs) - described in RFC 1630, URLs in RFC 1738 and RFC 1808. status-line == HTTP-version response-code response-phrase
Maguire
[email protected]
Allow Authorization Content-Encoding Content-Length Content-Type Date Expires From If-Modied-Since Last-Modied Location MIME-Version Pragma Referer Server User-Agent WWW-Authenticate
Maguire
[email protected]
1yz 200 201 202 204 301 302 304 400 401 403 404 500 501 502 503
Informational. Not currently used Success OK, request succeeded. OK, new resource created (POST command) Request accepted but processing not completed OK, but no content to return Redirection; further action needs to be taken by user agent Requested resource has been assigned a new permanent URL Requested resource resides temporarily under a different URL Document has not been modied (conditional GET) Client error Bad request Unauthorized; request requires user authentication Forbidden for unspecied reason Not found Server error Internal server error Not implemented Bad gateway; invalid response from gateway or upstream server Service temporarily unavailable
Maguire
[email protected]
Client Caching Client can cache HTTP documents along with the date and time the document was fetched. If the document is cached, then the If-Modified-Since header can be sent to check if the document has changed since the copy was cached - thus saving a transfer but costing a round trip time and some processing time. This is called a conditional GET.
Maguire
[email protected]
Server Redirect Response code 302, along with a new location of the request-URI.
Maguire
[email protected]
Multiple simultaneous connections to server GET of a page with multiple objects on it (such as GIF images) - one new connection for each object, all but the first can occur in parallel!
00 port 1114 port 1118 port 1121 in seconds
Figure 35: Timeline of eight TCP connection for a home page and seven GIF images (see Stevens, Vol. 3, figure 113.5, pg. 171)
port 1115
port 1116
port 1117
port 1119
port 1120
Note that the port 1115, 1116, and 1117 requests start before 1114 terminates, Netscape can initiate 3 non-blocking connects after reading the end-of-file but before closing the first connection.
Maguire
[email protected]
1 2 3 4 5 6 7
Why no improvement beyond 4? program has an implementation limit of 4, even if you specify more! gains beyond 4 are probably small (given the small difference between 3 and 4) - but Stevens has not checked!
Maguire
[email protected]
Maguire
[email protected]
HTTP Statistics
Statistics for individual HTTP connections (see Stevens, gure 13.7, Vol. 3, pg. 172) Median Mean
Maguire
[email protected]
connect to www.it.kth.se
Reality!
Logically
choose LAN adapter send ARP for name server or gateway MAC send DNS query send ARP for HTTP servers MAC or send via gateway MAC
request reply request reply
send TCP SYN to ITs IP address, via local routers MAC address
Adopted from TCP/IP from the wire up by Joe R. Doupnik, Novells BrainShare99. http://netlab1.usu.edu/pub/bsuk99/
Maguire
[email protected]
HTTP Performance Problems HTTP opens one connection for each document. Each such connection involves slow start - which adds to the delay Each connection is normally closed by the HTTP server - which has to wait TIME_WAIT, thus lots of control blocks are waiting in the server. Proposed changes: have client and server keep a TCP connection open {this requires that the size of the response (Content-Length) be generated}
requires a change in client and server new header Pragma: hold-connection
GETALL - causes server to return document and all in-lined images in a single response GETLIST - similar to a client issuing a series of GETs HTTP-NG (aka HTTP/1.1) - a single TCP connection with multiple sessions {it is perhaps the rst TCP/IP session protocol}
HTTP/1.1 also has another feature - the server knows what hostname was in the request, thus a single server at a single IP address can be the HTTP server under many names hence providing Web hotel services for many rms _but_ only using a single IP address.
Maguire
[email protected]
HTTP performance
Joe Touch, John Heidemann, and Katia Obraczka, Analysis of HTTP Performance,USC/Information Sciences Institute, August 16, 1996, Initial Release, V1.2 -- http://www.isi.edu/lsam/publications/http-perf/ John Heidemann, Katia Obraczka, and Joe Touch, Modeling the Performance of HTTP Over Several Transport Protocols, IEEE/ACM Transactions on Networking 5(5), October 1997. November, 1996.
http://www.isi.edu/~johnh/PAPERS/Heidemann96a.html
Simon E Spero, Analysis of HTTP Performance problems http://sunsite.unc.edu/mdma-release/http-prob.html This is a nice introduction to HTTP performance. John Heidemann, Performance Interactions Between P-HTTP and TCP Implementations. ACM Computer Communication Review, 27 2, 65-73, April, 1997. http://www.isi.edu/~johnh/PAPERS/Heidemann97a.html
Maguire
[email protected]
HTTP performance
2005.04.26
Splits the web server into a very tiny server on the device and more processing (via applets) in the desktop system (where the WEB browser is running). Axis Communications AB - http://www.axis.com - produces many web enabled devices - from thin clients to cameras running an embedded Linux
Maguire
[email protected]
Maguire
[email protected]
Remote Procedure Call (RPC) Two versions: using Sockets API and works with TCP and UDP using TLI API TI-RPC (Transport Independent) and works with any transport layer provided by the kernel
Format of an RPC call message as a UDP datagram
IP Header UDP Header transaction ID (XID) call (0) RPC version (2) program number version number procedure number credentials verier procedure parameters
XID set by client and returned by server (client uses it to match requests and replies)
Maguire
[email protected]
program number, program version, procedure number identies the procedue to be called credentials identify the client - sometimes the user ID and group ID verier - used with secure RPC (to identify the server); uses DES encryption l
Format of an RPC reply message as a UDP datagram
IP Header UDP Header transaction ID (XID) call (1) status (0=accepted) verier procedure results
Maguire
[email protected]
External Data Representation (XDR) used to encode value in RPC messages - see RFC 1014
Maguire
[email protected]
Port Mapper RPC server programs use ephemeral ports - thus we need a well known port to be able to find them Servers register themselves with a registrar - the port mapper (called rpcbind in SVR4 and other systems using TI-RPC) Port mapper is at well know port: 111/UDP and 111/TCP The port mapper is an RPC server with program number 100000, version 2, a TCP port of 111, a UDP port of 111. Servers register themselves with RPC calls and clients query with RPC calls: PMAPPROC_SET - register an entry PMAPPROC_UNSET - unregister an entry PMAPPROC_GETPORT - get the port number of a given instance PMAPPROC_DUMP - returns all entries (used by rpcinfo -p)
Maguire
[email protected]
NFSspy Insert a new pointer in place of the RPC we want to snoop Embed this earlier code in our code:
routine
routine
After
Network File System (NFS)
2005.04.26
NFSspy problem Imagine several students each insert a new pointer in place of the RPC they want to snoop: What happens if this one dies?
routine
prolog epilog
prolog epilog
prolog epilog
Maguire
[email protected]
nfsspy Initial implementations were written by Seth Robertson, Jon Helfman, Larry Ruedisueli, Don Shugard, and other students for a project assignment in my course on Computer Networks at Columbia Univ. in 1989. There is a report about one implementation by Jon Helfman, Larry Ruedisueli, and Don Shugard, "Nfspy: A System for Exploring the Network File System", AT&T Bell Laboratories, 11229-890517-07TM. See also NFS Tracing By Passive Network Monitoring by Matt Blaze, ~1992,
http://www.funet.fi/pub/unix/security/docs/papers/nfsspy.ps.gz
Matts program builds upon an rpcspy program and this feeds packets to his nfstrace program and other scripts.
Seth Robertsons version even inverted the file handles to show the actual file names.
Maguire
[email protected]
NFS protocol, version 2 (RFC 1094) provides transparent le access client server application built on RPC Generally, NFS server at 2049/UDP - but it can be at different ports
User process local file access NFS client TCP/UDP IP NFS server local file access
Figure 36: NFS client and NFS server (see Stevens, Vol. 1, figure 29.3, pg. 468)
Most NFS servers are multithreaded - so that multiple requests can be in process at one time. If the server kernel does not support mutlithreading then multiple server processes (nfsd) are used.
Maguire
[email protected]
Often there are multiple NFS clients (biod) running on the client machine - each processes one call and waits inside the kernel for the reply. NFS consists of more than just the NFS protocol
Various RPC programs used with NFS (see Stevens, Vol. 1, pg. 469) Application program number version numbers Number of procedures
2 2 1 1,2,3 1
4 15 5 19 6
Maguire
[email protected]
NFS File Handles To reference a file via NFS we need a le handle, an opaque object used to reference a file or directory on the server. File handle is created by the server - upon an lookup; subsequent client requests just simply pass this file handle to the appropriate procedure (they never look at the contents of this object - hence it is opaque). in version 2, a le handle is 32 bytes in version 2, a le handle is 64 bytes UNIX systems generally encode the filesystem ID (major and minor dev numbers), the i-node number, and an i-node generation number into the file handle.
Maguire
[email protected]
NFS Mount protocol Server can check IP address of client, when it gets a mount command from a client to see if this client is allowed to mount the given filesystem; Mount daemon returns the file handle of the given filesystem.
Maguire
[email protected]
NFS Procedures
Procedure
Description
return the attributes of a le set the attributes of a le return the status of a lesystem lookup a le - returns a le handle read from a le, starting at specied offset for n bytes (upto 8192 bytes) Write to a le, starting at specied offset for n bytes (upto 8192 bytes) Writes are synchronous - i.e., server responds OK when le is actually written to disk (this can often be changed as an option at mount time - but you can get into trouble) Create a le Delete a le Rename a le make a hard link to a le make a symbolic link to a le return the name of the le to which the symbolic link points create a directory delete a directory read a directory
Network File System (NFS)
2005.04.26
NFS over TCP Provided by some vendors for use over WANs. All applications on a given client share the same TCP connection. Both client and server set TCP keepalive timers If client detects that server has crashed or been rebooted, it tries to reconnect to the server if the client crashes,the client gets a new connection, the keepalive timer will terminate the half-open former connection
Maguire
[email protected]
NFS Statelessness NFS is designed to be stateless the server does not keep track of what clients are accessing which les there are no open or close procedures; just LOOKUP being stateless simplies server crash recovery clients dont know if the server crashes only the client maintains state Most procedures (GETATTR, STATFS, LOOKUP, READ, WRITE, READIR) are idempotent (i.e., can be executed more than once by the server with the same result). Some (CREATE, REMOVE, RENAME, SYMLINK, MKDIR, RMDIR) are not. SETATTR is idempotent unless it is truncating a file. To handle non-idempotent requests - most servers use recent-reply cache, checking their cache to see if they have already performed the operation and simply return the same value (as before).
Maguire
[email protected]
X Window System
Client-server application that lets multiple clients share a bit-mapped display. One server manages the display, keyboard, mouse, X requires a reliable bidirectional bitstream protocol (such as TCP). The server does a passive open on port 6000+n, where n is the display number (usually 0) X can also use UNIX domain sockets (with the name /tmp/.X11-unix/Xn, where n is the display number) > 150 different messages in the X protocol (for details see Nye, 1992)
Display window Client process Client process
TC
P
X server process
TCP
Maguire
[email protected]
X Window System
2005.04.26
All clients (even those on different hosts) communicate with the same server. Lots of data can be exchanged between client and server
xclock - send date and time once per second Xterm - send each key stroke (a 32 byte X message 72 bytes with IP and TCP headers) some applications read and write entire 32 bit per pixel images in cine mode from/to a window!
Maguire
[email protected]
X Window System
2005.04.26
Low Bandwidth X X was optimized for use across LANs. For use across low speed links - various techniques are used: caching sending differences from previous packets compression,
Maguire
[email protected]
X Window System
2005.04.26
Xscope Interpose a process between the X server and X client to watch traffic. For example, xscope could be run as if it were display 1, while passing traffic to and from display 0. See Stevens, Vol.1, pp. 488-489 for more details (or try running it!)
J.L. Peterson. XSCOPE: A Debugging and Performance Tool for X11. Proceedings of the IFIP 11th World Computer Congress, September, 1989, pp. 49-54.
See also XMON - An interactive X protocol monitor Both are available from: ftp://ftp.x.org/pub/R5/
Maguire
[email protected]
X Window System
2005.04.26
IPerf Pathchar
Measure bandwidth availabity using a client and server. Determines total bandwidth, delay jitter, loss, determine MTU, support TCP window size, Determine per hop network path characteristics (bandwidth, propagation delay, queue time and drop rate. It utilizes a series of packets with random payload sizes over a dened period of time to each hop in a path. Updated version of Pathchar -- by Bruce Mah NetLogger includes tools for generating precision event logs that can be used fpr detailed end-to-end application & system level monitoring, and tools for visualizing log data to view the state of a distributed system in real time. Measure single stream bulk transfer capacity. TReno doesnt actually use TCP slow start, but instead emulates it. It actually sends UDP packets to unused ports and uses the returned error meassages to determine the packet timing. Measure queuing properties during heavy congestion produce graphs of TCP connections from tcpdump les, suitable for use with xgraph. A perl script which produces time-sequence plots from tcpdump les.
Pchar Netlogger
Treno
Mping tdg
Maguire
[email protected]
Program
Description
parses raw tcpdump les to extract information and xplot les. generate graphs from plot data in X Windows. automated connection diagnosis tool; generates a test ow & display a time sequence plot based on that ow. determine trafc usage by AS path.
Maguire
[email protected]
Maguire
[email protected]
Summary
This lecture we have discussed: TCP HTTP Web enabled devices RPC, XDR, and NFS X Window System, and some tools for looking at these protocols
Maguire
[email protected]
Summary
2005.04.26
References
[23] Information Sciences Institute, University of Southern California, Transmission Control Protocol, IETF, RFC 793, September 1981
Maguire
[email protected]
References
2005.04.26
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapter 13
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.01:03:06
Maguire
[email protected]
SCTP.fm5
2005.05.01
Maguire
[email protected]
the sender and receiver can utilize multiple interfaces with multiple IP addresses increased fault tolerance current implementations do not support load balancing (i.e., only supports failover) via acknowledgements, timeouts, retransmission,
SCTP provides reliability SCTP provides ow control SCTP tries to avoid causing congestion
Maguire
[email protected]
SCTP Applications Initial goal of IETF Sigtran WG was to support SS7 applications over IP
For example, SMS transfer! For an example see [24], [28]
Maguire
[email protected]
SCTP Header
0 7 8 15 16 23 24 31
Chunks
Control information is contained in Control Chunks (these always precede any data chunks) Multiple data chunks can be present - each containing data for different streams
Maguire
[email protected]
SCTP Header
2005.05.01
12 bytes
SCTP Chunk
0 7 8 15 16 23 24 31
Type
Figure 39: SCTP packet (see Forouzan figure 13.8 pg. 354)
Type
Type
Chunk
Description
Type
Chunk
Description
0 1 2 3 4 5 6
7 8
Acknowledge an INIT chunk 9 Selective Acknowledgement 10 Probe to see if peer is alive Acknowledgement of a HEARTBEAT chunk Abort an association 11 14 192
SHUTDOWN Terminate an association SHUTDOWN-ACK Acknowledge SHUTDOWN chunk ERROR Reports errors without shutting down COOKIE ECHO Third packet in establishment of an association COOKIE ACK Acknowledges COOKIE ECHO chunk SHUTDOWN Third packet in an association COMPLETE terminations FORWARD TSN To adjust the cumulative TSN
Flag - 8 bit eld dened per chunk type Length - 16 bit length of chunk including chunk header (i.e., smallest value is 4) - does not include any padding bytes (hence you know just how much padding there is)
Maguire
[email protected]
SCTP Chunk
2005.05.01
server
INIT-ACK, VT:1200
tag: 5000 rwnd:2000 TSN:1700 + Cookie
COOKIE, VT:5000
Cookie
TCB allocated
COOKIE-ACK, VT:1200
The entity initiating the connection is (normally) called the "client" and does an active open, where as the server needed to previous do a passive open.
Maguire
[email protected]
INIT Chunk
0 7 8 15 16 23 24 31
Type = 1
Flag = 0
Length
Initiation tag Advertised receiver window credit (rwnd) Outbound streams Maximum inbound streams Initial Transmission sequence number (TSN) variable-length parameters (optional)
Figure 40: SCTP INIT chunk (see Forouzan figure 13.10 pg. 357)
Initiation tag
denes the tags for this association to be used by the other party reduce the risk due to a blind attacker (since there is only a 1 in 232 chance of guessing the right tag) can reject delayed packets - thus avoiding the need for TCPs TIME-WAIT timer
Outbound streams
suggested upper number of streams from this sender (can be reduced by receiver)
INIT Chunk
2005.05.01
Variable-length parameters
IP address(es) of endpoint Multiple addresses are used to support multihoming The receiver selects the primary address for the other endpoint Type of addresses Support for Explicit Congestion Notication (ECN)
Maguire
[email protected]
INIT Chunk
2005.05.01
Type = 2
Flag = 0
Length
Initiation tag Advertised receiver window credit (rwnd) Outbound streams Maximum inbound streams Initial Transmission sequence number (TSN) Parameter type: 7 Parameter length State Cookie variable-length parameters (optional)
Figure 41: SCTP INIT ACK chunk (see Forouzan figure 13.11 pg. 358)
The same fields as in the INIT chunk (with Initiation tag value set to that of the INIT) - but with the addition of a required parameter with a state cookie. Parameter type: 7 = State Cookie Parameter length = size of State Cookie + 4 (the parameter type and length elds) A packet carrying this INIT ACK chunk can not contain any other control or data chunks.
Maguire
[email protected]
State Cookie
Use of the COOKIE prevents a SYN flood like attack - since resources are not allocated until the COOKIE ECHO chuck is received. However, state has to be saved from the initial INIT chunk - therefore it is placed in the cookie in a way that only the server can access it (hence the cookie is sealed with an HMAC {aka digest} after being created {aka baked}). This requires that the server has a secret key which it uses to compute this digest. If the sender of the INIT is an attacker located on another machine, they wont be able to receive the cookie if they faked the source address in the INIT - since the INIT ACK is sent to the address and contains the cookie!
Without a cookie no association is created and no resources (such as TCB) are tied up!
Maguire
[email protected]
Type = 10
Length
Figure 42: SCTP COOKIE ECHO chunk (see Forouzan figure 13.12 pg. 359)
(chunk) Type: 10 = COOKIE ECHO (chunk) length = size of State Cookie + 4 (the parameter type and length elds) State Cookie
simply a copy of the COOKIE data from the INIT ACK chunk The COOKIE data is opaque (i.e., only the sender can read the cookie)
A packet carrying this COOKIE ECHO chunk can contain other control or data chunks -- in particular it can care the first user (client) data!
Maguire
[email protected]
Type = 11
Flag = 0
Length = 4
Figure 43: SCTP COOKIE ACK chunk (see Forouzan figure 13.13 pg. 359)
Completes the 4 way handshake. A packet with this chunk can also carry control and data chunks (in particular the first of the user (server) data.
Maguire
[email protected]
Data Chunk
0 7 8 15 16 23 24 31
Type = 0
Reserved U B E Length Transmission sequence number (TSN) Stream identier (SI) Stream Sequence number (SSN) Protocol Identier User data
Figure 44: SCTP Data Chunk (see Forouzan figure 13.9 pg. 356)
Flags:
U - Unordered - for delivery to the application right away B - Beginning (chunk position - for use with fragmentation) E - End chunk
Transmission sequence number (TSN)- only data chunks consume TSNs Stream identier (SI) Stream Sequence number (SSN) Protocol Identier User data
at least 1 byte of user data; padded to 32 bit boundaries although a message can be spread over multiple chunks, each chunk contains data from only a single message (like UDP, each message results in one or more data SCTP chunks)
Maguire
[email protected]
Data Chunk
2005.05.01
Multiple-Streams
Sending Process Receiving Process
SCTP
The figure above shows a single association. Each stream has a unique stream identier (SI) and maintains its own stream sequence number (SSN). Unordered data chunks (i.e., with U = 0) - do not consume a SSN and are delivered when they arrive at the destination. Multiple streams and unordered data avoid TCPs head of line blocking.
Maguire
[email protected]
Multiple-Streams
2005.05.01
Type = 3
Flag = 0 Length cumulative TSN acknowledgement Advertised receiver window credit Number of gap ACK blocks: N Number of duplicates: M Gap ACK block #1 start TSN offset Gap ACK block #1 end TSN offset Gap ACK block #N start TSN offset Gap ACK block #N end TSN offset Duplicate TSN 1 Duplicate TSN M
Figure 46: SCTP Data Chunk (see Forouzan figure 13.9 pg. 356)
Cumulative Transmission sequence number (TSN) acknowledgement the last data chunk received in sequence Gap = received sequence of chunks (indicated with start .. end TSNs) Duplicate TSN - indicating duplicate chunks (if any) SACK always sent to the IP address where the corresponding packet originated
Maguire
[email protected]
ERROR chunk
Sent when an endpoint finds some error in a packet
0 7 8 15 16 23 24 31
Type = 9
Length
Figure 47: SCTP ERROR chunk (see Forouzan figure 13.17 pg. 361)
Error code 1 2 3 4 5 6 7 8 9 10
Description Invalid Stream identier Missing mandatory parameter State cookie error Out of resource Unresolvable address Unrecognized chunk type Invalid mandatory parameters Unrecognized parameter No user data Cookie received while shutting down
Maguire
[email protected]
ERROR chunk
2005.05.01
Association Termination
Two forms of termination Association Abort
Used in the event of a fatal error uses same error codes as the ERROR Chunk Chunk format
0 7 8 15 16 23 24 31
Type = 6
Length
Figure 48: SCTP ABORT chunk (see Forouzan figure 13.18 pg. 362)
Maguire
[email protected]
Association Termination
2005.05.01
Association Shutdown
Active close
client
SHUTDOWN, VT:x
Cumulative TSN
time
SHUTDOWN COMPLETE, VT:x
Figure 49: Adapted from Forouzan figure 13.21 page 368 and slide 15 of [25]
15 16
23 24
31
Type = 7
Length = 8
Figure 50: SCTP SHUTDOWN chunk (see Forouzan figure 13.16 pg. 361)
23 24
31
Type = 8
0 7 8
Flag
15 16
Length = 4
23 24 31
Figure 51: SCTP SHUTDOWN ACK chunk (see Forouzan figure 13.16 pg. 361)
Type = 14
Flag
Length = 4
Figure 52: SCTP SHUTDOWN COMPLETE chunk (see Forouzan figure 13.16 pg. 361)
T bit indicates the sender did not have a Transmission Control Block (TCB)
Maguire
[email protected]
Association Termination
2005.05.01
Maguire
[email protected]
Stream Control Transmission Protocol Source port: 10777 Destination port: 13 Verification tag: 0x00000000 Checksum: 0x2b84fdb01 INIT chunk (Outbound streams: 10, inbound streams: 10) Chunk type: INIT (1) Chunk flags: 0x00 Chunk length: 32 Initiate tag: 0x43d82c5d Advertised receiver window credit (a_rwnd): 131071 Number of outbound streams: 10 Number of inbound streams: 10 Initial TSN: 771212194 Forward TSN supported parameter Parameter type: Forward TSN supported (0xc000) Parameter length: 4 Supported address types parameter (Supported types: IPv4) Parameter type: Supported address types (0x000c) Parameter length: 6 Supported address type: IPv4 address (5)
1. Ethereal complains about this checksum saying (incorrect Adler32, should be 0x973b078d), but this is in error see [34].
Maguire
[email protected]
Source port: 13 Destination port: 10777 Verification tag: 0x43d82c5d Checksum: 0x7f61f237 INIT_ACK chunk (Outbound streams: 1, inbound streams: 1) Chunk type: INIT_ACK (2) Chunk flags: 0x00 Chunk length: 128 Initiate tag: 0x5d581d9a Advertised receiver window credit (a_rwnd): 131071 Number of outbound streams: 1 Number of inbound streams: 1 Initial TSN: 1514529259 State cookie parameter (Cookie length: 100 bytes) Parameter type: State cookie (0x0007) Parameter length: 104 State cookie: 5D581D9A0001FFFF000100015A45E1EB... Forward TSN supported parameter Parameter type: Forward TSN supported (0xc000) 1... .... .... .... = Bit: Skip parameter and continue prosessing of the chunk .1.. .... .... .... = Bit: Do report Parameter length: 4
Maguire
[email protected]
Source port: 10777 Destination port: 13 Verification tag: 0x5d581d9a Checksum: 0x3af3f579 COOKIE_ECHO chunk (Cookie length: 100 bytes) Chunk type: COOKIE_ECHO (10) 0... .... = Bit: Stop processing of the packet .0.. .... = Bit: Do not report Chunk flags: 0x00 Chunk length: 104 Cookie: 5D581D9A0001FFFF000100015A45E1EB...
Maguire
[email protected]
Source port: 13 Destination port: 10777 Verification tag: 0x43d82c5d Checksum: 0x762d80d7 COOKIE_ACK chunk Chunk type: COOKIE_ACK (11) Chunk flags: 0x00 Chunk length: 4
Maguire
[email protected]
Source port: 13 Destination port: 10777 Verification tag: 0x43d82c5d Checksum: 0xf8fb1754 DATA chunk(ordered, complete segment, TSN: 1514529259, SID: 0, SSN: 0, PPID: 0, payload length: 25 bytes) Chunk type: DATA (0) Chunk flags: 0x03 .... ...1 = E-Bit: Last segment .... ..1. = B-Bit: First segment .... .0.. = U-Bit: Ordered deliviery Chunk length: 41 TSN: 1514529259 Stream Identifier: 0x0000 Stream sequence number: 0 Payload protocol identifier: not specified (0) Chunk padding: 000000 Data (25 bytes)
0000 0010 57 65 64 20 41 70 72 20 32 37 20 31 31 3a 34 33 3a 32 32 20 32 30 30 35 0a Wed Apr 27 11:43 :22 2005.
Maguire
[email protected]
Source port: 10777 Destination port: 13 Verification tag: 0x5d581d9a Checksum: 0xfa994e35 SACK chunk (Cumulative TSN: 1514529259, a_rwnd: 131071, gaps: 0, duplicate TSNs: 0) Chunk type: SACK (3) Chunk flags: 0x00 Chunk length: 16 Cumulative TSN ACK: 1514529259 Advertised receiver window credit (a_rwnd): 131071 Number of gap acknowldgement blocks : 0 Number of duplicated TSNs: 0
Maguire
[email protected]
Source port: 13 Destination port: 10777 Verification tag: 0x43d82c5d Checksum: 0xf447d00f SHUTDOWN chunk (Cumulative TSN ack: 771212193) Chunk type: SHUTDOWN (7) Chunk flags: 0x00 Chunk length: 8 Cumulative TSN Ack: 771212193
Maguire
[email protected]
Source port: 10777 Destination port: 13 Verification tag: 0x5d581d9a Checksum: 0x9f44d056 SHUTDOWN_ACK chunk Chunk type: SHUTDOWN_ACK (8) Chunk flags: 0x00 Chunk length: 4
Maguire
[email protected]
Source port: 13 Destination port: 10777 Verification tag: 0x43d82c5d Checksum: 0x3db6e771 SHUTDOWN_COMPLETE chunk Chunk type: SHUTDOWN_COMPLETE (14) Chunk flags: 0x00 .... ...0 = T-Bit: TCB destroyed Chunk length: 4
Maguire
[email protected]
Fault Management
Endpoint Failure Detection
Endpoint keeps a counter of the total number of consecutive retransmissions to its peer (including retransmissions to all the destination transport addresses [= port + IP address] of the peer if it is multi-homed). When this counter exceeds Association.Max.Retrans, the endpoint will consider the peer endpoint unreachable and shall stop transmitting any more data to it (the association enters the CLOSED state). Counter is reset each time: a DATA chunk sent to that peer is acknowledged (by the reception of a SACK) or a HEARTBEAT-ACK is received from the peer
Fault Management
2005.05.01
Figure 54: SCTP HEARTBEAT and HEARTBEAK ACK chunks (see Forouzan figure 13.15 pg. 360)
(chunk) Type: 4 = HEARTBEAT (chunk) Type: 5 = HEARTBEAT ACK (chunk) length = size of sender specic information + 4 (the parameter type and length elds) Sender specic information
The sender puts its Local time and transport address in (note that the sctplib implementation 1.0.2 puts the time in as an unsigned 32 bit integer and puts the path index in (also as an unsigned 32 bit integer) and add a HMAC computed over these values [29] The acknowledgement simply contains a copy of this information
Maguire
[email protected]
Source port: 9 Destination port: 38763 Verification tag: 0x36fab554 HEARTBEAT chunk (Information: 28 bytes) Chunk type: HEARTBEAT (4) Chunk flags: 0x00 Chunk length: 32 Heartbeat info parameter (Information: 24 bytes) Parameter type: Heartbeat info (0x0001) Parameter length: 28 Heartbeat information: 0280351E00000000E1A06CFBC1C6933F... Source port: 38763 Destination port: 9 Verification tag: 0x57c3a50c HEARTBEAT_ACK chunk (Information: 28 bytes) Chunk type: HEARTBEAT_ACK (5) Chunk flags: 0x00 Chunk length: 32 Heartbeat info parameter (Information: 24 bytes) Parameter type: Heartbeat info (0x0001) Parameter length: 28 Heartbeat information: 0280351E00000000E1A06CFBC1C6933F...
Maguire
[email protected]
Sender
uses the same destination address until instructed by the upper layer (however, SCTP may change to an alternate destination in the event an address is marked inactive) retransmission can be to a different transport address than the original transmission. keeps separate congestion control parameters (cwnd, ssthresh, and partial_bytes_acked) for each of the destination addresses it can send to (i.e., not each source-destination pair) these parameters should decay if the address is not used does slow-start upon the rst transmission to each of destination addresses
Maguire
[email protected]
IPv6
Based on RFC1981 [32] an SCTP sender using IPv6 must use Path MTU Discovery, unless all packets are less than the minimum IPv6 MTU (see RFC 2460 [33]).
SCTP differs in several ways from the description in RFC 1191 of applying MTU discovery to TCP: 1 SCTP associations can span multiple addresses an endpoint does PMTU discovery on a per-destination-address basis
The term MTU always refers to the MTU associated with the destination address
Since SCTP does not have a notion of Maximum Segment Size, for each destination MTUinitial MTUlink for the local interface to which packets for that remote destination address will be routed
Maguire
[email protected]
When retransmitting to a remote address for which the IP datagram appears too large for the path MTU to that address, the IP datagram should be retransmitted without the DF bit set, enabling it to be fragmented. While initial transmissions of IP datagrams must have DF set. Sender maintains an association PMTU (= smallest PMTU discovered for all of the peers destination addresses); when fragmenting messages this association PMTU is used to calculate the size of each fragment retransmissions can sent to an alternate address without encountering IP fragmentation
Maguire
[email protected]
SCTP resequences data at the receiving side SCTP discards duplicate data at the receiving side The window size (or more exactly the receive window size (rwnd)) - indicates how many bytes the receiver is prepared to receive (this number is relative to the acknowledgement number).
Maguire
[email protected]
Streami a stream number that was skipped by this FWD-TSN. Stream Sequencei = the largest stream sequence number in streami being skipped Receiver can use the Streami and Stream Sequencei elds to enable delivery of (stranded) TSNs that remain in the stream re-ordering queues.
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
Summary
This lecture we have discussed: SCTP
Message framing Multi-homing Multi-streaming
How SCTP differs from TCP Measurements of an implementation (there are other implementations such as that included with [27]):
http://www.sctp.de http://www.sctp.org
Maguire
[email protected]
Summary
2005.05.01
References
[24] G. Sidebottom, K. Morneault, and J. Pastor-Balbas, Signaling System 7 (SS7) Message Transfer Part 3 (MTP3) - User Adaptation Layer (M3UA), IETF RFC 3332, September 2002 http://www.ietf.org/rfc/rfc3332.txt [25] Andreas Jungmaier, A Gentle Introduction to SCTP, 19th Chaos Communications Congress, Berlin, 2002
http://tdrwww.exp-math.uni-essen.de/inhalt/forschung/19ccc2002/html/slide_1.html
[26] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. Paxson, Stream Control Transmission Protocol, IETF RFC 2960, October 2000 http://www.ietf.org/rfc/rfc2960.txt [27] Randall R. Stewart and Qiaobing Xie, Stream Control Transmission Protocol: A Reference Guide, Addison-Wesley, 2002, ISBN 0-201-72186-4. [28] K. Morneault, S. Rengasami, M. Kalla, and G. Sidebottom, ISDN
Maguire
[email protected]
References
2005.05.01
[29] Andreas Jungmaier , Herbert Hlzlwimmer, Michael Txen , and Thomas Dreibholz, "sctplib-1.0.2", Siemens AG and the Institute of Computer Networking Technology, University of Essen, Germany, August 2004 http://www.sctp.de/sctp-download.html {Note that a later version 1.0.3 was released March 4th, 2005} [30] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, and P. Conrad, Stream Control Transmission Protocol (SCTP) Partial Reliability Extension, IETF RFC 3758, May 2004 http://www.ietf.org/rfc/rfc3758.txt [31] J. Mogul and S. Deering, Path MTU Discovery, IETF RFC 1191, November 1990 http://www.ietf.org/rfc/rfc1191.txt [32] J. McCann, S. Deering, and J. Mogul, Path MTU Discovery for IP version 6, IETF RFC 1981, August 1996 http://www.ietf.org/rfc/rfc1981.txt [33] S. Deering and R. Hinden, Internet Protocol, Version 6 (IPv6)
Maguire
[email protected]
References
2005.05.01
Specification, IETF RFC 2460, December 1998 http://www.ietf.org/rfc/rfc2460.txt [34] J. Stone, R. Stewart, and D. Otis, Stream Control Transmission Protocol (SCTP) Checksum Change, IETF RFC 3309, September 2002
http://www.ietf.org/rfc/rfc3309.txt
[35] A. Jungmaier, E. Rescorla, and M. Tuexen, Transport Layer Security over Stream Control Transmission Protocol, IETF RFC 3436, December 2002
http://www.ietf.org/rfc/rfc3436.txt
Maguire
[email protected]
References
2005.05.01
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapter 14
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.18:21:30
Maguire
[email protected]
Dynamic_Routing.fm5
2005.05.18
Outline
Dynamic Routing Protocols
Maguire
[email protected]
Outline
2005.05.18
Routing
Control Plane
Routing Table
Access List
Queuing Priority
Accounting Data
Data Plane
Cache
Packet
Switching Tasks
Security Tasks
Queuing Tasks
Accounting Tasks
The routing table tells us which output port to use based on the destination (and possibly the source) IP address. The data plane has to run at packet rates (i.e., in real-time). However, a router also performs a lot of other processing
Maguire
[email protected]
Routing
2005.05.18
Routing Principles
Routing Mechanism: Use the most specic route
IP provides the mechanism to route packets
Routing Policy: What routes should be put in the routing table? <<< todays topic!
Use a routing daemon to provide the routing policy
Maguire
[email protected]
Routing Principles
2005.05.18
ipoutput
routing happens here
radio_output if_snd
lestart
radiostart
Processing
Rouing daemon route command netstat command
UDP
TCP
Routing Policy
Yes routing table update from adjacent routers
No ICMP
Routing Mechanism
Routing Table
forward datagram (if forwarding is enabled) ICMP redirects source IP Output: calculate next hop router (if necessary) routing process IP options
IP input queue
IP Layer
network interfaces
Maguire
[email protected]
Processing
2005.05.18
Maguire
[email protected]
Swedish University Network (SUNET) SUNET-KI Stockholm University - SU SUNET-KTH KTHNOC KTHNOC-SE
For statistics about the number of AS, etc.: http://www.cidr-report.org/ For a list of AS number to name mappings: http://www.cidr-report.org/autnums.html To find out who is responsible for a given autonomous system, use a query of the form: http://www.ripe.net/perl/whois?AS2839
Maguire
[email protected]
Routing Metrics
A measure of which route is better than another: Number of hops Bandwidth Delay Cost Load It is possible that the metric uses some weighted combination of the above.
Maguire
[email protected]
Routing Metrics
2005.05.18
Routing Algorithms
Static vs. Dynamic Single path vs. Multi-path Flat vs. Hierarchical Host-intelligent vs. Router-intelligent Intradomain (interior) vs. Interdomain (exterior) Link state vs. Distance vector
Issues: Initialization (how to get started) Sharing Updating When to share & Who to share with
Maguire
[email protected]
Routing Algorithms
2005.05.18
Maguire
[email protected]
Command Family
Reserved All 0s
Figure 58: RIP message format (see Forouzan figure 14.9 pg. 394)
a command: request or reply a version number (in this case 1) up to 25 instances (entries) containing:
address family (2 = IP addresses) Network Address (allocated 14 bytes, for an IP address we only need 4 bytes - and they are aligned to a 4 byte boundary - hence the leading and trailing zeros) metric [hop count]
Maguire
[email protected]
RIP v1 operation
As carried out by UNIX daemon routed using UDP port 520 Initialization:
for all interface which are up { send a request packet out each interface asking for the other routers complete routing table [command=1, address family=0 {== unspecified}1, metric=16 }
Request received:
if whole table requested, then send it all 25 at a time else if a specific set of routes then fill in the metric else set metric to 16 [16 == infinity == we dont know a route to this address]
Response received:
if valid (i.e., not 16),then update/add/delete/modify routing table
1. Page 24 of RFC 1058 says If there is exactly one entry in the request, with an address family identier of 0 (meaning unspecied), and a metric of innity (i.e., 16 for current implementations), this is a request to send the entire routing table.[38] - this is different than implied in Forouzan gure 14.10 pg. 395
Maguire
[email protected]
RIP v1 operation
2005.05.18
When are routes sent? Solicited response: Send a response when a request is received Unsolicited response: If a metric for a route changes, then (trigger) send update, else send all or part of the table every 30 seconds. If a route has not been updated for 180 seconds (3 minutes = 6 update cycles), then set metric to 16 and then after 60 seconds (1 minute) delete route. Metrics are in units of hops, thus this protocol leads to selection between routes based on the minimum number of hops. Summary of RIPv1 Timers: Periodic timer - regular updates random value [25..35] mean 30 s. Expiration timer - routes not updated within (180 s) expire Garbage collection timer - 120 s after expiring entries are GCd
Maguire
[email protected]
RIP v1 operation
2005.05.18
Maguire
[email protected]
Count to Innity
network1 Router A C network2 TTL expires network3 Router B D
Router A advertises it knows about routes to networks 1 and 2 Router B advertises it knows about routes to networks 2 and 3 After one update cycles A and B know about all 3 routes. If As interface to Network1 goes down, then A learns from B - that B knows a route to Network1; so A now thinks it can reach Network1 via B. So if D sends a packet for C, it will simply loop back and forth between routers A and B, until the TTL counts down to 0.
Maguire
[email protected]
Count to Innity
2005.05.18
Split Horizon
To counter the count to infinity, the split horizon algorithm - never sends information on an interface that it learned from this interface. RIPv1 implements: Split Horizon with Poison Reverse Update - rather that not advertise routes to the source, we advertise them with a metric of 16 (i.e., unreachable) - hence the source simply ignores them. Unfortunately split horizon only prevents loops between adjacent routers (so if there are three or more routers involved the previous problem re-appears)
Maguire
[email protected]
Split Horizon
2005.05.18
Maguire
[email protected]
Command 0xFFFF
if Authentication type = 2, this is a clear text password to be used to authenticate this message
Family Network Address Subnet mask Next-hop address Distance
Figure 59: RIPv2 message format (see Forouzan figures 14.13 pg. 397 and 14.14 pg. 398)
Route tag
Maguire
[email protected]
Because RIP is generally the only routing protocol which all UNIX machines understand! Relatively easy to congure It it widely available, since it must exist if the device is capable of routing!
Maguire
[email protected]
Maguire
[email protected]
IGRP Metrics
a vector of metrics each with a 24 bit value K2 K1 composite metric is ------ + ------ R , where K1 and K2 are constants, B D B the unloaded path bandwidth, D a topological delay, and R is reliability also we pass the hop count and Maximum Transmission Unit values K1 is the weight assigned to bandwidth (by default 10,000,000) K2 is the weight assigned to delay (by default 100,000) If up to 4 paths are with in a defined variance of each other, Ciscos IOS (Internetwork Operating System) will split the traffic across them in inverse proportion to their metric.
Maguire
[email protected]
IGRP Metrics
2005.05.18
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
link-state protocols converge faster than distance-vector protocols can calculate a route per IP service type (i.e., TOS) each interface can have a per TOS cost if there are several equally good routes can do load balancing supports variable length subnet masks enable point to point links to be unnumbered (i.e., dont need an IP address) uses clear text passwords uses multicasting
Open Shortest Path First (OSPF)
2005.05.18
Maguire
[email protected]
OSPF uses the Shortest Path First algorithm (also know as Dijkstras algorithm). OSPF networks generally divided into areas such that cross-area communication is minimal. Some routers with multiple interfaces become border area routers (with one interface in one area and another interface in another area). The only way to get from one area to another area is via the backbone - which is area 0. Note: The backbone need not be continuous.
Note that Forouzan refers to transient links -- I think that this should be transit links. (Since transient implies that the link would be short lived!)
Link state advertisements are sent to all routers in a given area (via ooding), rather than just neighbors (as in the distance-vector approach) - thus periodic updates are infrequent (every 1 to 2 hours). A key feature of OSPF is route aggregation - which minimizes the size of routing tables and the size of the topological database; in addition, it keeps protocol traffic to a minimum.
Maguire
[email protected]
2. Synchronization of Databases
Exchange of Link State Database between neighbors Get LSA headers Request the transfer of necessary LSAs
3. Flooding protocol
When links change or when your knowledge is old Send Link State updates to neighbors and ood recursively If not seen before, propagate updates to all adjacent routers, except the router you received it from
Maguire
[email protected]
OSPF Packets
Common header
0 7 8 15 16 23 24 31
Version
Type Source Router IP Address Area Identication Checksum Authentication (64 bits)a
Message length
Authentication type
Figure 60: OSPF Common Header (see Forouzan figure 14.26 pg. 408) a. Note that the Authentication eld is 64 bits, Forouzan gure 14.26 pg. 408 incorrectly shows it as 32 bits.
Maguire
[email protected]
OSPF Packets
2005.05.18
Hello packet
0 7 8 15 16 23 24 31
Version
Message length
Authentication type
Authentication (64 bits) Network Mask Hello interval (seconds) All zeros Dead interval (seconds) Designated router IP address Backup designated router IP address Neighbor IP address Neighbor IP address
Figure 61: OSPF Link state update packet (see Forouzan figure 14.44 pg. 419)
E T
Priority
E = 1 indicates a stub area T = 1 indicates router supports multiple metrics priority = 0 indicates that this router should not be considered as a designated or backup designated router Dead interval is the time before a silent neighbor is assumed to be dead list of neighboring routers (of the router which sent the hello packet)
Maguire
[email protected]
Hello packet
2005.05.18
Version
Type = 2 Source Router IP Address Area Identication Checksum Interface MTU Authentication (64 bits) All zeros Database Description sequence number LSA header (20 bytes) LSA header
Message length
Figure 62: OSPF Database Description packet (see Forouzan figure 14.45 pg. 420)
E = 1 indicates the advertising router is an autonomous boundary router (i.e., E external) B = 1 indicates the advertising router is an autonomous border router I = 1 initialization ag M = 1 More ag M/S ag: 0=slave, 1=Master Database Description sequence number LSA header(s) - gives information about the link - but without details; if details are desired they can be requested
Database Description packet
2005.05.18
E T
Length
Advertising router - IP address of the router advertising this message Link state checksum - a Fletchers checksum of all but age eld Length of the whole packet in bytes
Maguire
[email protected]
Version
Type = 4 Source Router IP Address Area Identication Checksum Authentication (64 bits) Number of link state advertisements Link state advertisement (LSA) Link state advertisement (LSA)
Message length
Authentication type
Figure 64: OSPF Link state update packet (see Forouzan figure 14.27 pg. 409)
Maguire
[email protected]
Version
Type = 3 Source Router IP Address Area Identication Checksum Authentication (64 bits) Link state type Link state ID Advertising router
Message length
Authentication type
Figure 65: OSPF Link state request packet (see Forouzan figure 14.46 pg. 420)
To ask for information about a specific route (or routes); reply is an update packet.
Maguire
[email protected]
repeat
Version
Type = 5 Source Router IP Address Area Identication Checksum Authentication (64 bits) LSA header
Message length
Authentication type
Figure 66: OSPF Link state acknowledgement packet (see Forouzan figure 14.47 pg. 421)
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
multihomed AS
stub AS
Maguire
[email protected]
transit AS
BGP operation
BGP routers exchange information based on traffic which transits the AS, derives a graph of AS connectivity; with loop pruning. Routing policy decisions can be enforced as to what is allowed to transit whom policy-based routing based on economic/security/political/ considerations. BGP does not implement the policy decisions, but allows the information on which such decisions can be made to propagate as necessary Uses TCP (port 179) to create a session between BGP routers: initially two systems exchange their entire BGP routing table, then they simply send updates as necessary. BGP is a path-vector protocol - which enumerates the route to each destination (i.e., the sequence of AS numbers which a packet would have to pass through from a source to its destination) = a path vector
Maguire
[email protected]
BGP operation
2005.05.18
BGP does not transmit metrics. However, each path is a list of attributes: well-known attributes (which every router must understand)
well-known mandatory attribute - must appear in the description of a route well-known discretionary attribute - may appear, but must be recognized, in the description of a route
optional attributes
optional transitive attribute - must be passed to the next router optional nontransitive attribute - the receiving router must discarded it if does not recognize it
For examples of the use of an attribute see [54] and [55] BGP detects failures (either links or hosts) by sending keepalive messages to its neighbors. Generally sent every 30 seconds and as they are only 19 bytes each only ~5 bits/second of bandwidth, but with very long lived TCP connections (semi-permanent connections) A major feature of BGP version 4 is its ability to do aggregation - to handle CIDR and supernetting. For more information on aggregation see chapter 5 of [49].
Maguire
[email protected]
BGP operation
2005.05.18
Allows the addresses assigned to a single organization to span multiple classed prexes. Envisioned a hierarchical Internet. CIDR addressing scheme and route aggregation has two major user impacts: you have to justifying IP Address Assignments get address from your ISP, i.e., renting them vs. being assigned them
Maguire
[email protected]
Maguire
[email protected]
Two kinds of BGP sessions: External (E-BGP) coordinates between border routers between ASs Internal BGP (I-BGP) coordinates between BGP peers within an AS
Note it must be a full mesh, but this does not scale organization into subASs, .
Maguire
[email protected]
BGP Messages
Open Update Keepalive Notication
Maguire
[email protected]
BGP Messages
2005.05.18
Length My autonomous system BGP identier Option length Option (variable length)
Figure 68: BGP Open messages (see Forouzan figures 14.53 pg. 427 and 14.52 pg. 426)
Version
Maker (for authentication), Length, and Type are common to all BGP messages.
Version = 4 My autonomous system - the AS number Hold time - maximum time to wait for a keepalive or update, otherwise the other party is considered to be dead BGP identier - identies the router sending this message (typically its IP address) Option length - zero if none Option - options in the form (length of parameter, parameter value)
Maguire
[email protected]
Length Unfeasible routes length Withdraw routes (variable length) Path attributes length
Type = 2
Path attributes (variable length) Network layer reachability information (variable length)
Figure 69: BGP Update message (see Forouzan figures 14.54 pg. 428 and 14.52 pg. 426)
Unfeasible routes length (2 bytes) - length of next eld Withdraw routes - list of all routes that must be deleted Path attributes length(2 bytes) - length of next eld Path attributes - species the attributes of the path being announced Network layer reachability information - prex length and IP address (to support CIDR)
Maguire
[email protected]
Length
Type = 3
Figure 70: BGP Keepalive message (see Forouzan figures 14.55 pg. 429 and 14.52 pg. 426)
Maguire
[email protected]
Type = 4
Error code
Figure 71: BGP Notification message (see Forouzan figures 14.56 pg. 429 and 14.52 pg. 426)
Error code (1 bytes) - category of error 1 = Message header error 2 = Open Message error 3 = Update Message error 4 = Hold time expired 5 = Finite state machine error 6 = Cease Error subcode (1 bytes) - the particular error in this category Error Data - information about this error
Maguire
[email protected]
Interconnections of networks
Since different networks have different users, policies, cost structure, etc. But, the value of a network is proportional to the (number users)2 [Metcalfs Law] Therefore, network operators want to connect their networks to other networks. Internet eXchange Points (IXs or IXPs) List of public internet exchange points: http://www.ep.net/ep-main.html For a discussion of why IXPs are important see [58] No internet exchange points no internetworking! Cost advantages in peering QoS advantages
Maguire
[email protected]
Interconnections of networks
2005.05.18
R R
FDDI ring
R R
Note that it need not be a physical ring, but was often an FDDI switch (such as the DEC Gigaswitch/FDDI).
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
SOF (Swedish Operators Forum) North American Network Operators Group (NANOG) http://www.nanog.org/
Maguire
[email protected]
Maguire
[email protected]
NAPs today
Using GigE, switch fabrics, resilient packet ring (Spatial Reuse Protocol (SRP)) technology, e.g., Ciscos Dynamic Packet Transport (DPT), with dedicated fiber connections to/from members. NAP managers are increasingly concerned about security, reliability, and accounting & statistics. Various NAP have different policies, methods of dividing costs, fees, co-location of operators equipment at the NAP, etc.
Maguire
[email protected]
NAPs today
2005.05.18
Maguire
[email protected]
entries are maintained by each service provider Internet Performance and Analysis Project (IPMA) IRR Java Interface
http://salamander.merit.edu/ipma/java/IRR.html
Maguire
[email protected]
1st Packet
Switching Tasks
NetFlow Cache
Security Tasks
Queuing Tasks
Accounting Tasks
NetFlow Statistics
Flows
A flow is defined as a uni-directional stream of packets between a given source network-layer address and port number and a specific destination network-layer address and port number. Since many application use well known transport-layer port numbers, it is possible to identify ows per user per application basis. There is a well defined Netflow Switching Developers Interface which allows you to get the statistics concerning the NetFlow cache and the per flow data (the later gives you essentially billing records). A general introduction to NetFlow Switching is available at
http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/12cgcr/switch_c/xcprt3/xcovntfl.htm
Maguire
[email protected]
Flows
2005.05.18
A Tag Edge router labels a packet based on its destination, then the Tag Switches make their switching decision based on this tag, without having to look at the contents of the packet. The Tag Edge routers and Tag Switch exchange tag data using Tag Distribution Protocol (TDP).
Basics of Tag switching:
1. Tag edge routers and tag switches use standard routing protocols to identify routes through the internetwork. 2. Using the tables generated by the routing protocols the tag edge routers and switches assign and distribute tag information via the tag distribution protocol (TDP). When the Tag routers receive this TDP information they build a forwarding database. 3. When a tag edge router receives a packet it analyzes the network layer header, performs applicable network layer services, selects a route for the packet from its routing tables, applies a tag, and forwards the packet to the next hop tag switch. 4. The tag switch receives the tagged packet and switches the packet based solely on the tag. 5. The packet reaches the tag edge router at the egress point of the network, the tag is stripped off and the packet delivered as usual.
Maguire
[email protected]
Tag Locations
in the Layer 2 header (e.g., in the VCI eld for ATM cells) in the Layer 3 header (e.g., in the ow label eld in IPv6) or in between the Layer 2 and Layer 3 headers
Maguire
[email protected]
Tag Locations
2005.05.18
Creating tags
Since tag switching decouples the tag distribution mechanisms from the data flows - the tags can be created: when the rst trafc is seen to a destination or in advance - so that even the rst packet can immediately be labelled. See also: Multiprotocol Label Switching (MPLS) and Generalized Multi-Protocol Label Switching (GMPLS)
Maguire
[email protected]
Creating tags
2005.05.18
Cache Packet
Switching Tasks
Security Tasks
Queuing Tasks
Accounting Tasks
Earlier we have looked at the routing step, but today much of the excitement is in the details of the other functions. For example, in order to support various QoS features you might want to use more sophisticated queue management: Weighted Round Robin, Fair Queuing, Weighted Fair Queuing, Random Early Detection (RED), Weighted RED, .
Maguire
[email protected]
Summary
This lecture we have discussed: IP routing
Dynamic routing protocols RIP, OSPF, BGP, CIDR Ciscos NetFlow Switching and Tag Switching
Maguire
[email protected]
Summary
2005.05.18
References
[37] J. Hawkinson and T. Bates, Guidelines for creation, selection, and registration of an Autonomous System (AS), IETF RFC 1930, March 1996 http://www.ietf.org/rfc/rfc1930.txt [38] C. Hedrick, Routing Information Protocol, IETF RFC 1058, June 1988
http://www.ietf.org/rfc/rfc1058
[39] G. Malkin, RIP Version 2: Carrying Additional Information, IETF RFC 1388, January 1993 http://www.ietf.org/rfc/rfc1388 [40] G. Malkin, RIP Version 2, IETF RFC 2453, November 1998
http://www.ietf.org/rfc/rfc2453
[41] Charles L. Hedrick, An Introduction to IGRP, Cisco, Document ID: 26825, Technology White Paper, August 1991
http://www.cisco.com/warp/public/103/5.html
References
2005.05.18
[44] OSPF Design Guide, Cisco, Technology White Paper, Document ID: 7039,August 26, 2004 http://www.cisco.com/warp/public/104/1.html [45] K. Lougheed and Y. Rekhter, A Border Gateway Protocol 3 (BGP-3), IETF RFC 1267, October 1991 http://www.ietf.org/rfc/rfc1267.txt [46] Y. Rekhter and T. Li (editors), A Border Gateway Protocol 4 (BGP-4), IETF RFC 1654, July 1994 http://www.ietf.org/rfc/rfc1654.txt [47] Y. Rekhter and T. Li (editors), A Border Gateway Protocol 4 (BGP-4), IETF RFC 1771, March1995 http://www.ietf.org/rfc/rfc1771.txt [48] John W. Stewart III, BGP4: Inter-Domain Routing in the Internet, Addison-Wesley, 1999, ISBN: 0-201-37951-1
Maguire
[email protected]
References
2005.05.18
[49] Bassam Halabi, Internet Routing Architectures, Cisco Press, ISBN 1-56205-652-2 [50] R. Hinden (Editor), Applicability Statement for the Implementation of Classless Inter-Domain Routing (CIDR), IETF RFC 1517, September 1993
http://www.ietf.org/rfc/rfc1517.txt
[51] Y. Rekhter and T. Li (Editors), An Architecture for IP Address Allocation with CIDR, IETF RFC 1518. September 1993 http://www.ietf.org/rfc/rfc1518.txt [52] V. Fuller, T. Li, J. Yu, and K. Varadhan, Classless Inter-Domain Routing (CIDR): an Address Assignment and Aggregation Strategy, IETF RFC 1519, September 1993 http://www.ietf.org/rfc/rfc1519.txt [53] Y. Rekhter and C. Topolcic, Exchanging Routing Information Across Provider Boundaries in the CIDR Environment, IETF RFC 1520, September 1993 http://www.ietf.org/rfc/rfc1520.txt [54] R. Chandra, P. Traina, and T. Li, BGP Communities Attribute, IETF RFC 1997, August 1996 http://www.ietf.org/rfc/rfc1997.txt
Maguire
[email protected]
References
2005.05.18
[55] E. Chen and T. Bates, An Application of the BGP Community Attribute in Multi-home Routing, IETF RFC 1998, August 1996
http://www.ietf.org/rfc/rfc1998.txt
[56] Iljitsch van Beijnum, web site http://www.bgpexpert.com/, last modified April 28, 2005 12:00:00 PM [57] Iljitsch van Beijnum, BGP, OReilly, 1st Edition September 2002, ISBN 0-596-00254-8 [58] Internet Exchange Points: Their Importance to Development of the Internet and Strategies for their Deployment: The African Example, Global Internet Policy Iniative (GIPI), 6 June 2002 (revised 3 May 2004),
http://www.internetpolicy.net/practices/ixp.pdf
Maguire
[email protected]
References
2005.05.18
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters 10, 15
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.02:22:31
Maguire
[email protected]
IP_Multicast_and_RSVP.fm5
2005.05.02
Outline
Multicast IGMP RSVP
Maguire
[email protected]
Outline
2005.05.02
Maguire
[email protected]
IP
discard
Device driver
discard
Interface
discard
Figure 75: Filtering which takes place as you go up the TCP/IP stack (see Stevens, Volume 1, figure 12.1, pg. 170)
Maguire
[email protected]
Broadcasting
Limited Broadcast
IP address: 255.255.255.255 never forwarded by routers What if you are multihomed? (i.e., attached to several networks) Most BSD systems just send on rst congured interface routed and rwhod - determine all interfaces on host and send a copy on each (which is capable of broadcasting)
Net-directed Broadcast
IP address: netid.255.255.255 or net.id.255.255 or net.i.d.255 (depending on the class of the network) routers must forward
Subnet-Directed Broadcast
IP address: netid | subnetid | hostID, where hostID = all ones
All-subnets-directed Broadcast
IP address: netid | subnetid | hostID, where hostID = all ones and subnetID = all ones generally regarded as obsolete!
Maguire
[email protected]
Broadcasting
2005.05.02
Maguire
[email protected]
Other approaches to One-to-Many and Many-to-Many communication Multicasting and RSVP 416
2005.05.02 Internetworking/Internetteknik
reector
node
All sites send to one site (the reector) overcomes the N2 problems The reector sends copies to all sites Problems: Does not scale well Multiple copies sent over the same link Central site must know all who participate Behavior could be changed by explicitly building a tree of reflectors - but then you are moving over to Steve Deerings model.
Maguire
[email protected]
Maguire
[email protected]
Multicast Server
Figure 77: MBONE behaves as if there were a multicast server, but this functionality is distributed not centralized.
Maguire
[email protected]
Core Problem
How to do efficient multipoint distribution (i.e., at most one copy of a packet crossing any particular link) without exposing topology to end-nodes N copies 1 copy
Applications
Conference calls (without sending N copies sent for N recipients) Dissemination of information (stock prices, "radio stations", ) Dissemination of one result for many similar requests (boot information, video) Unix tools:
Maguire
[email protected]
IGMP v1, v2, v3 Multicast Routing Protocols PIM, CBT, DVMRP, MOSPF, MBGP, Link-level Multicast (Ethernet) Figure 78: IP Multicast Service Model
Maguire
[email protected]
Maguire
[email protected]
Multicasting IP addresses
Multicast Group Addresses - Class D IP address High 4 bits are 0x1110; which corresponds to the range 224.0.0.0 through 239.255.255.255 host group set of hosts listening to a given address
membership is dynamic - hosts can enter and leave at will no restriction on the number of hosts in a host group a host need not belong in order to send to a given host group permanent host groups - assigned well know addresses by IANA 224.0.0.1 - all systems on this subnet 224.0.0.2 - all routers on this subnet 224.0.0.4 - DVMRP routers 224.0.0.9 - RIP-2 routers 224.0.1.1 - Network Time Protocol (NTP) - see RFC 1305 and RFC 1769 (SNTP) 224.0.1.2 - SGIs dogght application
Maguire
[email protected]
Multicasting IP addresses
2005.05.02
224.IN-ADDR.ARPA. 224.0.0.0 - 224.0.0.255 (224.0.0/24) Local Network Control Block 224.0.1.0 - 224.0.1.255 (224.0.1/24) Internetwork Control Block 224.0.2.0 - 224.0.255.0 AD-HOC Block 224.1.0.0 - 224.1.255.255 (224.1/16) ST Multicast Groups 224.2.0.0 - 224.2.255.255 (224.2/16) SDP/SAP Block 224.3.0.0 - 224.251.255.255 Reserved 239.0.0.0/8 Administratively Scoped
Maguire
[email protected]
239.000.000.000-239.063.255.255 Reserved 239.064.000.000-239.127.255.255 Reserved 239.128.000.000-239.191.255.255 Reserved 239.192.000.000-239.251.255.255 Organization-Local Scope 239.252.0.0/16 Site-Local Scope (reserved) 239.253.0.0/16 Site-Local Scope (reserved) 239.254.0.0/16 Site-Local Scope (reserved) 239.255.0.0/16 Site-Local Scope 239.255.002.002 rasadv
Internet Multicast Addresses
2005.05.02
Maguire
[email protected]
The multicast datagrams are delivered to all processes that belong to the same multicast group. To extend beyond a single subnet we use IGMP.
Maguire
[email protected]
Problems
Unfortunately many links do not support link layer multicasts at all! For example: ATM Frame relay many cellular wireless standards
Maguire
[email protected]
Problems
2005.05.02
20 bytes 8 bytes Figure 79: Encapsulation of IGMP message in IP datagram (see Stevens, Vol. 1, figure 13.1, pg. 179)
4 bit IGMP version (1) 4-bit IGMP type (1-2) 32 bit group address (class D IP address) Unused 16 bit checksum
ARP
Driver
RARP
incoming frame - accepted by matching address or multicast address Figure 80: IGMP - adapted from earlier figure (See Demultiplexing on page 30.)
Maguire
[email protected]
Hosts sends a report when rst process joins a given group Nothing is sent when processes leave (not even when the last leaves), but the host will no longer send a report for this group IGMP router sends queries (to address 224.0.0.1) periodically (one out each interface), the group address in the query is 0.0.0.0 In response to a query, a host sends a IGMP report for every group with at least one process
Routers
Note that routers have to listen to all 223 link layer multicast addresses! Hence they listen promiscuously to all LAN multicast trafc
Maguire
[email protected]
TTL generally set to 1, but you can perform an expanding ring search for a server by increasing the value Addresses in the special range 224.0.0.0 through 224.0.0.255 - should never be forwarded by routers - regardless of the TTL value
All-Hosts Group
all-hosts group address 224.0.0.1 - consists of all multicast capable hosts and routers on a given physical network; membership is never reported (sometimes this is called the all-systems multicast address)
All-Routers Group
NonMember
Delaying Member
Member
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
IGMP - ethereal
Maguire
[email protected]
IGMP - ethereal
2005.05.02
Maguire
[email protected]
Protocol: IGMP (0x02) Header checksum: 0x8284 (correct) Source: 130.237.15.225 (130.237.15.225) Destination: 239.255.255.250 (239.255.255.250) Options: (4 bytes) Router Alert: Every router examines packet Internet Group Management Protocol IGMP Version: 2 Type: Membership Report (0x16) Max Response Time: 0.0 sec (0x00) Header checksum: 0xfa04 (correct) Multicast Address: 239.255.255.250 (239.255.255.250)
Maguire
[email protected]
Protocol: IGMP (0x02) Header checksum: 0x4c20 (correct) Source: 211.105.145.186 (211.105.145.186) Destination: 224.0.0.2 (224.0.0.2) Options: (4 bytes) Router Alert: Every router examines packet Internet Group Management Protocol IGMP Version: 2 Type: Leave Group (0x17) Max Response Time: 0.0 sec (0x00) Header checksum: 0xff71 (correct) Multicast Address: 239.192.249.204 (239.192.249.204)
Maguire
[email protected]
Multicast routing
AS1 AS3 AS4 AS2 AS5 Figure 82: Multicast routing: packet replicated by the routers -- not the hosts
Maguire
[email protected]
Multicast routing
2005.05.02
Communicates with:
directly connected hosts via IGMP other multicast routers with multicast routing protocols
Maguire
[email protected]
Multicast routing
2005.05.02
Multicasting
Example: Transmitting a file from C to A, B, and D. $Using point-to-point transfer, some links will be used more than once to send the same le A 3 D
Using Multicast
Link 1 2 5 6 A 1 1 B 1 1 2 1 1 1 1 2 Point-to-point E D Total 1 2 2 1 Multicast 1 1 1 1 4
B 4
2 C 5
Maguire
[email protected]
Multicast routing
2005.05.02
Maguire
[email protected]
Build the tree as members join Multicast Protocols Source-based Tree PIM MOSPF DVMRP PIM-DM PIM-SM CBT Group-shared Tree
Figure 83: Taxonomy of Multicast Routing Protocols (see Forouzan figure 15.7 pg. 444)
Maguire
[email protected]
A 3 D
Drawbacks
B 4
2 C 5
$ It does not take into account group membership $ It concentrates all trafc into a small subset of the network links.
Maguire
[email protected]
However, it is expensive to keep store all this information (and most is unnecessary)
Cache only the active (S,G) pairs Use a data-driven approach, i.e., only computes a new tree when a multicast datagram arrives for this group
Maguire
[email protected]
1.When a multicast packet is received, note source (S) and interface (I) 2.If I belongs to the shortest path toward S, forward to all interfaces except I. Compute shortest path from the source to the node rather than from the node to the source. Check whether the local router is on the shortest path between a neighbor and the source before forwarding a packet to that neighbor. If this is not the case, then there is no point in forwarding a packet that will be immediately dropped by the next router.
Maguire
[email protected]
These trees have two interesting properties: They guarantee the fastest possible delivery, as multicasting follows the shortest path from source to destination Better network utilization, since the packets are spread over multiple links.
Drawback
$Group membership is not taken into account when building the tree a network can receive two or more copies of a multicast packet
Maguire
[email protected]
Maguire
[email protected]
Reverse Path Multicasting (RPM) DVMRP is data-driven and uses source-based trees
Maguire
[email protected]
Distance-Vector Multicast Routing Protocol (DVMRP) [64] Multicasting and RSVP 451 of 489
2005.05.02 Internetworking/Internetteknik
Steiner tree uses less resources (links), but are very hard to compute (N-P complete) In Steiner trees the routing changes widely if a new member joins the group, this leads to instability. Thus the Steiner tree is more a mathematical construct that a practical tool.
Maguire
[email protected]
demand-driven). This is in contrast with RPF where the rst packet is sent to the whole network. The amount of state is less; it depends only on the number of the groups, not the number of pairs of sources and groups Group-shared multicast trees (*, G) Routing is based on a spanning tree, thus CBT does not depend on multicast or unicast routing tables
Disadvantages
The
path between some sources and some receivers may be suboptimal. Senders sends multicast datagrams to the core router encapsulated in unicast datagrams
Maguire
[email protected]
The adjectives dense and sparse: refer to the density of group members in the Internet. Where a group is send to be dense if the probability is high that the area contains at least one group member. It is send to be sparse if that probability is low.
Maguire
[email protected]
Maguire
[email protected]
See the IETF MBONE Deployment Working Group (MBONED) http://antc.uoregon.edu/MBONED/ and their charter http://www.ietf.org/html.charters/mboned-charter.html
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
MBONE Chronology
Nov. 1988 Nov. 1990 Feb. 1991 Apr. 1991 Sept. 1991 Mar. 1992 Dec. 1992 Jan. 1993 1994/1995 July 1995 ... today Small group proposes testbed net to DARPA. This becomes DARTNET Routers and T1 lines start to work First packet audio conference (using ISIs vt) First multicast audio conference First audio+video conference (hardware codec) Deering & Casner broadcast San Diego IETF to 32 sites in 4 countries Washington DC IETF - four channels of audio and video to 195 watchers in 12 countries MBONE events go from one every 4 months to several a day Telesys gk -- multicast from KTH/IT in Stockholm KTH/IT uses MBONE to multicast two parallel sessions from IETF meeting in Stockholm lots of users and "multicasters"
IETF meetings are now regularily multicast - so the number of participants that can attend is not limited by physical space or travel budgets.
Maguire
[email protected]
MBONE Chronology
2005.05.02
MBONE growth
1600 1400
1200
1000
Nodes
800
600
400
200
0 1992
1993
1994
1995
Maguire
[email protected]
MBONE growth
2005.05.02
http://www.caida.org/tools/measurement/mantra/
2002
01/21/2002,11:30 PST (Pacific Standard Time)
02/06/2003,15:25:38 PST
Maguire
[email protected]
MBONE growth
2005.05.02
MBONE connections
MBONE is an overlay on the Internet multicast routers were distinct from normal, unicast routers - but increasingly routers support multicasting it is not trivial to get hooked up requires cooperation from local and regional people MBONE is changing: Most router vendors now support IP multicast MBONE will go away as a distinct entity once ubiquitous multicast is supported throughout the Internet. Anyone hooked up to the Internet can participate in conferences
Maguire
[email protected]
MBONE connections
2005.05.02
mrouted
mrouted UNIX deamon tunneling to other MBONE routers See: Linux-Mrouted-MiniHOWTO: How to set up Linux for multicast routing by Bart Trojanowski <[email protected]>, v0.1, 30 October 1999
http://jukie.net/~bart/multicast/Linux-Mrouted-MiniHOWTO.html
and http://www.linuxdoc.org/HOWTO/Multicast-HOWTO-5.html
Maguire
[email protected]
mrouted
2005.05.02
GLOP addressing
Traditionally multicast address allocation has been dynamic and done with the help of applications like SDR that use Session Announcement Protocol (SAP). GLOP is an example of a policy for allocating multicast addresses (it is still experimental in nature). It allocated the 233/8 range of multicast addresses amongst different ASes such that each AS is statically allocated a /24 block of multicast addresses. See [67] 0 78 23 31
233 16 bits AS local bits
Maguire
[email protected]
GLOP addressing
2005.05.02
Maguire
[email protected]
Maguire
[email protected]
shows the multicast tunnels and routes for a router/mrouted. traces the multicast path between two hosts. displays receiver loss collected from RTCP messages. monitors tree topology and loss statistics. monitors multicast trafc on a local area network. captures multicast group membership information. collects information about protocol operation.
Maguire
[email protected]
Maguire
[email protected]
PIM MIB
list of PIM interfaces that are congured; the routers PIM neighbors; the set of rendezvous points and an association for the multicast address prexes; the list of groups for which this particular router should advertise itself as the candidate rendezvous point; the reverse path table for active multicast groups; and component table with an entry per domain that the router is connected to. conguration of the router including interface conguration; router statistics for multicast groups; state about the set of group cores, either generated by automatic bootstrapping or by static mappings; and conguration information for border routers. interface conguration and statistics; peer router conguration states and statistics; the state of the DVMRP (Distance-Vector Multicast Routing Protocol) routing table; and information about key management for DVMRP routes. lists tunnels that might be supported by a router or host. The table supports tunnel types including Generic Routing Encapsulation (GRE) tunnels, IP-in-IP tunnels, minimal encapsulation tunnels, layer two tunnels (LTTP), and point-to-point tunnels (PPTP). only deals with determining if packets should be forwarded over a particular leaf router interface; contains information about the set of router interfaces that are listening for IGMP messages, and a table with information about which interfaces currently have members listening to particular multicast groups.
CBT MIB:
DVMRP MIB
Tunnel MIB
IGMP MIB
Maguire
[email protected]
HP Laboratories researchers investigating IP multicast network management are building a prototype integrated with HP OpenView -- intended for use by the network operators who are not experts in IP multicast; provides discovery, monitoring and fault detection capabilities.
Maguire
[email protected]
Maguire
[email protected]
Functionality
RSVP is receiver oriented protocol. The receiver is responsible for requesting reservations. RSVP handles heterogeneous receivers. Hosts in the same multicast tree may have different capabilities and hence need different QoS. RSVP adapts to changing group membership and changing routes. RSVP maintains Soft state in routers. The only permanent state is in the end systems. Each end system sends their RSVP control messages to refresh the router state. In the absence of refresh message, RSVP state in the routers will time-out and be deleted. RSVP is not a routing protocol. A host sends IGMP messages to join a multicast group, but it uses RSVP to reserve resources along the delivery path(s) from that group.
Maguire
[email protected]
Functionality
2005.05.02
Resource Reservation
Interarrival variance reduction / jitter Capacity assignment / admission control Resource allocation (who gets the bandwidth?)
Maguire
[email protected]
Resource Reservation
2005.05.02
Jitter Control
if network has enough capacity average departure rate = receiver arrival rate Then jitter is caused by queue waits due to competing trafc Queue waits should be at most the amount of competing trafc in transit, total amount of in transit data should be at most round trip propagation time (100 ms for transcontinental path) (64 kbit/sec => buffer = 8 kb/s*0.1 sec = 800 bytes) See: Jonathan Rosenberg, Lili Qiu, and Henning Schulzrinne, Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms on the Internet,INFOCOM, (3), 2000, pp. 1705-1714. See also http://citeseer.nj.nec.com/rosenberg00integrating.html
Maguire
[email protected]
Jitter Control
2005.05.02
Capacity Assignment
end-nodes ask network for bandwidth. Can get yes or no (busy signal) Used to control available transmission capacity
Maguire
[email protected]
Capacity Assignment
2005.05.02
Path Resv
Resv
Maguire
[email protected]
Maguire
[email protected]
RSVP operation
S1 R1
S2
R2
Figure 88:
Router Host
RSVP RSVP App Q Routing Q
Figure 89:
Maguire
[email protected]
RSVP operation
2005.05.02
Maguire
[email protected]
RSVP Summary
RSVP supports multicast and unicast data delivery RSVP adapts to changing group membership and routes RSVP reserves resources for simplex data streams RSVP is receiver oriented, i.e., the receiver is responsible for the initiation and maintenance of a ow RSVP maintains a soft-state in routers, enabling them to support gracefully dynamic memberships and automatically adapt to routing changes RSVP provides several reservation models RSVP is transparent for routers that do not provide it
Maguire
[email protected]
RSVP Summary
2005.05.02
Maguire
[email protected]
Further reading
IETF Routing Area, especially: Inter-Domain Multicast Routing (idmr) Multicast Extensions to OSPF (mospf) IETF Transport Area especially: Differentiated Services (diffserv) RSVP Admission Policy (rap) Multicast-Address Allocation (malloc) With lots of traditional broadcasters and others discovering multicast -- it is going to be an exciting area for the next few years.
Maguire
[email protected]
Further reading
2005.05.02
Summary
This lecture we have discussed: Multicast, IGMP, RSVP
Maguire
[email protected]
Summary
2005.05.02
References
[59] Joe Abley, f.root-servers.net, NZNOG 2005, February 2005, Hamilton, NZ
http://www.isc.org/pubs/pres/NZNOG/2005/F%20Root%20Server.pdf
[60] S. Deering, Host Extensions for IP Multicasting, IETF RFC 1112, August 1989 http://www.ietf.org/rfc/rfc1112.txt [61] W. Fenner, Internet Group Management Protocol, Version 2, IETF RFC 2236 , November 1997 http://www.ietf.org/rfc/rfc2236.txt [62] B. Cain, S. Deering, I. Kouvelas, B. Fenner, and A. Thyagarajan, Internet Group Management Protocol, Version 3, IETF RFC 3376, October 2002
http://www.ietf.org/rfc/rfc3376.txt
[63] J. Moy, Multicast Extensions to OSPF, IETF RFC 1584, March 1994 http://www.ietf.org/rfc/rfc1584.txt [64] D. Waitzman, C. Partridge, and S. Deering, Distance Vector Multicast Routing Protocol, IETF RFC 1075 , November 1988
Maguire
[email protected]
References
2005.05.02
http://www.ietf.org/rfc/rfc1075.txt
[65] D. Estrin, D. Farinacci, A. Helmy, D. Thaler, S. Deering, M. Handley, V. Jacobson, C. Liu, P. Sharma, and L. Wei, Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification, IETF RFC 2362, June 1998 http://www.ietf.org/rfc/rfc2362.txt [66] A. Adams, J. Nicholas, and W. Siadak, Protocol Independent Multicast Dense Mode (PIM-DM): Protocol Specification (Revised), IETF RFC 3973, January 2005 http://www.ietf.org/rfc/rfc3973.txt [67] D. Meyer and P. Lothberg, GLOP Addressing in 233/8, IETF RFC 3180 September 2001 http://www.ietf.org/rfc/rfc3180.txt [68] T. Bates, Y. Rekhter, R. Chandra, and D. Katz, Multiprotocol Extensions for BGP-4, IETF RFC 2858, June 2000 http://www.ietf.org/rfc/rfc2858.txt [69] Beau Williamson, Developing IP Multicast Networks, Cisco Press, 2000 [70] Internet Protocol Multicast, Cisco, Wed Feb 20 21:50:09 PST 2002
Maguire
[email protected]
References
2005.05.02
http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/ipmulti.htm
[71] B. Fenner and D. Meyer (Editors), Multicast Source Discovery Protocol (MSDP), IETF RFC 3618, October 2003 http://www.ietf.org/rfc/rfc3618.txt [72] T. Speakman, J. Crowcroft, J. Gemmell, D. Farinacci, S. Lin, D. Leshchiner, M. Luby, T. Montgomery, L. Rizzo, A. Tweedly, N. Bhaskar, R. Edmonstone, R. Sumanasekera and L. Vicisano, PGM Reliable Transport Protocol Specification, IETF RFC 3208 , December 2001 [73] S. Bhattacharyya (Ed.), An Overview of Source-Specific Multicast (SSM), IETF RFC 3569, July 2003 http://www.ietf.org/rfc/rfc3569.txt [74] D. Meyer, Administratively Scoped IP Multicast, IETF RFC 2365, July 1998 http://www.ietf.org/rfc/rfc2365.txt [75] B. Quinn and K. Almeroth, IP Multicast Applications: Challenges and Solutions, IETF RFC 3170,September 2001 http://www.ietf.org/rfc/rfc3170.txt [76] R. Braden (Ed.), L. Zhang, S. Berson, S. Herzog, and S. Jamin, Resource
Maguire
[email protected]
References
2005.05.02
ReSerVation Protocol (RSVP) -- Version 1 Functional Specification, IETF RFC 2205, September 1997 http://www.ietf.org/rfc/rfc2205.txt [77] Y. Snir, Y. Ramberg, J. Strassner, R. Cohen, and B. Moore, Policy Quality of Service (QoS) Information Model, IETF RFC 3644, November 2003
http://www.ietf.org/rfc/rfc3644.txt
Maguire
[email protected]
References
2005.05.02
2G1305 Internetworking/Internetteknik Spring 2005, Period 4 Module 9: Applications: Network Management and VoIP
Lecture notes of G. Q. Maguire Jr.
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.02:22:21
Maguire
[email protected]
Applications.fm5
2005.05.02
Lecture 5: Outline
Network Management SNMP VoIP
Maguire
[email protected]
Lecture 5: Outline
2005.05.02
ISO FCAPS Network Management Model Fault management Conguration management Accounting management Performance management Security management
Maguire
[email protected]
Control
Agent proxy
Management DB
Information Collecting
Management DB Management DB
Managed devices
Maguire
[email protected]
Management entity
network management protocol
network Agent
MIB
Management DB Management DB
Agent
Agent proxy
Management DB
SMI
Maguire
[email protected]
SNMP
Version 1 Version 2 - in 1992-1993, the SNMPv2 Working Group developed a security model based on parties to an SNMP transaction - this was known as SNMPv2p. But the working group decided that a user-based security model was much simpler - and hence more likely to be deployed. December 1995, the SNMPv2 Working Group was deactivated, but two prominent approaches emerged from independent groups:
SNMPv2u early standardization of the security features and a minimal specication - to encourage rapid deployment of simple agents; deferred standardization of features for managing large networks SNMPv2* concurrent standardization of security and scalability features to ensure that the security design addressed issues of: proxy, trap destinations, discovery, and remote conguration of security Focus was effective management of medium and large networks.
SNMP
2005.05.02
SNMPv3
March 1997, the SNMPv3 Working group was chartered to define a standard for SNMP security and administration. Target: April 1998 - all SNMPv3 specifications submitted to IESG for consideration as Proposed Standards. Based on An Architecture for Describing SNMP Management Frameworks (RFC 2271) Composed of multiple subsystems:
1. a message processing and control subsystem - Message Processing and Dispatching for SNMP (RFC 2272) 2. a security subsystem - based on a User-based Security Model (USM) (RFC 2274), provides SNMP message level security (Keyed-MD5 as the authentication protocol and the use of CBC-DES as the privacy protocol - but with support for others) denes a MIB for remotely monitoring/managing the conguration parameters for this Security model 3. a local processing subsystem - responsibile for processing the SNMP PDUs that operate on local instrumentation, applies access control [View-based Access Control Model (VACM) (RFC 2275)] and invokes method routines to access management information, and prepares a response to the received SNMP request. 4. SNMPv3 Applications (RFC 2273) - includes Proxy Forwarder Applications, which can forward SNMP requests to other SNMP entities, to translate SNMP requests of one version into SNMP requests of another version or into operations of some non-SNMP management protocol; and support aggregated managed objects where the value of one managed object depends upon the values of multiple (remote) items.
Maguire
[email protected]
SNMPv3
2005.05.02
SNMP
SNMPv1
only 5 commands: get-request, get-next request, set-request, response Clear-text password
SNMPv2: 1992-1996
get-bulk-request inform-request (for proxy) trap v2 MIB and M2M MIB Authentication
SNMPv3: 1997 more security enhancement View-based access control - so different managers can see different subset of the information remote conguration
Maguire
[email protected]
SNMP
2005.05.02
Case Diagram
To understand the relationship between counters and to make sure that all the data paths for a packet are accounted for.
Figure 90: Case diagram of UDP group (W. R. Stevens, TCP/IP Illustrated, V.1, pg. 367)
Maguire
[email protected]
SNMP Traps
Agent sends a trap to manager to indicate that something has happened. Following trap types are defined:
0: coldStart 1: warmStart 2: linkDown 3: linkUP 4: authenticationFailure 5: egpNeighborLoss 6: enterpriseSpecic
sun .33
Example: start the SNMP agent on sun and send traps to bsdi; tcpdump output:
1 0.0 trap type Port 162 Port 161 sun.snmp > bsdi.snmp-trap: C=traps Trap(28) E:unix.1.2.5 [140.252.13.33] coldstart 20 (18.86) PDU type (length) timestamp
2 18.86
sun.snmp > bsdi.snmp-trap: C=traps Trap(29) E:unix.1.2.5 [140.252.13.33] authenticationFailure 1907 Enterprise: sysObjectID IP address of agent
Maguire
[email protected]
Notify manager of errors provide alerts for network problems collects statistical baseline data (i.e., what is normal on this LAN), and acts as a remote network analyser. access higher level protocol information, Point-to-point trafc statistics broken down by higher layer protocols, eases trouble-shooting, and enables network capacity planning [and to solve problems before they become problems].
Remote MONitoring (RMON)Applications: Network Management and VoIP 501
2005.05.02 Internetworking/Internetteknik
RMON 2
Maguire
[email protected]
Probes can operate off-line, i.e., they operate even though they may not be in contact with the network managment system. Probes are sold by lots of vendors.
Maguire
[email protected]
RMON1 Statistics
Information collected by examing MAC layer Group Description
Tables
1 2 4 5 6
Statistics History Host Host Top N Matrix Filter Capture Alarm Event Token Ring
Statistics for the segment to which the RMON probe is attached History (Baselines) of the segment Per host statistics for each individual transmitting and receiving device. Top N reports on base statistics. Statistics on all conversations (i.e., who talks to whom) Match on any part of a frame, including errors (CRC, overruns, etc.)
7 8 3 9
Collect packets, based on lters, for later retrieval (as if you were a network analyzer Alarms to monitor for user-dened events. Log le for use in conjunction with the Alarm or Filter Group. Ring Station Order, Ring Conguration and Source Routing Information.
Token rings
RMON1 Statistics
2005.05.02
::= { statistics 1 }
EtherStatsEntry
EtherStatsEntry ::= SEQUENCE {
etherStatsIndex INTEGER (1..65535), etherStatsDataSource OBJECT IDENTIFIER, etherStatsDropEvents Counter, etherStatsOctets Counter, etherStatsPkts Counter, etherStatsBroadcastPkts Counter, etherStatsMulticastPkts Counter, etherStatsCRCAlignErrors Counter, etherStatsUndersizePkts Counter, etherStatsOversizePkts Counter, etherStatsFragments Counter, etherStatsJabbers Counter, etherStatsCollisions Counter, etherStatsPkts64Octets Counter, etherStatsPkts65to127Octets Counter, etherStatsPkts128to255Octets Counter, etherStatsPkts256to511Octets Counter, etherStatsPkts512to1023Octets Counter, etherStatsPkts1024to1518Octets Counter, etherStatsOwner OwnerString, etherStatsStatus EntryStatus
Maguire
[email protected]
EtherHistoryEntry
EtherHistoryEntry::= SEQUENCE {
etherHistoryIndex etherHistorySampleIndex etherHistoryIntervalStart etherHistoryDropEvents etherHistoryOctets etherHistoryPkts etherHistoryBroadcastPkts etherHistoryMulticastPkts etherHistoryCRCAlignErrors etherHistoryUndersizePkts etherHistoryOversizePkts etherHistoryFragments etherHistoryJabbers etherHistoryCollisions etherHistoryUtilization INTEGER (1..65535), INTEGER (1..2147483647), TimeTicks, Counter, Counter, Counter, Counter, Counter, Counter, Counter, Counter, Counter, Counter, Counter, INTEGER (0..10000)
}
Maguire
[email protected]
EtherHistoryEntry
2005.05.02
HostEntry
HostEntry ::= SEQUENCE {
hostAddress hostCreationOrder hostIndex hostInPkts hostOutPkts hostInOctets hostOutOctets hostOutErrors hostOutBroadcastPkts hostOutMulticastPkts OCTET STRING, INTEGER (1..65535), INTEGER (1..65535), Counter, Counter, Counter, Counter, Counter, Counter, Counter
Maguire
[email protected]
HostEntry
2005.05.02
Maguire
[email protected]
The maxtrixSDTable - a similar set of statistics (MatrixDSEntry) indexed by destination and source.
Maguire
[email protected]
RMON2
Information collected from network and higher layer (application) headers (defined by RFC2021)
Group Description
Protocols
11 12
Protocol Directory Protocol Distribution Address Mapping Network Layer Host Network Layer Matrix Network Layer Matrix Top N
List of protocol types the probe is capable of monitoring Number of packets and octets by protocols on a network segment MAC addresses and corresponding network addresses Amount of trafc sent to and from each network address Amount of trafc between each pair of network addresses Top N conversations over a user-dened period (packet or octet counts) Amount of trafc, by protocol Amount of trafc, by Protocol, between each pair of network addresses. Top N conversations over a user-dened period (packet or octet counts)
RMON2
2005.05.02
Network layer
13 14 15
Higher layers
16 17
Application Layer Host Application Layer Matrix Application Layer Matrix Top N
Maguire
[email protected]
Information collected from network and higher layer (application) headers (defined by RFC2021)
Group Description
18 Probe itself 19 20
Users created custom History Tables based on supported OIDs. Conguration of various operating parameters of the probe Lists which groups and instances of a group a probe supports
Maguire
[email protected]
RMON2
2005.05.02
Group
Description
FDDI MAC level and User Data Statistics for FDDI networks Bandwidth utilization by protocols Tracks MAC to IP address mappings; including when a host was rst and last seen, when a new host appears on the segment Generate trafc using user-dened packets (including packet with errors) Works out response times and helps to pin-point WAN failures using ICMP echo-requests initiated from the central site.
Maguire
[email protected]
SunSoft Solstice: Site Manager, Solstice Domain Manager, and Enterprise Manager -- http://www.sun.com/solstice/index.html Aprisma Management Technologies Spectrum http://www.aprisma.com/
Maguire
[email protected]
OpenView World Wide Web Interface DR-Web Manager and Agent SiteScope
Maguire
[email protected]
Maguire
[email protected]
Builds on:Intels Wired for Management (WfM) effort ==> Distributed Management Task Force (formerly Desktop Management Task Force) and Desktop Management Interface (now DMI 2.0) The DMI was designed to be:
independent of a specic computer or operating system independent of a specic management protocol easy for vendors to adopt usable locally -- no network required usable remotely using DCE/RPC, ONC/RPC, or TI/RPC mappable to existing management protocols (e.g., CMIP, SNMP) The DMI procedural interfaces are specically designed to be remotely accessible through the use of Remote Procedure Calls. The RPCs supported by the DMI include: DCE/RPC, ONC/RPC, and TI/RPC. -- DMI 2.0 Introduction
Web Based Enterprise Management Initiative (WBEM) Applications: Network Management
2005.05.02 Internetworking/Internetteknik
Maguire
[email protected]
Maguire
[email protected]
a service provider entity two sets of APIs, one set for service providers and management applications to interact (Service Provider API for Components), and the other for service providers and components to interact (Component Provider API), and set of services for facilitating remote communication.
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
Inter-Domain Management, Open Group Technical Standard, C802 ISBN 1-85912-256-6 January 2000 524 pages.
They have specificed such things as SNMP MIBS to CORBA-IDL conversion, CORBA-IDL to GDMO/ASN.1 conversion, CORBA/SNMP Gateway, .
Maguire
[email protected]
policy management
Policy Agents: Licensed to Manage/Policy Based Management of Distributed Systems by Morris Sloman. Department of Computing, Imperial College, London, U.K. Ciscos CiscoAssure Policy Networking: Enabling Business Applications through Intelligent Networking
Maguire
[email protected]
Applications
E-mail
E-mail was invented by Ray Tomlinson of BBN in 1972. His e-mail utility program permits listing, forwarding, and responding to e-mails It was demonstrated at International Computer Communication Conference (ICCC) that year. It become the rst killer application of the Internet.
Telnet and FTP Networked File systems (such as NFS) X windowing system Web browsers
The rst graphical Web browser (called Mosaic) is introduced in 1993 It was developed at the National Center for Supercomputing at the University of Illinois.
Maguire
[email protected]
Fixed IP terminal
Maguire
[email protected]
Universal Service
from a myth to a legal requirement an evolving service level - not a xed service or service level! special subsidies for schools, health care, libraries, etc.
1. The ofcial citation for the new Act is: Telecommunications Act of 1996, Pub. LA. No. 104-104, 110 Stat. 56 (1996). 2. http://www.fcc.gov:80/telecom.html 3. For informal background see WTO negotiations on basic Telecommunications - http://www.wto.org/wto/services/tel.htm
Maguire
[email protected]
- from a local player to a global player in ~5 years WinStar - wireless bypass to >70% of the US propulation Qwest - Internet telephony Level3 - Internet telephony Delta Three-Internet telephony, subsidiary of RSL Communications Ltd. Net2Phone - internet telephony ITXC and GRIC - concentrating on interconnecting ISP and selling minutes of voice
WorldCom
Maguire
[email protected]
Lucent buys Prominet, Ericsson buys ACC, Alcatel buys DSC Communications and Packet Engines, Nortel + Bay Networks becomes Nortel Networks: unied networks
Maguire
[email protected]
Latency
1 Usability Toll quality Satellite FAX relay/broadcast CB Radio
Internet telephony
(now!)
0 100 200 300 400 500 600
(past)
700
800
900 ms
Figure 91: Usability of a voice circuit as a function of end-to-end delay (adapted from a drawing by Cisco)a
a. http://www.packeteer.com/solutions/voip/sld006.htm
However:
Round-trip Local LAN to northern Sweden (basil.cdt.luth.se) to Austria (freebee.tu-graz.ac.at) To server in US network To my machine in the US (~30 ms is the ISDN link) To KTHs subnet at Stanford University in the US (ssvl.stanford.edu)
Maguire
[email protected]
hops 0 8 18 19 21 20
Public cells
Handset DECT exchange E-1 to PSTN Ascend ISDN-PRI GSM Electrum GSM system
Handset
Home
Mobile
Ofce
Maguire
[email protected]
Voice Gateway
2B+D or 30B+D or digital path CPU D/A converter A/D converter LAN adaptor LAN
ISDN interface
Maguire
[email protected]
Terminate incoming synchronous voice calls, compress the voice, encapsulate it into packets, and send it as IP packets. Incoming IP voice packets are unpacked, decompressed, buffered, and then sent out as synchronous voice to the PSTN connection. Global
directory mapping
Translate between the names and IP addresses of the Internet world and the E.164 telephone numbering scheme of the PSTN network. Authentication
and billing
Voice representation
ITU G.723.1 algorithm for voice encoding/decoding or G.729 (CS-ACELP voice compression).
Signaling
Based on the H.323 standard on the LAN and conventional signaling will be used on telephone networks.
Maguire
[email protected]
Fax Support
Both store-and-forward and real-time fax modes - with store-and-forward the system records the entire FAX before transmission.
Management
Full SNMP management capabilities via MIBs (Management Information Base) will be provided to control all functions of the Gateway. Extensive statistical data will be collected on dropped calls, lost/resent packets, and network delays.
Compatibility
De jure standards:
ITU G 723.1/G.729 and H.323 VoIP Forum IA 1.0
De facto standards:
Netscapes Cooltalk Microsofts NetMeeting
A protocol to keep you eyes on: Session Initiation Protocol (SIP) [RFC 2543], much simpler than H.323
Maguire
[email protected]
Premises to Network
Maguire
[email protected]
Surf&Call: a Web browser plug-in enables online customers to connect directly from a website to a live sales or support agent on a regular telephone.
NMS Communications (formerly ViaDSP, Inc.)
http://www.viadsp.com/ PacketTel Gateway - a carrier class gateway with real-time voice support, ITU G.723.1, G.729a; Hybrid echo cancellation, Silence suppression
Maguire
[email protected]
3Com
licensed an H.323 toolkit from DataBeam Corp + Total Control HiPer Access System remote access device voice gateway. Note that they simply download different software in the DSPs which are normally acting like a 56Kbps modem. joins with Siemens to form a joint venture company (US$100M) to do internet and LAN telephony
Packeteers Packet Shaper
Maguire
[email protected]
Cisco 3800 supports even more CODECs: ITU G.726 standard, 32k rate ITU G.726 standard, 24k rate ITU G.726 standard, 16k rate ITU G.728 standard, 16k rate (default) ITU G.729 standard, 8k rate By using Voice Activity Detection (VAD) - you only need to send traffic if there is something to send. An interesting aspect is that users worry when they hear absolute silence, so to help make them comfortable it is useful to play noise when there is nothing to output. Cisco provide a comfort-noise command to generate background noise to fill silent gaps during calls if VAD is activated. Cisco 3600 series router can be used as the voice gateway with software such as Microsoft NetMeeting. Cisco 3800 also supports fax-relay - at various rates either current voice rate or
Maguire
[email protected]
http://www.cisco.com/univercd/cc/td/doc/product/softwa re/ios113ed/113t/113t_1/voip/config.htm
Maguire
[email protected]
Maguire
[email protected]
Wireless LANs
The wireless workplace will soon be upon us1
Telia has strengthened its position within the area of radio-based data solutions through the acquisition of Global Cast Internetworking. The company will primarily enhance Telia Mobiles offering in wireless LANs and develop solutions that will lead to the introduction of the wireless office. A number of different alternatives to fixed data connections are currently under development and, later wireless IP telephony will also be introduced. The acquisition means that Telia Mobile has secured the resources it needs to maintain its continued expansion and product development within the field of radio-based LAN solutions. Radio LANs are particularly suitable for use by small and medium-sized companies as well as by operators of public buildings such as airports and railway stations. Todays radio-LAN technology is based on inexpensive products that do not require frequency certication. They are easy to install and are often used to replace cabled data networks in, for example, large buildings. [emphasis added by Maguire]
1. Telia press annoucement: 1999-01-25
Maguire
[email protected]
Telias HomeRun
http://www.homerun.telia.com/
A subscription based service to link you to your corporate network from airports, train stations, ferry terminals, hotels, conference centers, etc. Look for Telias HomeRun logo:
Maguire
[email protected]
Maguire
[email protected]
AT&T VoIP phone: http://www.telephones.att.com/new_prod.html Deutsche Telekom running a pilot Internet telephony service using networking products from Ascend Communications and VocalTec.
1. Mary E. Thyfault, Equant To Roll Out Voice-Over-Frame Relay Service, InformationWeek Daily, 10/21/98.
Maguire
[email protected]
Gatekeeper
To control an H.323 VOIP network Ericsson has introduced a produce called H.323 Gatekeeper. It provides for control of:
How much trafc is allocated to voice, video, and data; Do network bandwidth management; Handle routing when there are multiple H.323 Gateways; Manage Network Subscriber Access; Provides for Charging/Billing Systems; Add new Services & Applications; Support Network Security and Subscriber Authentication
Gatekeeper uses RAS (Registration, Admission, and Status) for call signalling and its communication.
Maguire
[email protected]
AT&T Kokusai Denshin Denwa (KDD) Co. Ltd. (Japan) Deutsche Telekom Telstra Corp. (Australia) Embratel (Brazil) Bezeq (Israel)
Maguire
[email protected]
Economics
"Can Carriers Make Money On IP Telephony? by Bart Stuck and Michael Weingarten, Business Communication Review, Volume 28, Number 8,August 1998, pp. 39-44 - http://www.bcr.com/bcrmag/08/98p39.htm
"What is the reality in the battle over packet-versus-circuit telephony, and what is hype? Looking at the potential savings by cost element, it is clear that in 1998, access arbitrage is the major economic driver behind VOIP. By 2003, we anticipate that switched-access arbitrage will diminish in importance, as the ESP exemption disappears and/or access rates drop to true underlying cost. However, we believe that the convergence between voice and data via packetized networks will offset the disappearance of a gap in switched access costs. As a result, VOIP will continue to enjoy a substantial advantage over circuit-switched voice. Indeed, as voice/data convergence occurs, we see standalone circuit-switched voice becoming economically nonviable."
Maguire
[email protected]
VoIP Handsets
In addition to the WLAN VoIP handset, there are now starting to appear USB attached VoIP handsets: TigerJet Network http://www.tjnet.com/solutions/usb_handset.htm Coming are VoIP cellular handsets
Maguire
[email protected]
Conferences
VON (Voice on the Net)
Maguire
[email protected]
Patents
Mixing voice and data in the LAN goes back to at least this patent: 4581735 : Local area network packet protocol for combined voice and data transmission INVENTORS: Flamm; Lois E., Chatham Township, Morris County, NJ Limb; John O., Berkeley Heights, NJ ASSIGNEES: AT&T Bell Laboratories, Murray Hill, NJ ISSUED: Apr. 8 , 1986 FILED: May 31, 1983
ABSTRACT: In order to control the transfer of packets of information among a plurality of stations, the instant communications system, station and protocol contemplate first and second oppositely directed signal paths. At least two stations are coupled to both the first and the second signal paths. A station reads one signal from a path and writes another signal
Maguire
[email protected]
Patents
2005.05.02
on the path. The one signal is read by an arrangement which electrically precedes the arrangement for writing the other signal. Packets are transmitted in a regular, cyclic sequence. A head station on a forward path writes a start cycle code for enabling each station to transmit one or more packets. If a station has a packet to transmit, it can read the bus field of a packet on the forward path. Responsive thereto, a logical interpretation may be made as to whether the forward path is busy or is not busy. If the path is not busy, the packet may be written on the path by overwriting any signal thereon including the busy field. If the path is busy, the station may defer the writing until the path is detected as not busy. In order to accommodate different types of traffic, the head station may write different start cycle codes. For example, a start-of-voice code may enable stations to transmit voice packets; a start-of-data code may enable stations to transmit data packets, etc. for the different types of traffic. Further, the start cycle codes may be written in a regular, e.g., periodic, fashion to mitigate deleterious effects, such as speech clipping. Still further, the last station on the forward path may write end cycle codes in packets on a reverse path for communicating control information to the head station. Responsive to the control information, the head station may modify the cycle to permit the respective stations to, for example, transmit more than one packet per cycle or to vary the number of packet time slots, which are allocated to each of the different types of traffic.
Maguire
[email protected]
Patents
2005.05.02
Deregulation Trends
replacing multiplexors with Routers/Switches/ << 1/10 circuit swi. cost Standard telco interfaces being replaced by datacom interfaces New Alliances:
HP/AT&T Alliance - a specic application: electronic commerce 3Com/Siemens, Bay/Ericcson, Cabletron/Nortel, Alcatel integrating Cisco IOS software technology, Ericsson Radio Systems & Cisco Systems collaborate wireless Internet services
Telecom (only) operators have no future Telecom (only) companies have no future
Maguire
[email protected]
VoIP details
Carry the speech frame inside an RTP packet IPv4/6
20/40 octets
40/60 octets
Maguire
[email protected]
P - whether zero padding follows the payload. X - whether extension or not. M - marker for beginning of each frame. PTYPE - Type of payload.
RTP: Real-Time Transport Protocol Applications: Network Management and VoIP
2005.05.02 Internetworking/Internetteknik
Maguire
[email protected]
H.323 is the framework of a group protocols for IP telephony (from ITU) H.225 - Signaling used to establish a call H.245 - Control and feedback during the call T.120 - Exchange of data associated with a call RTP - Real-time data transfer RTCP - Real-time Control Protocol
Maguire
[email protected]
SIP redirect server - directes the client to contact an alternate URI Location server - knows the current binding (from REGISTER msgs) SIP uses SDP (Session Description Protocol) to get information about a call, such as, the media encoding, protocol port number, multicast addresses, etc.
Maguire
[email protected]
SIP timeline
Alice invites Bob to a SIP session: Alice Invite OK,200 ACK session Bob
Bye
Maguire
[email protected]
SIP
1 Invite
INVITE sip:[email protected] SIP/2.0 Via: SIP/2.0/UDP pc33.atlanta.com;branch=z9hG4bK776asdhds To: Bob <sip:[email protected]> From: Alice <sip:[email protected]>;tag=1928301774 Call-ID: a84b4c76e66710 CSeq: 314159 INVITE Contact: <sip:[email protected]> Content-Type: application/sdp Content-Length: 142 (Alices SDP not shown)
SIP is a text-based protocol and uses ISO 10646 character set in UTF-8 encoding (RFC 2279). The message body uses MIME and can use S/MIME for security. The generic form of a message is:
generic-message = start-line message-header* CRLF [ message-body ]
1. Example from draft-ietf-sip-rfc2543bis-06.ps
Maguire
[email protected]
SIP Invite
2005.05.02
Bobs
1 response
SIP/2.0 200 OK Via: SIP/2.0/UDP pc33.atlanta.com;branch=z9hG4bK776asdhds Via: SIP/2.0/UDP bigbox3.site3.atlanta.com;branch=z9hG4bK77ef4c2312983.1 Via: SIP/2.0/UDP pc33.atlanta.com;branch=z9hG4bKnashds8 To: Bob <sip:[email protected]>;tag=a6c85cf From: Alice <sip:[email protected]>;tag=1928301774 Call-ID: a84b4c76e66710 CSeq: 314159 INVITE Contact: <sip:[email protected]> Content-Type: application/sdp Content-Length: 131 (Bobs SDP not shown)
Maguire
[email protected]
SIP Methods
Method
Invite Bye Options Ack Register Cancel
Purpose
Invites a user to join a call. Terminates the call between two of the users on a call. Requests information on the capabilities of a server. Conrms that a client has received a nal response to an INVITE. Provides the map for address resolution, this lets a server know the location of a user. Ends a pending request, but does not end the call.
Maguire
[email protected]
Provisional request received, continuing to process the request Success - the action was successfully received, understood, and accepted Redirection - further action needs to be taken in order to complete the request Client Error - the request contains bad syntax or cannot be fullled at this server Server Error - the server failed to fulll an apparently valid request Global Failure - the request cannot be fullled at any server
Maguire
[email protected]
ENUM
IETFs E.164 Number Mapping standard uses Domain Name Server (DNS) to map standard International Telecommunication Union (ITU-T) international public telecommunications numbering plan (E.164) telephone numbers to a list of Universal Resource Locators (URL). SIP then uses those URLs to initiate sessions. For example, ENUM DNS converts a telephone number in E.164 format, e.g. +46812345, and returns e.g., a Universal Resource Identifier (URI) SIP:[email protected] Then a SIP client can make a connection to the SIP gateway telia.se passing the local part olle.svenson. ENUM can return a wide variety of URI types.
Maguire
[email protected]
ENUM
2005.05.02
Further Reading
IP Telephony (iptel) PSTN and Internet Internetworking (pint) Also important are the measures of delay, delay jitter, throughput, packet loss, etc. IP Performance Metrics (ippm) is attempting to specify how to measure and exchange information about measurements of these quantities. A great set of references compiled by prof. Raj Jain is available at: http://www.cis.ohio-state.edu/~jain/refs/ref_voip.htm
Maguire
[email protected]
Summary
This lecture we have discussed: Network Management SNMP VoIP (including RTP)
Maguire
[email protected]
Summary
2005.05.02
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapter 27
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.16:19:02
Maguire
[email protected]
IPv6.fm5
2005.05.16
Lecture 6: Outline
IPv6
Maguire
[email protected]
Lecture 6: Outline
2005.05.16
Maguire
[email protected]
Growth
Currently IPv4 serves a market doubling every ~12 months In addition, new and very large markets are developing rapidly:
Nomadic Computing Networked Entertainment Device Control
Maguire
[email protected]
Growth
2005.05.16
Nomadic Computing
Wireless computers
supporting multimedia replacing pagers, cellular telephones,
Maguire
[email protected]
Nomadic Computing
2005.05.16
Networked Entertainment
Your TV will be an Internet Host! [consider the network attached Personal Video Recorders (PVR), such as TiVos DVR, SONICblues ReplayTV, Sonys SVR-2000, Philips PTR, )] 500 channels of television large scale routing and addressing auto-conguration requires support for real-time data SonicBluess ReplayTV 4000 a networked Digital Video Recorder (DVR) {i.e., coder/decoder + very big disk) that takes advantage of your broadband Internet connection - enables you to capture and transfer videos. Providing narrowcast content via broadband all the time is primetime.
Maguire
[email protected]
Networked Entertainment
2005.05.16
Device Control
Control everyday devices for
lightning, heating and cooling, motors, ... new street light controllers already have IP addresses! electrical outlets with addresses networked vehicles (within the vehicle1, between vehicles, and vehicles to infrastructure)2
1. On-Board Diagnostic systems (OBD-II), see slide 8 [79] 2. See InternetCAR, slide 4 (showing a Yokohama City bus) [79]
Maguire
[email protected]
Device Control
2005.05.16
IPv6 features
Expanded Addressing Capabilities
128 bit address length supports more levels of hierarchy improved multicast routing by using a scope eld new cluster addresses to identify topological regions
Maguire
[email protected]
IPv6 features
2005.05.16
Source Address 128 bits Destination Address 128 bits IPv6 header (total length = 40 bytes) IPv6: 6 fields + 2 addresses versus IPv4: 10 fixed fields + 2 addresses + options
Maguire
[email protected]
Demultiplexing
Initially, it was assumed that by keeping the version field the same that IPv4 and IPv6 could be mixed over the same links with the same link drivers. However, now IPv6 will be demultiplexed at the link layer: hence, IPv6 been assigned the Ethernet type 0x86DD (instead of IPv4s 0x8000)
Maguire
[email protected]
Demultiplexing
2005.05.16
Simplications
IPv6 builds on 20 years of internetworking experience - which lead to the following simplifications and benefits: Simplication
Use xed format headers Eliminate header checksum
Benets
Use extension headers instead, thus no need for a header length eld, simpler to process Eliminate need for recomputation of checksum at each hop (relies on link layer or higher layers to check the integrity of what is delivered) No segmentation, thus you must do Path MTU discovery or only send small packets (1996: 536 octets, 1997: proposed 1500 octets) (for observed PMTUs see [81]) This is because we should have units of control based on the units of transmitted data. Instead use (labeled) ows
Maguire
[email protected]
Simplications
2005.05.16
Quality-of-Service Capabilities
for packet streams Flow characterized by ow id + source address + destination address unique random ow id for each source
CLASS (8 bits) FLOW ID (20 bits) Network-wide priority (3 bits) Encodes the priority of trafc, can be used to provide Differentiated services Reserved (4 bits) Researchers would like to use two of these bits for congestion avoidance control:
one bit which could be set by routers to indicate that congestion was experienced; the other bit could be used by the source to mark that it is ready to adapt.
Class eld
D (1 bit) Delay sensitive
Flow ID - indicates packets which should all be handled the same way. The original specified in RFC 1809: Using the Flow Label Field in IPv6 Subsequently updated - see Chapter 6 of Huitema, 2nd edition; this change occurred because of Steve McCannes SigComm96 paper [83]. Note that chapter 27 in Forouzan is incorrect!
Maguire
[email protected]
Quality-of-Service Capabilities
2005.05.16
Payload length
Payload length is the length of the data carried after the header. As the length field is 16 bits maximum packet size of 64 kilobytes; but there is a provision for "jumbograms" [via the Hop-by-Hop option header with option type 194]. See RFC 2675 [84].
Maguire
[email protected]
Payload length
2005.05.16
0 2 3 5 6 17 43 44 45 51 52 59 60 88 89 255
Maguire
[email protected]
HBH ICMP GGP ST TCP UDP RH FH IDRP AH ESP Null IGRP OSPF
Hop-by-hop options IPv6 ICMP Gateway-to-Gateway Protocol Stream Transmission Control Protocol User Datagram Protocol IPv6 Routing Header IPv6 Fragmentation Header Inter-domain Routing Protocol Authentication Header Encrypted Security Payload No next Header (IPv6) IPv6 Destination Options Header IGRP Open Shortest Path First Reserved
IPv4 Protocol type IPv6 Next Header type
2005.05.16
Extension headers
Each header is a multiple of 8 octets long order (after IPv6 header):
Hop-by-hop option, Destination options header (1) Routing header, Fragment header, Authentication header, Encapsulating security payload header, Destination options header (2) Followed by the upper layer header (e.g., TCP, UDP, )
If we wanted to explicitly route the above packet, we simply add a routing header:
IPv6 header Next Header = Routing
Routing header
Next Header = TCP
Maguire
[email protected]
Extension headers
2005.05.16
Addressing
128 bits long three types: unicast, multicast, anycast
Unicast Multicast Anycast identies exactly one interface identies a group of interfaces; a packet sent to a multicast address will be delivered to all members of the group delivered to the nearest member of the group
296 times more addresses than IPv4 are available !!! IPv6 addresses per m2
Earth: 511,263,971,197,990 m2 665,570,793,348,866,943,898,599 / m2
Maguire
[email protected]
Addressing
2005.05.16
Assignment
0000 0000 0000 0001 0000 001 0000 01 0000 1 0001 001 010 011 100 101 110 1110 1111 0 1111 10 1111 110 1111 1110 0 1111 1110 10 1111 1110 11 1111 1111
Maguire
[email protected]
::/8 100::/8 200::/7 400::/6 800::/5 1000::/4 2000::/3 4000::/3 6000::/3 8000::/3 A000::/3 C000::/3 E000::/4 F000::/5 F800::/6 FC00::/7 FE00::/9 FE80::/10 FEC0::/10 FF00::/8
1/256 1/256 1/128 1/64 1/32 1/16 1/8 1/8 1/8 1/8 1/8 1/8 1/16 1/32 1/64 1/128 1/512 1/1024 1/1024 1/256
Reserved Unassigned Network Service Access Point (NSAP) Allocation-RFC 1888 Unassigned (rst half was formerly Novells IPX) Unassigned Unassigned Global Unicast - RFC 2374 see RFC 3587 Unassigned (formerly provider based unicast addresses) Unassigned Unassigned (formerly Geographic-based Unicast Addresses) Unassigned Unassigned Unassigned Unassigned Unassigned Unassigned Unassigned Link Local Use Addresses Reserved for IANA (was Site Local Use Addresses) Multicast Addresses
Address Allocation [94] and [99]
2005.05.16
Thus the Regional Internet Registries are allocating addresses from 2000:/3 For a table of IPv6 unicast assignment see
http://www.iana.org/assignments/ipv6-unicast-address-assignments
For an analysis of use from the point of view of RIPE see [101]
Maguire
[email protected]
Interface ID
Must be unique to the link, but there are some advantages of making it more globally unique. Hence, most will be based on the IEEE EUI-64 format, but with the u (unique) bit inverted. The u bit is the 7th most signicant bit of a 64 bit EUI. The inversion was necessary because 0:0:0:0 is a valid EUI, but this would collide with one of the IPv6 special addresses. u=1, when the address comes from a valid EUI, and is 0 otherwise. To go from a 48 bit IEEE 802, you insert 0xFFFE in between the 3rd and 4th octets of an IEEE 802 address, i.e., 123456789abc becomes 123456FFFE789abc.
Maguire
[email protected]
Interface ID
2005.05.16
Maguire
[email protected]
Link local addresses Link local address are simply unique to a given link - they can be used by stations that have not yet been assigned a provider-based address.
1111111010 (10 bits) 0 (54 bits) Interface ID (64 bits)
Maguire
[email protected]
Multicast Addresses
4 bit 1111 1111 Flags xxxT 4 bit Scope 112 bit - group id
T == Transient
T=0 T=1 well-known permanent - assigned by the IANA non-permanent
Scope
0 1 2 3, 4 5 6, 7 8 9, A, B, C, D E F
reserved node local scope link local scope unassigned site local scope unassigned organization local scope unassigned global scope reserved
Maguire
[email protected]
Multicast Addresses
2005.05.16
Maguire
[email protected]
unassigned Link Name All DHCP agents on this link All DHCP servers on this link All DHCP relays on this link All DHCP agents at this site All DHCP servers at this site All DHCP relays at this site Session Announcement Protocol (SAP) v1 Announcements
Multimedia conferences:
FF0X:0:0:0:0:2:8000 .. FF0X:0:0:0:0:2:FFFF multimedia conferences
X=2 -- this link; X=5 -- this site Use SAP to announce the conference - repeatedly until the end of the conference.
Maguire
[email protected]
Maguire
[email protected]
Anycast
Sending a packet to a generic address to get a specific service from the nearest instance. This puts the burden of determining which instance to deliver it to on the routing system. Requires defining a router entry for each anycast address. Subnet Anycast Address:
Subnet prex (n bits) 0 (128-n bits)
Maguire
[email protected]
Anycast
2005.05.16
IPv6 Routing
all standard routing protocols routing extensions
Provider Selection Host Mobility (route to current location) Auto-Readdressing (route to new address)
P1
SRC
PR
DEST
Figure 92: IPv6 Routing Option: provider specifies: SRC, PR, P1, Dest reply: Dest, PR, P1, SRC
Maguire
[email protected]
IPv6 Routing
2005.05.16
Routing header
Next Header (8 bits) Header Ext Length (8 bits) reserved (32 bits) address[1] (128 bits) address[2] ... address[n] Routing Type=0 (8 bits) Segments Left (8 bits)
Next Header identifies the next header in the chain of headers. Header Ext. Length. - number of 64 bit words (not including the first 64 bits). Routing type=0, is the generic routing header which all IPv6 implementations must support. Number of Segments is the number of segments left in the list (between 0 and 23).
Maguire
[email protected]
Routing header
2005.05.16
Fragment header
Next Header (8 bits) Reserved (8 bits) Fragment offset (13 bits) Identication RESERVED (2 bits) M (1 bit)
Fragment offset - in units of 64 bit words, the field is the most significant 13 bits of a 16 bit words. M == More fragment bit, set in all but the last fragment Identification - a 32 bit number
Maguire
[email protected]
Fragment header
2005.05.16
Action tells what action must be taken if the processing nodes does not recognize the option.
Bits Action
00 01 10 11
Skip over this option Discard packet silently (i.e., without sending an ICMP report) Discard packet and send an ICMP report - even if destination is multicast Discard packet and send an ICMP report - only if destination is not multicast
intermediate relays on the way to the destination Currently only two options are defined: Pad1 == a null byte - for use in padding to a 64-bit boundary; note it does not have a null option length field after it - as it is the whole field PadN - the length field says how many null bytes are needed to file to a 64-bit boundary.
Maguire
[email protected]
Currently three options are defined: Pad1, PadN, and Jumbo payload option (option type =194) - the option Data Length is 4 and is followed by a 32 bit Jumbo Payload Length value. See RFC 2113: Router Alert Option [86].
Maguire
[email protected]
Security
Header Authentication with signatures
Must have support for Message Digest 5 (MD5) algorithm [88]
RFC 1810 [89] examines MD5 performance Packet Encapsulation with e.g., DES For more information see Chapter 5 of IPv6, 2nd edition, by Christian Huitema.
Maguire
[email protected]
Security
2005.05.16
Maguire
[email protected]
1 2 3 4 128 129 130 131 132 133 134 135 136 137
Maguire
[email protected]
Destination Unreachable Packet too big Time exceeded Parameter problem Echo Request Echo Reply Group Membership Query Group Membership Report Group Membership Reduction Router Solicitation Router Advertisement Neighbor Solicitation Neighbor Advertisement Redirect
IPv6 ICMP [90]
2005.05.16
For type 1 the code reveals the reason for discarding the datagram
Maguire
[email protected]
Maguire
[email protected]
The Group Membership Reduction is used when a node leaves group. Reports are always sent to the same group address that is reported. Maximum response delay is the time in milliseconds that the responding report messages can be delayed. Responding stations are supposed to spread their responses uniformly over this range of delays (to prevent everyone from responding at once).
Maguire
[email protected]
Maguire
[email protected]
Maguire
[email protected]
Minimal upgrade dependencies (must rst upgrade DNS) Easy addressing (upgraded routers can use IPv4 address) FreeBit Co., Ltd.s Feel6 - secure IPv6 over IPv4[82], see slide 12 [79] and http://start.feel6.jp/ See also [92]
Maguire
[email protected]
Why IPv6?
solves Internet scaling problem
eliminates the problem of running out of addresses allows route aggregation - which allows the size of the routing tables in the backbone routers to decrease
exible transition (interworks with IPv4) meets the needs of new markets new functionality real-time ows provider selection host mobility end-to-end security auto-conguration - chapter 4, Plug and Play in IPv6, 2nd edition, by Christian Huitema - this a very major advantage of IPv6. See also [98]
Maguire
[email protected]
Why IPv6?
2005.05.16
IPv6 networks
6Bone - http://www.6bone.net/ a testbed for deployment of IPv6 Note the phase out of the 3FFE::/16 prex prex will be returned to the unassigned address pool on 6 June 2006 [93]. vBNS - http://www.vbns.net 6NET http://www.6net.org/ - project co-funded by European Commission Euro6IX: European IPv6 Internet Exchanges Backbone http://www.euro6ix.org/main/index.php - project co-funded by European Commission For some issues concerning IPv6 deployment see [100]
Maguire
[email protected]
IPv6 networks
2005.05.16
Maguire
[email protected]
Further information
See:
http://www.ipv6.org/ http://www.ipv6forum.com/
Measurements of dual stack IPv6 implementations: http://mawi.wide.ad.jp/mawi/dualstack/ See also: [80] and [81].
Maguire
[email protected]
Further information
2005.05.16
Summary
This lecture we have discussed: IPv6
Maguire
[email protected]
Summary
2005.05.16
References
[78] S. Deering and R. Hinden, Internet Protocol, Version 6 (IPv6) Specification, IETF RFC 2460, December 1998 http://www.ietf.org/rfc/rfc2460.txt [79] Jun Murai, WIDE report, 5th CAIDA-WIDE Workshop, Information Sciences Institute, Marina del Rey, CA, 15 March 2005
http://www.caida.org/projects/wide/0503/slides/murai.pdf
[80] Kenjiro Cho, Measuring IPv6 Network Quality (part 1), 5th CAIDA-WIDE Workshop, Information Sciences Institute, Marina del Rey, CA, 15 March 2005 http://www.caida.org/projects/wide/0503/slides/kenjiro-1.pdf [81] Kenjiro Cho, Measuting IPv6 Network Quality (part 2), Internet Iniative Japan (IIJ) / WIDE, 5th CAIDA-WIDE Workshop, Information Sciences Institute, Marina del Rey, CA, 15 March 2005
http://www.caida.org/projects/wide/0503/slides/kenjiro-2.pdf
[82] Trying Out for Yourself: Smooth use of IPv6 from IPv4 by Feel6 Farm,
Maguire
[email protected]
References
2005.05.16
[83] S. McCanne, V. Jacobson, M. and Vetterli, Receiver-driven Layered Multicast, ACM SIGCOMM, August 1996, Stanford, CA, pp. 117-130.
ftp://ftp.ee.lbl.gov/papers/mccanne-sigcomm96.ps.gz
[84] D. Borman, S. Deering, and R. Hinden, IPv6 Jumbograms, IETF RFC 2675 August 1999 http://www.ietf.org/rfc/rfc2675.txt [85] G. Huston, A. Lord, and P. Smith, IPv6 Address Prefix Reserved for Documentation, IETF RFC 3849, July 2004 http://www.ietf.org/rfc/rfc3849.txt [86] Dave Katz, IPv6 Router Alert Option, IETF RFC 2113, February 1997
http://www.ietf.org/rfc/rfc2113.txt
[87] R. Gilligan,S. Thomson, J. Bound, and W. Stevens, Basic Socket Interface Extensions for IPv6, IETF RFC 2133, April 1997
http://www.ietf.org/rfc/rfc2133.txt
Maguire
[email protected]
References
2005.05.16
[88] R. Rivest, The MD5 Message-Digest Algorithm, IETF RFC 1321, April 1992 http://www.ietf.org/rfc/rfc1321.txt [89] J. Touch, Report on MD5 Performance, IETF RFC 1810, June 1995
http://www.ietf.org/rfc/rfc1810.txt
[90] A. Conta and S. Deering, Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification, IETF RFC 2463, December 1998 http://www.ietf.org/rfc/rfc2463.txt [91] S. Thomson, C. Huitema, V. Ksinant, and M. Souissi, DNS Extensions to Support IP Version 6, IETF RFC 3596, October 2003 [92] C. Huitema, R. Austein, S. Satapati, and R. van der Pol, Unmanaged Networks IPv6 Transition Scenarios, IETF RFC 3750 , April 2004
http://www.ietf.org/rfc/rfc3750.txt
[93] R. Fink and R. Hinden, 6bone (IPv6 Testing Address Allocation) Phaseout, IETF RFC 3701, March 2004 http://www.ietf.org/rfc/rfc3701.txt
Maguire
[email protected]
References
2005.05.16
[94] R. Hinden and S. Deering, Internet Protocol Version 6 (IPv6) Addressing Architecture, IETF RFC 3513, April 2003 http://www.ietf.org/rfc/rfc3513.txt [95] R. Hinden, S. Deering, and E. Nordmark, IPv6 Global Unicast Address Format, IETF RFC 3587, August 2003 http://www.ietf.org/rfc/rfc3587.txt [96] APNIC, ARIN, and RIPE NCC, "IPv6 Address Allocation and Assignment Policy", Document ID: ripe-267, January 22, 2003
http://www.ripe.net/ripe/docs/ipv6policy.html
[97] IAB, IAB/IESG Recommendations on IPv6 Address Allocations to Sites, IETF RFC 3177, September 2001 http://www.ietf.org/rfc/rfc3177.txt [98] S. Thomson and T. Narten, IPv6 Stateless Address Autoconfiguration, IETF RFC 2462, December 1998 [99] Toshiyuki Hosaka, IPv6 Address Allocation and Policy: PART1 IPv6 Address Basics, Tech Tutorials, IPv6Style, NTT Communications, 18 November 2004 http://www.ipv6style.jp/en/tech/20041117/index.shtml
Maguire
[email protected]
References
2005.05.16
[101]Gert Dring, Impressions:An overview of the global IPv6 routing table, RIPE 50, Stockholm, SE, 3 May 2005, http://www.space.net/~gert/RIPE/R50-v6-table.pdf [102] RIPE, Total number of allocated IPv6 prefixes per RIR on 13/05/2005, web page accessed 2005.05.14 http://www.ripe.net/rs/ipv6/stats/index.html [103]J. Rajahalme, A. Conta, B. Carpenter, and S. Deering. IPv6 Flow Label Specification, IETF RFC 3697, March 2004 http://www.ietf.org/rfc/rfc3697.txt
Maguire
[email protected]
References
2005.05.16
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapter 24
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.16:19:02
Maguire
[email protected]
Mobile_IP.fm5
2005.05.16
Outline
Mobile IP
Maguire
[email protected]
Outline
2005.05.16
FDDI R
switch
switch
Token Ring R
R
Ethernet LANs
WAN
switch R switch
R IWU MH BTS
BSC
MSC HLR/VLR
MH
Ad hoc
Cellular networks
PAN MH
MH
Figure 93: Mobility (WWAN, WLAN, PAN, ) driving us towards Mobile Internet
Maguire
[email protected]
Mobility Z
X A1
Y B
Y B
X A2
Figure 94. X disconnects from location A1 and reconnects at location A2 What is X? X represents the identity (ID) of the node1
in an Ethernet it might be the MAC address, thus a node has a constant identity
Maguire
[email protected]
ccslab1.kth.se
130.237.15.254 130.237.216.25
Figure 95: Must update IP address related mappings after a move administrative nightmare
Maguire
[email protected]
Objectives of Mobile IP
To provide mobility support for the Internet To enable node mobility: across changes in IP subnet Allow change in location without change of IP address Communication should be possible (even) while moving (if the interface/link supports it) TCP/IP connections should survive movement Active TCP and UDP port bindings should be maintained
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Communication from Z to X Z
X A1
Y B
Y B
X A2
Figure 96. Z is communicating with X at A1 and wants to continue when X reconnects at location A2
This would require that router R send packets from Z to X over a new path (route). $ But X now has a new network address, since it is on a different network ().
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
1. An informal experiment conduced by John Ioannidis as part of this Mobile*IP research (and documented in an appendix of his thesis) indicted that almost all operating systems, of the time, did not correctly support source routing!
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Identication Z
X A1
Y B
Y B
X A2
Figure 97. How do we know it is the same X? When X moves to its new location (A2)
Why should it get service? How do we know it is the same X? (Or even that it is X?)
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Establishing Identity
When a node arrives on a network it must identify itself
mechanism: typically via a challenge response protocol Who should it identify itself to? Answer: The MSR Mobility Support Router
X A1
Y B
Y B
X MSR- A2 M I am X
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Y B
Y B
X MSR- A2 M Welcome to
X MSR- A2 M I am X
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Y B
Y B
MSR-
MSR- X A2
Welcome to
X A2 I am X
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Getting Service
Once its identity is know, the policy question must be ask: Should X get service?
The question of authentication, authorization, and accounting (AAA) for mobile users is addressed in [107]. See also IEEE 802.1x Port Based Network Access Control
http://www.ieee802.org/1/pages/802.1x.html
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Back to the original problem: Z wants to send a message to X Initially X is located at A1 then it moves to A2. Z
C
R
X A1
C
R
X A2
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
C
R
X A1
C
R
WAN
R
WAN
R
Redirect
X A2
Figure 102. X must send a redirect message to Z, to tell it its new address A2.
$ Z must be aware of where X currently is. $ X must get a new local address A2 (How? perhaps DHCP)
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
C
R
X A1
C
R
Redirect
X A2
Figure 103. X must send a redirect message to the Router, to tell it its new address A2 (rather than A1).
$ Router must now perform host specic routing. $ X must get a new local address A2 (How? perhaps DHCP)
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
C
R
MSR- M
X A1
C
R
MSR- M
X A2
Figure 104. X must send a redirect message to a Mobility Support Router (MSR-), to tell it its new address A2 (rather than A1).
$ MSR- must now perform host specic routing. $ X must get a new local address A2 (How? perhaps DHCP) Z is now completely unaware of the move. Router R is now completely unaware of the move (except for twice the trafc over the link to/from ).
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
C
R
MSR- M
X A1
C
R
MSR- M
MSR- M
MSR- M
X A2
Figure 105. X sends a message to MSR-, to get its new address A2 and says its old MSR was MSR-.
$ MSR- must now perform host specic routing to MSR- (which can provide the local address A2) Z is now completely unaware of the move - it always sends trafc to MSR-. If X moves again, Z does not change where it sends trafc to & trafc need not go via MSR- - it will go directly from MSR- to the MSR responsible for the new segment.
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Alternative 4 continued Initially X is located at A1 then it moves to A2 and then moves to A3. Z
C
R
MSR- M
X A1
C
R
MSR- M
MSR- M
MSR- M
X A3
Figure 106. X sends a message to MSR-, to get its new address A3 and says its old MSR was MSR-. The trafc from MSR- to MSR- or MSR- to MSR- can be encapsulated, using for example IP in IP
(written IP-IP) encapsulation. Thus none of the intervening routers needs know about mobility.
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
How does Z know to send things to MSR-? It does not know to do this! Z simply sends the packet to the network address of X. But what is the (real) network address of X?. Z
C
R
MSR- M
X A1
C
Or
R
MSR- M
X A1
Xs address is A1 A1 is an address on network MSA- intercepts packets addressed to A1 and forwards them if X is not currently present on the network
Xs address is {Mobile-Network,X} A1 is a temporary address on network MSA- routes {Mobile-Network,X} packets to A1when X is local and to another MSR when it is non-local
Objectives of Mobile IP
2005.05.16
MSR- M
Z mx
MSR- M
MSR- M
cell b
MSR- M
mx
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
MSR- M
C mx
MSR- M
X mx
MSR- M
cell b
MSR- M
Figure 109. X moves from the cell a to the cell b, but is still reachable by cell a! Mobile network address mx is reachable from both MSR- and MSR-. This could not occur in the wired case (unless there were multiple interfaces), since X would have to disconnect from network to connect to network . If the cell size is small the movement between cells could be frequent (and caused by other events, such as a new user, a door moving, ).
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Wireless WANs
cell a cell a
BS-a Z
BS-a X Z
C
R
MSR- M
mx C
R
MSR- M
X BS-b mx
cell b
BS-b
cell b
Figure 110. X moves from the cell a to the cell b, but may still reachable by cell a - but both cells are part of the same network
Basestation-a, basestation-b, are all part of the same network and it is up to this network to select which cell a mobile is in and which basestation will be used to communicate with it.
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
MSR- M
C mx
MSR- M
X mx
MSR- M
cell b
MSR- M
Figure 111. X is moving from the cell a to the cell b Mobile network address mx is partially reachable from both MSR- and MSR- - thus we will send packets via both MSR- and MSR-. This insures:
Lower probability of packet loss (important if we must provide low latency and high reliability - such as is needed for voice and some other services) $ increases trafc in both cells
Maguire
[email protected]
Objectives of Mobile IP
2005.05.16
Maguire
[email protected]
A Mobile-IP(V4) Scenario
IP in IP tunnel
Foreign Agent FA
Foreign network MN
Mobile Node
CN
Correspondent Node
CN sends packet to MNs home network (because that is where its IP address is logically located), HA intercepts them and forwards them inside an IP-in-IP tunnel to the Care of Address (CoA) where the FA forwards them to the MN. Traffic from the MN can go directly to the CN (unless there is ingress filtering) triangle routing
Maguire
[email protected]
A Mobile-IP(V4) Scenario
2005.05.16
A Mobile-IP(V6) Scenario
Internet Home Agent HA
Home network Foreign network MN IP in IP tunnel
binding list
Mobile Node
cache
CN
Correspondent Node
CN sends packet to MNs home network (because that is where its IP address is logically located), HA intercepts them and forwards them inside an IP-in-IP tunnel to the Care of Address (CoA) which is the MNs address in the foreign network. However, the MN can tell the CNabout its current address via a binding update (BU), now traffic can flow both ways directly between the CN and MN.
Maguire
[email protected]
A Mobile-IP(V6) Scenario
2005.05.16
IP-in-IP Encapsulation
In-in-IP vs. Minimal encapsulation - the major difference is the first puts the whole IP packet inside another, while the later tries to only put a minimal header inside along with the original data portion of the IP packet. For details see IP Encapsulation within IP, RFC 2003 [105] Minimal Encapsulation within, IP RFC 2004 [106]
Maguire
[email protected]
IP-in-IP Encapsulation
2005.05.16
Tunneling IP Datagrams
Both home agents and foreign agents (v4) must support tunneling datagrams using IP-in-IP encapsulation and decapsulation. MNs that use a co-located COA must also support decapsulation (v6). IP in IP tunnel
HA Home Agent
FA Foreign Agent
IP in IP tunnel
V4 MN Mobile Node
HA Home Agent
Maguire
[email protected]
Tunneling IP Datagrams
2005.05.16
Maguire
[email protected]
Agent Discovery
Why Agent Discovery? Methods an MN can use to determine whether it is currently at its home network or a foreign network. By: Agent Advertisement
periodic transmissions (beacons) sent by a mobility agent (rate limited to max. 1/s).
Agent Solicitation
Send by an MN to discover agents.
MN beacons
FA1
FA2
HA
Maguire
[email protected]
Agent Discovery
2005.05.16
Care of Address* {the number is determined by the length eld; must be at least 1 of F bit set}
Bit
Name
Meaning
0 1 2 3 4 5 6 7
R B H F M G V
Registration with this foreign agent (or another foreign agent on this link) is required; using a co-located care-of address is not permitted. Busy. Foreign agent not accepting registrations from additional mobile nodes. Agent offers service as a home agent. Agent offers service as a foreign agent. Agent implements receiving tunneled datagrams that use minimal encapsulation Agent implements receiving tunneled datagrams that use GRE encapsulation Agent supports Van Jacobson header compression over the link with any registered mobile node. reserved (must be zero)
Maguire
[email protected]
Bit
Name
Meaning
0 1 2 3 4 5 6-7
Maguire
[email protected]
S B D M G V
Simultaneous bindings, this is an additional address for the mobile Broadcast datagrams. Home agent to tunnel any broadcast packets it receives to the mobile. Mobile using co-located care-of address and will decapsulation itself Mobile requests home agent to use Minimal encapsulation. Mobile requests home agent to use GRE encapsulation. Mobile node requests that agent use Van Jacobson header compression. reserved (must be zero)
Registration Message Format
2005.05.16
MN Requirements
An MN must have: home address, netmask, mobility security association for each HA. For each pending registration, MN maintains the following information: link-layer address of the FA to which the Registration Request was sent IP destination address of the Registration Request Care-of address used in the registration remaining lifetime of the registration
Maguire
[email protected]
MN Requirements
2005.05.16
FA Requirements (v4)
Each FA must be congured with a care-of-address. Must maintain a visitor list with following information:
Link-layer source address of the mobile node IP Source Address (the MNs Home Address) UDP Source Port Home Agent address Requested registration Lifetime Identication eld
This visitor list acts much like a Visitor Location Register (VLR) in a cellular system.
Maguire
[email protected]
FA Requirements (v4)
2005.05.16
HA Requirements
Each HA must have: the home address and mobility security association of each authorized MN that it is serving as a home agent. Must create or modify its mobility binding list entry containing: Mobile nodes CoA (or CoAs in the case of simultaneous bindings) Identication eld from the Registration Request Remaining Lifetime of the registration The mobility binding list acts much like a Home Location Register (HLR) in a cellular system.
Maguire
[email protected]
HA Requirements
2005.05.16
Optimization Problem
Home site R2
Internet
Foreign site R3
shortest path
R2
HA
Home Agent
CN
R4
FA
Foreign Agent
Home network MN
node moves
MN
We can not follow the shortest path in Mobile IPv4 because the CN will always send it via our home network. However, we may be able to use the shortest path from the MN to the CN.
Maguire
[email protected]
Optimization Problem
2005.05.16
Not developed for cellular systems No interface dened between cellular systems (e.g. between Mobile-IP/HLR/VLR) No handover support Weak in security No key distribution mechanism Route optimization problems No QoS, real-time support, (DiffServ, RSVP)
Problems of Mobile IP (RFC2002) Only provides basic macro mobility support Cellular
}
}
Micro Mobility
Maguire
[email protected]
Security:
Route Optimization:
Route optimization for MIPv4, v6
Real-time QoS:
No solution yet
Maguire
[email protected]
HA
FA
MN
CN
Maguire
[email protected]
Home AAA
FAAA
Internet HA
PDGN
FA
PDSN
MS
Maguire
[email protected]
Cellular IP (CIP)
HAWAII extension is similar to Cellular IP.
Cellular IP network CIP node Paging Cache RFC2002 Mobile IP Internet GW Route Cache
CIP node Paging Cache Route Cache Paging Area CIP node Paging Cache Route Cache
CIP node Paging Cache Route Cache CIP node Paging Cache Route Cache
MN
HA
FA
Paging Cache
CN
Maguire
[email protected]
Route Cache
Cellular IP (CIP)
2005.05.16
Cellular IP network CIP node Paging Cache RFC2002 Mobile IP Internet GW Route Cache
CIP node Paging Cache Route Cache Paging Area CIP node Paging Cache Route Cache
CIP node Paging Cache Route Cache CIP node Paging Cache Route Cache
HA
MN
FA
Paging Cache
CN
Maguire
[email protected]
Route Cache
2005.05.16
Cellular IP network CIP node Paging Cache RFC2002 Mobile IP Internet GW Route Cache
CIP node Paging Cache Route Cache CIP node Paging Area
HA
FA
Paging Cache
CN
Maguire
[email protected]
Route Cache
2005.05.16
MN
Mobile IP 659 of 665
Internetworking/Internetteknik
FA FA FA
GFA
MN
HA
FA FA CN
FA
FA
Hierarchical FA and Regional Tunneling
2005.05.16
Maguire
[email protected]
FA FA FA
GFA MN
HA
FA FA CN
FA
FA
Hierarchical FA and Regional Tunneling
2005.05.16
Maguire
[email protected]
FA FA FA
GFA
HA
FA FA CN
FA
MN
FA
Hierarchical FA and Regional Tunneling
2005.05.16
Maguire
[email protected]
Thus DDNS does not really provide mobility, just connecting at different places.
Maguire
[email protected]
Summary
This lecture we have discussed: Mobile IP
Maguire
[email protected]
Summary
2005.05.16
References
[104]. B. Aboba and M. Beadles, The Network Access Identifier, IETF RFC 2486, January 1999 http://www.ietf.org/rfc/rfc2486.txt [105]C. Perkins, IP Encapsulation within IP, IETF RFC 2003, October 1996
http://www.ietf.org/rfc/rfc2003.txt
[106]C. Perkins, Minimal Encapsulation within IP, IETF RFC 2004, October 1996 http://www.ietf.org/rfc/rfc2004.txt [107]Juan Caballero Bayerri and Daniel Malmkvist, Experimental Study of a Network Access Server for a public WLAN access network, M.S. Thesis, KTH/IMIT, Jan. 2002
Maguire
[email protected]
References
2005.05.16
2G1305 Internetworking/Internetteknik Spring 2005, Period 4 Module 12: IPSec, VPNs, Firewalls, and NAT
Lecture notes of G. Q. Maguire Jr.
For use in conjunction with TCP/IP Protocol Suite, by Behrouz A. Forouzan, 3rd Edition, McGraw-Hill. For this lecture: Chapters 26 and 28
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.19:01:51
Maguire
[email protected]
Internet_Security_VPNs_NAT.fm5
2005.05.19
Lecture 6: Outline
IPSec, VPN, Firewalls & NAT Private networks
Maguire
[email protected]
Lecture 6: Outline
2005.05.19
Private networks
Private Networks are designed to be used by a limited set of users (generally those inside an organization)
Intranet Extranet a private network - access limited to those in an organization intranet + limited access to some resource by additional users from outside the organization
Addresses for Private IP networks these should never be routed to outside the private network they should never be advertised (outside the private network) allocated (reserved) addresses:
Range Total addresses
Maguire
[email protected]
Private networks
2005.05.19
SiteB
Internet leased line
SiteA
Figure 113: Hybrid network
SiteB
Internet SiteA
Figure 114: Virtual Private network
tunnel
2005.05.19
SiteB
IPSec, VPNs, Firewalls, and NAT 669 of 697
Internetworking/Internetteknik
Maguire
[email protected]
Authentication
Remote Authentication Dial-In User Services (RADIUS) http://www.gnu.org/software/radius/radius.html, FreeRADIUS http://www.freeradius.org/ DIAMETER http://www.diameter.org/
Maguire
[email protected]
GSS-API
Generic Security Services Application Programming Interface (GSS-API) provides an abstract interface which provides security services for use in distributed applications but isolates callers from specic security mechanisms and implementations. GSS-API peers establish a common security mechanism for security context establishment either through administrative action, or through negotiation. GSS-API is specified in: J. Linn, "Generic Security Service API v2", RFC 2078 [115] J. Wray, "Generic Security Service API v2: C-bindings", RFC 2744 [116].
Maguire
[email protected]
GSS-API
2005.05.19
IPSec
IPSec in three parts: encapsulating security payload (ESP) denes encryption or IP payloads, authentication header (AH) denes authentication method, and the IP security association key management protocol (ISAKMP) manages the exchange of secret keys between senders and recipients of ESP or AH packets.
Maguire
[email protected]
IPSec
2005.05.19
ESP packet
Consists of: a control header - contains a Security Parameters Index (SPI) and a sequence number eld (the SPI + destination IP address unqiuely identies the Security Association (SA)). a data payload - encrypted version of the users original packet. It may also contain control information needed by the cryptographic algorithms (for example DES needs an initialization vector (IV)). an optional authentication trailer - contains an Integrity Check Value (ICV) - which is used to validate the authenticity of the packet. ESP could use any one of several algorithms: DES, Triple DES, See: RFC 2406: IP Encapsulating Security Payload (ESP)[109]
Maguire
[email protected]
ESP packet
2005.05.19
AH header
For authentication purposes only contains: an SPI, a sequence number, and an authentication value. AH uses either: Message Digest 5 (MD5) algorithm, Secure Hash Algorithm 1 (SHA-1), truncated HMAC (hashed message authentication code), or For further information see: IP Authentication Header - RFC 2402 [110]
Maguire
[email protected]
AH header
2005.05.19
ISAKMP
ISAKMP is based on the Diffie-Hellman key exchange protocol; it assumes the identities of the two parties are known. Using ISAKMP you can: control the level of trust in the keys, force SPIs to be changed at an appropriate frequency, identify keyholders via digital certicates [requires using a certicate authority (CA)] For further information see: Internet Security Association and Key Management Protocol (ISAKMP) - RFC 2408 [111] The Internet IP Security Domain of Interpretation for ISAKMP RFC 2407 [112] The OAKLEY Key Determination Protocol - RFC 2412 [113] The Internet Key Exchange (IKE) - RFC 2409 [114]
Maguire
[email protected]
ISAKMP
2005.05.19
Transport Tunnelling
payload data follows the normal IP header end-users entire packet-IP headers and all-placed within another packet with ESP or AH elds [thus it is encapsulated in another packet] can hide the original source and destination address information
AS1 tunnel
AS3
AS4
AS2
Firewalls
exterior interior (often an Intranet)
Figure 116: Firewall an internet gateway
The firewall can provide packet by packet filtering of packets coming into the intranet or leaving the intranet. The firewall can decide which packets should be forwarded based on source, destination addresses, and port (or even deeper examination) using an explicitly defined policy.
Maguire
[email protected]
Firewalls
2005.05.19
Linux rewall
For example, for the software firewall used in Linux systems called ipfwadm: all ports are typically closed for inbound trafc, all outbound trafc is IP masqueraded, i.e., appears to come from the gateway machine; and For bi-directional services required by the users, holes may be punched through the rewall - these holes can reroute trafc to/from particular ports:
to specic users or the most recent workstation to request a service.
Maguire
[email protected]
Linux rewall
2005.05.19
Firewall Design
apply basics of security: least privilege:
dont make hosts do more than they have to (implies: specialize servers) use minimum privileges for the task in hand
fail safe
even if things break it should not leave anything open
defence in depth
use several discrete barriers - dont depend on a single rewall for all security
weakest links
know the limitations of your defences - understand your weakest link
Firewalls should have sufficient performance to keep the pipes full - i.e., a firewall should not limit the amount of traffic flowing across the connection to the external network, only what flows across it!
Maguire
[email protected]
Firewall Design
2005.05.19
Internet
Proxy Server
manually enabled bypass
Intranet
Often you need application level proxies (i.e., they undertand details of the application protocol) -- an example is to proxy RealAudios streaming audio.
Maguire
[email protected]
SOCKs
Permeo Technologies, Inc.s SOCKS http://www.socks.nec.com/ In order to bridge a firewall we can use a proxy: the proxy will appear to be all external hosts to those within the rewall
for example, If a user attached to the intranet requests a webpage, the request is sent to the proxy host where the same request is duplicated and sent to the real destination. When data is returned the proxy readdresses (with the users intranet address) the returned data and sends it to the user.
widely used to provide proxies for commonly used external services (such as Telnet, FTP, and HTTP). See: [123] and [124]
hole Internet
exterior interior Socks (Proxy) Server
Figure 118: Firewall and internet gateway
Maguire
[email protected]
Intranet
SOCKs
2005.05.19
Newping
http://ftp.cerias.purdue.edu/pub/tools/dos/socks.cstc/util/newping.c
a ping for SOCKS it depends on the target host not blocking the service on the appropriate port (in this case time). This version is primarily for checking Is it alive? rather than gathering statistics on the average response time of several echo requests. Uses the time TCP port to verify that a host is up, rather than using ICMP usable through a rewall that blocks ICMP.
Maguire
[email protected]
Newping
2005.05.19
Their firewall features: Source host checking (allowing only certain hosts to transmit through the rewall, or denying specic hosts) Destination port checking Packet contents (unwrapping encapsulated IP) Regulating bandwidth allocated to a specic multicast groups trafc Their Mbone gateway is based on a modified multicast routing daemon. hole MBONE Intranet join join join SOCKS+Mrouted-gw
Figure 119: Firewall and internet gateway
Maguire
[email protected]
Maguire
[email protected]
U.S. DOE CIACs Network Security Tools [126] IPSec, VPNs, Firewalls, and NAT 685 of
2005.05.19 Internetworking/Internetteknik
TCP Wrappers - allows monitoring and control over who connects to a hosts TFTP, EXEC, FTP, RSH, TELNET, RLOGIN, FINGER, and SYSTAT ports + a library so that other programs can be controlled and monitored in the same fashion xinetd - a replacement for inetd which supports access control based on the address of the remote host and the time of access + provides extensive logging capabilities
Maguire
[email protected]
U.S. DOE CIACs Network Security Tools [126] IPSec, VPNs, Firewalls, and NAT 686 of
2005.05.19 Internetworking/Internetteknik
(cleverly) uses raw IP packets determine what hosts are available on the network, what services (application name and version) are offered, what operating systems (and OS versions) they are running, what type of packet lters/rewalls are in use,
also has a link to Remote OS detection via TCP/IP Stack FingerPrinting by Fyodor <[email protected]> (www.insecure.org), October 18, 1998 - a means of identifying which OS the host is running by noting its TCP/IP behavior.
http://www.insecure.org/nmap/nmap_documentation.html
Maguire
[email protected]
Internet
Intranet
NAT maps IP addresses on the inside to one or more addresses on the outside and vice versa. See RFC 3022 [136] and RFC2766 [137]
Advantages: Disadvantage
save IPv4 addresses hides internal node structure from outside nodes the intranet does not have to be renumbered when you connect to another ISP
Maguire
[email protected]
$ Unfortunately this breaks many services because they use an IP address inside the their data.
Internet
Intranet
DMZ
web DNS e-mail ftp server server server server Figure 121: Example of a Firewall with a DMZ
Note that the various services may also be in different DMZ (see for example fogure 4 page 90 of [127]
Maguire
[email protected]
Maguire
[email protected]
Forum of Incident Response and Security Teams (FIRST), now: 170 members[118] NIST Computer Security Resource Center [119], Swedish Defense Material Administration, Electronics Systems Directorate [120],
Maguire
[email protected]
Security Organizations and CompaniesIPSec, VPNs, Firewalls, and NAT 691 of 697
2005.05.19 Internetworking/Internetteknik
Summary
This lecture we have discussed: Private networks IPSec Firewalls
Maguire
[email protected]
Summary
2005.05.19
Further information
[108]IETF Security Area http://sec.ietf.org/ [109]S. Kent and R. Atkinson, IP Encapsulating Security Payload (ESP), IETF RFC 2406, November 1998 http://www.ietf.org/rfc/rfc2406.txt [110]S. Kent and R. Atkinson, IP Authentication Header, IETF RFC 2402, November 1998 http://www.ietf.org/rfc/rfc2402.txt [111]D. Maughan, M. Schertler, M. Schneider, and J. Turner, Internet Security Association and Key Management Protocol (ISAKMP), IETF RFC 2408, November 1998 http://www.ietf.org/rfc/rfc2408.txt [112] D. Piper, The Internet IP Security Domain of Interpretation for ISAKMP, IETF RFC 2407, November 1998 http://www.ietf.org/rfc/rfc2407.txt [113] H. Orman, The OAKLEY Key Determination Protocol, IETF RFC 2412, November 1998 http://www.ietf.org/rfc/rfc2412.txt [114]D. Harkins and D. Carrel, The Internet Key Exchange (IKE), IETF
Maguire
[email protected]
Further information
2005.05.19
RFC 2409,November 1998 http://www.ietf.org/rfc/rfc2409.txt [115]J. Linn, Generic Security Service Application Program Interface, Version 2, IETF RFC 2078, January 1997, http://www.ietf.org/rfc/rfc2078.txt [116]J. Wray, Generic Security Service API Version 2 : C-bindings, IETF RFC 2744, January 2000 http://www.ietf.org/rfc/rfc2744.txt [117] Computer Emergency Response Team http://www.cert.org/ [118]Forum of Incident Response and Security Teams http://www.first.org/ [119]U. S. National Institute of Standards and Technology (NIST), Computer Security Division, Computer Security Resource Center http://csrc.nist.gov/ [120]Swedish Defense Material Administration http://www.fmv.se/ [121]David Crochemore, Response/Readiness: What R the new CERTS?, National Computer network Emergency Response technical Team/Coordination Center of China (CNCERT/CC) 2005 Annual Conference, Guilin, P.R.China, 30 March 2005
Maguire
[email protected]
Further information
2005.05.19
http://www.cert.org.cn/upload/2005AnnualConferenceCNCERT/1MainConference/10.DavidCrochemore-NGCERTOI. pdf
[122]Centre dExpertise Gouvernemental de Rponse et de Traitement des Attaques informatiques (CERTA) http://www.certa.ssi.gouv.fr/ [123]M. Leech, M. Ganis, Y. Lee, R. Kuris, D. Koblas, and L. Jones, SOCKS Protocol Version 5, IETF RFC 1928, March 1996
http://www.ietf.org/rfc/rfc1928.txt
[124]P. McMahon, GSS-API Authentication Method for SOCKS Version 5, IETF RFC 1961, June 1996 http://www.ietf.org/rfc/rfc1961.txt [125] Postfix http://www.postfix.org [126]U.S. DOEs Computer Incident Advisory Capability
http://ciac.llnl.gov/ciac/ToolsUnixNetSec.html
[127]Robert Malmgren, Praktisk ntskerhet, Internet Academy Press, Stockholm, Sweden, 2003, ISBN 91-85035-02-5
Maguire
[email protected]
Further information
2005.05.19
[128]Charlier Kaufman, Radia Perlman, and Mike Speciner, Network Security: Private Communication in a PUBLIC World, Prentice-Hall, 1995, ISBN 0-13-061466-1 [129]Simson Garfinkel, PGP: Pretty Good Privacy, OReilly & Associates, 1995 ISBN 1-56592-098-8 [130]Internet Mail Consortium, S/MIME and OpenPGP, Oct 15, 2004
http://www.imc.org/smime-pgpmime.html
Firewalls
[131]Bill Cheswick and Steve Bellovin, Firewalls and Internet Security: Repelling the Wily Hacker, Addison Wesley, 1994,ISBN: 0-201-63357-4 [132]D. Brent Chapman and Elizabeth Zwicky, Building Internet Firewalls, OReilly, 1995,ISBN: 1-56592-124-0 [133]Tony Mancill, Linux Routers: A Primer for Network Administrators Prentice-Hall, 2001, ISBN 0-13-086113-8.
Maguire
[email protected]
Further information
2005.05.19
[136] P. Srisuresh and K. Egevang, Traditional IP Network Address Translator (Traditional NAT), IETF RFC 3022, January 2001
http://www.ietf.org/rfc/rfc3022.txt
[137]G. Tsirtsis and P. Srisuresh, Network Address Translation - Protocol Translation (NAT-PT), IETF RFC 2766, February 2000
http://www.ietf.org/rfc/rfc2766.txt
Maguire
[email protected]
Further information
2005.05.19
2G1305 Internetworking/Internetteknik Spring 2005, Period 4 Module 13: Future and Summary
Lecture notes of G. Q. Maguire Jr.
1998, 1999, 2000,2002, 2003, 2005 G.Q.Maguire Jr. . All rights reserved. No part of this course may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission of the author. Last modied: 2005.05.19:01:49
Maguire
[email protected]
Future_and_Summary.fm5
2005.05.19
Lecture 7: Outline
QoS Interface trends IP SANs (Storage Area Networks): iSCSI, A glimpse into the future.
Maguire
[email protected]
Lecture 7: Outline
2005.05.19
Maguire
[email protected]
Service Differentiation
Integrated Services (InteServ): RSVP: connection request All nodes IntServ-capable Scalability Complicated network management Differentiated Service (DiffServ): end of one-size-fits-all Classes of Service QoS based Routing Classes of Service at Gigabit rates New Pricing and Billing Policies New Resource Allocation Methods See: [138]
Maguire
[email protected]
Service Differentiation
2005.05.19
Constraint-based Routing
QoS routing: selects network routes with sufficient resources for the requested QoS parameters to satisfy the QoS requirements for every admitted connection; to achieve network efciency in resource ultilization. Policy-based Routing: e.g. Virtual Private Networks (VPN) How can we combine this with IP mobility?
Maguire
[email protected]
Constraint-based Routing
2005.05.19
Performance
Routers:
1/2 to 1 Million packets per second (pps) for every gigabit per second of aggregate bandwidth more than 250,000 routes
Maguire
[email protected]
Performance
2005.05.19
PC interfaces
Standard I/O ports of PCs: Accelerated Graphics Port (AGP) PCI
Version 2.1 PCI bus - 64 bit, 66MHz, can burst to 528 Mbps PCI-X 2.0: High Performance, Backward Compatible PCI for the Future [139] PCI-X 533, offering up to 4.3 gigabytes per second of bandwidt
10/100/1000 Ethernet
Maguire
[email protected]
PC interfaces
2005.05.19
Fibre Channel
From the X2T11 standards activity Topologies: Point-to-Point, Fabric, and Arbitrated loop Addresses: Loops, LANs, and worldwide addresses Fibre Channel Profiles Fibre Channel products Disk drives Network interfaces
Maguire
[email protected]
Fibre Channel
2005.05.19
Internet IP Tunnel
tape library
FCIP
FC JBOD
FCIP
FC JBOD
Note that this approach simply interconnects the two Fibre Channel switches. The connection between the two switches is TCP and it simply encapsulates a FCIP header and a Fibre Channel Frame.
Maguire
[email protected]
Internet
router
tape library
iFCP Frames
iFCP Gateway
FC JBOD
iFCP JBOD
Note that this approach interconnects Fibre Channel devices. The connection between the two switches is TCP and it encapsulates a iFCP header and a Fibre Channel Frame; note that iFCP devices can simply be attached to the internet or an intranet. This means that there has to be a mapping between Fibre Channel addresses an IP addresses.
Maguire
[email protected]
Internet
router
iSCSI Server
iFCP Frames
router
iSCSI JBOD
iSCSI JBOD
One a SCSI initiator has logged-in to a SCSI target, it can simply issue SCSI commands, just as if the device were on a local SCSI chain! For more information see [141]
Maguire
[email protected]
Clustering
Myricom, Inc. http://www.myri.com/ American National Standards Institute (ANSI) Standard -- ANSI/VITA 26-1998 Started by
Prof. Charles L. Seitz - Caltech, now President and CEO Dr. Robert Felderman - Director of Software Development Mr. Glenn Brown - Engineer and programmer
Clusters used to form high performance servers, using commodity networks and hosts. For performance numbers see: http://www.myri.com/myrinet/overview/
Maguire
[email protected]
Clustering
2005.05.19
Beowulf-class machines
Using large numbers of commodity machines to make high performance computational systems by interconnecting them with a network. LANLs Loki http://loki-www.lanl.gov LANLs Avalon http://cnls.lanl.gov/avalon/ JPLs Hyglac http://hpc.jpl.nasa.gov/PS/HYGLAC/hyglac.html INRIAs PopC (Pile of PCs)
Maguire
[email protected]
Beowulf-class machines
2005.05.19
Maguire
[email protected]
Internet2
http://www.internet2.org/
World class research Driven by computational physics, biology, chemistry, and scientic visualization, virtual experiments, and remote control of real experiments. Networking R&D - focused on exploiting the capabilities of broadband networks media integration, interactivity, real time collaboration, Improve production Internet services and applications for all members of the academic community, both nationally and internationally. Purpose: support national research objectives, distance education, lifelong learning, and related efforts. http://www.hpcc.gov/white-house/internet/background.html
Maguire
[email protected]
Internet2
2005.05.19
Gigapops
Who will be operating them? Where will they be? How many will there be? What is the aggregate throughput that they will require? What is the maximum per port throughput? How many ports will they need to support? Will they support "mixing"? (mixing is used to defeat traffic analysis) Whose hardware and software will they use? What is the required functionality?
Maguire
[email protected]
Gigapops
2005.05.19
ASICs: Vertex Networks, Inc. , MMC Networks, Inc, Galileo Technology, TI,
Maguire
[email protected]
Future networks
Terabit per second == 1012 Readily achievable via combining multiple Gigabit per second streams using Wavelength Division Multiplexing (WDM). Petabit per second== 1015 Differentiated Services: Classes of Service, Multimedia Constraint-based Routing (QoS Routing) Ad Hoc Networking Auto-configuration (Plug and Play Internet) Active Networking Smart Networking Knowledge-based Networking
Maguire
[email protected]
Future networks
2005.05.19
Active Networks
Network nodes can perform customized computations on the messages owing through them. Can change, modify the contents of the messages. Potentially Mobility Enabling Routing using active network concept Client Server Active Node Node Active Node
Maguire
[email protected]
Future networks
2005.05.19
Maguire
[email protected]
Future networks
2005.05.19
Bottlenecks
Server
Server1
Microcell Macrocell
Femtocell Picocell
Gbit/s
kbit/s .. Mbit/s
User
Future networks
2005.05.19
GPS source
Camera(s)
Maguire
[email protected]
Future networks
2005.05.19
VANs DANs
The computer/printer/telephone/ will all be part of a very local area network on your desk. wireless links No longer will you have to plug your printer into your computer (PDA/) into your computer active badges No longer will you have to sign in/out of areas, write down peoples names at meetings, the system can provide this data based on the active badges
Olivetti and Xerox are exploring Teleporting your windows environment to the workstation nearest you, on command, if there are multiple choices probe each one (currently a beep is emitted to tell the user which).
BANs
Users will be carrying multiple devices which wish to communicate: thus there will be a need for a network between these devices which you carry around; and personal devices will wish to interact with xed devices (such as Bankomat machines, vehicle control systems, diagnostic consoles (for a mechanic or repairman), ) and other peripherals.
Maguire
[email protected]
Future networks
2005.05.19
Movement
Figure 123: Where am I? What am I? Who am I? Where am I going? When will I be there? What should I become? Who should I become?
Location
dependent services Predicting location to reduce latency, reduce power, hide position, Adapting the radio to the available mode(s), purposely changing mode, Recongure the electronics to adapt, for upgrades, for fault tolerance, ; Reconguration vs. powering up and down xed modules (what are the right modules, what is the right means of interconnect, what is the right packaging/connectors/, needed speed of adaptation) right level of independence; spectrum from Highly Independent Very Dumb
Maguire
[email protected]
Future networks
2005.05.19
GPS or from the network operators knowledge [resolution: 100m to sub-centimeter] Indoor: IR and RF beacons, triangulation, knowing what you can see or hear
What can I do with this knowledge?
KTH students built a JAVA Applet which gets data from GPS unit and dynamically displays a list of the information available - as a function of where you are:
if near bus, subway, train stop - you get transit information - potentially with real-time schedule since the system knows current location of vehicles list of restaurants, shops, etc. where you are and in the direction you are headed
x the scope is based on your velocity vector - so if you move quickly it reduces detail, but increases the scope
map information with updated position How do I know who Im with or what Im near?
Olivetti,
Olivetti put them on people, equipment, Xerox put them on electronic notepads, rooms, MIT Media Lab is putting them on people + lots of inanimate objects (clock, sh tank, )
Maguire
[email protected]
Future networks
2005.05.19
Human centered
Computer
Environment awareness
You
Maguire
[email protected]
Future networks
2005.05.19
Requirements
Systems
traditional computers, desktop workspaces, domestic appliances, building and automotive systems, doors, elevators (lifts), environmental control, seats and mirrors, etc.
Systems Systems
to provide sensor data: to correlate the sensor information and provide it in a useful way to the computer
systems:
Spatial and temporal sensor fusion, 3D and 4D databases, Machine Learning, and Prediction (based on pattern extraction)
Agents
and actuators to provide intelligent control of the environment wireless/wired/mobile communications infrastructures to link it all together
must assure privacy and security
Maguire
[email protected]
Future networks
2005.05.19
Badge just emits its ID periodically Smart Badge - [an IP device] Location and Context Aware (i.e., a sensor platform) Intelligent Badge - add local processing for local interaction by the user
Mark T. Smith - Hewlett-Packard Research Laboratories, Palo Alto, California, USA Dr. H. W. Peter Beadle
Formerly: University of Wollongong, Wollongong, Australia Currently: Director, Motorola Australian Research Centre, Botany, NSW, Australia
Maguire
[email protected]
Future networks
2005.05.19
Badge Location Server Internet Application Banks as intermediaries (if they have any future role)
Maguire
[email protected]
Badge Server
Application
Future networks
2005.05.19
Smart Badge 3
Analog Sensors Microphone & Speaker UCB1200 Digital Sensors StrongARM SA-1100
PCMCIA Buffers
IR XCVR TFDU6100
DC to DC Power Supply
PCMCIA Connector
Battery
Maguire
[email protected]
Future networks
2005.05.19
Future networks
2005.05.19
Badge 3
Maguire
[email protected]
Future networks
2005.05.19
Maguire
[email protected]
Future networks
2005.05.19
MEDIA
High integration (goal of MEDIA project)
Before
radio P LAN
Chips 5 1 1
After
radio + LAN
Chips 1+1
MR
Partners:
Kungl
Tekniska Hgskolan (KTH/ELE/ESDlab and KTH/IT/CCSlab) Tampere University of Technology (TUT) GMD FOKUS (GMD) Technische Universitt Braunschweig (UBR) Interuniversity Microelectronics Centre (IMEC) Ericsson Radio Systems AB (ERA) See http://www.ele.kth.se/ESD/MEDIA for more information
Maguire
[email protected]
Future networks
2005.05.19
Split the functions between access point and access point server
to infrastructure
Analog
radio ISDN/xDSL/LAN
Maguire
[email protected]
Future networks
2005.05.19
Radio TV Softradio
PC
Gateway
Softradio
Maguire
[email protected]
Future networks
2005.05.19
~1 Gbit/sec/user
Telepresense for work is the long-term killer application -- Gordon Bell and James N. Gray1
1. The Revolution Yet to Happen in Beyond Calculation: The Next Fifty Years of Computing, Eds. Denning and Metcalfe, Copernicus, 1997.
Maguire
[email protected]
Future networks
2005.05.19
Now, for the next 50 years, the web will drive electronic commerce into the information age, ubiquitous computers will disappear into the woodwork, and well start uploading ourselves into the Internet to become at last immortal. -- Robert M. Metcalfe June 26, 1997
1. Robert M. Metcalfe, Internet Futures, MIT Enterprise Forum, June 26, 1997.
Maguire
[email protected]
Future networks
2005.05.19
Future Systems
GPS source
Audio I/O via combined mic./earphone or neural connection /femto/pico/micro/macro/ cellular infrastructure Neural interconnection to visual cortex
Maguire
[email protected]
Future networks
2005.05.19
Bionic Technologies, Inc.s Intracortical Electrode Array Acute microelectrode assembly (10x10 array, 100 active electrodes) . . . . . . . . . . . $1,250.00
Figure 125: 10 x 10 silicon electrode array (each electrode: 1.5mm long, 0.08mm wide at base, 0.001mm tip), Built at the Univ. of Utah, by Richard A. Normann, et al.; from Scientific American, March 1994, pg. 108.
Maguire
[email protected]
Future networks
2005.05.19
1m
10nm
Figure 126: (a) Capacitive coupling of data into nerve and (b) using the charge in the nerve to control a transistors gate for getting data out of the nerve
(a) Peter Fromherz and Alfred Stett, Silicon-Neuron Junction: Capacitive Stimulation of an Individual Neuron on a Silicon Chip Phys.Rev.Lett. 75 (1995) 1670-1673 (b) P.Fromherz, A.Offenhusser, T.Vetter, J.Weis, A Neuron-Silicon Junction: A Retzius-Cell of the Leech on an Insulated-Gate Field-Effect Transistor Science 252 (1991) 1290-1293.
Maguire
[email protected]
Future networks
2005.05.19
is going to be your planning horizon? What will be the depreciation time for your equipment/software/infrastructure/ ? How fast:
can you change? should you change? will you change?
Maguire
[email protected]
Future networks
2005.05.19
Summary
Telecom operators are reinventing themselves and their infrastructures Things to watch IPv6, IPsec, Mobile-IP, DHCP, the new domain name registries, appliances, Low cost access points which exploit existing or easily installed infrastructure are key to creating a ubiquitous mobile infrastructure with effectively innite bandwidth. Smart Badge is a vehicle for exploring our ideas:
Exploits hardware and software complexity by hiding it. Explores allowing devices and services to use each other in an extemporaneous way. Enables a large number of location and environment aware applications, most of which are service consuming. Keep you eyes open for the increasing numbers of senors which will be on the network. Service is where the money is!
Personal Communication and Computation in the early 21st century: Just Wear IT! Coming in 20-30 years: Just implant IT! Remember: The internet will be what you make it.
Maguire
[email protected]
Summary
2005.05.19
Further Reading
[138]Kalevi Kilkki, Differentiated Services for the Internet, Macmillan Technical Publishing, 384 pages, June 1999, ISBN: 1578701325. [139] PCI-SIG, PCI-X 2.0: High Performance, Backward Compatible PCI for the Future, May 19, 2005 http://www.pcisig.com/specifications/pcix_20 [140] USB.org, Universal Serial Bus Revision 2.0 specification, May 19, 2005
http://www.usb.org/developers/docs/usb_20_02212005.zip
[141]Tom Clark, IP SANS: A Guide to iSCSI, iFCP, and TCIP Protocols for Storage Area Networks, Addison-Wesley, 288 pages, 2002, ISBN: 0-201-75277-8 General: http://www.ietf.org/
Maguire
[email protected]
Further Reading
2005.05.19
Thanks
Best wishes on your written assignments (or projects) and on the exam.
Maguire
[email protected]
Thanks
2005.05.19