Where Do You "Tube"? Uncovering Youtube Server Selection Strategy
Where Do You "Tube"? Uncovering Youtube Server Selection Strategy
I. I NTRODUCTION
YouTube, which started as a garage-project to share
videos online in 2005, has seen an explosive growth in its
popularity. Today it is indisputably the worlds largest video
sharing site. It serves millions of users across the world
every day. However, due to its ever increasing popularity
and demand, it is subjected to a continual expansion to
accommodate the growing demands. As shown in a recent
study [2], in 2008, it used 6 large data-centers located with in
United States to serve the videos to users (while LimeLight
CDN was used to push the most popular videos to users).
However, these data-centers were not enough to meet the
increasing global demand, and sometimes after it was bought
by Google, it started expanding its video distribution network
by using Googles infrastructure.
In a recent work [1], Adhikari et.al. used a reverse engineering based methodology to uncover the basic principles behind
the design of YouTube video delivery cloud. It showed that
YouTube video delivery cloud consists of three components:
the video-id space, hierarchical logical video server DNS
namespaces and physical video cache servers. YouTube then
uses a combination of static and dynamic load balancing approaches to distribute the demand to its physical resources. As
shown in that work, the video-id space consists of fixed length
unique identifier referring to each YouTube video, which are
then mapped to hierarchical video server namespaces using
static hashing. In addition, YouTube DNS servers map these
server hostnames to IP addresses corresponding to physical
hosts in a client-location aware manner.
This work is supported in part by the NSF grants CNS-0905037 and CNS1017647, and the DTRA Grant HDTRA1-09-1-0050.
In this paper, we use the same active measurement infrastructure used in [1] to provide insights into the server selection
strategy employed by YouTube video delivery network. In
particular, (i) we extract and chart the physical resources
currently used by YouTube to serve videos, (ii) provide details
on how various strategies are used by YouTube to distribute
the video request to its geographically distributed global cache
servers, and (iii) how these strategies interact with each other.
Our study shows that YouTube uses 3 different approaches
to distribute the load among various servers.
a. Static load sharing using hash based mechanism: As noted
in [1], YouTube maps each video-id to a unique hostname in
each of the namespace in the hierarchical DNS based host
namespaces. This provides a very coarse grain load-sharing
that allocates equal number of videos to each of the hostnames
in the primary namespace.
b. Semi-dynamic approach using location aware DNS resolutions: YouTube maps each DNS hostname to an IP address,
which represents a physical video cache, based upon the user
location and current demand. As seen in our experiments,
YouTube redirects user to a geographically close cache location during the normal hours. However, during the busy
hours it uses DNS based resolutions to direct the user to a
slightly farther location, which helps in avoiding geographical
hot-spots. This is one of the new insight we gained by
analyzing a large number of continuous DNS resolutions over
more than a month.
c. Dynamic load-sharing using HTTP redirections: Finally, to
further balance the load on different physical servers YouTube
caches used HTTP redirections to direct user from a busy
server to a not so busy video server. It helps in smoothing the
skewed load distribution caused by the combination of video
popularity and the spontaneous video demands. The interesting
and new observation we made by analyzing the redirection
logs is that YouTube uses local intra-tier load-sharing before
resorting to inter-tier load-sharing. In addition, none of these
approaches require any centralized coordination.
Our findings also show that YouTube caches are present
at more than to 45 cities in 25 different countries around
the world. These trends suggest that Google is aggressively
pushing its content close to users by placing a large number
of cache servers at various geographical locations around the
world. Moreover, several of these caches are co-located with
ISP-PoPs, which not only helps in reducing the bandwidth
cost for both the ISP and YouTube, but also improves the
performance for the ISP users.
Google 69.5%
Fig. 1.
from.
Fig. 2.
Three tiers of YouTube
tions(P=Primary,S=Secondary,T=tertiary).
video
cache
loca-
Number of IP address
800
600
400
0.8
200
0.6
0
Cache locations
0.4
0.2
10
20
30
40
50
60
70
80
Percentile to which the mapped IPs belong.
90
100
Fig. 4. CDF plot showing which decile mapped IPs belong to in terms of
ICMP ping latency.
1.6
Number of video requests served
example plot for the second group. In this figure, the X-axis
represents the time which is divided in the intervals of 5
minutes each, and Y-axis represents the mapped IP address.
As seen in this figure the mapped IP address changes almost
every time. It suggests that YouTube is trying to use 4 distinct
physical servers to represent the instance of one particular
primary logical server, and changes the mapped IP address
to divide the client request to each of these physical servers.
On the other hand, there are other locations where new IP
addresses only show up at specific times of the day. Our
analysis of DNS resolution pattern for these locations shows
that each logical server hostname maps to a fixed IP address
most of the time during the day, however, during the certain
hours of the day we see a large number of distinct IP addresses
for the same hostname. We show an example of such a location
in Figure 7. In this figure, we see that DNS servers primarily
map the hostname to IP 3. However, at busy hours of the day,
the DNS server also starts mapping the hostname to other IP
addresses. In this week long data, we can clearly see 7 specific
periods (one every day) in which we see additional IP address
other than IP 3.
x 10
1.4
1.2
1
0.8
0.6
0.4
Fig. 8.
20
40
60
80
100
120
140
Hostnames from primary namespace
160
180
200
IP address identifiers.
200
400
600
800
1000
1200
1400
Time slot identifiers. Each point indicates a 5 minute interval.
IP address identifiers.
Fig. 6.
1600
1800
2000
1600
1800
2000
8
6
4
2
200
400
Fig. 7.
600
800
1000
1200
1400
Time slot identifiers. Each point indicates a 5 minute interval.
Example plot showing hostname to IP mapping changing only during peak hours.