The Cost of Push Notifications For Smartphones Using Tor Hidden Services
The Cost of Push Notifications For Smartphones Using Tor Hidden Services
Abstract—Push notification services provide reliable, energy multiple messages destined for a variety of apps on a single
efficient, store-and-forward messaging between servers and client device can be coalesced temporally and multiplexed
clients. This mode of communication is widely used, and down a single TCP connection, saving battery life and
sufficiently compelling for mobile devices that push notification
services are integrated into operating systems. Unfortunately, improving performance; finally, an app server can achieve
push notification services today allow the service provider service fan-out by sending a single copy of a message to a
to practice censorship, surveillance, and location tracking. push notification service and requesting that the message is
We explore whether running a Tor hidden service from a delivered to many devices on a group or topic basis.
smartphone offers a viable, privacy-aware alternative. We There are downsides to push notifications however. From
conduct empirical measurements in the lab as well as modelling
using data from 2 014 handsets in the Device Analyzer dataset. a privacy perspective, a push notification service has the
We estimate the monthly median cost of cellular data required disadvantage that the service can see the sender and the
to support a Tor hidden service from a smartphone at 198 MiB. recipient of every notification across a broad range of apps
We further estimate that the network activity would cost at and thus may conduct surveillance and censorship. While
least 9.6% of total battery on a Nexus One device with a data is encrypted between the app server and the notification
daily charging cycle and connected to the Internet via 3G. We
explore four strategies for reducing cellular data costs which, service, and between the notification service and the handset,
when combined, could potentially reduce the total monthly there is no requirement for it to be encrypted end-to-end.
median cost to 61 MiB. Therefore, app data can often be read by the push notification
server. In addition, regardless of support for end-to-end
I. I NTRODUCTION encryption between an app server and a handset, metadata
on which handsets use which apps, as well as the location
Push notification services provide reliable, energy ef- of the user (e.g. via the handset’s IP address), are revealed
ficient, store-and-forward messaging between servers and to the notification service.
clients. This mode of communication is sufficiently com- In this paper we explore the design space of more privacy-
pelling for mobile devices that push notification services friendly designs for push notification services. We consider
are integrated into operating systems. For example, Google three broad options: use push notification services as de-
Cloud Messaging (GCM) is embedded into Android through ployed today; connect to a single push notification service
the Google Play Services API. GCM is also available as via Tor; or run a separate push notification service per app
a library for developers of iOS apps and developers of and connect to each of these via Tor. In the latter two
extensions for the Chrome web browser. Consequently, push cases, connections via Tor could be made outbound from
notification services are widely used by apps to support both the phone to the service or to a Tor hidden service running
device-to-device communication (e.g. sending and receiving on a smartphone. We discuss details of these designs, and
messages between users of a social media app) as well as the trade-offs they represent, in Section III after we review
supporting information dissemination (e.g. news apps and the background on Tor and its support for hidden services
sports score apps). in Section II.
Push notifications provide app writers and client device A key requirement for mobile devices is careful manage-
owners with four advantages: first, if the client device is ment of battery energy and cellular data usage. We therefore
switched off, or temporarily disconnected from the Inter- measure the data usage costs of using Tor and running a
net, the push notification service will store messages and Tor hidden service on an Android handset. We do this by
deliver them when the device is next online; second, push breaking down the costs of running a Tor hidden service into
notification software on the client initiates a single long- components that allow us to produce a model of data usage
lived TCP connection from the client device to the service, as a function of the connectivity profile of a handset. By
avoiding issues with NAT and firewalls as well as removing using connectivity data, such as the availability of WiFi and
the need to poll servers periodically for updates; third, cellular data from 2 014 handsets in the Device Analyzer [1]
dataset, we estimate the cellular data usage of running a Tor A. Tor hidden services
hidden service on an Android device. We find that running a
Tor hidden service today costs the median Device Analyzer In addition to supporting clients connecting to services
user 198 MiB per month in cellular data usage, and the top such as websites on the public Internet, Tor also allows
10% of devices in excess of 362 MiB (and at e 0.20 per Tor clients to publish hidden services. A hidden service
MiB, e 72.40 per month). Using EnergyBox [2] we estimate is identified by an onion address, which represents the
cellular data usage costs 9.6% of battery charge for a Nexus first 16 characters of a base32-encoded SHA1 hash of
One device with a daily charge cycle. a public key generated by the client. Onion addresses
are long-lived identifiers, distributed through some out-of-
The cellular data costs of running a Tor hidden service are
band mechanism between parties who wish to communi-
significant. We therefore explore four strategies to reduce
cate. For example, Facebook offers a Tor hidden service
costs, making push notification services over Tor more
at https://facebookcorewwwi.onion/.1 An onion
attractive. By combining these strategies, we are able to
address allows Tor clients to establish circuits with the
reduce the costs of running a Tor hidden service for the
hidden service using the Tor rendezvous protocol, and there-
median Device Analyzer device to 61 MiB per month.
fore transfer data to and from the service over the Tor
network. The design of Tor hidden services prevents any
II. BACKGROUND single relay from learning the IP address associated with
an onion address, therefore providing anonymity to both a
To understand the costs and benefits of using Tor to hidden service provider and its clients.
improve the privacy of push notification services for mobile The rendezvous protocol is described below, where we
devices, we start with a brief summary of Tor. The Tor assume that Bob wants to run a hidden service and Alice
network is composed of clients, which generate and consume wants to connect to it. A reasonably detailed understanding
traffic, including smartphones, laptops, or desktops; and of these steps is required in order to understand the network
servers, called relays, which forward traffic to other relays and energy costs presented in later sections of the paper.
and make connections to the public Internet on behalf of Bob creates a hidden service:
clients. To use the Tor network, clients download the latest 1) Bob asks his Tor client to create a new hidden service.
network status document approximately every 90 minutes, This generates a public-private key pair for the service.
which lists information about around 7 000 Tor relays cur- The public key of the service identifies the service and
rently available worldwide. The network status document is used to generate an onion address.
is managed by a small number of more trusted servers, 2) Bob shares his onion address with Alice via an out-of-
called directory authorities, who vote on a consensus of band mechanism.
its contents once an hour. The directory authorities publish
additional, relatively static, information on relays in relay Bob runs a hidden service:
descriptors every 18 hours. After downloading the network 3) Bob’s Tor client chooses a small number of (typi-
status document, clients download any relay descriptors cally three) relays as introduction points. Bob then
mentioned in that document that the client does not already establishes a circuit to each introduction point and
have. The client also downloads certificates of authorities sends a single-use public key, or service key, and signs
where it does not already have a current one. a message to prove he is the owner of this public
Clients use relays to build circuits through a sequence key.2 Bob’s Tor client must keep the circuits to the
of (typically three) relays. Such circuits support an overlay introduction points open while the service is running
network between the client and the final public Internet to receive connection requests from new clients.
service required. The client applies layers of encryption in 4) Bob’s Tor client generates a service descriptor contain-
such a way that none of the relays, nor the final Internet ser- ing the public key, the service key, and the introduction
vice, is able to determine which devices on the Tor network points. The service descriptor is uploaded to a few (cur-
are connecting to which Internet services. Because circuit rently six) hidden service directories, chosen based on
construction takes time, clients proactively build circuits the descriptor ID, which is a hash of the service’s public
in anticipation of any requirement for data connectivity; key, the current date and time, and other deterministic
circuits can also support multiple concurrent TCP streams. data. Bob’s Tor client publishes a new descriptor once
Tor clients periodically send keep-alive messages on idle an hour, or whenever its content changes.
open connections to prevent the connection from expiring
1 Note: Facebook have spent considerable computational resource to final
at any intermediate routers. The default interval between
keep-alive messages is currently 5 minutes. To improve the a public key whose base32-encoded SHA1 is memorable.
2 Earlier versions used the public key of the hidden service instead of a
privacy properties of Tor, circuits are (at least partially) single-use service key, but this allowed the introduction point to monitor
rebuilt every 10 minutes. Bob’s activity.
Alice connects to Bob’s hidden service: direct communication between two smartphones is now pos-
5) Alice’s Tor client determines the set of hidden service sible, as an onion address is globally unique and accessible.
directories responsible for Bob’s key using his onion The downside to this approach is that both the sending and
address and the current time, and retrieves Bob’s service receiving smartphone need to be online simultaneously for
descriptor from one of them. data to flow. This requires careful scheduling of smartphones
6) Alice’s Tor client establishes a rendezvous point. It does to wake from low-power states and both devices to have
so by randomly choosing a Tor relay, building a circuit network connectivity at the same time. We note that an
to it, and asking it to act as a rendezvous point, speci- energy- and data-efficient solution is likely a prerequisite
fying a randomly chosen 20-byte rendezvous cookie. for mobile apps that use device-to-device communication
7) Alice’s Tor client connects to one of Bob’s introduction (e.g. messaging apps). We therefore focus on data and
points and requests an introduction to Bob by pro- energy issues of Tor hidden services. We leave the issue
viding a hash of Bob’s service key. Alice also sends of scheduling communication between devices for future
a rendezvous request, including the address of the work, although such issues have been addressed before.
rendezvous point, the rendezvous cookie, and the first For example, the PEN network supported direct peer-to-peer
part of a Diffie-Hellman key exchange, all encrypted communication, with a scheduling algorithm that was more
under Bob’s temporary service key. efficient than the more traditional (centralized) master-slave
8) The introduction point forwards the rendezvous request scheme [3, p. 21].
to Bob. Bob checks the request is valid and not a replay. Both connecting to push notification services via Tor,
9) Bob’s Tor client creates a new circuit to the rendezvous and the use of Tor hidden services, inherit the anonymity
point chosen by Alice and asks the rendezvous relay properties of Tor, which is resistant to local adversaries who
to complete a circuit to Alice. Bob’s request contains are able to control any local network. This means that a local
the rendezvous cookie, the second part of the Diffie- adversary does not learn the endpoints of any connections.
Hellman exchange, and a handshake digest. The ren- In addition, the app server may also be located behind a
dezvous point forwards the latter two to Alice’s Tor hidden service, providing anonymity for the app server too.
client. Alice’s Tor client checks that the handshake is Regardless of whether we use a single push notification
valid, and both sides derive a new set of keys. A new server, a push notification service per app, or phone-to-phone
circuit is now established between Bob and Alice. communication, our primary concern is that using Tor, and
10) Alice can now establish one or more TCP connections possibly running a Tor hidden service, may be significantly
over her circuit with Bob. less energy-efficient, or may result in substantially more data
usage, than traditional push notification services. Quantify-
III. P USH NOTIFICATIONS OVER T OR ing and improving the cost of Tor is a requirement in all
We now consider three overall designs: push notification three use-cases and is thus the focus of the remainder of
services as deployed today; connection to a single push noti- this paper.
fication service via Tor; running a separate push notification We note that the use of Tor to support push notifications
service per app and connect to each of these via Tor. may increase latency for message delivery, but we do not
Connecting to a single push notification service may believe the typical latency times found on Tor will lead to
be more energy efficient than using one push notification large problems for push notifications. We therefore leave this
service per app since separate messages from multiple app analysis as an area of future work.
servers (likely destined for a variety of apps on the same
handset) can be coalesced into a batch for delivery in a single IV. E XPERIMENTAL M ETHOD
Tor circuit. The downside is that the push notification service
learns the app servers (and therefore apps) communicating We present a series of experiments to measure data usage
with a single handset, although it does not necessarily know requirements and to estimate the energy consumption of us-
the identity or location of the handset if such communication ing Tor and operating a Tor hidden service on smartphones.
is sent over Tor. Our testbed consists of two Nexus 5X smartphones running
Running a hidden service on a smartphone does not, at Cyanogenmod (Android 6.0.1). To support the creation and
first glance, appear to provide much benefit over the use of operation of Tor hidden services, we developed a simple cus-
an outbound Tor connection to a push notification service. tom app that uses Tor project’s Orbot Android app (version
Importantly, however, hidden services allow app developers 15.1.2) to run a hidden service. Our app accepts connections
to avoid using a push notification service at all if the aim of to the hidden service and logs any data sent to it, allowing
the app is to share data between client devices. us to explore data transmission at various rates between the
Mobile devices typically sit behind a NAT or firewall. smartphone and another computer. To avoid problems with
Thus, direct phone-to-phone communication is often difficult the phone going into deep sleep, we configured the phone to
or impossible. If every device operates a Tor hidden service, always stay awake. To provide a comparison with Google’s
Cloud Messaging (GCM) service, we installed and enabled is difficult because: multiple cells may be carried inside
the Google Play Services Framework when necessary. a single IP packet; a single cell may be split across an
We obtained full packet traces of all traffic on a Linux IP packet; and TLS handshake messages, TLS headers,
workstation by connecting the smartphones to a NETGEAR TCP headers and TCP re-transmissions introduce additional
WiFi access point with an Ethernet uplink connected to a overhead that should be associated with the underlying
workstation. The workstation was configured to route data category of use.
onto the wider Internet, allowing connections to and from the Accounting for the TCP header size and re-transmissions
Tor network and GCM. The experiments where conducted is relatively easy as these are visible in the packet trace.
between December 2016 and February 2017. To account for TLS headers and overheads, we record
the number of bytes read and written to the TLS stream
A. Measuring Tor traffic and to the underlying TCP socket. We match the byte
To estimate the cost of using Tor for push notifications, we counts written to the TCP socket with the bytes sent in the
wanted to construct an empirical model of Tor traffic. Such network trace to determine which cells (or parts of cells)
a model is important for accurately estimating the data and are contained within a specific network trace. The overheads
energy costs an app might generate using any of the Tor- resulting from TLS and TCP are assigned proportionally to
based push notification systems we discussed in Section III. the cells contained within the relevant packets.
As discussed in Section II, the Tor client takes part in Determining the purpose of each cell is generally straight-
many different network activities which we break down into forward since the cell header associates the cell with a
nine categories in order to build an empirical model: regular specific circuit, and additional instrumentation allows us to
downloads of the network status; relay (micro) descriptor record the current purposes of a circuit or of the stream asso-
data; creating circuits to introduction points; regular uploads ciated with the cell. One complication is that the assignment
of hidden service descriptors; sending keep-alive messages of a purpose to a cell cannot be made directly after data is
along established connections to Tor relays; downloading read from the underlying TLS connection, since only part
authority certificates; measuring circuit timeouts; establish- of a cell may be returned. Additional bookkeeping is thus
ing and closing connections to a (first hop) Tor relay; and needed so that the purpose can be determined after complete
creating circuits, responding to connection requests, and data cells have been received and parsed. Another difficulty is that
communication associated with a hidden service. many TCP streams can be multiplexed down a single circuit.
In this section we describe how we quantify the amount of For circuits that were used for more than one purpose, there
network traffic in each above categories. We use this analysis can exist some traffic that cannot be assigned to a particular
in Section VI to derive an empirical model of Tor data usage TCP stream (e.g. creating a new circuit); if the purpose
and assess the real-world impact of using Tor with handsets cannot be uniquely inferred, the traffic cost is shared equally
in the Device Analyzer project. between all the purposes associated with the circuit.
Tor traffic is encrypted, and thus it is not straightforward Consequently, there are two approximations in our analy-
to obtain a breakdown of traffic by category. We therefore sis that are small and therefore do not have a material impact
instrumented the Tor source code to identify and log the on our analysis. First, since Tor preemptively builds circuits,
purpose (thus category) of all network data sent or received some of these circuits may not have been used; we find
by the Tor client. We used the log to associate this category unused circuits were responsible for only 0.1% of the total
with each packet in the network trace captured by the traffic. Second, when cells cannot be associated with a TCP
workstation. stream, and their purpose cannot be inferred, we assign their
Tor clients and routers communicate with one another cost equally to all purposes associated with a given circuit;
via TLS connections with ephemeral keys. Traffic on these this only affected 0.2% of the total traffic. Section V-A offers
connections consist of 514-byte cells, which contain a header more details.
and a payload. Cells are either control cells, used to create,
extend, or destroy a circuit, or are payload cells, containing V. R ESULTS
encrypted data travelling over an existing circuit. The cir- We now report on four experiments. First we measure
cuits themselves are used to support connectivity for client the cost of maintaining a Tor hidden service for a fixed IP
applcations (e.g. allowing an app on the phone to make a address and stable Internet connection. Second, we measure
TCP connection to a push notification service) and maintain the additional cost of changing our IP address, a regular
connectivity to the Tor network (e.g. downloading network occurrence for a smartphone as it moves between cellular
status; uploading a hidden service descriptor; sending a data and WiFi networks. Third, we explore the overhead of
keep-alive message; and so on). data transmission across the Tor network. These results allow
Our instrumented Tor client generally allows us to deter- us to produce a model of the cost of running a Tor hidden
mine the purpose of each cell sent or received, but associat- service, something we build on in Section VI. Finally, for
ing this with the network trace captured by the workstation comparison, we measure the overheads of using GCM.
A. Hidden service maintenance established because the source IP address used to support
We measured the network traffic induced by maintaining the TCP connections underlying Tor circuits changes.
a Tor hidden service over a 48-hour period using our testbed. To estimate the total additional network traffic caused by
We recorded 32.5 MiB of Tor traffic, including IP headers network connectivity changes, we used the same setup as
across 46,790 packets, or an average of 693 KiB (975 in the maintenance experiment in Section V-A, but forced
packets) per hour. The large majority of the traffic volume a disconnect of the WiFi connection every 20 minutes, and
in bytes was caused by network status consensus downloads a reconnect 5 seconds later. When Orbot detects that the
(79.9%), with another 11.7% caused by hidden service network is down, Tor shuts down all connections and starts
descriptor uploads. Downloading relay descriptors caused rebuilding connections when connectivity is back.
4.3% of the traffic, keep-alive messages 2.5%, and intro- We then measured the amount of traffic generated over 48
duction circuits 0.2%. Establishing and closing connections hours and classified it as in Section V-A. Our experiments
to entry (first hop) relays was responsible for 0.8% of the showed that network status document and relay descriptor
traffic. 0.2% was used to measure circuit timeouts, another downloads were not affected by connectivity changes. We
0.2% to fetch authority certificates, and the remaining 0.1% therefore exclude traffic classified as one of these categories.
was used to manage circuits that remained unused. Table I The current implementation of Orbot chooses new introduc-
provides further detail. tion points after each reconnect, and re-uploads the hidden
At the time of writing, directory authorities vote on a service descriptors. Based on Section V-A, which describes
new network status consensus every hour, which is valid the traffic required for a set of hidden service descriptor
for three hours. Clients download a new consensus at a uploads, we also exclude traffic related to them to get an
randomly chosen time between 105 and 170.6 minutes after estimate of the remaining traffic caused by a connectivity
their current consensus becomes valid. We observed a total change. Ignoring traffic related to these three activities, we
of 38 consensus downloads, with an average size of 699 ± 9 calculated the difference in total traffic compared to the idle
KiB. In addition, we saw one case where the directory server connection (Section V-A). Excluding these, we measured
returned a “304 Not modified” status. In this case, the client 5 628 KiB of traffic, compared to 1 362 KiB for the idle
retried the download after one minute, resulting in the same connection. During the 48 hour period, the WiFi reconnected
status code. When the client tried again at a different server 143 times. We therefore estimate an average additional traffic
10 minutes later, it received a full consensus document again. per reconnect of 29.8 KiB, primarily for re-establishing
This caused an additional 8 KiB of traffic. We also observed connections, introduction circuits, and other circuits. Adding
336 hidden service descriptor uploads. Keep-alive messages the approximately 70 KiB it takes to upload hidden service
are padded to the size of a cell, with the total size of keep- descriptors, a reconnect generates roughly 100 KiB of traffic.
alive IP packets as 595 bytes, which is answered by an ACK C. Data transmission
packet of 52 bytes. Both sides of the connection send a
keep-alive packet, resulting in 4 packets and 1 294 bytes We measured the overhead of transmitting data over the
exchanged per idle connection every 5 minutes. Tor network. To do so, we sent messages of three different
sizes (1 B, 512 B and 1 KiB) at three different intervals
Type of traffic KiB/h KiB% Pkts/h Pkts% (1 min, 8 min, 12 min) to the smartphone. We chose 8 and
Network status download 554 79.9% 694 71.2% 12 minute intervals to explore the effect of circuit rebuilds,
Relay descriptors 30 4.3% 47 4.9%
HS descriptor 82 11.7% 144 14.8% which currently occur every 10 minutes (Section II). For
Keep-alive 17 2.5% 54 5.6% each message, we established a fresh TCP connection to the
Introduction circuits 1 0.2% 3 0.3% hidden service and sent a stream of bytes of the given length
First-hop connections 6 0.8% 24 2.5%
Measure circuit timeout 2 0.2% 3 0.3% before closing the connection. For each combination, we
Authority certificate 2 0.2% 3 0.3% sent messages for 4 hours. Table II shows how much traffic
Unused circuits 1 0.1% 1 0.1% was generated on average by a single message for different
Total 693 100% 975 100%
message sizes and rates. We estimated this amount by
Table I counting all traffic not labeled as network status download,
AVERAGE NETWORK TRAFFIC GENERATED WHEN MAINTAINING A T OR
HIDDEN SERVICE . relay descriptor download, hidden service descriptor upload,
certificate authority download, or measuring circuit timeout
over the 4 hour-period, subtracting the expected amount of
traffic for the same categories for simply maintaining the
B. Network connectivity changes hidden service as measured in Section V-A (429 bytes/1.4
Smartphones regularly change their network connectivity packets per minute), and dividing by the number of messages
as they move between WiFi access points and connections sent. Note that we count keep-alive traffic, as receiving
via cellular data services. Whenever such device connec- messages may reduce or increase the need for keep-alive
tivity changes, connections to the Tor network must be re- messages.
Interval 1B 512 B 1 KiB
1 min 2.7(7.6) 3.2(7.8) 3.8(9.2) a reconnect. We assume that the fact that we did not change
8 min 5.7(15.7) 6.1(16.4) 7.2(19.6) the IP address might have resulted in PS not reconnecting
12 min 9.1(25.1) 9.9(26.7) 9.7(25.4) in these cases.
Table II To measure the traffic overhead when sending messages
T HE AVERAGE ADDITIONAL NETWORK TRAFFIC IN K I B ( NUMBER OF to the smartphone, we sent similar messages to our GCM-
PACKETS IN BRACKETS ) GENERATED PER MESSAGE OVER T OR FOR
DIFFERENT MESSAGE SIZES AND DIFFERENT SENDING RATES . enabled app as we did over Tor in Section V-C. We used the
same message sizes and intervals (1 min, 8 min, 12 min; 1 B,
512 B and 1 KiB) and we measured each combination for 2
hours. The average traffic per message did not significantly
D. Comparison with GCM differ for different intervals. Per 1-byte message we observed
on average 0.3 KiB, per 512-byte message 0.8 KiB, and per
For comparison with Tor, we used our testbed to measure 1024-byte message 1.3 KiB of traffic.
the costs of maintenance, connectivity changes, and message
overhead of using GCM. To determine the traffic relevant to
VI. T OR HIDDEN SERVICE MODEL
GCM, we filtered TCP traffic from the smartphone whose
destination was mtalk.google.com, ports 5228–5230. In this section, we use the results of Section V to derive
Push notifications over GCM requires Google Play Ser- a model for the data usage of a hidden service on a
vices (PS) running on the handset. PS initiates and maintains smartphone. We use this model to evaluate the data usage
a single open TCP connection to a GCM server to receive and energy costs of using Tor to support a push notification
push notifications. To keep the connection alive, PS periodi- service on real devices in Section VI-A. The model is based
cally sends keep-alive messages to a GCM server. The active on the results from our measurements and therefore on the
keep-alive intervals can be determined by typing the code current state of the Tor network. Future work could take into
*#*#426#*#* in the Phone app. Using this technique, we account the changing nature of the Tor network and create
experimentally confirmed that, for mobile data connections, a model that depends on parameters like the number of Tor
PS currently uses a 28-minute interval. On WiFi, PS uses relays that notably impact the amount of network traffic.
a proprietary adaptive algorithm to determine an interval of To estimate the total network traffic required to maintain
between 110 seconds and 29 minutes; in our case the interval the hidden service on a phone, we require knowledge of the
was typically set to 19, 24, or 29 minutes. connectivity profile of a device: the periods the device was
There are no entries in the smartphone system log con- connected to the Internet via WiFi or a cellular network,
cerning keep-alive messages. Thus, to quantify data usage when the IP address of the handset changes, and when
and packet count for keep-alive messages, we looked at no network connectivity is available. Network traffic is
the packet trace from the smartphone deployed with our generated by periodic network status and relay descriptor
testbed with PS installed and enabled. To ensure that PS downloads, hidden service descriptor uploads, and the cre-
connected to GCM and waited for push notifications, we ation and maintenance of Tor circuits to introduction points.
wrote and launched a simple app that waits for incoming We look at each of these in turn.
GCM messages. We observed a periodic burst of three or The network status document is downloaded at regular
four packets with a total length between 224 and 278 bytes, intervals. Building on our analysis in Section V-A, we
which matched the WiFi heartbeat interval. From the 246 assume that the Tor client starts a network status download
bursts we observed, the average total size was 238±22 bytes when either: a disconnected device connects to the Internet
(not counting duplicate packets). Alongside this periodic and has no valid network status document; or time t (chosen
burst, we sometimes observed up to four additional packets uniformly at random from the interval [105, 170.6] minutes)
containing duplicate TCP packets (up to 528 bytes in total). has passed since their current download became valid. We
The contents of the packets were encrypted so we could assume that a new network status document becomes valid
not determine further details of the keep-alive message or on the hour, every hour (in UTC) and the client always
the purpose of the retransmission. The average total size downloads the most recent valid document. We assume that
including duplicate packets was 258 ± 57 bytes. each consensus download produces 716 419 bytes of traffic,
We repeated the experiment described in Section V-B for the average measured in Section V-A.
GCM on a Nexus 5X handset. We again forced the phone to We assume that the client downloads a set of relay
reconnect to WiFi every 20 minutes. We ran the experiment descriptors immediately after it downloads a network status
for 48 hours. We measured the amount of traffic within a document. We assume that this requires 30 356 · h bytes of
minute after each reconnect and observed a burst of traffic, traffic, where h is the number of hours that have passed
with an average size of 2.9±1.5 KiB (16.8±1.6 packets) in since the last network status download. This is based on the
141 out of 143 cases when the phone reconnected to WiFi. average descriptor download traffic we observed per hour,
In two cases, we observed no additional traffic directly after and should give a good approximation, in particular because
the large majority of descriptor downloads happens shortly The baseline box plot shown in Figure 1 shows our esti-
after a network status download. mate of the cost of running a Tor hidden service for 30 days
We assume that each time the phone changes the way on the 2 014 devices from the Device Analyzer dataset. An
it connects to the Internet, it needs to rebuild its Tor con- equivalent numeric summary is shown in Table III. Cellular
nections and circuits (including the ones to the introduction data usage is high, with a median cost across all devices
points), which costs 30 548 bytes of traffic, the average of 198 MiB. For 10% of the devices we estimate a cellular
measured in Section V-B. data usage of 362 MiB or more. These are high data rates,
We assume that the client uploads its hidden service and in most countries will require a significant data plan.
descriptor immediately after it has (re-)established its con- For exposition purposes, assuming networks charge e 0.20
nections to the introduction points, or 60 minutes after per MiB, the maximum roaming charge mobile operators
the last upload. We assume that each set of uploads (to within the EU were allowed to charge after March 2014, this
six directories) incurs 71 504 bytes of traffic, the average represents e 72.40 per month; a substantial price to pay for
measured in Section V-A. better privacy. By way of comparison, GCM maintenance,
Finally, we assume that for every 5 minutes the device is without any IP address changes, costs on average 258 bytes
connected to the Internet, 1 474 bytes of keep-alive traffic every time it needs to send a heartbeat (see Section V-D), or
is generated, the average measured in Section V-A. We do 0.44 MiB over 30 days for heartbeat interval of 24 minutes.
not include periodic changes of the introduction points, as Even factoring in multiple network changes per day, at
these have a small impact on total traffic (0.2% during the 2.9 KiB per change, total costs are still likely only a couple
experiment described in Section V-A) and in our analysis, of MiB per month.
the Tor client changes introduction points once a day. We use EnergyBox [2] to estimate the energy costs of
Similarly, for simplicity, we do not include other traffic in maintaining a Tor hidden service on a smartphone. Ener-
our model since the remaining traffic was only about 1% of gyBox takes a packet trace, a smartphone model, and a
the total traffic during our measurement. connectivity profile (WiFi or cellular). To provide a reason-
able lower-bound estimate for the energy costs, we reused
A. Evaluation the 48-hour packet trace collected for our experiment in
To evaluate the energy and data usage costs of using Section V-A and assumed this trace was transferred over the
Tor to support push notifications, we use our model from cellular data network (3G) with a Nexus One device on the
Section VI together with connectivity profile data of smart- TeliaSonera network, the only device and network operator
phones from the Device Analyzer project [1]. the EnergyBox authors provide an energy model for. The
Device Analyzer is an Android app, available on the Nexus One was released in 2010, and we expect newer
Google Play Store since May 2011 and installed on over devices’ batteries to last longer. Over the 48-hour period,
30 000 handsets. It gathers information on a wide variety the estimated total energy costs were 5 346 J; or 2 673 J
of system statistics, including: app usage; metadata on calls = 0.743 Wh per day. The Nexus One device has a battery
placed and received; metadata on text messages sent and capacity of 5.18 Wh, so, assuming a battery profile where
received; Bluetooth devices seen and connected to; WiFi ac- the device is charging for 8 hours over night and on battery
cess points seen and connected to; cell network coverage for for 16 hours, this represents 9.6% of total battery capacity
calls and data; and battery and power usage. Data collected – a significant amount. Note that this value only takes into
by the app is processed on the handset to obscure direct account the energy required for network communication, and
personal identifiers (e.g. phone numbers) before uploading additional power is required to make a hidden service work,
data to a server at the University of Cambridge. e.g., to keep the device awake when needed.
We analyzed traces from the 30 444 devices in the Device
Analyzer dataset. We excluded all devices with less than
B. Reducing Tor data usage
30 days’ worth of data. We further excluded all devices
where Device Analyzer data collection has been interrupted The above numbers demonstrate that running a Tor hidden
at any point, had large jumps in their device clock, or service on a smartphone generates several hundreds of
where the device clock was obviously wrong or broken. megabytes of cellular data traffic per month on a typical
For each device trace from the remaining 2 014 devices, we device, an unacceptable volume for all but those with a
estimate the volume of cellular data required to maintain a generous data package. As the EnergyBox paper [2] demon-
Tor hidden service. We do this by assuming that cellular data strates, there is a strong correlation between data volume and
is used when a cellular connection is available and a WiFi energy usage too. Therefore, we evaluate four strategies to
connection is not. Since the Tor client uses timing randomi- reduce the amount of data Tor requires, thereby reducing
sation when downloading the network status document, we energy usage. Note that these strategies may potentially
simulate the connectivity pattern of the device 40 times and impact user anonymity. We leave evaluating this for future
take the average amount of traffic. work.
2000
costs. Strategy B in Figure 1 explores this option.
1800
Strategy C: Defer fetching network status on cellular:
1600 Since mobile devices regularly move between WiFi and
cellular data, it may make sense to delay downloading the
Estimated traffic per month in MiB
1400
1200
network status document until just before expiry in the
hope that WiFi connectivity appears before a download over
1000
the cellular network becomes necessary. Additionally, this
800 may reduce the total number of network status document
600 downloads. This strategy may cause spikes in download
400
requests on directory mirrors in the Tor network if many
clients adopt this policy. Nevertheless, strategy C in Figure 1
200
explores this option.
0
Baseline A B C B+C D A+B+C+D
Strategy
800 Consensus diff stdev
700 Consensus diff
Full consensus
Figure 1. Estimated traffic over the cellular network over 30 days for 600
Size in KiB
2 014 devices in the Device Analyzer dataset. The leftmost box shows a 500
data usage estimate for Orbot’s current behaviour; the remaining boxes 400
300
estimate data usage with various data reduction strategies developed in 200
Section VI-B. 100
0
0 20 40 60 80 100 120 140 160 180
Base A B C BC D ABCD Time passed in hours
[5] (2016) Tor Ticket #19522. HS intro circuit retry logic fails [19] S. Doswell, N. Aslam, D. Kendall, and G. Sexton, “Please
when network interface is down. https://trac.torproject.org/ Slow Down!: The Impact on Tor Performance from Mobility,”
projects/tor/ticket/19522. Accessed 20 December 2016. in Proceedings of the Third ACM Workshop on Security and
Privacy in Smartphones & Mobile Devices. ACM, 2013, pp.
[6] (2015) Tor Ticket #16387. Improve reachability of hidden 87–92.
services on mobile phones. https://trac.torproject.org/projects/
[20] S. Doswell, D. Kendall, N. Aslam, and G. Sexton, “A
tor/ticket/16387. Accessed 20 December 2016.
longitudinal approach to measuring the impact of mobility
on low-latency anonymity networks,” in 2015 International
[7] P. Palfrader. (2008) Provide diffs between consensuses. Wireless Communications and Mobile Computing Conference
http://github.com/isislovecruft/torspec/blob/master/proposals/ (IWCMC). IEEE, 2015, pp. 108–113.
140-consensus-diffs.txt. Accessed 20 December 2016.
[21] Briar. https://briarproject.org/. Accessed 20 December 2016.
[8] D. Martı́. (2014) [GSoC] Consensus diffs - Fourth
report. https://lists.torproject.org/pipermail/tor-dev/2014-July/ [22] Optimizing for Doze and App Standby. https:
007163.html. Accessed 20 December 2016. //developer.android.com/training/monitoring-device-state/
doze-standby.html. Accessed 20 December 2016.
[9] J. McLachlan, A. Tran, N. Hopper, and Y. Kim, “Scalable
Onion Routing with Torsk,” in Proceedings of the 16th [23] S. A. Kollmann and A. R. Beresford. (2017) Supporting data
ACM Conference on Computer and Communications Security. for ”The Cost of Push Notifications for Smartphones using
ACM, 2009, pp. 590–599. Tor Hidden Services”. https://doi.org/10.17863/CAM.7547.