Privacy Leakage over the Bitcoin Network
In the following, we present a method to deanonymize Bitcoin users by linking their pseudonyms (addresses) to the IP addresses of the underlying clients. This attack was first introduced in [6] and later expanded in [7].
Note that this attack allows to deanonymize users even when they operate behind network address translators (NATs) or firewalls. More specifically, this technique allows an adversary to distinguish connections and transactions pertaining to different users that are located behind the same NAT.
The main intuition behind this attack is that since entry nodes of any given client are not renewed by default (until the client restarts), each client can be safely and uniquely identified by the set of nodes that he or she connects to.
In terms of required resources, the attach only requires few running instances of the Bitcoin clients (each residing on a different IP) to establish a certain number of connections (following the Bitcoin protocol) and log the incoming transactions [6].
In a specific example offered in [6], an adversary equipped with no more than 50 connections to each Bitcoin server can disclose the sender’s IP address for around 11% of all transactions generated in the Bitcoin network. Experimental results have shown that deanonymization rates of up to 60% can also be reached, if the adversary were to mount a small DoS on the network (see [6] for more details). The overall cost of mounting such an attack on the full Bitcoin network is estimated to be around 1,500 EUR per month [6].
The attack evolves in three steps. First, the attacker attempts to disconnect users from Tor or other anonymizing networks that these clients may be leveraging for connecting to Bitcoin peers. This allows the adversary to use directly the information received by the network (e.g., to figure out the network’s topology). Finally, the adversary can use the acquired network knowledge in combination with the mechanism that Bitcoin uses to forward transactions in the network, to deanonymize transactions. In what follows, we detail these steps.
Phase 1: Disconnecting Clients from Tor: The Tor network [17] comprises a set of relays that are publicly available online, and which can be used by any party to send a message while avoiding traffic analysis attacks. To establish a connection to a service or a node through Tor, a user chooses a chain of three Tor relays, through which the messages to the target service or node will be routed. The final node in the chain, also known as Tor
Exit node, appears to the service as the originator of this connection.
To prevent Bitcoin users from making use of Tor when transacting with Bit- coin, the adversary could exploit the Bitcoin built-in DoS protection. Recall that in Bitcoin, whenever a peer receives a malformed message, it increases the penalty score of the IP address from which the message came and bans that IP for 24 hours when that score reaches 100. To exploit this, the adversary can simply try to connect to various Bitcoin nodes, using Tor, and send malformed messages, such that all Tor exit nodes are banned from the majority of the Bitcoin nodes. Alternatively, the adversary could simply spoof the IP of the exit node and issue malformed messages from that IP that would result in a 24-hour ban of the exit node.
Phase 2: Inferring Network Topology: This phase assumes that the use of Tor has been temporarily deactivated using the strategy described in the previous paragraphs.
In this phase, the adversary targets Bitcoin clients that do not accept incoming connections and only exhibit the minimum (i.e., eight) outgoing connections to the rest of the network. The goal of the adversary is to learn the eight entry nodes of each targeted Bitcoin client.
The attack unfolds as follows. Whenever a client C establishes a connection to one of its entry nodes, it engages in the address discovery protocol described in Chapter 3 and advertises its external addresses that have the highest local scores IPC. If the adversary is already connected to one of those entry nodes, the address
IPc will be forwarded to them with some probability (which depends on the number of the attacker’s connections).
This suggests that the attacker can shortlist the entry nodes of the target address IPc as follows:
- • The attacker connects to a large number of Bitcoin server nodes, say which is assumed to be close to the set of all Bitcoin server nodes NS.
- • The attacker logs the messages received from all connected servers, and for each advertised address, say IPc, the attacker logs the set of servers Nip that forwarded it to the attacker’s machines.
- • The attacker designate NipC as the entry node subset associated to address IPc .
Note that address NipC which is announced to the adversary by a node does not have to necessarily correspond to NipC’s entry node. At the same time, as the client does not simultaneously connect to all of its entry nodes, time intervals among the announcement of the same address by its entry nodes may mislead the attacker to a misconception of the network topology.
Assuming that the adversary knows the target address IPc before this address reconnects to the network,
[1] one can leverage the antiflooding mechanism that Bitcoin has set in place in order to avoid advertising the same address multiple times [6].
Namely, the proposal in [6] ensures that the adversary advertises IPc enough before the IPc reconnects such that when IPc reconnects, the probability that its advertizement is sent to the adversary’s machines via a non-entry node is small.
Phase 3: Deanonymizing Bitcoin Transactions: After preventing nodes from using Tor, and after short-listing certain servers as entry nodes for each victim address, deanonymization evolves as follows:
1. The attacker obtains the list NS of Bitcoin servers assuming that it is regularly refreshed. Here, the adversary first collects the entire list of peers by querying all his neighbors/known peers with a
getaddr message. Given this, the attacker collects the list of advertised addresses and adds to the list of Bitcoin servers NS every listed address that is online and publicly reachable. This can be easily ascertained by the adversary by trying to establish a TCP connection and exchange
version messages.
- 2. The attacker composes the list C of Bitcoin clients to be deanonymized. Here, the attacker selects a set IPC of nodes that he or she wants to consider in the deanonymization attack. At this point, the attack is agnostic to how the attacker constructs C. For example, the attacker might randomly select IPs advertised throughout the network or obtain C as a set of the IPs used by a user retrieved by an out-of-band channel.
- 3. The attacker retrieves the entry nodes NipC of each client IPC € C when IPC connects to the network, as described above.
- 4. The attacker keeps monitoring the traffic from servers in Nip and, by mapping transactions to entry nodes, the attacker can ultimately map transactions to clients. More specifically, the attacker monitors inv messages with transaction hashes received over all the established connections, and for each received transaction, it collects the addresses of Bitcoin servers that forwarded the associated inv message at each round of transaction advertisements. The attacker finally correlates the sets of servers that advertised each transaction at each round and extracts pairs (entry — node; transaction) from the matching pairs.
Eventually, the adversary creates a list List = (IPC, IdC, PKC), where IPC is the IP address of a peer or its ISP, IdC distinguishes clients sharing the same IP, and PKC is the address/pseudonym used in a transaction (hash of a public key).