MSc Proposal 2021-22


Name

Torpedo 2.0

Title

Privacy-preserving Deanonymization of Onion Services in the Dark Web with Encrypted Deep Learning

Advisor

Nuno Santos

Objectives

The Tor anonymity network provides a global infrastructure for supporting cybercriminal activity in the Dark Web. Using Tor, sellers and buyers can meet each other anonymously and perform transactions of illegal goods and services on marketplace websites backed by Tor onion services. To boost the police authorities' ability to investigate cybercrime, we are developing Torpedo, a distributed system that allows for deanonymizing client access to onion services. With Torpedo, police authorities can issue global queries, e.g., to determine the real IP addresses of onion services or of clients that have accessed certain onion services. However, in our current implementation, Torpedo requires that ISPs exchange raw trace information of collected Tor flows to enable Torpedo to correlate flows using a traffic classifier based on convolutional neural networks (i.e., deep learning algorithms). As it turns out, network traces are highly privacy sensitive and are protected by strict data protection regulations which hampers ISPs' ability to share them with each other. Consequently, this limitation introduces a practical barrier that would prevent the widespread adoption of Torpedo.

In this project, we want to overcome the existing limitations of Torpedo by building a privacy-preserving Tor onion traffic correlation service. In other words, our goal is to enhance Torpedo such that the traffic correlation can be performed without the need for ISPs to exchange their traces. To achieve this, our idea is to leverage on the recent advances in encrypted deep learning which allow us to train and query a convolutional neural network (CNN) where both the training and testing datasets can be entirely private. This is akin to processing over encrypted data, but applied to machine learning algorithms and more specifically to CNN. Concretely, we want to apply this idea to Torpedo which implies leveraging an existing framework for encrypted machine learning (named TF-Encrypted) for securing our Torpedo classifier. This job comes with some challenges namely the fact the existing Torpedo CNN will likely require a few adaptations, and that we will need to develop specific optimizations to overcome the performance penalties introduced by TF-Encrypted. Furthermore, it will also be necessary to guarantee the end-to-end security of our system and provide strong authentication and accountability support by developing special-purpose security protocols involving the participating actors (i.e., ISPs and international police agencies). This work is expected to result in the publication of a paper in a flagship security conference.

In summary, the mains tasks of this project will be: 1) review the related work, 2) extend the current design of Torpedo to support privacy-preserving Tor flow correlation, 3) implement a prototype based on TF-Encrypted, 4) evaluate the system experimentally using synthetic onion service traffic, 5) write a scientific article, and 6) write a dissertation.

Requirements

Interest in distributed systems, security, and machine learning. Attendance in the forensics cyber-security course.

Location

IST-Alameda (INESC-ID) or IST-Tagus

Observations

This work will be performed in collaboration with researchers from FCUL, FCUP, INESC TEC, and the University of Waterloo.