Challenges in Net Neutrality Violation Detection: A Case Study of Wehe Tool and Improvements

cover
11 Apr 2024

Authors:

(1) Vinod S. Khandkar and Manjesh K. Hanawal, Industrial Engineering and Operations Research Indian Institute of Technology Bombay, Mumbai, India and {vinod.khandkar, mhanawal}@iitb.ac.in.

Abstract & Introduction

Related Work and Background

Challenges in TD Detection Measurement Setup Development

Case Study : Wehe - TD Detection Tool for Mobile Environment

Shortcoming of Wehe on HTTPS Traffic

TD Detection of HTTPS Traffic

Conclusion & References

Abstract

We consider the problem of detecting deliberate traffic discrimination on the Internet. Given the complex nature of the Internet, detection of deliberate discrimination is not easy to detect, and tools developed so far suffer from various limitations. We study challenges in detecting the violations (focusing on the HTTPS traffic) and discuss possible mitigation approaches. We focus on ‘Wehe,’ the most recent tool developed to detect net-neutrality violations. Wehe hosts traffic from all services of interest in a common server and replays them to mimic the behavior of the traffic from original servers. Despite Wehe’s vast utility and possible influences over policy decisions, its mechanisms are not yet validated by others. In this work, we highlight critical weaknesses in Wehe where its replay traffic is not being correctly classified as intended services by the network middleboxes. We validate this observation using a commercial traffic shaper. We propose a new method in which the SNI parameter is set appropriately in the initial TLS handshake to overcome this weakness. Using commercial traffic shapers, we validate that SNI makes the replay traffic gets correctly classified as the intended traffic by the middleboxes. Our new approach thus provides a more realistic method for detecting neutrality violations of HTTPS traffic.

Index Terms—Net neutrality, traffic differentiation, detection tools

I. INTRODUCTION

Net neutrality is a guiding principle promoting the “equal” treatment of all packets over the Internet. However, for economic benefits, ISPs may apply traffic differentiation to a specific service, user, content provider, or any other traffic group on the Internet without making any public declaration. It gives rise to a need to have tools that can detect such malicious activities over the Internet.

Traffic differentiation (TD) detection involves the coalescence of many elements such as end-systems (user-client and server), probing traffic generation, expected network responses, and TD detection algorithms. These are interdependent components or operations. Hence, in developing TD detection tools, one faces challenges such as crafting internet traffic and conditioning measured network response that suits their detection algorithm. Moreover, as HTTPS traffic becomes prevalent, it is unclear what policies the middle-boxes apply for discrimination as payload provides no signatures. Our first goal is to study various challenges in designing a reliable TD detection mechanism for HTTPS traffic

Several methods have been proposed in the literature to detect Network neutrality violations as documented in recent surveys [1], [2]. The literature is rich with various forms of discrimination and its detection, like discrimination of content providers [3], end-users [4], specific services like BitTorrent [5]. Our interest in this work is discrimination of services, specifically on streaming services (both audio and video) which are potential candidates from description due to their commercial values and high bandwidth requirements.

Recently several tools are developed to detect discrimination of services like, NANO [6] ChkDiff [7], Netpolice [8], Shaperprobe [9], Packsen [10], Glasnost [11], Bonafide [12], and Wehe [13]. NANO is based on passive measurements, and all others are based on either active or differential probes. Wehe is the latest tool that overcomes many of the drawbacks of the other tools. However, due to the Internet’s complexity, many issues persist that prevent its use as a reliable tool.

Wehe [13] follows a client-service architecture in which content of service of interests (Youtube, Netflix, etc.) are copied (or sniffed) from their respective original servers and hosted on a common server (referred to as replay server). To check if a particular service is discriminated, the client accesses the content of that service from the replay server. The server transfers the requested data with timing relations that mimic the original service’s data transfer characteristics over the Internet. The client accesses content from the replay server with and without VPN and compares them to ascertain if there was any difference in the quality.

The basic idea of Wehe is that connection over VPN is encapsulated and will not be subjected to any discrimination as the middleboxes cannot classify the traffic correctly. Whereas content accessed without VPN can be classified correctly due to the signatures induced by the timing relations and can be subjected to discrimination policies. Despite the Wehe tool’s vast utility and possible influences over policy decisions, its mechanisms are not yet fully validated by other researchers.

We investigate Wehe’s differentiation detection’s performance on traffic using HTTPS protocol, which is predominantly used by all services for security reasons and to avoid detection. We noticed that Wehe uses port 80 (HTTP) for all communications. Middleboxes process all the traffic on this port as non-encrypted traffic and may not classify them correctly as they will not see any signatures (recall that replayed traffic is HTTPS). On the other hand, when Wehe access traffic over port 443 (HTTP), traffic is not classified as intended traffic as we demonstrate it using a commercial traffic shaper. As middleboxes do not see traffic from the Wehe’s replay server as that coming from the original server, they may not apply intended traffic discrimination policies. Thus Wehe may not detect any deliberate discrimination resulting in false negatives.

in false negatives. To overcome the limitations of Wehe for HTTPS traffic, we propose a new method in which traffic is accessed on port 443, but with a modification in the TLS handshake. In the TLS handshake, we explicitly send the Server Name Indication (SNI) corresponding to the actual service. We demonstrate that the HTTPS traffic get correctly classified by the middleboxes with these modifications. Our method ensures the middleboxes classify traffic from replay servers in the same way they do from the original servers for each service and apply the same discriminate policies (if any) on traffic from both replay and original servers. Thus, our method makes it possible to detect discrimination of HTTPS traffic in a realistic setting. Our contributions can be summarized as:

  1. We study various challenges associated with detection of discrimination of HTTPS traffic. We present the categorization of these challenges based on their source, e.g., as protocol and operational environment.

  2. We demonstrate limitations of Wehe in detection of discrimination of HTTPS traffic. Specifically, we demonstrate that HTTPS traffic will not be correctly classified in the Wehe setup using a commercial traffic shaper.

  3. We propose a mechanism in which traffic of services accessed from the replay server is treated as if it originated from the actual servers. Thus traffic from the replay server also gets subjected to the same discriminated policies applied on the actual service. We

  4. We validate that replay traffic of our mechanism is treated in the same way originating from actual servers using a commercial traffic shaper.

The paper is organizaed as follows. Sec. II provides the necessary background and related work. Sec. III describes all identified challenges in measurement setup for TD detection. Sec. IV describes the Wehe tool and its mechanisms in the context of identified challenges, Sec. V provides the validation results. Our proposed method is described in Sec. VI and Sec. VII gives the conclusions.

This paper is available on arxiv under CC 4.0 license.