Why Wehe Struggles with TD Detection of HTTPS Traffic

cover
11 Apr 2024

Authors:

(1) Vinod S. Khandkar and Manjesh K. Hanawal, Industrial Engineering and Operations Research Indian Institute of Technology Bombay, Mumbai, India and {vinod.khandkar, mhanawal}@iitb.ac.in.

Abstract & Introduction

Related Work and Background

Challenges in TD Detection Measurement Setup Development

Case Study : Wehe - TD Detection Tool for Mobile Environment

Shortcoming of Wehe on HTTPS Traffic

TD Detection of HTTPS Traffic

Conclusion & References

VI. TD DETECTION OF HTTPS TRAFFIC

As described in Sec. V, Wehe tool suffers from critical weakness of not achieving intended traffic classification of its Original replay traffic due to encrypted nature of most of the modern streaming services as demonstrated in Sec. V-A. In this section, we propose a method to overcome this shortcoming and make it useful to detect discrimination of HTTPS traffic. The payload-based traffic classification techniques do not apply to encrypted traffic. Commercial traffic shapers primarily use the Server Name Indication or SNI-based traffic classification technique to overcome the classification of encrypted

(a) Gmail video play

(b) YouTube replay traffic

Fig. 8. Traffic classification of Gmail video play and YouTube replay traffic with appropriate SNI

shown in Fig 3. We repeated the experiment in Sec. III-A with user-client performing TLS handshake using appropriate value of SNI parameter extracted from original service’s network logs. The Fig. 8(a) shows the outcome of the experiment as the correct classification of Gmail traffic. Note that the same traffic was wrongly classified as YouTube traffic when transferred without SNI (refer Fig. 2(b)).

Based on the above validation, we propose to use SNI in Wehe’s replay traffic to overcome its shortcoming of correct replay traffic classification. To validate this point, we repeated the experiment in Sec. V-A with appropriate SNI (SNI extracted from original YouTube service’s traffic) used in the TLS handshake. We used the experiment setup as used in Sec.V. The Fig. 8(b) shows the outcome of this experiment. As seen, that the traffic shaper has correctly detected YouTube replay traffic. This observation establishes that using servicespecific SNI in TLS handshake messages leads to the correct classification of replay traffic.

The Wehe tool employing our new SNI-based mechanism to emulate the original service’s traffic can detect discrimination of HTTPS traffic and also work with a larger set of network middle-boxes, including those that fail to classify replay traffic correctly based on traffic characteristics alone.

This paper is available on arxiv under CC 4.0 license.