Service detection, classification, and analysis from encrypted communication using Machine Learning

Project Description

The Internet has grown considerably over the past decade and with new uses, including more and more personal data, the problem of privacy has taken a considerable part. To ensure user privacy, over-the-top service providers began to use encrypted communications between clients and their servers including HTTPS and more recently HTTP/2, which natively includes encryption on popular browsers. With HTTP/2, the user data is encrypted, but the headers remain unencrypted, which allows to analyze some information or do some differential treatments. But the actors of the internet have gone further, including Google which proposes QUIC, a new protocol that encrypt also the headers. With QUIC, only the very first packet exchanged between the clients and the server is unencrypted, however, all subsequent packets are encrypted. For network operators, these new protocols make traffic analysis by network probes very complext and it becomes impossible to have visbility into network traffic for these protocols. This mandates defining new solutions allowing to identify the traffic, services and applications, their data volume and finally potential problems. This project aims to devise solutions (architectures, techniques, and software) to provide an answer to these issues. One aspect of this project is to use machine learning techniques that allow self-learning of the characteristics of network flows for classification, service identification, and measurement and analysis.

Sponsors and Partners

Orange