Publicly available datasets

Whenever possible, we share not only our code, but also the datasets with the scientific community, that this page points to.

AppClassNet

AppClassNet is a carrier-grade dataset for traffic classification and application identification research, containing millions of labeled samples from hundreds of applications -- the networking equivalent of the ImageNet dataset!

Size of AppClassNet compared to other public datasets

IP-ID census

Maps of IP-ID behaviors that are prevalent in the Internet. Censuses at IP/24 level, along with training set with manual ground truth.

Synoptic of IP-ID host behaviors

Web QoE

A collection of award winning datasets including both automated collection, as well as large-scale real users campaign.

Summary of available datasets

LEDBAT + BitTorrent

Over 20GB of controlled experiments with different congestion control algorithms, showing an interplay of data plane throughput vs control plane delay in the performance of distributed applications.

Download time distribution for difference congestion control protocols

Anycast geolocation

Maps of anycast IPv4 enumeration and geolocalization. Monthly censuses at IP/24 level, lists, Google maps and more!

Slideshow of anycast resources available on the website

Bufferbloat

Methodology to infer queuing delay from simple non-intrusive measurement, along with Internet-wide measurement campaign for BitTorrent hosts and other targets

Grab the data and learn how to remotely spying queuing delays