Prefix Top Lists Study
  • About
  • Downloads
  • Plots
  • Object Ranks
  • Contact

About this Study

Domain-based top lists such as the Alexa Top 1M portray the popularity of web domains. Even though their shortcomings (e.g., instability, no aggregation, lack of weights) have been pointed out, domain-based top lists still are an important element of Internet measurement studies. In this paper we present the concept of prefix top lists, which provide insights from the importance of addresses of domain-based top lists, while ameliorating certain of their shortcomings. With prefix top lists we aggregate domain-based top lists into network prefixes and apply a Zipf distribution to provide weights to each prefix. We find that different domain-based top lists provide differentiated views on Internet prefixes. In addition, we observe very small weight changes over time. We leverage prefix top lists to conduct an evaluation of the DNS to classify the deployment quality of domains. We show that popular domains with name server recommendations for IPv4, but IPv6 compliance is still lacking. The Zipf weight aggregation allows us to create a single ranking for the providers of highly popular domains and providers used by many low ranked domains. Finally, we provide these enhanced and more stable prefix top lists to fellow researchers which can use them to obtain more representative measurement results.


Paper and Raw Data Download

Our paper has been accepted for IMC 2019.

Raw Data

We provide the data used in our paper on an archive server.
The prefixes which are referenced by non RFC-compliant zones (no topological diversity in the nameserver IP addresses) can be found in a subfolder.
Please cite this study when using the data.

Plots

Daily Zipf weight changes for domain and prefix top lists (PTLs)

Coverage runup of discovered prefixes by prefix top list (PTL) per IP version over time

Cumulative Zipf weight of new prefixes per prefix top list (PTL) over time

The jump on June 27, 2019 is caused by previously unseen prefixes for Wikipedia.

Missing Topological Diversity for Nameserver (RFC 2182)

In the following two chart IPv4 and IPv6 refer to zones with nameservers supporting the corresponding IP version and their compliance to be in two distinct norm prefixes. The ranks are normalized to 100% in order to make the lists comparable.

Domains ranked by domain-based toplist

The non-compliant domains ranked by domain toplists are about evenly spread across the whole list. In comparison, if they are ranked on their A/AAAA records prefix toplist rank, the first 20-30% contain only a small share of the total violating domains.

Domains ranked by prefix rank of their A/AAAA records

TOP 25 Object Ranks

  • Alexa
    • Alexa IPv4
    • Alexa IPv6
  • Majestic
    • Majestic IPv4
    • Majestic IPv6
  • Umbrella
    • Umbrella IPv4
    • Umbrella IPv6

Contact

naab [AT] net.in.tum.de

sattler [AT] net.in.tum.de

Measurement Platform

TUM's GINO project.