Domain Names for Authoritative DNS Servers

Not So Popular: A Sample of Domain
Names for Typical Web Sites
Paul Hoffman
DNS-OARC 34
February 2021
Motivation
When researchers measure the properties of the
authoritative DNS servers on the Internet, they
first need to define the types of authoritative
servers they are sampling
Collecting domain names used for web servers
is fairly easy, and is thus the basis of much
research on authoritative name servers
The current collections of domain names against
which one can do research are not that good for
assessing things about “typical” domain names
Most popular web sites
Extracts from the zone files of gTLDs
Dumps from passive DNS collection systems
2
External links on Wikipedia pages
Wikipedia has wikis in almost every language
External links go to a large variety of real but
not popular web pages
governments of small cities
colleges and universities of all sizes
obscure sports teams
small regional music and movie studios
personal sites of academics
Worldwide coverage
3
Collection and analysis
Retrieve the database of external links for
each language Wikipedia from a mirror of the
main Wikipedia site
Extract all the external links
Clean up the list of external links, limit to
“http:” and “https:”
For each remaining URL, strip off the scheme
and everything after the domain name
Cull the list of domain names so that only one
copy of each domain name remains
4
The dataset
750 databases were from 2020-01-01
After culling, 7.35 million unique domain
names in the dataset
Use a random sample of 100,000 from that
dataset for testing
Needed to start with more than 100,000
because many names could not be resolved
to an IPv4 address
5
Test results
17% also had IPv6 addresses
4% of those IPv4 addresses were signed with
DNSSEC
Also timed startup of TLS to help work in
DPRIVE
Short document from ICANN in OCTO
document series, hopefully within a few
months
More design detail, TCP/TLS numbers, more
considerations for what is “typical”, ...
6
Questions
...and comments
7
Slide Note
Embed
Share

Researchers need to accurately define the types of authoritative DNS servers they sample when measuring server properties. This study focuses on collecting domain names used for web servers to assess typical domain name characteristics, highlighting the importance of accurate data for research purposes. The process involves culling millions of unique domain names to create a dataset for analysis, testing, and obtaining results relating to IPv4 addresses, DNSSEC signatures, TLS startup times, and more design considerations for typical DNS servers. This research aims to enhance understanding and improve the assessment of authoritative name servers on the Internet.

  • Domain Names
  • DNS Servers
  • Research
  • Internet
  • Analysis

Uploaded on Sep 27, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Not So Popular: A Sample of Domain Names for Typical Web Sites Paul Hoffman DNS-OARC 34 February 2021

  2. Motivation When researchers measure the properties of the authoritative DNS servers on the Internet, they first need to define the types of authoritative servers they are sampling Collecting domain names used for web servers is fairly easy, and is thus the basis of much research on authoritative name servers The current collections of domain names against which one can do research are not that good for assessing things about typical domain names Most popular web sites Extracts from the zone files of gTLDs Dumps from passive DNS collection systems 2

  3. External links on Wikipedia pages Wikipedia has wikis in almost every language External links go to a large variety of real but not popular web pages governments of small cities colleges and universities of all sizes obscure sports teams small regional music and movie studios personal sites of academics Worldwide coverage 3

  4. Collection and analysis Retrieve the database of external links for each language Wikipedia from a mirror of the main Wikipedia site Extract all the external links Clean up the list of external links, limit to http: and https: For each remaining URL, strip off the scheme and everything after the domain name Cull the list of domain names so that only one copy of each domain name remains 4

  5. The dataset 750 databases were from 2020-01-01 After culling, 7.35 million unique domain names in the dataset Use a random sample of 100,000 from that dataset for testing Needed to start with more than 100,000 because many names could not be resolved to an IPv4 address 5

  6. Test results 17% also had IPv6 addresses 4% of those IPv4 addresses were signed with DNSSEC Also timed startup of TLS to help work in DPRIVE Short document from ICANN in OCTO document series, hopefully within a few months More design detail, TCP/TLS numbers, more considerations for what is typical , ... 6

  7. Questions ...and comments 7

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#