Student's Summer Research Simplifies Domain Name Queries

July 15, 2010

When addressing a letter, you follow a standard format: name, street address, city, state, zip. But what if there were no rules to follow? What if you wrote the zip code first, or the street number after the state? The postal service would need to analyze each piece of mail separately, a time-consuming and inefficient task with high probability for human error.

Analogically, that's how security researchers and IT technicians feel when they query domain names (everything after the www. in a web address) to investigate cyber violations, such as phishing or spam. However, Yu-Lo Su, an INI Master of Science in Information Technology - Information Security (MSIT-IS) student from Taiwan, is spending his summer trying to change that.

Supervised by Professor Nicolas Christin, INI associate director and faculty member, Su is developing a module that takes the raw data from a domain name query and presents the information in a standard format.

Through a query/response protocol called WHOIS, researchers and technicians can find information about a domain name's owner, or 'registrant'. This information can include the registrant and IT technician's contact information, billing information, and the domain's start and expiration dates.

However, current query methods are inconsistent in the type, order and format of output information. Su's module will standardize this output data, making the information more accessible and user-friendly. According to Su, a consistent output format will immensely help security researchers who commonly conduct batch processing, querying large numbers of domain names simultaneously.

"Imagine querying 1,000 different domain names at the same time, and receiving a different field name for each entry, even though all fields describe the same type of information," Su said. "For example, 'Updated On', 'Last Updated', and 'Domain Last Updated Date' all refer to the same thing, but imagine sorting through 1,000 different ways to list that information. Wouldn't that be troublesome?"

To develop his module, Su is building upon an existing module from the Comprehensive Perl Archive Network (CPAN). With help from his sponsor at the Internet Corporation of Assigned Names and Numbers (ICANN) and his INI classroom knowledge, Su is using the programming language Perl to extend the current module’s capabilities.

Su will continue his work through the summer and possibly into the fall, until he graduates in December 2010. When his module is complete, Su will publicly post it on CPAN to benefit ICANN and other organizations, which can then implement the module or continue to build its functionality. With time permitting, Su hopes to further extend his module to benefit global users as well.

"Right now, this module primarily concentrates on U.S. domain names," Su said. "Eventually, it should encompass international domains as well to truly benefit everyone."