Programming a client for the WHOIS protocol

I have a little task that involves programmatically determining whether DNS servers are set correctly for a domain. Since this project is written in Python, I first set out to see if there were any “whois” clients already available for Python. I eventually found rwhois.py, which is a whois client with recursive ability. I noticed it hasn’t changed since 2003, but thought that if it works, that shouldn’t be much of a problem.

My first run of the program resulted in an error. The client successfully found the registrar information for my domain, but failed to parse and display it. There was a “NoParser for: whois.godaddy.com” exception. I set out to analyze the rwhois.py client and the whois protocol and see if I couldn’t either fix it or come up with something for a replacement.

First of all, let me explain what I learned about the whois protocol. You could of course, read the entire WHOIS protocol RFC, but I think a quick summary will do the trick. The Whois protocol is a basic text based protocol that operates over a tcp socket, much the way that a web browser sends and receives HTTP requests and responses from web servers. This protocol however, doesn’t include all the markup. You simply send the domain (on port 43) that you’d like information about to the WHOIS server and it sends back some information and then closes the connection.

Here is a quick example for Google:

> telnet whois.markmonitor.com 43
google.com
---> snip <---
Created on..............: 1997-Sep-15.
Expires on..............: 2011-Sep-14.
Record last updated on..: 2006-Dec-29 14:38:40.
---> snip <---

That seems simple enough. The trick is knowing which WHOIS server get the information from. In my example above, I queried whois.markmonitor.com. That is the domain registrant that Google used when they registered Google.com. Each top level domain (TLD) has an authoritative server that controls all the domains that fall under that TLD. My first thought was that there must be list of authoritative WHOIS servers somewhere on the Internet. I checked the “whois” command on my Linux machine to see if it was using a built in list or something else:


> strings `which whois` | grep whois
whois.corenic.net
whois.denic.de
whois.cat
whois.nic.ad.jp
whois.jprs.jp
whois.arin.net
whois.nic.mil
---> snipped <---

There were quit a lot of those. The whois client is using a list that is compiled in. Realizing that they have to update the program each time TLD information changes, I thought there must be another way to dynamically find an authoritive WHOIS server. I found that you can generally do a DNS lookup on _nicname._tcp.<the TLD> [1] to get the WHOIS server. I also found whois-servers.net, which allows you to do a DNS lookup on <the TLD>.whois-servers.net [2] and get pretty much the same thing.


> dig +short _nicname._tcp.uk srv
0 0 43 whois.nic.uk # worked (and the service is on port 43)
> dig +short _nicname._tcp.com srv
# uh oh, no WHOIS server found
> dig +short com.whois-servers.net
whois.verisign-grs.com.
199.7.51.74

Great, now you have a server to send the WHOIS query to. Notice that the server returned for the .com TLD is not the registrant that I used in the example above for Google.com. The authority for the TLD returns a summary record and tells you which registrant is used. Sometimes the summary information is enough. You have to follow the registrant information and make additional queries to find all the information however.

Example:


> telnet whois.verisign-grs.com 43
google.com
---> snip <---
GOOGLE.COM.BR
GOOGLE.COM.BEYONDWHOIS.COM
GOOGLE.COM.AU
GOOGLE.COM.ACQUIRED.BY.CALITEC.NET
GOOGLE.COM

To single out one record, look it up with "xxx", where xxx is one of the of the records displayed above. If the records are the same, look them up with "=xxx" to receive a full display for each record.

Uh oh, didn’t get the information I wanted back. Like the message says, you can use the equal sign, to denote an exact match on the domain name:


> telnet whois.verisign-grs.com 43
=google.com
--->snip<---
Domain Name: GOOGLE.COM
Registrar: MARKMONITOR INC.
Whois Server: whois.markmonitor.com
Referral URL: http://www.markmonitor.com
Name Server: NS1.GOOGLE.COM
Name Server: NS2.GOOGLE.COM
-->snip<---

There, if you want to pipe the output of your query through “grep Whois Server”, you can find the registrar that was used and make the additional WHOIS query to get the information you need. For me, I just need the name servers so I’m OK at this point.

The rwhois.py client uses this exact method to return whois information. It has two problems.

  1. It maintains a static list of whois servers for TLDs. This requires that it be updated frequently as new TLD are made available and as WHOIS servers change.
  2. It uses Registrant specific parsing code to display results. While this is handy, it is what caused it to fail for me since it hadn’t been updated to know how to parse information returned by godaddy.com’s whois server.

I’m going to create a simple client that does a DNS lookup to find the appropriate WHOIS server, and then sends “=<the domain>” and tests the response with a regular expression to find my name servers. I think that ought to solve my problems with the rwhois.py client. If you need additional information, like administrative or technical contacts, you’d need to implement the recursive functionality.

Hope you enjoyed!

References:
[1] http://www.circleid.com/posts/whois_server_address_registry/
[2] http://www.cctec.com/maillists/nanog/historical/9904/msg00217.html

This entry was posted in Programming, Web and tagged , , , , . Bookmark the permalink.

4 Responses to Programming a client for the WHOIS protocol

  1. justatomato says:

    Great article! I’m trying to write a whois lib for my python application too. Were you able to finish it, and if so, could I take a peek at the code?

  2. Dennis says:

    I had a few partial implementations to test about as much as I have written about. Unfortunately, the project I was working on while I wrote this piece was placed on the back burner. I’ll probably end up coming back to it at some point however. I’ll definitely post the code if/when I get something together.

  3. Oyunlar says:

    Did anyone solved to get godaddy’s detaied whois results?

  4. Dale Hubbard says:

    I use rs.internic.net:43 for my code. If returned, Godaddy’s port 43 server will sometimes refer you to their web interface. They also impose quotas per IP address. As described above, whois-verisign-grs.com is not a referrer and has no IP restrictions so you can hit that as often as you like for basic stuff like NS and expiration date.

Comments are closed.