On The Importance of Reverse DNS

I was creating a cache file for webalizer and analog and came to the obvious optimization to select only hosts which received a "200" status code in response to a request (I suppose I could add 304s as well but I'm not sure that would add any value), and then strip out any hosts which made less than 10 requests in a 30 day period. This reduced the IP address list from about 3500 entries across the sites I maintain, to 470.

See updates below...

There appear to be about 140 unique organizations that hit the sites (mainly epcostello.net, frisket.org, and artific.com).

18% of the addresses do not reverse resolve.

Of the 387 addresses which did reverse resolve, 41 (11%) reverse resolve to an address which itself does not forward resolve to anything (that is: address 1.2.3.4 reverse resolves to something.example.com, but something.example.com itself does not resolve back to 1.2.3.4.

MSN has a number of hosts which reverse to a phx.gbl top level domain (64.4.8.113 through .118). That is to say, 64.4.8.113 reverse maps to by1sch4041904.phx.gbl, not .msn.com or .msn.net. I sent a note to the poc for the 64.4.8 network but it appeared to disappear into a black hole. Or they're intentionally reverse mapping to a nonsense domain.

Oddly, there's another 17 hosts which all reverse resolve to msnbot.msn.com, none of which are on the 64.4.8 network. That is to say: 17 hosts, across a number of networks and subnets, all reverse resolve to the same hostname, msnbot.msn.com, this hostname itself does not resolve to anything.

crawler.bloglines.com does not forward resolve to [65.214.39.151], though 65.214.39.151 resolves back to crawler.bloglines.com.

MSN seems to have the largest number of IPs and hostname mismatches or resolution failures.

One of the IBM gateways (I'm guessing in the Southbury, CT data center) reverse resolves [129.33.1.37] to bi01pt1.ct.us.ibm.com, which does not in turn resolve back to 129.33.1.37.

Nothing earth shattering here...most web sites turn off name resolution these days, doing it only in post-processing, or on a specific basis within an application. And no one who is remotely sane turns on HostnameLookups double in their server configurations.

Where it does come into play is if you are using hostnames in access control lists. Unlikely on a totally public site, but if you have a protected area, a semi-private extranet, and you add a Allow from .ibm.com, then anyone who's using that 129.33.1.37 gateway will get bounced, at least from Apache based servers since mod_access will perform a double lookup (at least according to the documentation).

I suppose there are other situations where you might use the hostname to allow access for search engine spiders, where otherwise you might require some other form of authentication (eg: set up a satisfy any block, add allow *.google.com, *.msn.com, *.yahoo.com and then a check for a cookie with a mix of Allow and SetEnvIf rules.

 

Update 8 July 2006

I noticed more search hits coming in to this article as well as some more comments and thought I'd post this update. I feel confident that the fake .phx.gbl top level domain name is being used by Microsoft/MSN, though I cannot understand why (if you're going to create PTR records, why not make them which your identifiable domain, since anyone can eventually determine who is assigned the address block).

I created the following table of addresses in the 64.4.8.0/255 network with the corresponding PTR records (reverse DNS records), this is a snapshot as of 0347Z on 8 July 2006.

If anyone from Microsoft network operations reads this, would you mind explaining why you're using a fake top level domain for your search engine robot? (It also appears to be used in some MSN/Hotmail mail headers, that's less of a concern or interest to me)

This is a table of the reverse DNS (PTR records) for addresses in the 64.4.8.0-64.4.8.255 range taken 7 July 2006. According to ARIN, the 64.4.0.0/18 network is assigned to Microsoft / Hotmail.

Most of the addresses do not reverse resolve to anything. Of those that do reverse resolve, only three [64.4.8.7, 64.4.8.8, 64.4.8.252] resolve to a hostname which in turn resolves to that IP address. One address [64.4.8.14] reverse resolves to oe.hotmail.com, which in turn resolves to [64.4.60.7].

This address block appears to be used by MSN's 'bot (msnbot.msn.com), based on my access logs. msnbot also appears to use addresses in the 65.55.235.0/255 network, which also resolve to the fake .phx.gbl top level domain.

IP Addresses Reverse DNS mappings
IP Addresses Reverse DNS mappings
[64.4.8.3] vlan300.bay-6nf-srch-4a.ntwk.msn.net
[64.4.8.4] vlan300.bay-6nf-srch-4b.ntwk.msn.net
[64.4.8.7] dc1.hotmail.com
[64.4.8.8] dc2.hotmail.com
[64.4.8.14] oe.hotmail.com
[64.4.8.110]by1sch4041901.phx.gbl
[64.4.8.111]by1sch4041902.phx.gbl
[64.4.8.112]by1sch4041903.phx.gbl
[64.4.8.113]by1sch4041904.phx.gbl
[64.4.8.114]by1sch4041905.phx.gbl
[64.4.8.115]by1sch4041906.phx.gbl
[64.4.8.116]by1sch4041907.phx.gbl
[64.4.8.117]by1sch4041908.phx.gbl
[64.4.8.118]by1sch4041909.phx.gbl
[64.4.8.119]by1sch4041910.phx.gbl
[64.4.8.120]by1sch4041911.phx.gbl
[64.4.8.121]by1sch4041912.phx.gbl
[64.4.8.122]by1sch4041913.phx.gbl
[64.4.8.123]by1sch4041914.phx.gbl
[64.4.8.124]by1sch4041915.phx.gbl
[64.4.8.125]by1sch4041916.phx.gbl
[64.4.8.126]by1sch4041917.phx.gbl
[64.4.8.127]by1sch4041918.phx.gbl
[64.4.8.128]by1sch4041919.phx.gbl
[64.4.8.129]by1sch4041920.phx.gbl
[64.4.8.130]by1sch4040801.phx.gbl
[64.4.8.131]by1sch4040802.phx.gbl
[64.4.8.132]by1sch4040803.phx.gbl
[64.4.8.133]by1sch4040804.phx.gbl
[64.4.8.134]by1sch4040805.phx.gbl
[64.4.8.135]by1sch4040806.phx.gbl
[64.4.8.136]by1sch4040807.phx.gbl
[64.4.8.137]by1sch4040808.phx.gbl
[64.4.8.138]by1sch4040809.phx.gbl
[64.4.8.139]by1sch4040810.phx.gbl
[64.4.8.140]by1sch4040811.phx.gbl
[64.4.8.141]by1sch4040812.phx.gbl
[64.4.8.142]by1sch4040813.phx.gbl
[64.4.8.143]by1sch4040814.phx.gbl
[64.4.8.144]by1sch4040815.phx.gbl
[64.4.8.145]by1sch4040816.phx.gbl
[64.4.8.146]by1sch4040817.phx.gbl
[64.4.8.147]by1sch4040818.phx.gbl
[64.4.8.148]by1sch4040819.phx.gbl
[64.4.8.149]by1sch4040820.phx.gbl
[64.4.8.212]by1sch40408ms.phx.gbl
[64.4.8.215]by1sch40419dg.phx.gbl
[64.4.8.216]by1sch40419ms.phx.gbl
[64.4.8.220]by1sch40301dg.phx.gbl
[64.4.8.221]by1sch40302dg.phx.gbl
[64.4.8.222]by1sch40408dg.phx.gbl
[64.4.8.223]by1sch40304dg.phx.gbl
[64.4.8.224]by1sch40305dg.phx.gbl
[64.4.8.225]by1sch40306dg.phx.gbl
[64.4.8.226]by1sch40307dg.phx.gbl
[64.4.8.227]by1sch40308dg.phx.gbl
[64.4.8.228]by1sch40309dg.phx.gbl
[64.4.8.229]by1sch40310dg.phx.gbl
[64.4.8.252]search.msn-int-tr.com

Update: September 6, 2006

  • crawler.bloglines.com now forward and reverse resolves to 65.214.44.29.

Posted in Webmastery

Archives

202: Accepted Archives

Feed icon We use Feedburner to distribute our web feeds: 202 Accepted Feed

feedburner graphic
Google

Copyright 2002–2011 Artific Consulting LLC.

Unless otherwise noted, content is licensed for reuse under the Creative Commons Attribution-ShareAlike 3.0 License. Please read and understand the license before repurposing content from this site.