I was creating a cache file for webalizer and analog and came to the obvious optimization to select only hosts which received a "200" status code in response to a request (I suppose I could add 304s as well but I'm not sure that would add any value), and then strip out any hosts which made less than 10 requests in a 30 day period. This reduced the IP address list from about 3500 entries across the sites I maintain, to 470.
There appear to be about 140 unique organizations that hit the sites (mainly epcostello.net, frisket.org, and artific.com).
18% of the addresses do not reverse resolve.
Of the 387 addresses which did reverse resolve, 41 (11%) reverse resolve to an address which itself does not forward resolve to anything (that is: address 1.2.3.4 reverse resolves to something.example.com, but something.example.com itself does not resolve back to 1.2.3.4.
MSN
has a number of hosts which reverse to a phx.gbl top level domain (64.4.8.113 through .118). That is to say, 64.4.8.113 reverse maps to by1sch4041904.phx.gbl, not .msn.com or .msn.net.
I sent a note to the poc for the 64.4.8 network but it appeared to disappear into a black hole.
Or they're intentionally reverse mapping to a nonsense domain.
Oddly, there's another 17 hosts which all reverse resolve to msnbot.msn.com, none of which are on the 64.4.8 network.
That is to say: 17 hosts, across a number of networks and subnets, all reverse resolve to the same hostname, msnbot.msn.com, this hostname itself does not resolve to anything.
crawler.bloglines.com does not forward resolve to [65.214.39.151], though 65.214.39.151 resolves back to crawler.bloglines.com.
MSN seems to have the largest number of IPs and hostname mismatches or resolution failures.
One of the IBM gateways (I'm guessing in the Southbury, CT data center) reverse resolves [129.33.1.37] to bi01pt1.ct.us.ibm.com, which does not in turn resolve back to 129.33.1.37.
Nothing earth shattering here...most web sites turn off name resolution these days, doing it only in post-processing, or on a specific basis within an application.
And no one who is remotely sane turns on HostnameLookups double in their server configurations.
Where it does come into play is if you are using hostnames in access control lists.
Unlikely on a totally public site, but if you have a protected area, a semi-private extranet, and you add a Allow from .ibm.com, then anyone who's using that 129.33.1.37 gateway will get bounced, at least from Apache based servers since mod_access will perform a double lookup (at least according to the documentation).
I suppose there are other situations where you might use the hostname to allow access for search engine spiders, where otherwise you might require some other form of authentication (eg: set up a satisfy any block, add allow *.google.com, *.msn.com, *.yahoo.com and then a check for a cookie with a mix of Allow and SetEnvIf rules.
I noticed more search hits coming in to this article as well as some more comments and thought I'd post this update. I feel confident that the fake .phx.gbl top level domain name is being used by Microsoft/MSN, though I cannot understand why (if you're going to create PTR records, why not make them which your identifiable domain, since anyone can eventually determine who is assigned the address block).
I created the following table of addresses in the 64.4.8.0/255 network with the corresponding PTR records (reverse DNS records), this is a snapshot as of 0347Z on 8 July 2006.
If anyone from Microsoft network operations reads this, would you mind explaining why you're using a fake top level domain for your search engine robot? (It also appears to be used in some MSN/Hotmail mail headers, that's less of a concern or interest to me)
| IP Addresses | Reverse DNS mappings |
|---|---|
| IP Addresses | Reverse DNS mappings |
| [64.4.8.3] | vlan300.bay-6nf-srch-4a.ntwk.msn.net |
| [64.4.8.4] | vlan300.bay-6nf-srch-4b.ntwk.msn.net |
| [64.4.8.7] | dc1.hotmail.com |
| [64.4.8.8] | dc2.hotmail.com |
| [64.4.8.14] | oe.hotmail.com |
| [64.4.8.110] | by1sch4041901.phx.gbl |
| [64.4.8.111] | by1sch4041902.phx.gbl |
| [64.4.8.112] | by1sch4041903.phx.gbl |
| [64.4.8.113] | by1sch4041904.phx.gbl |
| [64.4.8.114] | by1sch4041905.phx.gbl |
| [64.4.8.115] | by1sch4041906.phx.gbl |
| [64.4.8.116] | by1sch4041907.phx.gbl |
| [64.4.8.117] | by1sch4041908.phx.gbl |
| [64.4.8.118] | by1sch4041909.phx.gbl |
| [64.4.8.119] | by1sch4041910.phx.gbl |
| [64.4.8.120] | by1sch4041911.phx.gbl |
| [64.4.8.121] | by1sch4041912.phx.gbl |
| [64.4.8.122] | by1sch4041913.phx.gbl |
| [64.4.8.123] | by1sch4041914.phx.gbl |
| [64.4.8.124] | by1sch4041915.phx.gbl |
| [64.4.8.125] | by1sch4041916.phx.gbl |
| [64.4.8.126] | by1sch4041917.phx.gbl |
| [64.4.8.127] | by1sch4041918.phx.gbl |
| [64.4.8.128] | by1sch4041919.phx.gbl |
| [64.4.8.129] | by1sch4041920.phx.gbl |
| [64.4.8.130] | by1sch4040801.phx.gbl |
| [64.4.8.131] | by1sch4040802.phx.gbl |
| [64.4.8.132] | by1sch4040803.phx.gbl |
| [64.4.8.133] | by1sch4040804.phx.gbl |
| [64.4.8.134] | by1sch4040805.phx.gbl |
| [64.4.8.135] | by1sch4040806.phx.gbl |
| [64.4.8.136] | by1sch4040807.phx.gbl |
| [64.4.8.137] | by1sch4040808.phx.gbl |
| [64.4.8.138] | by1sch4040809.phx.gbl |
| [64.4.8.139] | by1sch4040810.phx.gbl |
| [64.4.8.140] | by1sch4040811.phx.gbl |
| [64.4.8.141] | by1sch4040812.phx.gbl |
| [64.4.8.142] | by1sch4040813.phx.gbl |
| [64.4.8.143] | by1sch4040814.phx.gbl |
| [64.4.8.144] | by1sch4040815.phx.gbl |
| [64.4.8.145] | by1sch4040816.phx.gbl |
| [64.4.8.146] | by1sch4040817.phx.gbl |
| [64.4.8.147] | by1sch4040818.phx.gbl |
| [64.4.8.148] | by1sch4040819.phx.gbl |
| [64.4.8.149] | by1sch4040820.phx.gbl |
| [64.4.8.212] | by1sch40408ms.phx.gbl |
| [64.4.8.215] | by1sch40419dg.phx.gbl |
| [64.4.8.216] | by1sch40419ms.phx.gbl |
| [64.4.8.220] | by1sch40301dg.phx.gbl |
| [64.4.8.221] | by1sch40302dg.phx.gbl |
| [64.4.8.222] | by1sch40408dg.phx.gbl |
| [64.4.8.223] | by1sch40304dg.phx.gbl |
| [64.4.8.224] | by1sch40305dg.phx.gbl |
| [64.4.8.225] | by1sch40306dg.phx.gbl |
| [64.4.8.226] | by1sch40307dg.phx.gbl |
| [64.4.8.227] | by1sch40308dg.phx.gbl |
| [64.4.8.228] | by1sch40309dg.phx.gbl |
| [64.4.8.229] | by1sch40310dg.phx.gbl |
| [64.4.8.252] | search.msn-int-tr.com |
crawler.bloglines.com now forward and reverse resolves to 65.214.44.29.Posted in
Copyright 2002–2008 Artific Consulting LLC.
Unless otherwise noted, content is licensed for reuse under the Creative Commons Attribution-ShareAlike 3.0 License.
Please read and understand the license before repurposing content from this site.
Comments
Comments are hosted through disqus effective November 2008.
From: Steven
Date: May 21, 2006 03:08 AM
From: KenCinJapan
Date: June 28, 2006 07:19 PM
From: menneke
Date: July 2, 2006 01:17 PM
In my web log files the domain sticks out because it has approximately the same number of page accesses as hits, a typical sign of a robot/spider.
From: OtisB
Date: July 13, 2006 03:26 PM
From: Nemo Noman
Date: September 6, 2006 06:33 AM
From: meinhard
Date: September 24, 2006 11:34 AM
hello phoenix dot global hunters, ;)
this is from the access_log of a new domain registered ten days ago:
..a crawler it is alright, but it does not follow any of the links and this was the only visit so far. about the odd tld, maybe ms corp is trying to introduce a new tld by itself?
what happened to steven's media campaign? i proposed that the issue should be included in wikipedia: http://en.wikipedia.org/wiki/Talk:Alternative_DNS_root
cheers
From: ErwinvA
Date: October 13, 2006 09:13 PM
For Example, when I'm logged in to MSN I see this in my netstat:
TCP erwin:4484 by2msg2204912.phx.gbl:1863 ESTABLISHED
When I ask a 'netstat -an' I get this:
TCP 192.168.123.200:4484 207.46.111.86:1863 ESTABLISHED
When I lookup the IP via nslookup I get this:
Name: by2msg2204912.phx.gbl
Address: 207.46.111.86
My question remains; why? Why would Microsoft do this? It also comes back in every Hotmail E-mail header..
Greets,
Erwin
From: Kenyon
Date: October 26, 2006 07:46 PM
From: James
Date: November 20, 2006 10:16 AM
From: hermes lopez
Date: November 22, 2006 07:00 PM
From: Orb
Date: November 28, 2006 07:20 PM
From: Orvago
Date: December 4, 2006 09:03 PM
From: Rockdrala
Date: January 19, 2007 06:17 PM
From: Alejandro
Date: February 26, 2007 02:13 PM