I was creating a cache file for webalizer and analog and came to the obvious optimization to select only hosts which received a "200" status code in response to a request (I suppose I could add 304s as well but I'm not sure that would add any value), and then strip out any hosts which made less than 10 requests in a 30 day period. This reduced the IP address list from about 3500 entries across the sites I maintain, to 470.

See updates below...

There appear to be about 140 unique organizations that hit the sites (mainly epcostello.net, frisket.org, and artific.com).

18% of the addresses do not reverse resolve.

Of the 387 addresses which did reverse resolve, 41 (11%) reverse resolve to an address which itself does not forward resolve to anything (that is: address 1.2.3.4 reverse resolves to something.example.com, but something.example.com itself does not resolve back to 1.2.3.4.

MSN has a number of hosts which reverse to a phx.gbl top level domain (64.4.8.113 through .118). That is to say, 64.4.8.113 reverse maps to by1sch4041904.phx.gbl, not .msn.com or .msn.net. I sent a note to the poc for the 64.4.8 network but it appeared to disappear into a black hole. Or they're intentionally reverse mapping to a nonsense domain.

Oddly, there's another 17 hosts which all reverse resolve to msnbot.msn.com, none of which are on the 64.4.8 network. That is to say: 17 hosts, across a number of networks and subnets, all reverse resolve to the same hostname, msnbot.msn.com, this hostname itself does not resolve to anything.

crawler.bloglines.com does not forward resolve to [65.214.39.151], though 65.214.39.151 resolves back to crawler.bloglines.com.

MSN seems to have the largest number of IPs and hostname mismatches or resolution failures.

One of the IBM gateways (I'm guessing in the Southbury, CT data center) reverse resolves [129.33.1.37] to bi01pt1.ct.us.ibm.com, which does not in turn resolve back to 129.33.1.37.

Nothing earth shattering here...most web sites turn off name resolution these days, doing it only in post-processing, or on a specific basis within an application. And no one who is remotely sane turns on HostnameLookups double in their server configurations.

Where it does come into play is if you are using hostnames in access control lists. Unlikely on a totally public site, but if you have a protected area, a semi-private extranet, and you add a Allow from .ibm.com, then anyone who's using that 129.33.1.37 gateway will get bounced, at least from Apache based servers since mod_access will perform a double lookup (at least according to the documentation).

I suppose there are other situations where you might use the hostname to allow access for search engine spiders, where otherwise you might require some other form of authentication (eg: set up a satisfy any block, add allow *.google.com, *.msn.com, *.yahoo.com and then a check for a cookie with a mix of Allow and SetEnvIf rules.

§

Update 8 July 2006

I noticed more search hits coming in to this article as well as some more comments and thought I'd post this update. I feel confident that the fake .phx.gbl top level domain name is being used by Microsoft/MSN, though I cannot understand why (if you're going to create PTR records, why not make them which your identifiable domain, since anyone can eventually determine who is assigned the address block).

I created the following table of addresses in the 64.4.8.0/255 network with the corresponding PTR records (reverse DNS records), this is a snapshot as of 0347Z on 8 July 2006.

If anyone from Microsoft network operations reads this, would you mind explaining why you're using a fake top level domain for your search engine robot? (It also appears to be used in some MSN/Hotmail mail headers, that's less of a concern or interest to me)

This is a table of the reverse DNS (PTR records) for addresses in the 64.4.8.0-64.4.8.255 range taken 7 July 2006. According to ARIN, the 64.4.0.0/18 network is assigned to Microsoft / Hotmail.

Most of the addresses do not reverse resolve to anything. Of those that do reverse resolve, only three [64.4.8.7, 64.4.8.8, 64.4.8.252] resolve to a hostname which in turn resolves to that IP address. One address [64.4.8.14] reverse resolves to oe.hotmail.com, which in turn resolves to [64.4.60.7].

This address block appears to be used by MSN's 'bot (msnbot.msn.com), based on my access logs. msnbot also appears to use addresses in the 65.55.235.0/255 network, which also resolve to the fake .phx.gbl top level domain.

IP Addresses Reverse DNS mappings
IP Addresses Reverse DNS mappings
[64.4.8.3] vlan300.bay-6nf-srch-4a.ntwk.msn.net
[64.4.8.4] vlan300.bay-6nf-srch-4b.ntwk.msn.net
[64.4.8.7] dc1.hotmail.com
[64.4.8.8] dc2.hotmail.com
[64.4.8.14] oe.hotmail.com
[64.4.8.110]by1sch4041901.phx.gbl
[64.4.8.111]by1sch4041902.phx.gbl
[64.4.8.112]by1sch4041903.phx.gbl
[64.4.8.113]by1sch4041904.phx.gbl
[64.4.8.114]by1sch4041905.phx.gbl
[64.4.8.115]by1sch4041906.phx.gbl
[64.4.8.116]by1sch4041907.phx.gbl
[64.4.8.117]by1sch4041908.phx.gbl
[64.4.8.118]by1sch4041909.phx.gbl
[64.4.8.119]by1sch4041910.phx.gbl
[64.4.8.120]by1sch4041911.phx.gbl
[64.4.8.121]by1sch4041912.phx.gbl
[64.4.8.122]by1sch4041913.phx.gbl
[64.4.8.123]by1sch4041914.phx.gbl
[64.4.8.124]by1sch4041915.phx.gbl
[64.4.8.125]by1sch4041916.phx.gbl
[64.4.8.126]by1sch4041917.phx.gbl
[64.4.8.127]by1sch4041918.phx.gbl
[64.4.8.128]by1sch4041919.phx.gbl
[64.4.8.129]by1sch4041920.phx.gbl
[64.4.8.130]by1sch4040801.phx.gbl
[64.4.8.131]by1sch4040802.phx.gbl
[64.4.8.132]by1sch4040803.phx.gbl
[64.4.8.133]by1sch4040804.phx.gbl
[64.4.8.134]by1sch4040805.phx.gbl
[64.4.8.135]by1sch4040806.phx.gbl
[64.4.8.136]by1sch4040807.phx.gbl
[64.4.8.137]by1sch4040808.phx.gbl
[64.4.8.138]by1sch4040809.phx.gbl
[64.4.8.139]by1sch4040810.phx.gbl
[64.4.8.140]by1sch4040811.phx.gbl
[64.4.8.141]by1sch4040812.phx.gbl
[64.4.8.142]by1sch4040813.phx.gbl
[64.4.8.143]by1sch4040814.phx.gbl
[64.4.8.144]by1sch4040815.phx.gbl
[64.4.8.145]by1sch4040816.phx.gbl
[64.4.8.146]by1sch4040817.phx.gbl
[64.4.8.147]by1sch4040818.phx.gbl
[64.4.8.148]by1sch4040819.phx.gbl
[64.4.8.149]by1sch4040820.phx.gbl
[64.4.8.212]by1sch40408ms.phx.gbl
[64.4.8.215]by1sch40419dg.phx.gbl
[64.4.8.216]by1sch40419ms.phx.gbl
[64.4.8.220]by1sch40301dg.phx.gbl
[64.4.8.221]by1sch40302dg.phx.gbl
[64.4.8.222]by1sch40408dg.phx.gbl
[64.4.8.223]by1sch40304dg.phx.gbl
[64.4.8.224]by1sch40305dg.phx.gbl
[64.4.8.225]by1sch40306dg.phx.gbl
[64.4.8.226]by1sch40307dg.phx.gbl
[64.4.8.227]by1sch40308dg.phx.gbl
[64.4.8.228]by1sch40309dg.phx.gbl
[64.4.8.229]by1sch40310dg.phx.gbl
[64.4.8.252]search.msn-int-tr.com

Update: September 6, 2006

Technorati tags:

Comments

Comments are moderated, and are closed after 28 days. Syndicate comments on this entry: Standard Feed Icon from feedicons.com

From: Steven
Date: May 21, 2006 03:08 AM

Many people on the Internet have experienced strange things with those "xxxxx.phx.gbl"-domains, as part of sender-headers in email.

What some of my friends, my wife and me much more concerns is the fact that chat-messages, using the MSN-messenger protocol run through .phx.gbl domains in realtime, somekind of gateways, we think. It isn't a matter of chatsoftware you'e using (i.e. Miranda or the original MSN-messenger client or some other hybrid-chattools), that's for sure, as well as the fact that those strange things became actual since late 2005. The earliest post I've found about this topic is from November 2005, and I have researched this very, very carefully (google, alltheweb, yahoo, several meta-searchengines) ect..

.PHX.GBL or just .GBL (in this special case) doesn't seem to be a valid top level domain (TLD) (!)

There's just quite a lot of people out there, who are asking questions about it, since a couple of months. I have found some postings that assume .phx.gbl to have something to do with spiders, msn-bots or spam .. (?) Many have tried to trace back .gbl-origins, just got errors and servers, belonging to the .ARPA network as results in between themselves and .phx.gbl

Whatever covers behind those domain(s), having its origin somewhere in the 64.-IP range, I find it should clearly described what this is good for, why it is used, where and when exactly it has been registered.

Is .gbl a short version of "global"?

Who is the owner? MSN or some contractor?

Why doesn't have ANYBODY in the world written longer articles on the Internet about this already, not in german, english, french or spanish? .. As I have mentioned: I checked this out carefully.

If there's no satisfying answers to all this SOON I'm going to set up a whole website about this and write at least 100 articles, starting on the world's largest online newspaper-forums in 4 different languages. Feel free and email me if you found out more.

Greets,
Steven. (May 20, 2005)

From: KenCinJapan
Date: June 28, 2006 07:19 PM

Interesting post!! I bumped into when trying to find out what PHX.GBL was. Like the person that commented, I am surprised by the lack of info out there on this.

From: menneke
Date: July 2, 2006 01:17 PM

I'm really getting intrigued by this phx.gbl too. The earliest reference I found on the net is from a posting on http://www.experts-exchange.com/Networking/Q_21482770.html, but that is a paid service so I didn't get to see any answers. As you all found out already the domain seems to be used a lot in references to newsgroup and mailing list articles.

In my web log files the domain sticks out because it has approximately the same number of page accesses as hits, a typical sign of a robot/spider.

From: OtisB
Date: July 13, 2006 03:26 PM

I've also seen phx.gbl in message ids of messages sent through hotmail, eg:

Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Thu, 13 Jul 2006 02:33:47 -0700
Message-ID: <BAY103-F4F8E2BA80472462866982AC6E0@phx.gbl>
Received: from nnn.nnn.nnn.nnn by by103fd.bay103.hotmail.msn.com with HTTP;
	Thu, 13 Jul 2006 09:33:43 GMT

From: Nemo Noman
Date: September 6, 2006 06:33 AM

I track visits to my website pages by IP on a page by page basis. A number of phx.gbl IP record views: usually at a rate of one page per second. But the activity doesn't seem to be crawling.No following links, nor visiting new pages as opposed to old.

Glad to know I'm not the only one puzzled by this MS domain weirdness.

From: meinhard
Date: September 24, 2006 11:34 AM

hello phoenix dot global hunters, ;)

this is from the access_log of a new domain registered ten days ago:

bl1sch4061111.phx.gbl - - [24/Sep/2006:03:41:40 +0200] "GET /robots.txt HTTP/1.0
" 200 - "-" "msnbot-media/1.0 (+http://search.msn.com/msnbot.htm)"
bl1sch4061111.phx.gbl - - [24/Sep/2006:03:41:41 +0200] "GET / HTTP/1.0" 200 1165
7 "-" "msnbot-media/1.0 (+http://search.msn.com/msnbot.htm)"

..a crawler it is alright, but it does not follow any of the links and this was the only visit so far. about the odd tld, maybe ms corp is trying to introduce a new tld by itself?

what happened to steven's media campaign? i proposed that the issue should be included in wikipedia: http://en.wikipedia.org/wiki/Talk:Alternative_DNS_root

cheers

From: ErwinvA
Date: October 13, 2006 09:13 PM

It looks like Microsoft has been toying with their DNS Resolving :-)



For Example, when I'm logged in to MSN I see this in my netstat:

TCP erwin:4484 by2msg2204912.phx.gbl:1863 ESTABLISHED



When I ask a 'netstat -an' I get this:

TCP 192.168.123.200:4484 207.46.111.86:1863 ESTABLISHED



When I lookup the IP via nslookup I get this:


Name: by2msg2204912.phx.gbl

Address: 207.46.111.86



My question remains; why? Why would Microsoft do this? It also comes back in every Hotmail E-mail header..



Greets,

Erwin

From: Kenyon [TypeKey Profile Page]
Date: October 26, 2006 07:46 PM

Looks like they're changed now, but I still see phx.gbl in hotmail email headers.
113.8.4.64.in-addr.arpa domain name pointer livebot-64-4-8-113.search.live.com.

From: James
Date: November 20, 2006 10:16 AM

I've seen this too - caught MSN Messenger connecting to one of these babies:

$ host 207.46.108.36
Name: by1msg4176104.phx.gbl
Address: 207.46.108.36

$ whois 207.46.108.36

OrgName: Microsoft Corp
OrgID: MSFT
Address: One Microsoft Way
City: Redmond
StateProv: WA
PostalCode: 98052
Country: US

NetRange: 207.46.0.0 - 207.46.255.255
CIDR: 207.46.0.0/16
NetName: MICROSOFT-GLOBAL-NET

etc. etc.

From: hermes lopez
Date: November 22, 2006 07:00 PM

I think the reason is to strenght the difficulty to block their addresses in order to control their services like msn messenger.

From: Orb
Date: November 28, 2006 07:20 PM

Here's the line from my netstat...
TCP puter:1413 by2msg2204708.phx.gbl:1863 ESTABLISHED

This occurs whenever I login to MSN Messenger. I also get connected to..

TCP puter:1440 host-65-119-205-137.tvpath.com:http CLOSE_WAIT

Which was causing me some concern because who is tvpath.com. I checked whois.sc and don't understand the connection between tvpath and microsoft messenger but whenever i connected with messenger it wants to connect. so i entered 127.0.0.1 tvpath.com in the host file. see if that prevents it from connecting.

this is quite the ???

From: Orvago
Date: December 4, 2006 09:03 PM

Hi, i am looking into the phx.gbl domain too.

Just for reference, http://www.experts-exchange.com/Networking/Q_21482770.html doesn't give any answer. They use to have good answers, but here they ended up asserting it was fake, as there isn't a root record for this domain.

There is very little information about it. Yours is one of the little. Thank you. Worse, there's a lot of junk on searches as it is appearing on mail headers of not-related topic.

One answer suggested on another page was it being a fake TLD used on the internal msn network (thus no resolving from outside). However, i wonder assuming it's right why they chose this domain, instead of another much clearer and shorter like .msn

And if .gbl means global, what does phl stand for? People Hating Love? If using an inexistant TLD, why make it a second level gbl? Might be that they want to create it using the method Microsoft has used for so many standards: do it, then present it as standard ? Would be quite odd...

PS: Please send me the news / that website.

From: Rockdrala
Date: January 19, 2007 06:17 PM

These address are being used Turkish Hackers to hack websites. I have confirmation, even managed to get the root of one of the hacker and now have his msn addy, I am contacting microsoft as we speak, These addresses do not belong to microsoft, Im assuming these script kiddies have there own lil phone system rig, thus the oddball pbx naming convention.

Beware of BodyguarD@msn.com

found on ips

by1msg4082317.phx.gbl:1863
by1msg4276308.phx.gbl.1863

by2msg2132816.phx.gbl:1863
by1msg4082115.phx.gbl:1863
by1msg4276308.phx.gbl:1863

using port 1863 i was able to confrim the tuskish hacker.

He uses a cookie sploit to access php sql driven sites to insert his own admin information, while the table was easy enough to truncate and remake a super admin, hes affected over 40 websites.

Archives

Site navigation

Artific Industries Essays Archives

:
:

Feeds

We use Feedburner to distribute our web feeds:

Google
Web artific.com