It's a new year and time for some dumb data analysis.
Most interesting thing to me this year is that most of the traffic
to this site and my other sites (notably epcostello.net) is from
automated agents: search engines, random webcrawlers, SEO's link injectors.
| 155127 | 200 |
| 64662 | 304 |
| 11400 | 301 |
| 6352 | 302 |
| 5705 | 404 |
| 5308 | 202 |
| 2784 | 401 |
| 1609 | 405 |
| 126 | 400 |
| 63 | 414 |
| 31 | 500 |
| 30 | 403 |
| 3 | 501 |
| 4551 | 66.249.73.200 | crawl-66-249-73-200.googlebot.com |
| 2234 | 81.52.143.16 | natcrawlbloc03.net.m1.fti.net |
| 1890 | 81.52.143.15 | natcrawlbloc01.net.m1.fti.net. |
| 1492 | 64.152.34.36 | jfk-lv3-n4.panthercdn.com |
| 1380 | 38.99.203.110 | Panscient_Data_Services.demarc.cogentco.com |
| 991 | 128.194.135.94 | web-crawler.irl.cs.tamu.edu |
| 885 | 216.240.154.103 | |
| 884 | 66.249.73.148 | crawl-66-249-73-148.googlebot.com |
| 800 | 64.92.162.210 | |
| 773 | 72.30.177.225 | wm509310.inktomisearch.com. |
| 21547 | /robots.txt |
| 8030 | /favicon.ico |
| 6801 | /articles/2005/12/27/a_practically_u/ |
| 6062 | /articles/nav-commenters.gif |
| 6055 | /g/Google_logo_transparent.png |
| 5582 | /d/4/js/ajax/ |
| 5290 | /202/2006/06/disabling_trackbacks_in_movabl/ |
| 4445 | / |
| 4217 | /g/feed-icon-16×16.png |
| 3993 | /g/by-sa-3.0-88×31.png |
| 21547 | /robots.txt |
| 6801 | /articles/2005/12/27/a_practically_u/ |
| 5290 | /202/2006/06/disabling_trackbacks_in_movabl/ |
| 4445 | / |
| 2812 | /202/ |
| 2664 | /202/2006/12/google_reader_annoyances/ |
| 2121 | /202/2006/10/social-bookmarking-and-attention/ |
| 1278 | /202/2006/11/bloglines_new_features_playlis/ |
| 912 | /articles/ |
| 890 | /202/2006/07/yet_another_spam_retaliation_t/ |
| 2353 | crawl-66-249-73-200.googlebot.com | [66.249.73.200] | |
| 1404 | natcrawlbloc03.net.m1.fti.net | [81.52.143.16] | |
| 1192 | natcrawlbloc01.net.m1.fti.net | [81.52.143.15] | |
| 770 | wm509310.inktomisearch.com | [72.30.177.225] | |
| 736 | ct501085.crawl.yahoo.net | [74.6.86.230] | |
| 688 | wm509458.inktomisearch.com | [74.6.74.202] | |
| 498 | crawl-66-249-73-148.googlebot.com | [66.249.73.148] | |
| 496 | livebot-65-55-213-74.search.live.com | [65.55.213.74] | |
| 491 | wm508816.inktomisearch.com | [74.6.69.173] | |
| 342 | lj512274.crawl.yahoo.net | [74.6.19.77] | |
| 342 | wm511001.inktomisearch.com | [72.30.252.135] | |
| 327 | natcrawlbloc02.net.s1.fti.net [193.252.149.15] | ||
| 288 | lm502044.crawl.yahoo.net | [72.30.226.173] | |
| 262 | ct501101.crawl.yahoo.net | [74.6.86.207] | |
| 233 | 67.110.56.45.ptr.us.xo.net | [67.110.56.45] | |
| 224 | crawl-66-249-73-132.googlebot.com | [66.249.73.132] | |
| 208 | wm511565.inktomisearch.com | [72.30.226.209] | |
| 206 | c02.entireweb.com | [89.150.197.130] | |
| 199 | wm509426.inktomisearch.com | [74.6.75.46] | |
| 197 | ip67-95-51-86.z51-95-67.customer.algx.net | [67.95.51.86] | |
Last time robots.txt changed: 23 March 2007
| 97488 | "-" |
| 533 | "http://neworder.box.sk/forum.php?page=last&did=multSecurity%20and%20Networking&thread=251392" |
| 244 | "http://www.zenatode.org.uk/ian/internet/hotmail.xhtml" |
| 212 | "http://my.yahoo.com/" |
| 142 | "http://www.google.com/search?hl=en&q=phx.gbl" |
| 122 | "http://www.stumbleupon.com/refer.php?url=http%3A%2F%2Fartific.com%2Farticles%2F2005%2F12%2F27%2Fa_practically_u%2F" |
| 117 | "http://www.google.com/search?q=phx.gbl&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a" |
| 88 | "http://www.google.com/search?hl=en&q=phx.gbl&btnG=Google+Search" |
| 68 | "http://neworder.box.sk/forum.php?did=multSecurity%20and%20Networking&thread=251392" |
| 55 | "http://www.dslreports.com/shownews/DNS-Hacks-Phishing-20-90182" |
Internal referrers and obviously junk referrers have been filtered out.
| 1474 | q=phx.gbl |
| 260 | q=phx.gbl" |
| 122 | q=.phx.gbl |
| 107 | q=phx.gbl%3A1863 |
| 98 | q=phx%2egbl" |
| 76 | q=crawler.bloglines.com |
| 62 | q=phx.gbl+domain |
| 59 | q=gbl+domain |
| 53 | q=gbl+tld |
| 42 | q=%22phx.gbl%22 |
phx.gbl is a pseudo-domain used by Microsoft for a variety of services. I wrote about it in On The Importances of Revers DNS which I now realize is still using the previous design system for this site.
| 1474 | q=phx.gbl |
| 260 | q=phx.gbl" |
| 122 | q=.phx.gbl |
| 107 | q=phx.gbl%3A1863 |
| 62 | q=phx.gbl+domain |
| 42 | q=%22phx.gbl%22 |
| 30 | q=.phx.gbl%3A1863 |
| 29 | q=@phx.gbl |
| 26 | q=%40phx.gbl |
| 23 | q=phx.gbl+netstat |
| 22 | q=phx.gbl+1863 |
| 22 | q=.phx.gbl" |
| 19 | q=phx.gbl%3A1863" |
| 18 | q=by2msg2204708.phx.gbl |
| 17 | q=what+is+phx.gbl |
| 15 | q=netstat+phx.gbl |
| 15 | q=by1msg4176104.phx.gbl |
| 14 | q=phx.gbl+msn |
| 13 | q=by2msg2204912.phx.gbl |
| 13 | q=%22phx.gbl%22" |
| 76 | q=crawler.bloglines.com |
| 38 | q=artific |
| 16 | q=infobackground |
| 15 | q=ed+costello |
| 15 | q=207.46.108.36 |
| 15 | q=202+Accepted |
| 13 | q=crawler.bloglines.com" |
| 13 | q=207.46.111.86 |
| 13 | q=202+accepted |
| 11 | q=spam+retaliation |
| 11 | q=InfoBackground |
| 10 | q=google+reader+rename+folder |
| 10 | q=Reverse+DNS |
| 9 | q=kb05474 |
| 9 | q=importance+of+reverse+dns |
| 9 | q=crawler.bloglines.com+ |
| 8 | q=nokia+espionage |
| 8 | q=iab+ad+units |
| 8 | q=hotmail+reverse+dns |
| 7 | q=tvpath.com |
In March 2007 I wrote my own trackback endpoint in PHP which logs all of the trackback data to a file instead of beating up my MovableType installation and MySQL database.
Trackbacks Received since 21 March 2007: 10258
Number of Valid Trackbacks: 0
| 447 | 207-234-131-237.ptr.primarydns.com | [207.234.131.237] | |
| 156 | movinglabs.com [195.242.99.80] | ||
| 150 | u15250532.onlinehome-server.com [74.208.14.63] | ||
| 144 | 218.189.232.72.static.reverse.ltdomains.com | [72.232.189.218] | |
| 124 | [206.123.73.15] [206.123.73.15] | ||
| 122 | server.camelotwealthcreation.com | [69.50.210.8] | |
| 113 | giantlogic.net [208.101.35.52] | ||
| 99 | 89-149-195-161.internetserviceteam.com [89.149.195.161] | ||
| 96 | u15251680.onlinehome-server.com [74.208.14.215] | ||
| 95 | 210.219.232.72.static.reverse.ltdomains.com | [72.232.219.210] | |
| 169 | "Tramadol." |
| 151 | "Phentermine." |
| 119 | "Xanax." |
| 94 | "Cialis." |
| 62 | "Lexapro." |
| 56 | "Ephedra." |
| 52 | "Valium." |
| 52 | "Ultram." |
| 47 | "Zoloft." |
| 43 | "Ambien." |
| 42 | "Fioricet." |
| 37 | "Percocet." |
| 37 | "Cheapphentermine." |
| 37 | "Adderall." |
| 34 | "Soma." |
Posted in Webmastery
Copyright 2002–2008 Artific Consulting LLC.
Unless otherwise noted, content is licensed for reuse under the Creative Commons Attribution-ShareAlike 3.0 License.
Please read and understand the license before repurposing content from this site.