Assertivenet Spider is Gigablast Spider
Introduction and Purpose
The objective of this newsletter is to supply proof and data to counteract the advice that Assertivenet is probably used for malicious functions.
On Saturday, March 11, 2006, I gained a moderately pressing phone name from a shopper of mine, Hibiscus Florals (www.hibiscusflorals.com). The proprietor, Mark Morkowski, was once involved as a result of he were reviewing his web site site visitors statistics and had spotted that at a large number of issues all over the day, a consumer or spider from “ASSERTIVENET” (IP 220.127.116.11) had visited the Hibiscus web site.
Since this was once relatively odd, Mark elected to research additional by way of on the lookout for additional info “Assertivenet” by means of the Google seek engine. The first 3 effects that he discovered seem beneath:
From this knowledge, Mark and I accumulated that the landlord of the spider in query seems to be an organization referred to as Assertive Networks, and hosted via an organization referred to as “BC Hosting.” More data was once no longer in an instant to be had.
It is that this lack of awareness that most likely led probably the most participants of the PowerBASIC boards to dam the IP vary 66.154.* from getting access to their more than a few internet sites, and justifiably so. But this identical lack of awareness ended in further questions:
- What recordsdata was once the Assertivenet spider getting access to/looking to get right of entry to? Was the spider crawling pages or, like some bots, was once it on the lookout for particular recordsdata which may be used for malicious functions (e.g. recordsdata and scripts which may be manipulated for web site assaults?)
- Why is the obvious proprietor of the Assertivenet spider a internet web hosting corporate (BC Hosting)?
- What is the supposed objective of the Assertivenet spider?
Additional Research – All Is Not As It Appears
At this level, I made up our minds to appear past what the web site site visitors statistics published, in addition to the guidelines that Mark’s preliminary seek published. I had to get started by way of answering the questions I posed previous, and so as to take action, I had to get right of entry to the uncooked log recordsdata for the Hibiscus web site.
I spread out the log recordsdata, looked for the specific IPs in query, and located a chain of entries comparable to those:
2006-03-11 03:47:34 18.104.22.168 – 22.214.171.124 80 GET /robots.txt – 200 0 400 285 78 HTTP/1.0 http://www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html –
2006-03-11 03:47:34 126.96.36.199 – 188.8.131.52 80 GET /larger_image.asp PID=215 200 Zero 0 299 125 HTTP/1.0 http://www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html –
2006-03-11 03:50:37 184.108.40.206 – 220.127.116.11 80 GET /larger_image.asp PID=195 200 Zero 0 299 31 HTTP/1.0 http://www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html –
2006-03-11 07:47:05 18.104.22.168 – 22.214.171.124 80 GET /robots.txt – 200 0 400 285 78 HTTP/1.0 http://www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html –
2006-03-11 07:47:05 126.96.36.199 – 188.8.131.52 80 GET /larger_image.asp PID=219 200 Zero 0 299 109 HTTP/1.0 http://www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html –
The spider on this case in reality belongs to a seek engine referred to as Gigablast, and is accurately named the Gigabot. The Gigabot best crawled pages and recordsdata as different search engines like google have, and made no makes an attempt in any respect to get right of entry to recordsdata and scripts of a recognized malicious nature.
Gigablast is a “Tier 2” seek engine that has over 1,000,000,000 pages listed as of the date of this newsletter (March 13, 2006.) While it’s not at the identical degree in relation to reputation because the Big three of Yahoo!, MSN, and Google, it has listed a considerably huge portion of the internet, and will also be helpful for some searches. In specific, Gigablast has applied an “Giga bits” function wherein exchange searches are prompt in accordance with the consumer’s unique question so as to lend a hand slender the question down and supply higher relevancy.
I carried out further analysis and came upon that some IP addresses from the 66.154.* IP block do get to the bottom of to gigablast.com e.g.:
Conclusion – The Gigabot is Safe
As you could neatly have accumulated by way of now, the Gigabot is a superbly protected spider that acts and operates in the similar way as different seek engine spiders perform. There isn’t any explanation why at the moment to dam the 66.154.* IP vary that the bot makes use of; if the rest, site owners would acquire from the prospective loose site visitors that Gigablast would generate for his or her internet sites as the results of the Gigabot’s efforts.