The home page of OpenOffice.org, the well-known Microsoft Office competitor, is missing from Microsoft’s Bing search engine. While it sounds suspicious, the problem has nothing to do with Bing itself — it’s a technical problem on OpenOffice.org’s end.
Ian McAnerin noticed earlier today that OpenOffice.org doesn’t show up in Bing on searches for [open office] and [openoffice.org]. He wonders if Bing is “allowing its results to be unduly influenced by either money or corporate policy.” But, upon further digging with some help from SEL’s Vanessa Fox, that’s not the case.
To be clear: Pages from the openoffice.org domain do show up in Bing — a [site:openoffice.org] search proves that. But the home page itself is nowhere to be found.
It seems as though the problem is simply due to a technical misconfiguration on the openoffice.org servers. This issue is impacting Yahoo’s index as well as Bing’s. When you navigate to openoffice.org as a user, you see the home page as you should. If you change the user agent to Googlebot (Vanessa used the User Agent Switcher Firefox plugin), you see the same nicely rendered home page.
However, if you change the user agent to either MSNbot or Yahoo Slurp, you see a 403 access denied error.
You can see this more clearly in the HTTP response from the server (using a tool such as the Live HTTP Headers Firefox plugin). Accessing the page as Googlebot returns the following (shortened for space; note the 304 rather than 200 response simply because Vanessa had visited the page before as Googlebot):
Host: www.openoffice.orgGET / HTTP/1.1Host: openoffice.orgUser-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)If-Modified-Since: Mon, 29 Mar 2010 12:58:51 GMTHTTP/1.0 304 Not Modified
Whereas accessing the page as MSNbot looks like this:
http://www.openoffice.org/GET / HTTP/1.1Host: openoffice.orgUser-Agent: msnbot/1.1 (+http://search.msn.com/msnbot.htm)HTTP/1.0 403 Forbidden
How did the server get set up this way? Any number of explanations are possible. Sometimes this happens when the host notices overactive crawling from particular bots and blocks them. This is always something that a site owner who uses shared hosting should watch out for (as the result is that your site gets dropped from that search engine’s index). In this case, Open Office likely manages their own servers, but they may not be blocking Microsoft and Yahoo purposely. A piece of server software could have easily been misconfigured accidentally.
A Microsoft spokesperson tells us: “We’re reaching out to them now to try and resolve the issue.”