15 September 2008
According to Google, TechCrunch predates Attila the Hun
You might not have enjoyed history at school but I bet you knew enough so that if I asked you which came first, TechCrunch or Attila the Hun you’d at least guess at Attila. But now Google has rewritten history thanks to their News Archive.
Wikipedia tells me that Attila the Hun was born 406 AD. Google News Archive tells me that TechCrunch was first talked about in 125 AD by a remarkably modern sounding chap, webmaster-source.com. Apparently (see picture above) people were also talking about TechCrunch in 1849 and 1955.
And so it seems that Google News Archive is broken.
Google News has been in the news a bit recently for getting things wrong. In case you’ve not seen it, last Monday Google News picked up a story about United Airlines filing for bankruptcy. The story was old but had a posting date of September 7th 2008 - it had been picked up as that day’s most viewed story (BBC News have a similar feature). Google assumed it was current news and United Airlines stock fell 11% that day.
But now Google seems to be having even more trouble dating articles. In the example picture I posted above you can see that because the original author wrote “125×125 ad” in the content Google has spotted “125 ad” and taken that to be the posting date.
Clearly this is a ludicrous thing to do not least because the author is writing about banner ads! More generally however, it would be ridiculous to assume an article is historical simply because there is a date in the content. How on earth is Google going to date articles on historical events with a method such as that?!
I’m noticing these errors on plenty of results. Today I searched for ‘Lehman Bros‘ and was given an article which Google claimed was from 1850 simply because the article contained the phrase “founded in 1850″.
In case you think I’m misunderstanding what Google News Archive is meant to do let me quote the ‘About‘ section of the help files:
News archive search provides an easy way to search and explore historical archives.
Seems clear to me and therefore fair enough to judge the News Archive by this statement. Even if this functionality is intentional it makes the News Archive far less useful and is at best misleading. I’ve emailed Google to inform them of this but I doubt they’ll respond.
3 Comments currently posted.
ezineaerticles » Blog Archive » According to Google, TechCrunch predates Attila the Hun says:
All about link building. says:
[...] has a timeline option, mapping events and names. The usefulness of this feature greatly varies, as quite often it’s broken. A search for “techcrunch”, for instance, reveals the earliest occurrence to be 125 AD. [...]
Google Adsense Tips & News » Blog Archive » Google News Archive Hiccup says:
[...] has a timeline option, mapping events and names. The usefulness of this feature greatly varies, as quite often it’s broken. A search for “techcrunch”, for instance, reveals the earliest occurrence to be 125 AD. [...]


[...] Original Dave Shaw [...]