Since the launch of the popular microgblogging platform, Twitter users have generated a huge pile of content – hundreds of millions of 140-character messages expressing their feelings, experiences, thoughts and views. Keeping up to date with them was a pain in the ear, until yesterday – when the platform has launched the Tweet Index, a searchable database of all public tweets ever published by its users since 2006.
The new search service is fast and efficient, indexing approximately half a trillion documents and serving queries with a latency of under 100 milliseconds, Yi Zhuang writes on the official Twitter blog. The most important things Twitter’s engineers kept in mind when building the new service was modularity, scalability, cost effectiveness (running such a service can be a quite costly, especially as Twitter’s real time index is stored in RAM for quicker access), at the same time keeping the interface as simple as possible and allowing for the incremental development of the database. Read the post for all the technical details.
So what does this mean from the average Twitter user’s point of view? With the previous Tweet Index, users could search between the tweets of the last few weeks. Usually that is all a user needs, but if not (and this happens more often than you would think), they would have to scroll back through pages of content to find (or sometimes not find at all) what they are looking for. The new Tweet Index offers much fuller and richer search results, seeking through a collection of hundreds of billions of tweets published by users. The way Twitter has accomplished this is not just efficient, but not bothersome at all. Searching through Twitter will allow its users to learn more about important events for the whole human history and their personal lives, as a result of eight years of hard work by the service’s engineers.