August 24, 2012
Facebook Data Growing At 500 Terabytes Per Day
redOrbit Staff & Wire Reports - Your Universe Online
Facebook processes some 2.5 billion pieces of content and more than 500 terabytes of data every day, the social networking company said during a briefing with technology reporters on Wednesday at its Menlo Park, California headquarters.
“The world is getting hungrier and hungrier for data,” said Jay Parikh, Facebook´s Vice President of Engineering.
“Big data really is about having insights and making an impact on your business. If you aren´t taking advantage of the data you´re collecting, then you just have a pile of data, you don´t have big data.”
Facebook also gave reporters the first details on its new Project Prism, TechCrunch reported.
Facebook currently stores its entire user database in a single data center, with others employed for redundancy and other data. Whenever the main chunk of data gets too large for one data center, it has to move all of it to another place that´s been enlarged to accommodate it, something that wastes valuable resources.
But Project Prism “lets us take this monolithic warehouse“¦and physically separate [it] but maintain one view of the data,” Parikh said.
This means the live dataset can be broken up, and hosted across Facebook´s data centers throughout the U.S. and in Sweden.
Facebook said it does not partition data internally, or create barriers between different business units. Instead, it keeps the data in one place for easy access, which allows product developers to look at data across departments so they can launch new products, interpret user reactions and alter designs in nearly real-time.
This also means an engineer who wants to identify trends or find statistics can easily obtain the data, write a code and get results, Parikh said.
This keeps the creation and improvement of Facebook features as fast as possible.
For Facebook users who might be concerned about the company´s employees being able to look into members´ personal activity so easily, Parikh stressed that the company has a zero-tolerance policy when it comes to any abuse from this comprehensive access. Furthermore, access to users´ data is logged and strictly monitored, and only those working on building products that require data access can view this data. There is also an intensive training process around acceptable use, Parikh said.
Any employee who pries where they´re not supposed to is fired, he said.
“We have a zero-tolerance policy.”