Collecting Cookie Data And Still Protecting Privacy
Enid Burns for redOrbit.com – Your Universe Online
The browser cookie has long been debated as a troubling side effect of the Internet. Privacy advocates and consumers fear that data collected from cookies are used in nefarious ways. However, publishers typically use the data to understand a site’s visitors and target advertising.
A group of cryptology engineers at the Saarland University in Germany is the most recent group to tackle the cookie issue, this time by using a cryptographic method to collect data that preserves privacy.
The problem: “Many website providers are able to collect data, but only a few manage to do so without invading users’ privacy,” explains Aniket Kate from Cryptographic Systems (CrypSys) Research Group at Saarland University, and leader of the research group creating the technology.
The justification for collecting data in such a manner shows the necessity. “But this wealth of sensitive information allows them to reconstruct detailed profiles of each individual,” said Kate, in a university statement.
Kate and her group at Saarland University propose a new way to collect data. The group is presenting findings this week at the computer expo CeBIT in Hannover. The research is also presented in a paper titled “PrivaDA: A Generic Framework for Privacy-preserving Data Aggregation.”
“Our PrivaDA framework is a replacement for the privacy-insensitive server-side user tracking over the Internet. In general, PrivaDA is almost independent of the cookies infrastructure: In particular, a visitor’s data/answers can still be generated by processing cookies in the visitor’s local environment or by directly asking the visitor for his/her answers (using say a webform). The key difference with the current tracking is that the visitor input should not reach the (analytic) server in the plain form. Answers from several visitor must to processed [sic] together in the PrivaDA framework, and only noisy statistical answers should reach the server,” Kate told RedOrbit.
One issue that has held cryptographic methods in the past is a limitation on the depth of data that the technology can provide. The PrivaDA system is able to overcome some of those limitations.
PrivaDA is presented as “a novel design framework for distributed differential privacy that leverages recent advances in SMPCs on fixed and floating point arithmetics to overcome the previously mentioned limitations. In particular, PrivaDA supports a variety of perturbation mechanisms (e.g., the Laplace, discrete Laplace, and exponential mechanisms) and it constitutes the first generic technique to generate noise in a fully distributed manner while maintaining the optimal utility. Furthermore, PrivaDA does not suffer from the answer pollution problem and we demonstrate its practicality with a performance evaluation,” the PrivaDA paper’s introduction states.
PrivaDA is not the first alternative developed to replace cookies. Several companies are working on device fingerprinting technology, which would allow website publishers to identify users by device, such as a computer, mobile phone or tablet. Several major Internet companies such as Google and Microsoft are working to come up with new technologies that will replace cookies, and address privacy concerns triggered by cookies.
“A variety of factors are conspiring against the continued existence of third-party cookies. Among them changing online privacy rules and norms, the fact that Microsoft, Google and other large publishers are looking for alternatives and the fact that mobile devices don’t support them– are all reasons that everyone is starting to look around for alternatives,” Greg Sterling, principal analyst at Sterling Market Intelligence, told RedOrbit.
Device fingerprinting is already being implemented by some web providers and publishers, as it is able to track users on multiple devices, including devices cookies are not able to track such as mobile devices. Last fall it was reported that at least 145 of the web’s top 10,000 websites are using device fingerprinting, from a report released by KU Leuven-iMinds researchers. While it is expected the device fingerprinting statistic will only rise, other technologies such as those using cryptographic methods could just as easily rise to dominate analytics on the Internet.
The new approach developed by the CrypSys Research Group at Saarland University uses cryptography, a method of secure communication that is able to encode and decode messages or data, to collect user data and keep it in a secure form. The data are able to compile a user profile while still remaining protected.
“It should be done locally on the user/visitor machine by directly asking queries to the visitor using (say) a web form or by running a local profile creation code that generates user profiles locally,” said Kate. “Instead our distributed system processes several such profiles together and then forwards the noisy statistical information about all visitor profiles to the server. The mechanism we employ to create noisy statistics is called ‘differential privacy’, and our system implement differential privacy mechanism as a secure multi-party computation.”
Limitations to the PrivaDA system might be that user data is not pieced together as fully as tracking cookies. Privacy advocates might be pleased about this outcome, but it remains unclear whether publishers and advertisers will accept this method as a replacement to cookies. Data provided by PrivDA will render individual statistics, but will be less able to provide the profile picture currently available to publishers and advertisers.
“As only the noisy statistics are provided to the server in the end, combining a limited number of data pieces (i.e. statistical outputs) does not reveal any significant amount of user profile to the colluding servers,” Kate said. “Some profile items such as gender, age does not have to answered [sic] each time, while some might have to be answered freshly.”
Advantages of the PrivaDA system include protecting the privacy of site visitors. Publishers and advertisers retain the ability to derive demographic and psychographic data from their analytics platforms, but the data are less tied to the individual site visitor.
“The key advantage of differentially private (noisy) statistics is that they protect individual’s privacy without hampering the useful [sic] of the statistical results. Publishers claim to collect visitors statistics to improve the user experience or the publishers site’s performance, and the noisy statistics suffice for this purpose. Thus, our system serves the purpose of the publisher/analytic firm while still protecting the individual user’s privacy,” said Kate.