Differential Privacy

Privacy by the Numbers: A New Approach to Safeguarding Data

In 1997, when Massachusetts began making health records of state employees available to medical researchers…William Weld, then the governor, assured the public that identifying individual patients in the records would be impossible.

Within days, an envelope from a graduate student at the Massachusetts Institute of Technology arrived at Weld’s office…[Latanya Sweeney](http://latanyasweeney.org/work/index.html) was able to pinpoint Weld’s records.

Differential privacy focuses on information-releasing algorithms, which take in questions about a database and spit out answers—not exact answers, but answers that have been randomly altered in a prescribed way. When the same question is asked of a pair of databases (A and B) that differ only with regard to a single individual (Person X), the algorithm should spit out essentially the same answers.

Privacy is a nonrenewable resource [] once it gets consumed, it is gone.

The question of [] acceptable privacy loss is ultimately a problem for society, not for computer scientists—and each person may give a different answer. And although the prospect of putting a price on something as intangible as privacy may seem daunting, a relevant analog exists.

There’s another resource that has the same property—the hours of your life [] — once you use them, they’re gone. Yet because we have a currency and a market for labor, as a society we have figured out how to price people’s time. [I]magine the same thing happening for privacy.