Heartland Payment Systems, a credit card processor, may have had up
to 100 million records exposed to malicious hackers. Payment processors
CheckFree and RBS Worldpay, and employment site Monster.com have all
reported data breaches in recent months, as have universities and
government agencies. Experts at Wharton say that personal data is
increasingly a liability for companies, and suggest that part of the
solution may be minimizing the customer information these companies
Indeed, according to Wharton marketing professors Eric Bradlow and Peter Fader
companies should deploy a technique called data minimization. The
concept: Keep the customer data a company needs for competitive
advantage and purge the rest. I think there is a fear and paranoia
among companies that ... if they don't keep every little piece of
information on a customer, they [can't function], says Bradlow.
Companies continue to squirrel away data for a rainy day. We're not
saying throw data away meaninglessly, but use what you need for
forecasting and get rid of the rest.

The problem with the data hoarding approach is that companies can't
use most of the information they keep, adds Fader. Meanwhile, they
become data pack rats, chasing an illusory dream of one-to-one
marketing, which he says is a myth. The best thing to do is aggregate
information so companies can predict something like, 'Among all people
who bought five times or more, how many times are they likely to buy in
the next year?'

Fader and Bradlow discussed data minimization concepts when they
presented papers at the recent Wharton Information Security Best
Practices Conference. Their papers illustrate how companies can still
predict customer behavior even if they minimize the customer data they

However, data minimization isn't a panacea, argues Wharton operations and information management professor Eric Clemons.
Some industries -- such as insurance or credit card companies -- may
need to collect detailed customer data for competitive advantage.
Meanwhile, companies that serve as pack rats for customer data are
focusing on installing better defenses and procedures to protect

The dominant argument of the day is that more data improves the accuracy of targeting, says Andrea Matwyshyn,
a legal studies and business ethics professor at Wharton. But there
are additional risks associated with storing that information. More may
not always be better.

Indeed, the cost of a data breach in 2008 was $202 per compromised
record, up 2.5% from $197 per record in 2007, according to the Ponemon
Institute, a Michigan group that researches and consults on privacy and
information security issues. Ponemon's estimates are based on
interviews with companies that have suffered breaches to customer
records that include credit card numbers and, in some cases, personally
identifiable information. Following a data breach, companies often must
hire security consultants, engage legal counsel and offer credit
monitoring services to affected customers. The Institute also found
that companies will lose customers in the year following a data breach.
For example, health care and financial firms lost 6.5% and 5.5% of
customers, respectively, after such incidents.

Fader and Bradlow argue that companies are taking on undue risk to
their reputations by hoarding data with little business benefits. While
companies generally disclose what data they keep in little-read privacy
statements, consumers can still be surprised when breaches occur.
Companies are actively collecting data without realizing the work
involved, says Fader. And given how companies are stretched thin,
they can't manage the data correctly. Keeping detailed data is a
blessing and a curse.

What to Keep?

The real challenge for companies is assessing what customer
information they need to retain, says Fader, who adds that firms may be
keeping an excessive amount of data because they can't pinpoint what
they actually want. Data minimization involves more than just the
data. You can't minimize data until you know what to do with it. What
data elements do you need to predict customer behavior?

The inability to answer those tough questions, says Bradlow, could
be one reason why companies default to storing as much data as possible
-- not the best strategy when it's clear that many companies don't know
what to do with this data even when they have it.

Fader and Bradlow recommend a simple approach to data minimization.
First, companies should figure out what information they need to track
consumer behavior. Then, aggregate that information -- including, for
example, grocery bills, shopping frequencies and e-commerce sales for a
retailer -- over a defined period such as two to four months. With that
aggregated information, a company can create histograms -- graphical
representations of aggregated data --and throw away original data.

Fader suggests that histograms offer accuracy rates close to
individual targeting -- without the risk. Purging individual
information lowers costs because companies don't have to secure
information in transit, store and analyze data, and navigate a bevy of
regulations across the globe. Maintaining data warehousing is costly
because the minute you keep data, you have to protect it, says
Bradlow. Most firms realize they can't do one-to-one targeting so why
not only keep data that's relevant?

According to Matwyshyn, the discussion by Fader and Bradlow was an
eye opener for privacy and legal experts at the Wharton security
conference. What remains to be seen is whether privacy experts,
marketers and security professionals can agree that data minimization
is an important step. The key is that there is discussion on the
issue, says Matwyshyn. Marketers and privacy experts may not be as
far apart as people presume.

Fader and Bradlow acknowledge that the argument for data
minimization is only just beginning. For data minimization to become
the norm, a company's management, privacy officers, legal counsel and
marketing team will have to reach consensus on customer data
collection. Legal and privacy experts are likely to support data
minimization, while marketers will argue for keeping all the data they
can collect.

Poaching Profitable Customers

In addition, data minimization practices will vary by industry.
Clemons says that data can be a competitive advantage for many
companies. For instance, Capital One used customer data to better
segment its most profitable customers and poach similar ones from
larger rivals. In this example, customer information led to varied
pricing models -- such as interest rates that varied by customer credit
ratings -- that maximized the profit from the top decile, or 10%, of
customers. Under the uniform pricing models of the mid
1990s, the top decile of customers produces 150 times more profit than
average, says Clemons. Capital One found a way to attract the best
customers away from other issuers.

In a co-authored study, Clemons found that Capital One used what it
calls an information-based strategy that allows the company to try
varying approaches based on differences between itself and rivals. This
strategy allowed Capital One to deploy a mass customization model. That
model also generated returns, says Clemons. Capital One sustained
double-digit returns on equity and double-digit increases in sales and
profit growth due to its approach.

Clemons argues that storing customer data in bulk could lead to new
pricing strategies. He agrees that one-to-one marketing is illusory at
best, but a move to precision pricing -- or figuring out exactly what
an individual customer will pay -- may warrant being a data pack rat.
I am not talking about pursuing some sort of illusory one-to-one
marketing relationship with customers, says Clemons. I'm talking
about making the transition to precision pricing, which does indeed
require understanding your customer.

Meanwhile, there's another conundrum companies face: Data purged
today could be valuable tomorrow. Ten years ago, one of my clients
wanted to purge his database. It was an insurance company, but once you
purge your database, you know no more about your customers than a new
entrant, says Clemons. That was okay under existing pricing models,
but after any form of insurance deregulation, the information they were
purging would have been enormously valuable.

Ultimately, the choice to follow data minimization practices boils down to one question: What will a company do with the data?

If you are collecting data just for its own purposes, follow a
minimization approach, suggests Matwyshyn. If a company is doing
something else with data, like selling it, then there's no incentive to
minimize the risk.

Bradlow says data minimization has the potential to be one of the
key security tools used by companies, even if it remains largely an
academic concept today. Security professionals will buy [data
minimization]. Next, you have to convince the marketing world and begin
giving talks outside the ivory tower. I think firms will start buying