During the past two decades, recommender systems and targeted advertising have used personal data collected at scale in their quest to transform how we consume news, shop, and interact online. The more relevant and high-quality the data, the better the results—or at least that’s been the guiding assumption. That principle now appears to hold true for the new wave of GenAI products as well. But if data matters, then so do the questions of when, where, and to what effect.
For technology companies, data serves multiple purposes. It helps identify user preferences, improve services, and develop new products. For the dominant platforms that control vast amounts of it, it’s also often cited as a source of competitive advantage and market power.
For regulators, this poses a multidimensional challenge. Data-driven value encourages firms to innovate and compete more effectively. That’s good. But too much concentration of data in the hands of a few incumbents may entrench market power and discourage innovation by new entrants who lack comparable data resources. Also, even if data sharing can foster competition, how do we ensure that it doesn’t compromise user privacy?
We’ve been studying and writing about this topic a lot lately.
For example, just this summer we published “Trade-offs in Leveraging External Data Capabilities,” which analyzes a major field experiment in an online search market and shows that data held by large firms can create value for smaller firms even when it is shared in a privacy-preserving way. The paper suggests that regulations that enable data sharing between large and small online platforms can, to an extent, level the playing field without compromising individual privacy.
Similarly, in “Targeting, Retargeting, and the Effectiveness of Search Engine Advertising,” we describe partnering with a major motion-picture studio and Google’s advertising team to design 10 separate ad campaigns for movies. For the project, we experimentally varied which users saw the ads. The only advertisements that generated a positive return on investment for the studios, we found, were those that targeted users who Google knew had expressed a prior interest in the movie. That finding, we note in the paper, “highlights the strategic importance of advertising platforms that can directly observe customer behavior and can target advertisements to individual consumers based on their prior behavior.”
One last example: In “The Editor and the Algorithm,” we study how news platforms balance editorial decisions and algorithmic personalization. We find, like in the previous study, that personal data improves targeting but with a nuance. The benefits taper off after a point, with diminishing returns setting in as more data is added. This is an important insight: Data can be powerful, but there is a limit to how much incremental value it can generate.
Taken together, our recent work has shown that data can be a driver of both revenue and product innovation. This raises a natural concern that, when taken to an extreme, data-intensive strategies could undermine competition and privacy.
So, what’s the best approach to regulation?
Our findings can help policymakers design interventions that preserve the benefits of gathering and using data at scale while mitigating the risks. The first study we cited above suggests that privacy-preserving data sharing can be a viable tool to enhance competition without undermining user trust. But, to preserve the incentives for the larger player to innovate, they must be compensated fairly. The second study underscores that data only generates value when it is sufficiently fine-grained, with the third study confirming the baseline finding of the second study. The latter two studies suggest that policies enabling access to a baseline level of personal data might allow entrants to compete effectively without requiring complete parity with incumbents.
You may have heard that data is so valuable and essential today that it has become “the new oil,” but our research suggests that’s a bad analogy. Data is a resource whose value depends on context, scale, and use. The challenge for policymakers is to design frameworks that unlock its positive potential for innovation, product development, and personalization while ensuring that it’s not used in ways that allow the excessive concentration of power and the invasion of privacy. This demands that we move beyond black-and-white narratives such as “Big data is good” or “Data collection is bad” and instead start thinking in terms of calibrated access, safeguards, and incentives. In the same vein, it is important to recognize that focusing on data alone will not solve all issues related to competition.
Policymakers in different jurisdictions are taking a first step in experimenting with different solutions as part of a case-by-case approach. In the European Union, some provisions of the Digital Markets Act (DMA) aim to curb the market power of large “gatekeeper” platforms by imposing data-sharing obligations. The United States has no direct equivalent to the DMA, but regulators and crucial judicial decisions have imposed data sharing obligations on large technology companies as witnessed in the most recent antitrust case against Google. Agencies like the FTC and DOJ are exploring whether data advantages constitute barriers to entry and are challenging practices they see as exclusionary.
As AI systems become more integrated into daily life and more dependent on large-scale data, these questions will only grow in importance. By combining empirical evidence on the economics of data with careful policy design, we can create an environment where innovation thrives, markets remain competitive, and individual privacy is respected.
Michael D. Smith is a Senior Adjunct Fellow at Technology Policy Institute. He is also a Professor of Information Systems and Marketing and the Co-Director of IDEA, the Initiative for Digital Entertainment Analytics at Carnegie Mellon University. He holds academic appointments at Carnegie Mellon University’s School of Information Systems and Management and the Tepper School of Business. Smith has received several notable awards including the National Science Foundation’s prestigious CAREER Research Award, and he was recently selected as one of the top 100 “emerging engineering leaders in the United States” by the National Academy of Engineering. Smith received a Bachelors of Science in Electrical Engineering (summa cum laude) and a Masters of Science in Telecommunications Science from the University of Maryland, and received a Ph.D. in Management Science from the Sloan School of Management at MIT.


