Shane Greenstein on Co-Invention and the Geography of AI Innovation

Shane Greenstein on Co-Invention and the Geography of AI Innovation

Scott Wallsten: Hi and welcome back to Two Think Minimum, the podcast of the Technology Policy Institute. Today is Wednesday, November 5, 2024 and I’m TPI president Scott Wallsten.

Today’s podcast is a discussion with Shane Greenstein, who is back for his second appearance on the show.

Shane Greenstein is the Martin Marshall Professor of Business Administration at Harvard Business School. He studies how digital technologies diffuse through the economy and create value, from the early commercialization of the internet to today’s data center boom. Shane has written extensively on what he calls ‘co-invention’—the often-invisible adaptation work that happens after a core technology is invented to make it actually useful.

Today we’re talking about the billions of dollars flowing into AI data centers. Companies are making massive bets on hyperscale facilities, but there’s real uncertainty about whether the revenue will justify the investment. We discuss where data centers locate and why, how demand has evolved over the past decade, and whether we’re witnessing a transformative technology wave or just building bigger because we can.

Scott Wallsten: Shane, thanks for joining us again.

Shane Greenstein: Happy to be here.

Scott Wallsten: So, let’s talk about something that a lot of people care about right now, which is where data centers are located. Tell me a little bit about your work.

Shane Greenstein: Love this topic, by the way.

Scott Wallsten: Yeah, well, it’s a hot topic.

Shane Greenstein: Yeah, but it’s a nerdy topic, too.

Scott Wallsten: That’s true. Which is why we like it. So, tell us about your research there. In particular, starting from the first papers that you did on this, which, in reality, weren’t so long ago, but a lot has changed. Tell us about your research, the evolution, and what you think about it today.

Shane Greenstein: So the way to start is, the categories for understanding how data centers choose their locations. The categories haven’t changed, but the relative priorities of the firms making those choices have changed a lot. [Here Shane is discussing his 2025 paper coauthored with Tommy Pan Fang, “Where the Cloud Rests: The Economic Geography of Data Centers.”]

Scott Wallsten, narrating: That shift in priorities is what Shane means by co-invention—the innovation that happens after a technology exists, when companies adapt it to new uses and realities. Data centers aren’t a new idea, but how firms use them, balancing power, connectivity, and customer proximity, keeps evolving. That adaptation is where a lot of the real innovation happens.

Shane Greenstein: And so that’s kind of an interesting lesson.

Scott Wallsten: So what are the categories?

Shane Greenstein: Yeah, so the discipline of actually trying to figure this out and measure it taught me that lesson. Maybe that’s the way to say it. If you ask a firm what are the primary determinants of where they place their data center, it would be, first, what their customers want and what the use case was. Some use cases demand close proximity to the customer, and other use cases don’t. A second thing would be access to electricity, because it’s a core input. A third is access to the internet, because again, it’s another core input. And then, fourth is almost always access to water. But that’s usually not been a bottleneck, so it doesn’t usually get much attention. Access to land sometimes matters also, or has started to matter a great deal more. It mattered in the past, when the customers wanted close proximity in a condensed urban area, and so it wasn’t possible to be next to them. That problem emerges a ton inside downtown San Francisco. But on the other hand, firms then face the tension, and they make some trade-offs. Another problem that used to emerge was getting two sources of electricity. Electricity wasn’t in short supply, usually from one source, but most data centers wanted to have multiple sources in order to have backup and offer the service that they were offering, because the service typically required four nines to five nines reliability.

Scott Wallsten: When you say two sources, you don’t just mean diesel backup generators, you mean two real big sources, right?

Shane Greenstein: Two parts of the grid, yeah. Because if you’re going to offer—so, you know, the phrase is four nine or five nine reliability—four nine reliability, for those who don’t know, is 99.99% uptime. So, if you’re going to offer that, you’ve got to make investments as a supplier to enable that to happen, and that means physically making investments to have a backup in the event of something going on. Electricity is pretty central to that. Like, one of the major stories inside the data center lore is the data center in Houston that remained open after the hurricane.

Scott Wallsten: Right. That hit it.

Shane Greenstein: And their bigger problem wasn’t electricity, because they had so many backup generators, and they had planned for a hurricane. They could keep the thing running. The bigger problem they had was getting food to their employees, because they couldn’t get a car down the street. But that was the problem. And then internet connectivity matters, particularly for scale, so you have to have huge connectivity to support some of these places. Most of that’s been resolved for most of the stuff that’s already in place, but just to be clear, in the old world, in the world of 25 years before the present boom, a major concern of many buyers of services inside data centers was having high connectivity. So being able to house interconnection between different networks was a valuable service as well, because it meant then that the user of the data center was immediately going to get their data on multiple networks who were all interconnected in the basement, basically.

Shane Greenstein: Now, you might have thought data centers were analogous to the places where railroads used to be. And then, uploading and unloading of cargo generated complementary economic activity. It’s a well-known historical pattern that cities where multiple railroads met ended up getting a ton of other economic activity. And so you might forecast that for data centers, too, at least the ones that serve as the interconnection point. And you do not see that. There just isn’t much evidence of that at all. Really, the only place in the U.S. where we see agglomeration of data centers, at least anything that looks like a network effect, is the one outside DC, in Ashburn, Virginia.

Scott Wallsten: In, yeah, in Loudoun County. So why is that? Why did Loudoun County data centers develop the way they did? Was it just a confluence of fiber, demand—why didn’t this happen in Palo Alto?

Shane Greenstein: Yeah, no, exactly. The hard question is why the dog doesn’t bark somewhere else. The easy question is, why did it happen, at least outside DC? That’s one of the earliest interconnection points. MAE-East was located just outside of Dulles Airport, so that’s part of it. Another part of it was the meeting of two major trunk lines for the internet in 1999. One of which went from the northeast down to Miami, and the stuff from the Northeast is pulling in transatlantic lines from Europe, and the stuff in Miami is going into South America, and so that was a major trunk travel point. And then there was another major travel point going east-west.

Scott Wallsten: From…

Shane Greenstein: DC, or New York. New York could have served this as well, and that’s going east-west, going just underneath Lake Michigan, just south of Chicago, keeps going all the way to San Francisco and Hawaii and Seattle.

Scott Wallsten: So, you answered what you said was the easier question, which was why it was there, but not the harder question of why it wasn’t in other places.

Shane Greenstein: Yeah, like Dallas is another place, and Chicago’s another place. Availability of land in the right places could have been one of the hindrances. Available electricity at scale is a potential explanation, specifically versus Chicago. Availability of land certainly explains why Palo Alto didn’t—

Scott Wallsten: Right.

Shane Greenstein: Right? Had this been done in 1950, yeah, there were a bunch of orange groves, but somebody had already converted all those.

Scott Wallsten: Right, those are all gone now.

Shane Greenstein: Yeah, water wasn’t a bottleneck. As I say, labor isn’t really a bottleneck for these things. Internet connectivity in any of those places was not a bottleneck, so I would speculate it was land and electricity, but I actually don’t know. I’m not sure anybody knows.

Scott Wallsten: Are companies approaching this with an experimental mindset? Or are they saying, nope, I just believe this is the way it’s going to turn out, and I’m going full steam ahead, I’m putting all my eggs in this basket?

Shane Greenstein: So there were, and I could keep going, there were a set of existing users who already had use cases, and they were making investments, and they’re still making investments, and they still think those investments will pay off. That’s a little different, in an interesting way, because their incentive is to generate—or some fraction is to generate new revenue, some fraction is to protect existing revenue. That’s rather distinct from what OpenAI is doing, which is primarily to generate new revenue. So there’s that. That’s distinctly different. And then there’s, you know, I can’t quite tell you what Oracle Cloud looks like, which they were doing for many years anyway, and then they’re trying to generate new business that way also. And then there’s a bunch of other things, but those are all slightly different, so I would want to be a little bit cautious.

Shane Greenstein: So there’s a lot of room for improvement. It’s still a question of who’s going to pay for that. I admit, I’m pretty much a skeptic at this point, because the amount of money going in is so high. I have a hard time seeing where we can generate enough new revenue to pay that back. But you have to be conscious that some of the motive here isn’t about generating new revenue, so that’s not quite the right question to ask, either.

Scott Wallsten, narrating: That’s one of the most provocative things Shane said. He’s skeptical the revenue will justify the investment. So I had to ask: is he putting his money where his mouth is?

Scott Wallsten: Right. Well, so how much of a skeptic are you? Have you done anything different with your own personal portfolio? Have you taken money out of the S&P 500?

Shane Greenstein: No.

Scott Wallsten: Okay.

Shane Greenstein: Yeah.

Scott Wallsten: Keep a diversified portfolio, though.

Shane Greenstein: Don’t put it all in bonds, or don’t put it all in stocks, and don’t put it all in real estate, you know? These are pretty old rules. They still seem to apply.

Scott Wallsten: Everything in plastics.

Shane Greenstein: I’m old enough to remember that, to know what that means.

Scott Wallsten, narrating: Listening to Shane, you get the sense that the payoff from all this AI infrastructure might come later as firms learn how to make these massive investments actually useful. It’s the same pattern he’s studied for years: the big breakthroughs grab attention, but the value shows up in the hard, messy work that follows.

Scott Wallsten: Right. Well, let’s come back to the demand point. In your paper, you separated data centers into third-party datacenters and what you called cloud, which was basically single owner, like Amazon, Google, and Azure, Microsoft. Has the nature of demand changed now, where data centers today or the hyperscalers, the large investments we’re seeing, are divided between data centers that focus on training AI models and those that focus on using AI models? And do those have different impacts on where each kind would be located? Or how does the nature of demand today differ from, you know, 10 years ago?

Shane Greenstein: Yeah, okay, there we go. Some part of it persists, so the finance people still want to be close to their computing resources. So that persists. And the market analysts who are doing large datasets now, which has become a new activity in 10 years, with these terabytes of data that they’re moving now—they still also want to be co-located, or closely located, because they don’t want bottlenecks in their transmission. So that demand still exists, as it did 10 years ago. Demand for backup still exists. But previously, there was an activity that people in the industry used to call server hugging. The user wanted to be close by because it was the crown jewels of their organization. And that demand has shifted. There’s more trust in the network and the infrastructure now. So many more firms are willing to do their backup on something like an AWS server that’s far further away. So that’s a definite shift.

Scott Wallsten: So is the issue that they are more likely to be willing to do it further away because they trust the infrastructure that carries the data between, or that they don’t need their own infrastructure anymore and they’re willing to send it to AWS or somebody else?

Shane Greenstein: Both. And then the additional thing is that some functions are now better provided by AWS. I mean, I’m not here to sell AWS to you. I could say Google Cloud or Azure, Oracle Cloud, I could give a lot of examples. Some security features, particularly, are better offered—Cloudflare being another example, right? Akamai, I could keep going. Some security features are better offered by these firms than your own security team could otherwise do internally. And so that’s another thing generating demand for outside management of your compute and storage. That’s partially also driving this. It still doesn’t come for free. It still means a large firm, or any firm that’s doing that, has to hire internal labor who knows how to work with those systems. But it can be a much smaller labor force, or technical labor force, than if you were doing it all yourself.

Scott Wallsten, narrating: And there was a nontechnical reason this change was attractive: where the expense shows up—operating budget versus capital budget.

Shane Greenstein: I did not anticipate how important this was when I started in data centers, but it turns out to really matter: whether the expense shows up on your capital budget or your operating budget.

Yeah, okay, so if you ask a firm how are you paying for your AWS, it comes off the operating budget, because it’s an expense affiliated with day-to-day operations. If you were to have asked them when they ran it themselves, where it came from, it came off the capital budget, because they were expensing the equipment and the setup, and then there would be a variable expense, which were their employees who ran that, and it was a cost center. So for many firms, it turns out, for budgeting reasons, they much prefer to have everything on their operating budgets. They can monitor it much better. They feel like they can manage it much better. It’s a statement about the way many organizations were managing their IT infrastructure prior to the growth of, like, an AWS. There is a really strong preference. I’m just telling you the facts. It’s what I get from interview after interview.

Scott Wallsten: Do you have any explanation as to why? I mean, to me, you know, I’m an economist, money is money, and so I don’t get it.

Shane Greenstein: Right. I would ask this too, and it was more about—we’re underestimating the headaches affiliated with managing the IT inside these organizations.

Scott Wallsten: Interesting.

Shane Greenstein: I think some of it is also, you can outsource the security part to somebody else, and they’re better.

Scott Wallsten: Right.

Scott Wallsten: And as demand changes, firms may need to change their business models. Are they experimenting with different monetization approaches?

Scott Wallsten: So last week I talked with Catherine Tucker, and we talked about monetization a bit, you know, she’s written about that.

Shane Greenstein: She knows something about that.

Scott Wallsten: Right, exactly. And so, you know, we don’t know exactly how things are going to be monetized in the future, and we talked about whether they could be running experiments on what the right way, the more efficient way, the better way to monetize it is. But it sounds like you’re talking about something else that we also don’t know the answer to, except that they’re building these incredibly expensive pieces of infrastructure.

Shane Greenstein: Yeah. There’s a couple different things going on. So, first of all, we’re in early times for a decadal or multi-decadal process. We have a couple applications that have generated revenue. Let’s be very precise. What are they? Coding assistants. People will pay for that. And a general purpose language assistant, whether you call it Claude or ChatGPT—people will pay for that.

Shane Greenstein: There are a number of other applications that we speculate people will be willing to pay for. Science-based ones, finance-based applications where people have always been willing to pay, logistics-based applications. Yeah, there’s always been a willingness to pay for better there.

Scott Wallsten: Okay, but that’s still, what will people pay for?

Shane Greenstein: Then there’s a set of use cases by established firms who can already plug it in. So, without question, Google plugged the large language models into their existing businesses and lines of business. They started with Translate, I believe. Like, language translation was where they started, but then they could quickly move it to auctions and to matching—that type of auction to the type of bidders. I’m pretty sure Amazon’s also using it in how they decide which web pages to first show you.

Shane Greenstein: So there were, and I could keep going, there were a set of existing users who already had use cases, and they were making investments, and they’re still making investments, and they still think those investments will pay off. That’s a little different, in an interesting way, because their incentive is to generate—or some fraction is to generate new revenue, some fraction is to protect existing revenue. That’s rather distinct from what OpenAI is doing, which is primarily to generate new revenue. So there’s that. That’s distinctly different. And then there’s, you know, I can’t quite tell you what Oracle Cloud looks like, which they were doing for many years anyway, and then they’re trying to generate new business that way also. And then there’s a bunch of other things, but those are all slightly different, so I would want to be a little bit cautious.

Shane Greenstein: Microsoft—I couldn’t tell you what they’re doing. They got off to a great start with Copilot, GitHub Copilot. Right? It was a very specific application that they were able to generate after they had exclusive access to ChatGPT 3.0, which they bought for a billion dollars. The access they bought. And now they’ve decided to put in $10 billion, or whatever they put in, to buy the access for something else, some option value in the future, which, again, I’m not sure I could tell you what option value they’re looking for. But GitHub Copilot was a pretty good one. Conditional on—they had the right people in the right place, and a billion dollars actually turned out not to be too much money for what they got for it.

Scott Wallsten: It seems almost quaint.

Shane Greenstein: Yeah, now, yeah. So that’s, yeah, I don’t know, but that’s still early times.

Scott Wallsten: We’ve talked about why firms moved to the cloud. Now the question is: why are we suddenly seeing these enormous hyperscale data centers? Are they the natural next step in demand, or are we just building bigger because we can?

Shane Greenstein: All these other things we just talked about—additional trust, so firms are willing to move bigger projects and larger amounts of their backup. There’s some of that. Some of it is just there’s more data, so that’s demand. If you run an Uber and you’re trying to do a simulation of Uber across the whole U.S., it’s a big data project, and you just need more. So there’s that.

Scott Wallsten: But that doesn’t necessarily mean you need more all in the same space.

Shane Greenstein: Correct. That doesn’t necessarily get you the hyperscalers that you’ve got now.

Scott Wallsten: OK, so it doesn’t automatically mean everything needs to live in one massive building. Then what explains the scale? Well, the other big force is supply—new tech and hardware stacks that make scale itself productive. That’s the piece that pushes you toward the huge, single-site builds.

Shane Greenstein: For sure, it’s a combination of the invention of “Attention Is All You Need.” So, the invention of transformers, and the observation from those who implemented it that there is as yet no diminishing returns to scaling what you can do inside of a data center. I say as yet, because it’s an open question.

Scott Wallsten: Right.

Shane Greenstein: Right? And there’s a third thing going on, which is a little different between Google and all the NVIDIA GPU users, so I just want to make sure we qualify this properly. But let me do the NVIDIA GPU user side first. NVIDIA, well prior to this growth, built a software platform to enable coders to go inside and use a GPU for parallel mathematical functions. They have, in addition to that, built through acquisition, as well as their own work, a series of tools to help GPUs in the same location work with each other to make the parallelization more efficient. So there’s a supply response also to what they perceive as demand for locating everything in the same place.

Scott Wallsten: But does the same location matter because even the smallest amount of latency makes a difference? Because why does it matter if you’re in one building versus another building, as opposed to being in the same building, right?

Shane Greenstein: So, you know, you’ve got to go talk to somebody who’s an engineer, because there’s also a lot of speculation and debate about this, because this doesn’t come for free, right? So let’s give NVIDIA their side. By the way, we’re just going to park on the Google side of this for now.

Scott Wallsten: Okay.

Shane Greenstein: Right? They have a TPU, they have their own data centers that they’re engineering for their own use cases, so there’s that going on also. And on the NVIDIA side, after ChatGPT 3.5 showed up in November, December of 2022—I’m going to date it, because it is a dateable event.

Scott Wallsten: So part of this is technical, but part is just a market bet, especially after ChatGPT proved there was real consumer demand for this stuff.

Shane Greenstein: After it showed there was large-scale consumer demand for the thing these large language models provided—that was an unknown before then. There was investment after that event, and even in anticipation of such an event. There has been a tremendous amount of investment in making the large language models more general, giving them a greater number of functions. And doing that requires a larger amount of work on the algorithm, and then a larger amount of training and tuning after the estimation. So there is a belief by some in the industry that that is best done inside one building. And there is a speculation, which I’m in no position to evaluate, that 3 years from now, the buildings we’re making right now will continue to be able to enable us to do things that we could not have previously done, and they will have so much value that someone will pay for them. There’s another view, been around for a long time, that parallelization doesn’t need to all be done inside one building.

Scott Wallsten: Right.

Shane Greenstein: Or, for that matter, in one GPU, because it isn’t. And that there are limits to the economies of scale that you get from putting it in one building, and that you can separate these things. And to give an example, Facebook, today called Meta, many years ago put its first data center in Prineville, Oregon. They are working on data center number 9 at this point in Prineville, Oregon. They’re all right next to each other. They’re all the same size and more or less configuration. They have made this assessment that, for their purposes, they can locate them next to each other and they can get efficiencies from doing things exactly the same way across them.

Scott Wallsten: Maybe they should have just put them all in the metaverse.

Shane Greenstein: Yeah, absolutely. You know, AWS also does this kind of thing. Their engineers don’t—if they want two buildings to work with each other, they put them within a certain distance of each other. And they do that also. And then they give users the ability to use resources from different regions. When you use AWS, you can request where the resources are coming from. So that’s the debate, and then that debate further splits on whether you think the future of large language models—sorry, the value of large language models is highest when you put everything inside of one LLM, or we’re going to have a future of a lot of specialized large language models narrowly focused on one, delivering one function or another, in which case you do not need to do that.

Scott Wallsten, narrating: So now we come back to Shane’s concept of co-invention. It’s not formal R&D, it’s buried in operations, and we don’t measure it well. But Shane argues it might be the most important part of innovation. And it has a geography—it doesn’t happen everywhere equally.

Shane Greenstein: The thing that I find most interesting is quite often the co-invention, that is the invention that takes place right after the core invention, to translate it and adapt it to use cases and value propositions, which is what we’ll end up talking a lot about today.

Scott Wallsten: Yeah, did any aspects of those kind of surprise you as you were going through your career? Like, you know, you’ve found results or other observations that sort of didn’t meet with what you had expected?

Shane Greenstein: Two things. It’s a good question, first of all. Two things. One, we all learned the schoolboy economics—technology’s invented, and then firms come along later and learn the science, and then somebody else, or maybe the same people come along later and translate that into something useful. One of the earliest things you learn if you study this is that that is just a terrible theory of how technology gets created or translated into value. And despite enormous amounts of evidence to suggest it’s a terrible theory, it persists. So I would say I’m surprised at its persistence, because it’s so easily disproved. The world is not linear. It does not progress in this linear way. That would be one. So the other surprise—in retrospect, but I guess with wisdom, I’m not surprised—is that what we measure tends to get more attention than what we don’t measure. And co-invention, which, again, we’re going to talk a little bit about today, is very difficult and challenging to measure. It doesn’t fall into a formal R&D activity. It’s often buried under operations. It shows up in ad hoc auditing categories inside firms, and so consequently it’s not identified as a sort of activity, measured as a separate activity, and it doesn’t get studied as a consequence. And with time, I’ve come to appreciate—I guess I shouldn’t be surprised by that, but it happens over and over and over again.

Scott Wallsten, narrating: So if co-invention is this invisible but crucial activity, where does it happen? Does it spread everywhere, or does it cluster? Shane and his co-authors studied this.

Scott Wallsten: All right, let’s talk a little bit about your paper that you had this year on innovation value distribution. And you talk about the differences between incremental co-invention and novel co-invention, and you conclude that incremental co-invention is geographically distributed, novel co-invention tends to cluster. So first, I mean, say more about that, because I didn’t explain anything. And how do you think that fits with AI and the kind of investment we’re seeing today? Because we see both distributed and concentration.

Shane Greenstein: So, well, this is co-authored work with Tim Bresnahan and Pai-Ling Yin, so they did an equal amount. And we were doing this for years, so, you know, just coincidence that it happened to come out at the same time. Yeah, similarity to the data center work—it’s backward-looking in the sense that it looked at the last decade. And it looks at a different part of the stack, if you want to call it that. So it’s looking at consumer computing technologies. Similar to what we were just talking about, there’s a general technical advance. And then it gets adapted to particular applications to generate value. So that framing was also true of that general wave of technical advance. I could say it more concretely: the technical advance was things like smartphone apps and engaging web pages with dynamically changing content in response to the things you were doing, which kept you there—and which we now, in retrospect, aren’t sure all of that was great. But that was what the era was. And a lot of firms applied it. So, to get the ground right, incremental here meant you kept the business you had, saw this new technology, and adapted it to your existing business. So my favorite example is the terrestrial radio industry. There were thousands of terrestrial radio stations, still are, in the United States. Due to quirky regulatory rules, there are thousands of them distributed all over the country.

Scott Wallsten: Right, now it’s a radio station for every person.

Shane Greenstein: Yeah, and the radio stations all went online. They all had web pages, so you could listen to their radio station from your computer. Most of them did nothing more than that. If they were music stations, they’d post articles about various musicians in the genre that they were covering. If there were talk stations, they had pictures of their hosts and maybe some articles that they related to. If it was sports, again, related to that, depending on the format. That was all incremental. And a fraction of users went to those web pages. Great. Did it generate a lot of value? Well, no. It got them some ads. That’s what they got.

Shane Greenstein: So that’s one example of incremental. Large value, so what we call novel, it goes by many names in popular conversations, so just to alert your listeners—it has a very different set of determinants. So incremental value, you can just see it empirically across the United States, is building off an existing business. So if the existing businesses were geographically distributed, so too is the incremental. That’s just the way to understand its geography. Novelty, empirically, you can see if you read enough examples, in particular requires some combination of domain knowledge and technical knowledge to generate some adaptation of the technical wave that nobody had ever previously thought of, and to deliver some service that was new to that firm, or sometimes even new to the world. And those combinations were actually quite hard to do. And, as we see in our data, it could have happened in a lot of places. What, in fact, did happen was the most valuable applications of consumer computing technologies tended to be in media, and so the largest leaps in value creation tended to emerge in one of two kinds of locations. Either you had concentrations of technically adept people who understood how to adapt the new technologies to media, so they had to hire other people with domain knowledge, and the two of them together would work something out. Or you had somebody who was in the media industry who recognized there was this opportunity, and they had to hire someone with the technical skill.

Shane Greenstein: To adapt their ideas to the market. Given that supply, it’s of no surprise that Seattle and San Francisco are major sources of novel co-invention of consumer computing technologies. But it does not get enough attention. One of the things we discovered was that New York and Los Angeles were sources of a lot of that also. It shouldn’t be surprising, because it’s media, because those are the two media capitals of the U.S. already. And then, you know, Washington, DC is actually sort of a distant fifth, and there are a couple other cities that are getting close. But that’s why you got concentration, because the successes required this combination.

Scott Wallsten: Do you think this will be different at all? Or does AI make any of this different?

Shane Greenstein: Adaptations of AI.

Scott Wallsten: Yeah.

Shane Greenstein: Technical skills, so that seems almost identical. So just, again, to be clear, we’re talking about adaptation. So the moment we’re at today is, like, the core inventions, and there are a thousand people in the United States who are doing this. I mean, it’s less than 10,000. Really, it’s just not that many people. But the adaptations that will translate that into widespread value, oh my god, that’s going to require a great deal more than those people. And, you know, the famous guy, Jeff Dean at Google can’t invent everything.

Scott Wallsten: Right.

Shane Greenstein: Right.

Scott Wallsten: He’ll try.

Shane Greenstein: Yeah, and that’s going to involve a lot more people, but it’s still going to involve people with technical skill. I think it’s a pretty good bet you’re going to find a lot of them in Seattle, San Francisco.

Scott Wallsten: Right, the same places.

Shane Greenstein: The same places. It doesn’t have to necessarily be that way, you know? It doesn’t, but it’s a pretty good bet.

Scott Wallsten: That clustering happens because novel co-invention needs both technical skill AND domain knowledge. Shane gave me a concrete example of what happens when you have one without the other.

Shane Greenstein: And again, if it’s going to be adaptation of AI to use cases to create value, there’s got to be domain knowledge. Most of the coders I know just don’t know anything about architecture, so if they’re going to do AI in architecture, they’re going to need an architect, for God’s sake! If they’re going to do applications in medicine, they aren’t going to be able to do it. I’ve been watching—you and I and everyone else has been watching what’s been happening in x-rays for the last decade. Get a doctor on your team.

Shane Greenstein: They’re a lot more productive, the startups I watch, the ones that have somebody with some medical knowledge. That helps a great deal. Again, domain knowledge matters in application.

Scott Wallsten: Is that, you think, I mean, from having spoken with lots of executives and other people in these companies, is that generally a problem? That they’re not interacting with domain experts?

Shane Greenstein: A lot of startups don’t, yes. Yeah, I don’t know how specific you want me to get.

Scott Wallsten: I mean, as specific as you can be without revealing somebody’s secrets, unless you want to reveal them.

Shane Greenstein: I’ll use an HBS case that one of my colleagues wrote, and I know the entrepreneur. He’s come to talk to my class, and he’s a wonderful guy. He’s doing x-rays and dentistry. And, you know, he actually really understood the dental market for automated x-rays of your cavities and your gum problems and so on.

Shane Greenstein: Yeah. So he will readily admit, you know, you can take an undergrad with a good training in neural networks, his dad is a dentist, his mom is a dentist, go collect 10,000 x-rays from their files, and he could make an algorithm. That student would have no problem—he or she would have no problem with a little bit of neural network training.

Shane Greenstein: That would be pretty good at identifying 90% of the cavities in their customers. Is that going to generate any value? No! That’s just—come on! No! Why? Because it has to scale, it has to be embedded inside the standardized approaches, the equipment that all the dentists are using all over the U.S. It has to be embedded in all the standardized approaches that all the insurance firms are using to verify fraud. It requires regular, trustworthy updating. And, by the way, there are dozens, scores of things your dentist is doing with the x-rays, not just cavities. So it also requires a broad scope of activity. Now, it turns out, also in dentistry, it’s a wonderful application area, because most dentists learn to read x-rays in school, and they never get any more training after that. And they don’t like doing it anyway, and they don’t charge for it. They make money when they put the stuff in your teeth. So the point being that most dentists don’t resist having to automate analysis, and it enables them to do the rest of their job better. So that’s the more general point. It integrates into their workflow in a way that doesn’t generate resistance.

Scott Wallsten: So what does this mean for policy, especially when governments are racing to subsidize AI infrastructure?

Shane Greenstein: Let’s go to data centers, right? So what are going to be the issues in data centers?

Scott Wallsten: Well, I mean, you could start with the White House AI Action Plan, which directed federal agencies to give these companies loans and all kinds of federal subsidies and so on. They weren’t specific, and were really stupid. But that’s a place to start, the only sort of actual policies we’ve seen, but there are local…

Shane Greenstein: Yeah, there’s tons of local stuff, so probably the bigger financial thing is the local breaks. They come in a couple different forms. There’s sales tax abatements, property tax abatements, and there are also tax abatements for capital that can be separate. The most common one is sales tax, which means that when you buy a server, a new set of servers and racks, you’re saving yourself 5% or 6% in Illinois, or in a state that has a tax to begin with. That’s a pretty serious savings, and the reason the local municipalities or the counties like this is because, of course, you don’t see it. It’s just a tax that doesn’t happen. Property tax is a little easier to notice if somebody’s trying to figure out which businesses are paying for the land that they’re using. And then the interesting and harder question is when localities pass on those taxes, do they get benefits?

Scott Wallsten: There are two parts to that. One is, does the local economy benefit in any way? And then, if we look at it more broadly, are we generating any kind of net benefits by creating these incentives?

Shane Greenstein: Yeah, so it’s pretty clear we get a construction benefit. That’s short—you know, 18 months to 2 years max. So you can calculate that one out pretty in detail, actually, in advance of some of these plans. You can figure out how many people will be employed, how much money will circulate.

Scott Wallsten: Sure.

Shane Greenstein: Yeah. Then after that, most of these data centers don’t employ very many people. I do have a joke, usually. Data centers are a very large building with a very small parking lot. And the people they do employ—electricians to do the electricity and air conditioning, specialists to manage the cooling and some of the water cooling stuff. Those are well-paying jobs, I don’t want to just dismiss that. There just aren’t many of them.

Scott Wallsten: Which is inherently, I mean, that’s okay. I mean, that doesn’t have to be…

Shane Greenstein: It’s okay, but you’ve got to recognize that that’s the income-generating side of it. We are not seeing much in the way of agglomeration. At least outside of Ashburn.

Scott Wallsten: Right.

Shane Greenstein: Right. You have to—that’s actually a hard thing to do. The data center, compared to the world where there wasn’t one, for some of the rural areas where this is happening, it’s the difference between having a schoolhouse and not. So it’s not a margin we often think about, but it’s there.

Scott Wallsten: What’s the additional tax revenues?

Shane Greenstein: Yeah.

Scott Wallsten: For the town.

Shane Greenstein: And a few extra people working at a time.

Scott Wallsten: Right, and also, I mean, additional tax revenues can be a big deal. I don’t want to minimize that either.

Shane Greenstein: Yeah, you know, think about Quincy, Washington, which is where Microsoft has some of it, and so does Amazon, on the eastern part of Washington state. That’s an apple growing area, a cherry-growing area. So its primary economic activity prior to the growth of the data center industry there was agriculture and seasonal, and now the data center is year-round. That’s, again, another kind of way to think about it. It changes things, and a different kind of person is generating income also.

Scott Wallsten: So that explains why a town or region might want to engage in some kind of incentives. But, you know, let’s say you’re the president, and I mean by that, a sane president. Is there any reason to think that this is in the country’s interest, to have some kind of incentives?

Shane Greenstein: Well, there is a point of view that the frontier is uniquely affiliated with the hyperscalers. So remember, this is a debatable point.

Scott Wallsten: Right.

Shane Greenstein: Right, but there is certainly a point of view in the industry. OpenAI is certainly the largest firm committed to that point of view. And that view is that we must aggressively address it to stay ahead of all foreign rivals, named and unnamed.

Shane Greenstein: So if that was your point of view, which isn’t, as I said, it’s not crazy, and there’s a lot of technical things pushing in that direction—yeah, then you would, at the federal level, you might want to subsidize that, because you would like to have the frontier still located domestically. You might want the suppliers, like your chip suppliers, to be located domestically, because the scale would be under threat if it were, say, located somewhere off the coast of China. Just saying.

Scott Wallsten: Hypothetically.

Shane Greenstein: Hypothetically, yeah. You might, if you thought it was important for military advantage, that’s an open question, right? Yeah, you might want domestic supply.

Scott Wallsten: I mean, those are real issues, and as economists, we have a really hard time incorporating that kind of analysis into our thought process.

Shane Greenstein: As economists, we do have a very straightforward piece of advice. I mean, I have to say it—subsidies usually work better than tariffs.

Scott Wallsten: Right, yes.

Shane Greenstein: So if you want, you know, we usually say, first thing first, if you want to subsidize a local industry, it’s going to go badly. That’s, like, the first thing we’re going to say. And if you want to protect the local industry and get it to grow, typically it doesn’t go well. However, in spite of that, from time to time, we have succeeded as a country in doing that. You know, internet being sort of the shining example, satellite being one of the bad examples, space being somewhere in between. So we have a bunch of examples. And there is, you know, broad-based subsidy works so much better.

Scott Wallsten: So you’re saying subsidies usually don’t work, but occasionally they do, and so we’re going to take that lesson and with that do the one thing we know is much, much worse, and that’s use tariffs.

Shane Greenstein: Yes. Yeah, but I’m saying that from the premise, like you said, most economists are pretty skeptical that this is ever going to work. But yeah, it’s one of those things that in theory ought to work, and in practice it very rarely does. And so the history of it is—and subsidy is, by the way, on multiple levels. It’s usually, it’s a tricky one here, but it’s usually for technical skills, so subsidize the locations who are training the people who end up being key to reaching the frontier. Subsidy for the capital investment, because this is a pretty capital-intensive activity. Potentially a long-term subsidy for some of the other inputs, like, say, electronics or integrated circuits. And then, you know, there are other inputs here too, like electricity and internet lines. You and I could talk a long time about subsidizing internet in low-density areas.

Scott Wallsten: Oh, yeah.

Shane Greenstein: Oh, God! Yeah, but we’re not going to do that.

Scott Wallsten: We’re not going to do that.

Shane Greenstein: Yes. So there are places where you could subsidize. And it might, if it were done in a rational way—yeah, I would contribute.

Scott Wallsten: Alright, I think we should stop there, even though I’ve still got pages of notes left to go through.

Shane Greenstein: Is that okay? That’s it.

Scott Wallsten: Yeah, no, it’s great. Shane, thank you so much. It’s always fun talking with you.

Shane Greenstein: Great pleasure.

Scott Wallsten, narrating: We’ve covered a lot, from the mystery of why Loudoun County won to whether anyone knows if these billion-dollar AI bets will pay off. The uncertainty is what makes this fascinating.

Shane Greenstein: I hope you’re gonna edit this.

Scott Wallsten: We’ll see, but I’m definitely gonna leave that part in.

Website |  + posts

Shane Greenstein is the MBA Class of 1957 Professor of Business Administration at Harvard Business School and co-chair of the Harvard Business School Digital Initiative. He teaches in the Technology, Operations and Management Unit. Greenstein is also co-director of the program on the economics of digitization at The National Bureau of Economic Research. Encompassing a wide array of questions about computing, communication, and Internet markets, Greenstein’s research extends from economic measurement and analysis to broader issues. His most recent book focuses on the development of the commercial Internet in the United States. He also publishes commentary on his blog, Digitopoly, and his work has been covered by media outlets ranging from The New York Times and The Wall Street Journal to Fast Company and PC World. Greenstein previously taught at the Kellogg School of Management, Northwestern University, and at the University of Illinois, Urbana/Champaign. He received his B.A. from University of California at Berkeley and Ph.D. from Stanford University, both in economics.

Scott Wallsten is President and Senior Fellow at the Technology Policy Institute and also a senior fellow at the Georgetown Center for Business and Public Policy. He is an economist with expertise in industrial organization and public policy, and his research focuses on competition, regulation, telecommunications, the economics of digitization, and technology policy. He was the economics director for the FCC's National Broadband Plan and has been a lecturer in Stanford University’s public policy program, director of communications policy studies and senior fellow at the Progress & Freedom Foundation, a senior fellow at the AEI – Brookings Joint Center for Regulatory Studies and a resident scholar at the American Enterprise Institute, an economist at The World Bank, a scholar at the Stanford Institute for Economic Policy Research, and a staff economist at the U.S. President’s Council of Economic Advisers. He holds a PhD in economics from Stanford University.

Share This Article

ai, Podcast

View More Publications by

Recommended Reads

Avi Goldfarb on AI and Predictive Analytics

Podcasts

Kristina McElheran on The Effects of AI on Workers and Firms

Podcasts

The Case for Not Regulating AI … Or, At Least, Not Yet

Commentaries and Op-Eds

Explore More Topics

Antitrust and Competition 181
Artificial Intelligence 34
Big Data 21
Blockchain 29
Broadband 382
China 2
Content Moderation 15
Economics and Methods 37
Economics of Digitization 15
Evidence-Based Policy 18
Free Speech 20
Infrastructure 1
Innovation 2
Intellectual Property 56
Miscellaneous 334
Privacy and Security 137
Regulation 12
Trade 2
Uncategorized 4

Related Articles

From Simple to Impossible: How Task Complexity Limits AI Research Assistants

Measuring AI Intensity by Occupation: Adjusting for Workforce Size

AI’s Energy Crisis Is Not What You Think

Needham’s Laura Martin on Why Disney Should Ditch ABC

Want AI Leadership? Stop Attacking the Science That Creates It

Does Agentic AI Require New Policy Frameworks?

Preserving the Bottom Rung: How We Can Protect Experiential Learning in the AI Era

Spectrum Policy 2025: Insights from TPI’s Winter Series

Sign Up for Updates

This field is for validation purposes and should be left unchanged.