Interoperations: Zachary Smith On The Future Of Hardware and Green Tech

Cloud computing veteran discusses the bottlenecks that hinder technology innovation and the challenges — and opportunities — emerging in the digital infrastructure industry.

Interoperations: Zachary Smith On The Future Of Hardware and Green TechInteroperations: Zachary Smith On The Future Of Hardware and Green Tech

For the first time since the dawn of the cloud era, hardware is a bottleneck to widespread innovation. We wanted to understand why and what can be done about it, so, I sat down with Zachary Smith, who has hacked into operating models for computing hardware for two decades, to learn how he thinks about the challenges and opportunities ahead with digital infrastructure.

Many people are coming to you to get your perspective on the future of the data center and related hardware industry. What are some specific things going on as people try to capitalize on digital infrastructure?

There's a great quote from Tim Hockin who said, "It’s an exciting time for boring infrastructure." So true! In my opinion, we spent much of the past two decades benefiting from a “scale out” world of client <> server technologies. Faster chips, better software, and more scale. We solved a lot of scale problems with software like orchestration, Kubernetes, distributed memory systems, consensus algorithms, and load balancers that could distribute load across wide swaths of commodity infrastructure. But we’re making an architectural shift into something where the computational asset is no longer a commodity—it's a differentiator and, in most cases, a hard requirement. Companies innovating with machine learning and AI must combine highly tuned software with specific hardware. Suddenly, it seems that hardware matters again. It reminds me that when the workload becomes so big and important, the only option is to optimize the hardware around the software, not the other way around. Or, as Alan Kay famously said, “People who are really serious about software should make their own hardware.”

This creates a disconnect from the public cloud experience, where you can just get more of whatever you want whenever you want. AI infrastructure, in general, is extremely power-intensive for various reasons. It requires significantly more resources than are readily or generally available. That is why we’re seeing a delay factor and a lot of noise about GPUs, data centers, and power — ingredients that were successfully obfuscated over the last cycle by public clouds but have now come back into the foreground.

How much power do we have in existing data centers, and how much will we need for AI infrastructure?

We have a lot of data center capacity in the world today. Thousands of data centers and millions of square feet, the size of huge football fields, with collectively tens of millions of servers in them. These facilities consume a lot of electricity and generate heat nonstop, mainly cooled by evaporating millions of gallons of fresh water each day, often in locations with constrained resources. And since this is widely considered critical infrastructure, it all needs to work 24/7/365, with generators on hand to provide backup power.

To meet the coming AI wave, some people estimate we’ll need four to five times as much data center infrastructure as we do today. All done over the next five years, which is mind-boggling. AI infrastructure is several orders of magnitude more power-intensive than existing CPU-based architectures. Simultaneously, hardware is getting hotter, too, because we've moved to smaller process technology, putting more transistors on each chip and then gone from one layer to multiple “3D” layers, which means more transistors in the same amount of space, all stacked on top of each other.

A few years ago, a company working on circular infrastructure did a study and estimated that around 70% of the carbon intensity of a server is just making it, moving it around, etc. One has to find the metals, dig them up, and process technology, fabs, supply chain, parts, copper, you name it. That leaves about 20% of the carbon intensity from running it for its whole life, and 10% is related to getting rid of it somehow. At some point, we will need to bring circularity in materials and business models to make this sustainable.

But I’m an optimist and believe innovation will help meet these challenges. I actually think we'll see this giant energy crunch and demand for data center infrastructure as an opportunity to help the world transition to renewables like green hydrogen. Imagine if data centers could be the number one customer for hydrogen? That could transform our energy infrastructure. In short, we have to move from "We have to build more" to "How do we solve this?"

I grew up in California in the eighties: It was all about reduce, reuse, recycle. I'm not sure we're working on the reduction right now. We’re expanding, not reducing. I think we need to figure out how we extend the life of these assets. How do we reuse more of their components? Why are we throwing away so much copper from the power supplies in a server every time we want to upgrade the chip? Can we make them modular to upgrade them over time and keep all the sheet metal and whatnot? There are a lot of places where we can start, especially at the scale we're all discussing, to reuse large portions of the infrastructure.

Who's incentivized to do that type of sort of circular manufacturing?

Unfortunately, very few companies are motivated by this right now. Chip companies get paid when they sell chips, server companies get paid when they sell servers, and data center companies often see circularity as a burden.

We need new business models that see circularity as an advantage, not a cost. First, they need financial incentives to help assets live as long as possible because it could generate more money. I like to think that if a chip company like Intel could make a penny per core for every hour that their chip was being used, they would probably work a lot harder on optimizing software for existing chips already in the market versus feverishly working on shipping the next one. 

Instead of leaning further into a “tick-tock” cycle and trying to sell the next H200 or H300, maybe work on optimizing the existing chip instead. Customers don’t love the idea of throwing everything away every few years, but if their only way to stay competitive is with new components, that’s likely what we’ll see.

We have significant market forces and realities at play, but at some point, I believe we're going to see positive movement here. It may come because of a business model shift, or because of market dynamics, or due to regulation. Digital infrastructure is just too critical of an industry to not be touched by more societal and governmental regulation.

What excites you right now in hardware innovation?

First, I’m excited about hardware innovation, as there's no way we’ll get a thousand times, ten thousand times, or a hundred thousand times better, faster, cheaper, or more efficient without it. I think that the fundamental hardware innovation we’re talking about is the kind of low-level technology that delivers a net benefit to our world. But we need to think more holistically about the resources we consume and how to create a sustainable relationship with technology.

And the second thing that excites me is open source––in software, hardware, and standards. I always go back to Linux, an amazing example of how an open, community-based approach can truly thrive, reduce friction in the market, and make massive innovation possible. We need to do the same thing for the foundational, physical layer that underpins our digital world. It will require us to work together — and I love that because it's not a winner-take-all mindset. There's so much valuable work that can be done to make the distribution layer of digital infrastructure more interoperable, and doing so in an open, community-driven way interests me.

I know you have ideas on how open source can play a role here. Is there a Linux-level innovation on the horizon?

Well, there's been some work related to open ISAs, like RISC-V, which I think is super interesting. At the data center layer, there is Open Compute, a project started by Meta (when it was called Facebook!) and Microsoft. They’ve done a great job with hyperscale hardware innovation, which represents a massive part of the supply chain. Another project, which I ran for a while, is called the SSI Alliance, which stands for Sustainable Scalable Infrastructure Alliance. It's part of the Linux Foundation and focused on interoperability between servers and rack-level infrastructure — the stuff that makes it much easier to adopt best practices around power, liquid cooling, reuse, and recycling. So there are some efforts, and we're seeing a positive shift from proprietary technology to open innovation.

Three or four years ago, I found it very difficult to get chip companies and data center companies to pay attention to the topic of sustainable infrastructure and open standards. I heard things like, "Why do we need to focus on this problem? It’s working fine. I'm not sure how it works, but the data centers are full, and all the servers got sold, so it must be fine."

Now we see this intense interest, and I hear things like: "Wait, we're starting to reach the thermal limits of our existing infrastructure. We're starting to not be able to get enough power. The air cooling doesn't work anymore." Overall, we've pushed the limit and now you’re seeing chip companies and server OEMs are interested in different form factors, different cooling methodologies, and new ways to maintain the infrastructure, which is, in my opinion, a welcome change.

But we still need to figure out how we can interoperate better to meet the global scale that the market is demanding.

Is there going to be a culture clash here between the people who have been doing this for a long time and new people? Do you see that kind of thing taking place?

There’s definitely a generational gap and I believe an alarm bell has been ringing in our industry for a while. Basically, there are not enough people who are aware of, invested in, and diving into the lower layers of the stack to drive the kind of innovation we need.

I’ll give you an example. When Packet was bought by Equinix — which is one of the largest operators of data centers in the world — I tried bringing a local high school class to a data center near Manhattan, but we had no way to bring them in due to a lack of government IDs. I suspect this issue may be resolved by now, but as a general rule, it's hard for folks to visit, to touch and feel, to understand with their own eyes what a data center is all about. It’s unlikely that folks will become interested in working in the data center and related hardware industry if they’ve never seen it.

In a way, this is a great success for the data center and cloud industry. We've made it work so well, in such an invisible manner, that people don't even worry about it. But it also means it’s harder for new people to come in and start experimenting. My hope is that with today’s interest in digital infrastructure, there will be a whole wave of entrepreneurs who know way more about data centers, power, internet infrastructure, server hardware, chips, and racks.

What are some companies that excite you?

I'm super interested in the fundamental technology of sustainable data centers. I’m a big fan, for example, of EdgeCloudLink, which is working on the hydrogen economy for data centers. It's got a one-megawatt test facility that I went and visited in Silicon Valley recently and it's awesome! A data center that isn’t connected to the grid and its only byproduct is water. How cool is that?

I'm also interested in companies doing cloud-style operations. A special skill set is needed to run physical, global infrastructure at scale because things break all the time. It’s even more difficult as the pace of innovation increases. As such, companies that are comfortable getting their hands dirty at the physical layer and investing in the teams to operate this type of infrastructure are rare and especially needed. You could say that, given my past, I have a soft spot for cloud operators!

Finally, I’m interested in open-source reinvention. I've been a bystander to the growth of several open-source companies like Grafana, where my former co-founder Raj Dutt is the founder and CEO. I think we're just starting to see a class of companies that are born in a cloud-first model with open source as one part of their distribution model, but are also natively offering cloud or SaaS services. I’m excited by how fast these companies can find their product fit and scale revenue.

What about companies like Lightmatter, the photonics business?

To be honest, I’ve had very little brain space for that. I have a few friends playing in the quantum field, but I honestly have to ask them to repeat things to me a few times before I understand it. I must be getting old!

It sounds like building community is important to you.

It is. When I got into business in New York in 2001 with my first company, the NYC tech scene was very different. It wasn’t as vibrant as it is now. But I’ve always felt that building community, even with competitors or other parts of the industry, is super important. For example, when I was running Voxel with Raj, we always talked to our competitors. They were our friends, and we worked together quite transparently. Fortunately, the industry was growing, and you could see how a rising tide could lift all boats.

I think we could use more of that today. I see a lot of new independent, siloed, cloud providers who don’t even know each other. Personally, I would love to see these folks work together, even if their offerings are somewhat competitive.

As such, I’m interested in new ways to bring people together. It may not be the same conference format or meetup style of 2015, but I do think there is an unlock in bringing people together, sharing information, building relationships, and seeing what magic can happen.

Tags: Takes