Industry Spotlight.
Building Sustainable and High-Performance AI Infrastructure
Kasia Zabinska
Co-Founder & CSO


Interviewer
Robin Thomas
Founder Clockhash
As AI adoption accelerates, infrastructure decisions are no longer background technical choices — they are business-defining commitments. What begins as an experimental model in the cloud can quickly become a long-term cost burden. GPU density rises. Energy consumption climbs. Performance bottlenecks emerge. And what felt convenient at the start becomes complex at scale.
Few leaders approach AI infrastructure with as much intentionality as Kasia Zabinska. As Co-Founder and CSO at AlloComp, Kasia specialises in designing high-density GPU systems that power modern AI — not just for performance, but for control, efficiency and long-term sustainability.
Rather than treating infrastructure as a DevOps afterthought, she frames it as product strategy — one that determines burn rate, reliability, compliance posture and competitive advantage. At a time when AI systems are becoming more power-hungry and geopolitically sensitive, Kasia is helping startups and organisations rethink where — and how — their AI should run.
“Cloud is a great tool. But it should be a deliberate choice, not just a cloud-first reflex.”
To start, could you briefly introduce yourself and AlloComp, and explain how your company helps startups and organisations build AI infrastructure that is both high-performance and sustainable?
I’m a co-founder of AlloComp, we specialise in high density GPU systems that power modern AI. But we do more than sourcing hardware. We focus on optimising AI infrastructure for performance, control and efficiency. When systems are optimised, they use less power and reduce cooling overhead, suffer fewer failures and last longer. That is where sustainability really begins. We help teams compare options with a clear view of cost, control, performance and what scaling will look like over time. That might mean GPU-on-demand, a single workstation, or a liquid-cooled cluster in a colocation facility or on-premises. The key is making the setup fit the business, not the other way around.
AI adoption is accelerating rapidly. From your perspective, what infrastructure challenges do startups most underestimate when building AI-driven products?
Deciding where to run AI is a strategic decision with long-term consequences, and many startups underestimate that. A lot of teams approach infrastructure like renting a car at the airport. The default option appears, it looks convenient, so they click accept. Hyperscale cloud is powerful and often the right starting point. But once workloads become steady and GPU-heavy, and free credits disappear, costs can rise quickly and performance can become harder to predict. There is far more to AI hosting than simply choosing between AWS, Azure or GCP. The real challenge is not picking a provider, it is asking better questions early. What will you need in three years? How fast will compute demand grow? Where does your data need to live? What does that mean for operating costs, control and data security? Cloud is a great tool. But it should be a deliberate choice, not just a cloud-first reflex. Teams that step back and design intentionally tend to avoid expensive corrections later.
“We cannot cheat physics.”
Liquid cooling is gaining attention in AI infrastructure. Can you explain why liquid cooling is becoming critical for high-performance AI workloads, and where it delivers the most impact?
We cannot cheat physics. GPU density has increased so dramatically over the past few years that traditional air cooling is simply hitting its limits. For context, a legacy data centre’s average rack density is around 8 kW per rack. Today’s AI supercomputers exceed 100 kW per rack, with 600 kW racks already on the horizon. And almost all of that power turns into heat. The more compute you pack in, the more heat you have to remove. Liquid cooling removes heat more efficiently than air, allowing higher density systems, more stable performance and lower operating costs. Where it makes the biggest difference is in sustained, high intensity workloads such as large model training, inference at scale or research clusters running around the clock. These systems are designed to operate flat out for long periods. We are even seeing new reliability metrics gaining traction, such as MTBI, Mean Time Before Interruption, which looks at how long a system can run before an unexpected disruption. That reflects how critical stability has become, because long training runs are expensive to interrupt. That said, liquid cooling is not a universal answer. Many workloads run perfectly well on well-designed air-cooled systems. But as power density increases beyond a certain point, liquid cooling is what makes higher density designs possible. For us, it is not about pushing a specific cooling method. It is about designing systems that can handle the performance and density of modern AI now demands.
Nominate for Spotlight
Know someone who's making an impact? Nominate them to be featured in our spotlight series.
Nominate Now →We often talk about DevOps and cloud efficiency - but hardware plays a massive role underneath. How should founders think differently about infrastructure hardware when designing scalable AI systems?
Hardware is no longer a background detail. When you are training large models, running real-time inference at scale or fine-tuning multimodal systems, infrastructure becomes a defining factor in go to market speed, cost and reliability. The pace at which hardware is evolving is unprecedented. We are now running workloads that would have been hard to imagine only a few years ago. But… while it is exciting to work on the fastest, newest supercomputers, not every workload needs that level of performance. In fact, some workloads do not require GPUs at all. Scaling AI is not just about adding more GPUs. It is about building systems where all parts work well together. Compute, storage, networking, power and cooling must be balanced. When they are not, you get very expensive bottlenecks. I actually believe hardware constraints can be a positive force. They push teams to optimise software and to be more deliberate about architecture and efficiency. Just think about the computing power available on the rockets that went to the moon, and what was achieved with it. Constraints often drive innovation. It is important to think ahead. Future proofing matters. For example, investing in stronger networking at the start can make far more sense than trying to retrofit it once workloads grow. Designing for where you expect to be in two or three years, rather than where you are today, saves significant cost and disruption later. And, at risk of sounding like a broken record - remember, you have options. In some cases owning infrastructure can be more cost-effective over time. We are seeing increasing interest in compact high performance systems as a stepping stone. NVIDIA recently introduced the DGX Spark, a compact system built on the same Grace Blackwell architecture that powers large scale supercomputers. It is not about matching hyperscale cloud performance, but about giving teams access to serious capability at a very accessible price point. Treat hardware as part of your product strategy, not just your DevOps stack. The earlier you do that, the fewer surprises you face as you scale.
“Sustainability in AI is not about adding something extra. It is about avoiding waste.”
Sustainability is often seen as a nice-to-have. In reality, for AI startups working with limited budgets, what’s the biggest mistake you see when choosing compute infrastructure in terms of long-term operational costs and environmental impact?
You’re right about the “nice to have” perception, and it is a puzzling one. Maybe it comes from other sectors where sustainability is seen as a premium add on – buying sustainably grown carrots in a supermarket or organic cotton will likely cost you more. In computing, efficiency and sustainability usually mean the same thing. Sustainability in AI is not about adding something extra. It is about avoiding waste. Use your GPUs well. Build only what you need. Keep systems stable and design with growth in mind. Optimise your code. Do that, and you reduce energy use, extend hardware life and protect your margins at the same time. And we should absolutely ask how green our cloud providers really are, and not stop at “we use renewable energy”. Sustainability is also about efficiency and design. What is their PUE, basically how much power goes to computing versus overhead like cooling? What cooling methods do they use, especially for high density AI systems? How much water does it consume? How long does hardware stay in service before being replaced? Is waste heat reused? The more informed and demanding customers become, the more incentive data centre operators have to improve transparency and raise their standards. For startups working with limited budgets, sustainability is not the expensive option. Waste is.
Looking ahead to the next 2–3 years, what changes do you expect in how AI infrastructure is built, procured, and managed?
Density will keep rising. That will push liquid cooling into the mainstream, and we will likely see more focus on heat reuse, particularly in Europe, where energy costs and sustainability targets are shaping infrastructure decisions. Pricing pressure is unlikely to ease. Purchasing will require more planning and creativity. That will push teams to plan further ahead, secure capacity earlier and become more resourceful in how they allocate compute. Sovereign AI will become a much more important decision factor. More organisations will look for European providers, both in software and infrastructure, to maintain control over data and reduce geopolitical risk. Workload placement will become more deliberate. Instead of defaulting everything to a cloud environment, teams will design a mix of cloud, colocation and on-premises systems based on cost, performance, compliance and proximity to data. Regulators, investors and customers are already asking harder questions and that pressure to measure and report on sustainability and energy use will increase further. I expect to see more infrastructure deployed closer to users. Smaller, high-density edge data centres will support low latency use cases, data sovereignty requirements and regional development. Overall, AI infrastructure will become more intentional, more regional and more responsible.
“Scaling AI is not just about adding more GPUs. It is about building systems where all parts work well together.”
You’re very active in tech communities and events. How important are collaborative ecosystems when it comes to building sustainable and scalable AI infrastructure?
AI infrastructure is simply too complex for any single vendor to solve alone. It spans chips, servers, cooling, data centres, networking, software, compliance and more. And it is evolving incredibly fast. No one company covers all of that, so collaboration is not optional, it is essential. When we started AlloComp, we already had a strong ecosystem of specialists around us, and we continue to build it. We do not pretend to know everything, but we do know who to call. Each partner, customer and support organisation is solving a different piece of the puzzle. The real value appears when those pieces connect. We see this in action at every AI FORWARD event we organise. When we bring together people who would not normally attend the same conferences, learning accelerates. The most recent event, the launch of the Boole Supercomputer - Ireland’s first liquid cooled NVIDIA B200-class system at CloudCIX in Cork, proved it. You might think data centres, servers and racks sound niche, yet more than 100 people showed up on a stormy, wet day because they wanted to see it up close. There is genuine curiosity in this space. The appetite to see, experience and understand what is possible is very real, and we are proud to help fuel it. Innovation almost always happens at the intersection. Cooling experts learn from automotive engineers. One of my favourite examples - a machine vision specialist who once worked on quality control in a doughnut factory is now applying the same principles to optimise airport operations. The more these conversations happen, the smarter and more sustainable the systems become. And if we zoom out for a moment, this matters at a national and regional levels too. If Europe wants real control over its AI capabilities, that requires collaboration between universities, infrastructure providers, hardware vendors, software developers, energy companies, policy makers, local authorities and many more. Sovereign AI is not built by one company. It is built through coordinated effort. Ireland is very strong at collaboration, and that gives us a real opportunity to build more sustainable, sovereign AI infrastructure that attracts serious projects and talent.
“Those early infrastructure decisions tend to stick. Make sure they are yours, not just the default.”
If you could give one piece of advice to founders or CTOs building AI products today, what would it be?
Now, I will definitely sound like a broken record😊 You’ve got options. Where your AI runs is not just a technical detail. It will shape your burn rate, reliability, flexibility, compliance, risks, competitive advantage - pretty much every aspect of your business for years to come. It is tempting to choose whatever is fastest and most familiar. But take a moment to pause and design deliberately. Think about where you want to be in three years, not just what is easiest today. Those early infrastructure decisions tend to stick. Make sure they are yours, not just the default.
Key Takeaways
AI infrastructure isn’t about chasing the fastest hardware or defaulting to hyperscale cloud—it’s about intentional design. Where and how AI runs shapes burn rate, reliability, sovereignty, and long-term flexibility far earlier than most teams expect. As density rises and sustainability pressures grow, balanced systems—not brute force scaling—become the real competitive advantage. The teams that pause to design deliberately avoid expensive corrections later.
Our Perspective
Kasia’s insights reflect a shift we increasingly see across AI-driven organisations: infrastructure is no longer an operational layer—it is product strategy. Decisions around workload placement, hardware balance, cooling, and energy efficiency directly influence cost structure, compliance posture, and competitive positioning. At ClockHash, we work with founders and CTOs facing these same questions—how to scale intelligently without locking themselves into costly defaults.
Wrapping Up
If conversations like this resonate, the Clockhash Industry Spotlight Series is where we share more stories from leaders navigating scale, execution, and real-world technical trade-offs. Follow along for future interviews exploring how modern teams build systems that last.
Subscribe to Industry Spotlight
Get the latest scaling stories delivered each month.