Andrew Feldman, CEO of Cerebras, discusses the company's innovative approach to AI compute, challenging NVIDIA with its wafer-scale engine and open-source strategies. He shares insights on sovereign AI, the shift to open-source models, and how Cerebras is redefining AI infrastructure.
Key takeaways
- Cerebras is focused on building systems, not just chips, to provide complete solutions for AI compute, offering both on-premise and cloud-based options.
- The company's wafer-scale engine allows for more on-chip memory, significantly increasing the speed and efficiency of AI inference compared to traditional GPUs.
- Inference is becoming increasingly important as AI applications move from novelty to practical use, driving the need for faster and more efficient compute solutions.
- Cerebras is strategically partnering with sovereign institutions like G42 in the UAE to build AI infrastructure and develop models tailored to specific regions and languages.
- The shift to open-source models is creating new opportunities for smaller companies to innovate and deliver AI-powered applications without relying on proprietary technologies.
- Ease of use is critical for widespread AI adoption, and Cerebras is making it simple for developers to switch to their platform with minimal code changes.
Who this episode is for
- AI founders
- AI operators
- AI investors
- Deep tech enthusiasts
- Individuals interested in the future of AI compute
Nataraj interviews Andrew Feldman, CEO of Cerebras, about the company's mission to revolutionize AI compute. Cerebras is challenging established giants like NVIDIA with its wafer-scale engine, designed for unprecedented AI compute power and efficiency. The conversation explores the current AI landscape, the shift to open-source models, and the future of sovereign AI.
The Genesis of Cerebras
Founded in 2015, Cerebras aimed to address the emerging need for specialized AI compute. Feldman and his team, with a background in building chips and systems, recognized the potential for a new computer architecture tailored to AI workloads. They predicted that AI would usher in a new era of compute, similar to how cell phones and networking equipment created new demands and opportunities.
Despite NVIDIA's dominance at the time, Cerebras believed it could build a new type of computer that would be orders of magnitude faster. This vision led to the development of the wafer-scale engine, a groundbreaking innovation in AI hardware.
Building the World's Largest AI Chip
Cerebras embarked on a mission to build the largest chip in the history of the computer industry. This involved fundamental design, creativity, engineering, innovation, and invention. After three years and half a billion dollars, the company created its first wafer-scale engine.
The larger size allowed Cerebras to keep more data on the chip, reducing the need for frequent data movement and minimizing power consumption. This resulted in significantly faster performance compared to traditional chips. The initial period was challenging, with 15 months of high spending and no working chip, but the team persevered and eventually achieved a breakthrough.
The Advantage of On-Chip Memory
Cerebras's architecture utilizes SRAM, a fast type of memory, to overcome the limitations of traditional memory solutions. By stuffing the chip with SRAM, Cerebras achieved both capacity and speed, resulting in significantly faster performance. This is particularly beneficial for inference, where large amounts of data need to be moved quickly to generate results.
The memory bandwidth of GPUs is a known bottleneck, and Cerebras's design addresses this by placing the SRAM right next to the compute core. This allows for much faster data movement, leading to significant improvements in inference speed.
Product Strategy and Market Approach
Cerebras sells complete computer systems, not just chips, to provide comprehensive solutions for AI compute. These systems can be deployed on-premise or accessed through the cloud, either through Cerebras's own cloud or through partner clouds like Amazon Marketplace and Microsoft Marketplace.
The company also offers forward-deployed engineering services to help customers accelerate the delivery of AI solutions. This comprehensive approach sets Cerebras apart from other chip companies and allows it to cater to a wide range of customer needs.
The CUDA Challenge and Open Source
While CUDA has been a significant barrier for new chip companies, it is becoming less relevant in the inference space. Developers can easily switch to Cerebras's platform with minimal code changes, making it an attractive option for those looking to improve inference performance.
The rise of open-source models is further weakening CUDA's dominance, as developers can leverage these models without needing to know CUDA. This is creating new opportunities for innovation and making AI more accessible to a wider range of users.
Sovereign AI and Geopolitics
Cerebras is strategically partnering with sovereign institutions like G42 in the UAE to build AI infrastructure and develop models tailored to specific regions and languages. This reflects the growing importance of sovereign AI, as nations seek to build domestic AI capabilities.
Feldman believes that the US should sell chips and AI systems to its allies but not to its adversaries. This highlights the geopolitical considerations that are shaping the AI landscape and the importance of aligning AI development with national interests.
Get one tactical founder breakdown per week
No fluff. Just the decisions behind product-market fit and scale.
