When Microsoft introduced Copilot+ PCs, a pressing question emerged: Why can’t these AI applications run directly on GPUs? Nvidia provided the answer at Computex 2024 by announcing a collaboration with Microsoft to develop an Application Programming Interface (API) that enables developers to run AI-accelerated apps on RTX graphics cards.
Key Developments
1. New API for GPU Utilization: Nvidia and Microsoft are creating an API that allows AI applications, including Small Language Models (SLMs) used in Copilot features like Recall and Live Captions, to run on GPUs. This development means these apps can leverage the superior AI processing power of GPUs rather than being limited to Neural Processing Units (NPUs).
2. Enhanced AI Capabilities: GPUs generally possess higher AI processing capabilities than NPUs. For example, while Copilot+ PCs require an NPU with at least 40 Tera Operations Per Second (TOPS), even low-end GPUs can achieve 100 TOPS, with higher-end models offering even more.
3. Broader Hardware Compatibility: This API opens up Copilot+ functionalities to a wider range of PCs, including those without the specific NPU requirements. This move could significantly expand the availability and power of AI applications on consumer devices.
Retrieval-Augmented Generation (RAG)
The new API also introduces retrieval-augmented generation (RAG) capabilities to the Copilot runtime. RAG enables AI models to access specific local information, providing more accurate and helpful responses. Nvidia demonstrated RAG capabilities with their Chat with RTX earlier this year, showcasing the potential of this technology.
Nvidia’s RTX AI Toolkit
In addition to the API, Nvidia announced the RTX AI Toolkit at Computex. This toolkit, launching in June, includes various tools and SDKs designed to help developers optimize AI models for specific applications. According to Nvidia, the RTX AI Toolkit can make AI models four times faster and three times smaller compared to open-source solutions.
Future Prospects
The collaboration between Nvidia and Microsoft signifies a significant step forward in the development and deployment of AI applications. With the new API and tools like the RTX AI Toolkit, developers can create more powerful and efficient AI applications. As these tools become more widely available, we can expect a surge in AI applications by next year, utilizing the robust hardware capabilities of modern GPUs.
By addressing the hardware-software synergy, Nvidia and Microsoft are paving the way for a new era of AI-powered applications, making advanced AI functionalities more accessible and practical for a broader range of users.