DeepSeek Develops V3 with Enhanced Nvidia Hardware

Mon 3rd Feb, 2025

DeepSeek has reportedly invested significantly more in the development of its V3 model than initially indicated. The company is said to have access to a substantial number of GPU accelerators, totaling around 60,000, including the H100 models, which are subject to U.S. export restrictions.

Despite the U.S. banning the sale of H100 accelerators to China, DeepSeek managed to acquire approximately 10,000 of these units through imports. The previously estimated costs for the V3 model, pegged at about $5.6 million, likely represent only a fraction of the total expenditure involved.

According to the technical documentation for the V3 model, DeepSeek operates a relatively small data center featuring 2,048 Nvidia H800 accelerators. The projected rental fees for these GPUs were calculated at $2 per hour each. Based on an estimated total of 2.8 million computing hours distributed across these GPUs, the total cost calculation aligns with the earlier figure of $5.6 million.

However, the developers have noted a significant caveat: the stated costs only account for the official training of DeepSeek V3 and exclude expenses related to prior research endeavors and experimental phases concerning architectural design, algorithms, or data usage.

Market analysts from Semianalysis have conducted a detailed assessment of the actual costs. They suggest that DeepSeek, through its parent company High-Flyer, has access to around 60,000 Nvidia accelerators, comprising 10,000 A100 units from the Ampere generation acquired prior to the implementation of U.S. export restrictions, 10,000 H100 units sourced from the gray market, 10,000 H800 accelerators tailored for the Chinese market, and 30,000 H20 units introduced in response to newer export limitations.

During a recent CNBC interview, Alexandr Wang, CEO of Scale AI, mentioned that DeepSeek is utilizing 50,000 H100 accelerators. This statement may reflect a misunderstanding, as the H100, H800, and H20 models--collectively amounting to 50,000 units--belong to the Hopper generation, albeit in different configurations.

The H100 model is the standard version available in Western markets, while the H800 is modified by Nvidia to limit the NVLink communication capability between multiple GPUs due to export controls. The H20 model, designed in light of more recent restrictions, boasts significantly reduced computing power but retains full NVLink functionality. Additionally, it utilizes maximum memory expansion capabilities, featuring 96 GB of High-Bandwidth Memory (HBM3) with a transfer rate of 4 TB/s.

Semianalysis further estimates that the necessary infrastructure to support 60,000 GPUs could cost around $1.6 billion. Even when these costs are amortized over a span of several years, the financial burden associated with DeepSeek V3 development would remain considerable. Operational costs would be additional, not accounting for the salaries of the development teams involved.

According to DeepSeek, 96% of the cited $5.6 million cost pertains to pre-training, which encompasses the training of the core model. It is important to note that this figure does not reflect the earlier development efforts or innovations introduced in DeepSeek V2.

Among the advancements made, the development of the Multi-Head Latent Attention (MLA) caching technique reportedly took several months. This technique is designed to compress generated tokens for rapid access during new queries, minimizing the required storage space. Another significant innovation is the 'Dual Pipe' approach, which leverages a portion of the streaming multiprocessors (SMs) in Nvidia GPUs as a virtual Data Processing Unit (DPU). This allows for independent management of data movement between AI accelerators, significantly reducing wait times compared to traditional CPU usage, thereby enhancing operational efficiency.

Notably, the technical documentation concerning the more powerful R1 model does not provide information regarding the hardware utilized, raising questions about the plausibility of employing a small data center for this purpose. Recent reports suggest that DeepSeek could also be utilizing AI accelerators from Huawei for the R1 model.

Article collated/edited/curated, or written in-house, by The Munich Eye.

Mystery Behind Blue-Colored Dogs Near Chernobyl Explained

Reports of blue-colored dogs roaming the exclusion zone surrounding the damaged Chernobyl nuclear power plant have gained widespread attention online, prompting curiosity and concern. Recently,...

Multiple Injuries Reported After Stabbing Incident on Train Near Huntingdon, England

A serious security incident occurred on a passenger train near the town of Huntingdon in Cambridgeshire, England, resulting in several individuals being injured by a stabbing attack. Emergency...

Hurricane Melissa Devastates Caribbean: Rising Death Toll and Severe Disruptions in Jamaica

The Caribbean region continues to grapple with the catastrophic aftermath of Hurricane Melissa, which has caused significant destruction across multiple island nations. Jamaica has been particularly...

Avalanche in South Tyrol Claims Lives of German Mountaineers

An avalanche in South Tyrol has resulted in a tragic loss of life among a group of German mountaineers. The incident occurred during an ascent of the Vertainspitze mountain, a prominent peak in the...

White House Introduces Stricter Access Rules for Journalists Under Trump Administration

The White House has implemented new restrictions impacting journalists' access to areas within the press office, specifically those used by White House Press Secretary Karoline Leavitt and her team....

Netherlands Set to Welcome First Gentleman as Rob Jetten Becomes Prime Minister

The Netherlands is poised to mark a historic milestone in its political landscape, as Rob Jetten, a 38-year-old from the social-liberal party, is expected to assume the role of Prime Minister. With...

Germany Raises Health Insurance Income Limits: What This Means for Expats

Section: Health Insurance

Revolutionising Websites for Cafés, Restaurants, and Bars Across Europe

Section: News

New Regulations Mandate Winter Tires for Trucks for Five Months Annually

Section: News

Power Outage at Chernobyl Following Russian Airstrike

Section: News

Reach More Visitors: The Venue and Events Management and Promotion Tool by TEN

Section: Arts

Oktoberfest Closed Following Verified Bomb Threat Linked to Deadly Incident in Northern Munich

Section: News

The Eye Newspapers Launch Cutting-Edge Venue and Event Management System for Organisers and Venue Owners

Section: Arts

Double Feature at the Orangerie

Section: Arts

Why Fashion Is Cyclical: The Eternal Return of Style

Section: Fashion

Why Climbing Shirts Are More Than Just Outdoor Gear

Section: Arts

German Private Health Insurance

Health Insurance in Germany is compulsory and sometimes complicated, not to mention expensive. As an expat, you are required to navigate this landscape within weeks of arriving, so check our FAQ on PKV. For our guide on resources and access to agents who can give you a competitive quote, try our PKV Cost comparison tool.

Hospital and Clinic Directory

Germany is famous for its medical expertise and extensive number of hospitals and clinics. See this comprehensive directory of hospitals and clinics across the country, complete with links to their websites, addresses, contact info, and specializations/services.

Upcoming Events

8MM Live: Ryder The Eagle

Date: 12 Nov | Location: Berlin, 8MM BarRyder The Eagle returns to 8MM Bar in Berlin for a unique live performance. Doors open at 19:00, with the show starting at 20:30. Known as a DIY hero, fallen Romeo, and karaoke enthusiast, the Mexico-based French singer-songwriter brings an art-rock flair,...