
Chinese company DeepSeek has developed an AI chatbot similar to OpenAI's ChatGPT. According to a CNBC report, after its launch in the U.S. in January, it quickly surpassed ChatGPT to become the No. 1 free app on the Apple App Store. At the same time, DeepSeek claims that the AI model behind it cost less than USD 6 million to develop. This poses a disruptive challenge to the current market belief that AI development requires huge budgets and top-tier chips, and also caused chip giant NVIDIA's stock to fall by 16.86%, wiping out nearly USD 600 billion in market value.
This article will explore the key factors behind DeepSeek's breakthrough in the market, the impact on NVIDIA, and a comparison of DeepSeek's model applications with those of OpenAI and Google Gemini.
What Is DeepSeek?
DeepSeek is an emerging Chinese AI company that launched an AI chatbot similar to ChatGPT on January 10, currently available in both web and mobile app versions. The notable aspect is its newest model behind the chatbot, DeepSeek-V3, which can rival international AI leaders like OpenAI and Anthropic in terms of cost, performance, and mathematical reasoning capabilities.
What's more, unlike the closed-model strategy adopted by OpenAI, DeepSeek focuses on low cost, high efficiency, and open-source technology, drawing the attention of developers and companies worldwide. Even Yann LeCun, the chief scientist leading Meta AI research, has noted that "open-source models are surpassing proprietary models.
✦Further Reading: Analysis by iKala CEO Sega —How DeepSeek's Open-Source Strategy Could Reshape the AI Value Chain
Why Did DeepSeek's Success Impact NVIDIA's Stock?
DeepSeek's success has upended traditional assumptions about AI development and introduced uncertainty to the high-performance chip market, proving that highly efficient AI models do not necessarily require huge budgets and top-tier GPUs. Especially in a global AI industry still heavily dependent on NVIDIA GPUs (like the H100 and A100) for training, if DeepSeek-V3 can optimize math and reasoning applications with an open-source, low-cost, and high-efficiency core, it may reduce enterprises' need for expensive GPU servers and affect NVIDIA's market share.
Moreover, with U.S. restrictions on exporting AI chips to China, Chinese companies are accelerating the development of their own AI infrastructure. As a leader in open-source AI, DeepSeek is driving companies to adopt domestic AI chips—such as HUAWEI Ascend and Cambricon—potentially further weakening NVIDIA's competitive advantage in the Chinese market. As DeepSeek's technology continues to advance, it may redefine the landscape of the AI chip industry.
Differences Between DeepSeek R1 and V3 Models: Which Use Cases Are They Suitable For?
Recently, DeepSeek introduced two large language models: DeepSeek-V3 and DeepSeek-R1, each with distinct architectural designs, training methods, and application scenarios.
DeepSeek-V3: A High-Performance, Open-Source Base Model Optimizing AI Training and Inference Costs for NLP Applications
DeepSeek-V3 employs a Mixture-of-Experts (MoE) architecture with 671 billion parameters, though only about 37 billion are activated in actual operation, striking a balance between high performance and low cost. DeepSeek-V3 focuses on natural language processing (NLP) tasks, featuring excellent text generation, multi-language support, and conversational understanding capabilities, making it suitable for applications requiring large-scale language processing.
DeepSeek-R1: Focused on Reasoning Capabilities and Enhanced Through Reinforcement Learning
Based on the DeepSeek-V3 architecture, DeepSeek-R1 further strengthens reasoning ability, especially in solving mathematical problems, generating code, and performing logical reasoning. Trained via reinforcement learning, it is designed for tasks requiring advanced reasoning, such as research, academic applications, and complex problem-solving.
Overall, DeepSeek-V3 offers scalability and high performance for a wide range of NLP tasks, while DeepSeek-R1 targets specialized reasoning requirements, delivering superior problem-solving capabilities. The choice depends on specific use cases and requirements.
Why Are DeepSeek's Models Cost-Effective? How Do They Reduce AI Development Costs?
DeepSeek's most notable breakthrough is that its AI chatbot can match OpenAI's capabilities but at only a fraction of the cost. How do they achieve this?
1. Company Background and Flexible Organizational Structure。
According to a report shared by independent tech news reporter Tim Culpan, DeepSeek's success is not solely due to a technological breakthrough, but also from minimizing unnecessary computational resource consumption. By using the smallest amount of computation, they can achieve performance comparable to other large AI models. This approach mirrors the strategies of quantitative hedge funds in financial markets—filtering out the most valuable information and then using proprietary algorithms to infer market trends, thereby increasing efficiency and reducing operational costs.
The core strategy of the DeepSeek team lies in removing redundant data, optimizing computational processes, and leveraging open-source technology to reduce development costs. This enables them to offer powerful NLP and reasoning capabilities with fewer computational resources. Meanwhile, unlike a large company like Google with extensive bureaucratic processes, DeepSeek can rapidly implement ideas.
2. Technological Innovations in AI Development
Beyond company culture, the key lies in DeepSeek's unique technological innovations and resource optimization strategies. By optimizing computational efficiency through architectural design, DeepSeek reduces GPU resource requirements, enabling more cost-effective AI training and inference. Below are its core technological breakthroughs:
-
Multi-Head Latent Attention (MLA):
DeepSeek V3 employs MLA technology to effectively reduce the KV Cache (Key-Value Cache) needed for each query, minimizing computational resource consumption during inference. Research indicates that MLA reduces KV Cache requirements by 93.3%, allowing DeepSeek models to handle longer contexts and achieve faster inference with the same computational resources.
-
Mixture-of-Experts (MoE) Architecture:
DeepSeek V3 uses a Mixture-of-Experts model, which allows tokens to be automatically routed to the most appropriate expert network, reducing unnecessary computational overhead. With MoE, DeepSeek V3 only needs to activate a portion of the expert networks instead of running the entire set of model parameters, making the inference process more efficient and lowering computational costs compared to traditional Transformer models.
-
Multi-Token Prediction (MTP):
DeepSeek's AI model training process incorporates a multi-token prediction approach, allowing the AI to predict multiple tokens during training instead of a single token at a time. This significantly boosts training efficiency, enabling performance comparable to GPT-4o in a shorter time, while reducing GPU computing requirements and energy consumption.
DeepSeek vs. OpenAI: From Model Distillation to an Open-Source Strategy Shift
As DeepSeek rises in the AI realm, the market frequently compares it to OpenAI. However, Reuters reports that DeepSeek may be using a distillation approach that allows smaller AI models to learn by imitating the behavior and decision-making patterns of larger models, thereby improving computational efficiency and reducing costs. This strategy has given DeepSeek an edge in reasoning capability and cost control.
Nonetheless, facing competition from DeepSeek, OpenAI CEO Sam Altman recently admitted in media reports that the company's closed strategy is "on the wrong side of history" and that they will reconsider an open-source approach to maintain their technological competitiveness. Additionally, on February 1, OpenAI officially launched o3-mini, a lightweight AI model, and offered free inference for the first time, suggesting OpenAI is actively responding to the market changes DeepSeek has sparked, aiming to remain at the forefront of AI competition.
Below is a comparison table of DeepSeek vs. OpenAI
| Comparison Item | DeepSeek | OpenAI |
|---|---|---|
| Technical Architecture | Open-source model (DeepSeek-V3・R1), allowing developers to deploy and fine-tune freely | Closed model (GPT-4o), accessible only via API |
| Core Technology | Uses Mixture of Experts (MoE) model, reducing unnecessary computation and improving inference efficiency | Uses closed Transformer architecture, integrating multimodal capabilities (text, speech, image) |
| Language & Data Training | Primarily based on Chinese and Asian language data, optimized for the Asian market | More extensive training data, stronger English and multilingual comprehension capabilities |
| Inference & Computational Costs | Uses Multi-head Latent Attention (MLA) technology with KV cache requirements to reduce inference costs | Stronger performance but higher operational costs, requiring more computing resources |
| Application Scenarios | Suitable for developers, internal enterprise training, digital assistants, and program support | Suitable for enterprise AI services, chat AI, creative AI, and various applications |
| Product & Integration | Open-source model, suitable for developers to train and deploy locally, but not fully production-ready | ChatGPT is fully commercialized and integrates with Microsoft Azure AI, suitable for B2B |
| Market Positioning | Focused on the Chinese market, attracting open-source developers | Global market leader, deeply integrated with Microsoft, enterprises, and IT sectors |
| Government & Regulatory Impact | Supported by the Chinese government, with open-source strategy enhancing acceptance | Strong brand influence, but faces regulatory and data governance challenges |
DeepSeek vs. OpenAI, Which Should Businesses Choose?
- DeepSeek:Suitable for businesses seeking on-premises deployment, low-cost AI training, and customizable AI solutions, especially in mathematics, reasoning, and Chinese NLP applications.
- OpenAI:Ideal for businesses requiring a comprehensive AI product, enterprise-grade API integration, and powerful multilingual capabilities, particularly for global markets and multi-modal use cases.
Comparing DeepSeek R1 with Google Gemini: Research Indicates Google Has a Competitive Edge
Although the market mostly compares DeepSeek R1 with OpenAI o1, analysis from Semianalysis suggests that Google Gemini Flash 2.0 Thinking is also competitive in reasoning ability and cost control. Google launched its model a month before the release of R1, outperforming R1 in multiple benchmarks, indicating Google's edge in AI reasoning technology.
Although R1 quickly caught up with OpenAI o1 through inference optimization techniques, the chosen benchmarks might have been biased to highlight R1's strengths. Google Gemini Flash 2.0 Thinking, on the other hand, can operate at lower computational costs even when handling longer contexts, suggesting Google retains a cost advantage in inference.

Attention and Controversies Surrounding DeepSeek's Rapid Rise
1. Technological Origins and Intellectual Property
Reuters reported that DeepSeek might be using a distillation approach involving OpenAI's data during model training to rapidly enhance its capabilities. This could violate OpenAI's terms of service and may affect future policies on data distillation.
2. Content Moderation and Freedom of Speech
Bloomberg News reports that DeepSeek self-censors regarding sensitive topics in China, avoiding queries about events like the Tiananmen Square protests or the possibility of a Chinese invasion of Taiwan. Researchers warn this type of censorship could spread misinformation and shape international public opinion.
3. Data Privacy and Cybersecurity Concerns
Cybersecurity experts warn that DeepSeek collects users' chat logs, IP addresses, and keystroke patterns, which could be used for phishing or manipulative activities, posing risks to individuals and enterprises. Additionally, there are reports of DeepSeek transferring U.S. user data to servers in China, raising concerns about oversight under China's national security laws.
4. Restrictions and Investigations by Various Countries
As DeepSeek garners global attention, certain governments and enterprises remain wary of its use, even considering restrictions to reduce potential cybersecurity risks. According to CNA (Central News Agency), the U.S. Navy, citing security concerns, has advised personnel not to use DeepSeek's model; Taiwan's Ministry of Digital Affairs announced restrictions on using DeepSeek AI products in government agencies; and the UK, France, Germany, and others have launched their own investigations into DeepSeek.
Conclusion
The rapid rise of DeepSeek has reshaped market perceptions of AI model development, leveraging an open-source strategy, low-cost inference, and high-performance training to challenge international giants like OpenAI and Google. Its R1 and V3 models showcase outstanding reasoning and computational capabilities, driving greater accessibility of AI technology and impacting NVIDIA's market value and AI chip demand.
However, DeepSeek's emergence also comes with controversies over data privacy, content censorship, and international market competition, especially under tighter AI regulations. Its future developments merit close attention. With competitors like Google Gemini Flash 2.0 Thinking and OpenAI o3-mini entering the fray, the AI inference market will further accelerate its transformation.
DeepSeek's technological innovations open up new possibilities for the industry. Businesses and developers should maintain an open mind about its applications and explore opportunities in low-cost computing, enhanced reasoning, and corporate AI deployments. Contact iKala to learn more about AI applications!