OpenAI again accuses DeepSeek of illegally "distilling" its models.

OpenAI has again publicly accused the Chinese startup DeepSeek of illegally copying its product functionality using "model distillation" technology, and has submitted relevant evidence to the U.S. House Select Committee on China. OpenAI claims that DeepSeek uses a technology called...Model DistillationUsing technical means, core functions are extracted from OpenAI's massive model, enabling the development of competitive products with comparable performance at extremely low cost.

According to an internal OpenAI memo, DeepSeek allegedly used "sophisticated and obfuscated methods" to bypass OpenAI's security safeguards and collect output data from OpenAI models to train its own R1 and subsequent models. OpenAI stated that this action not only harmed its business interests but also disrupted fair market competition.

Image

Industry experts point out that while OpenAI's accusations against DeepSeek are extremely harsh, "distillation" is actually quite common in the AI industry. It allows developers to use the output of large models as training references with limited data and computing power, gaining significant advantages in cost and time. However, OpenAI's terms of service explicitly prohibit users from copying its model output to develop competing models. This means that if DeepSeek used OpenAI's model output without authorization, it may indeed have violated OpenAI's contractual terms.

Despite facing accusations, DeepSeek continues to perform strongly in the market. Since the release of its R1 model, DeepSeek has become one of OpenAI's main competitors in both domestic and international markets, thanks to its breakthroughs in mathematical reasoning, programming, and general dialogue capabilities. DeepSeek's R1 model reportedly costs only about one-thirtieth of OpenAI's latest large models in terms of training cost, while its performance rivals OpenAI's most powerful inference model, o1.

DeepSeek's commitment to open source is considered an industry benchmark, evolving from simply open-sourcing model weights to encompassing the underlying architecture.Full-stack open sourceDeepSeek not only open-sourced the model itself, but also several underlying optimization tools, including FlashMLA, DualPipe, and EPLB (Expert Load Balancing). These tools cover various aspects of model training.Two-way pipeline parallelCommunication computation overlapas well asExpert load balancerThese key technologies help developers reproduce and deploy models more efficiently.

DeepSeek's open-source strategy has spurred significant participation from the global developer community. Developers are not only replicating DeepSeek but also attempting to "stitch" it together with models like Claude and Gemini, leveraging their respective strengths to build hybrid AI systems. This demonstrates that DeepSeek has become a fundamental component of the global AI toolkit.

This article is a user submission and does not represent the views of this website.

The copyright of this content belongs to the original author. Please contact the original author for authorization before reprinting. For any copyright infringement issues, please contact copyright@jaketao.com

0
0 0 0

Further Reading

Post a reply

Log inYou can only comment after that.
Share this page
Back to top