Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out

Models & Benchmarks · Published: Jan 30, 2025 · Amara Okonkwo · ~10 min read

Author

Amara Okonkwo · Robotics & Embodied AI Editor

Humanoids, industrial robots, and what is demo vs. deployed.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out

I read the headlines about OpenAI and Anthropic’s coordinated strike against DeepSeek, but what actually matters is whether these models can run reliably on hardware that doesn’t cost a fortune. The industry loves a legal battle, but I’m watching the unit economics of distillation to see if this threat is real or just noise.

I think legal threats don’t fix broken supply chains or inflated inference costs. In the field, distillation is standard engineering practice, not necessarily IP theft. What I watch for is microsoft’s quick integration proves they care about capability more than copyright posturing.

Waking up to the news, I see that both OpenAI and the parent company of Claude have taken action against DeepSeek. This isn’t just a PR stunt; it signals a genuine anxiety in Silicon Valley about losing its moat.

According to The Financial Times, OpenAI stated that it has found evidence proving DeepSeek used their models for training, which is suspected of infringing on intellectual property rights. The core accusation involves “distillation”—a technique where outputs from larger, more expensive models are used to train smaller, cheaper ones. This allows competitors to achieve similar results in specific tasks at a lower cost, threatening the high-margin business models of the incumbents.

Microsoft has also begun investigating whether DeepSeek used OpenAI’s API, adding another layer of scrutiny to DeepSeek’s development practices.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 2

Upon the news breaking, a wave of mockery ensued online. The hypocrisy angle resonated with engineers who have spent years building on open weights and public data.

New York University professor Marcus was among the first to criticize:

OpenAI: We need free access to all artists’ and writers’ works to train our models so we can save money to sue DeepSeek for blatantly stealing from us!

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 3

Jason, founder and editor-in-chief of the well-known tech media outlet 404 Media, directly called out OpenAI in an article, implying that they are guilty of hypocrisy. The sentiment is clear: when you build a castle on sand, don’t complain when others find the foundation.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 4

Meanwhile, Dario Amodei, founder of Claude’s parent company Anthropic, wrote a lengthy post discussing DeepSeek. His take was more measured but still dismissive of the threat level.

He stated that claiming DeepSeek poses a threat is an exaggeration; it is merely “at the level we were 7–10 months ago.” He noted that Claude 3.5 Sonnet still remains far ahead in many internal and external evaluations, suggesting that DeepSeek’s leap forward is impressive but not yet game-changing for top-tier performance.

However, to maintain our lead, I suggest we might need to impose more constraints on ourselves?

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 5

Good heavens. To contain DeepSeek, competitors OpenAI and Anthropic have rarely joined forces. This rare alliance highlights the perceived urgency of the situation, even if their public statements downplay it.

In contrast, Microsoft’s approach has been more intriguing. While others were drafting cease-and-desist letters, they were looking at how to monetize the technology immediately.

Just hours after accusing DeepSeek of infringement, Microsoft integrated the DeepSeek model into its AI platform. This pragmatic move suggests that if you can’t beat them, you might as well sell their engine under your own brand.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 6

Netizens commented: As the saying goes, denial is the first step to acceptance. The market doesn’t care about your legal grievances; it cares about what works and how much it costs.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 7

The API Loophole and the Distillation Debate

I read the reports on Microsoft and OpenAI’s ongoing investigation into DeepSeek, and what stands out is not just the suspicion of API misuse from last autumn, but the specific allegation of data leakage. Under OpenAI’s terms of service, while anyone can register for the API, using that output to train models that compete directly with them is strictly forbidden.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 8

OpenAI told The Financial Times that they found evidence of model distillation, suspecting DeepSeek was responsible. Currently, OpenAI has refused to comment further and declined to provide details on the evidence.

I think aPI logs don’t lie about usage patterns, but proving intent is a legal minefield. In the field, distillation is standard engineering; treating it as theft ignores how the industry actually works.

Let’s look at what model distillation actually is, since it has sparked this controversy. It is a model compression technique that “distills” the knowledge of a complex, computationally expensive large model (the teacher model) into a smaller, more efficient model (the student model). The core goal is to allow the student model to retain as much performance as possible from the teacher while being lightweight.

In the paper Distilling the Knowledge in a Neural Network by Turing Award winner and “Father of Deep Learning” Geoffrey Hinton, it was pointed out:

Distillation is very effective for transferring knowledge from ensembles or large highly regularized models to smaller distilled models.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 9

For example, Together AI recently worked on distilling Llama 3 into Mamba, achieving up to a 1.6x increase in inference speed while maintaining stronger performance.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 10

An article on knowledge distillation from IBM also noted that in most cases, the most advanced LLMs place too high demands on computation and cost… Knowledge distillation has become an important method to transplant the advanced capabilities of large models into smaller (usually) open-source models. Therefore, it has become a key tool for democratizing generative AI.

What I watch for is we need efficient models in production; restricting distillation only helps incumbents hoard compute. I think the unit economics of running massive models without compression are unsustainable for most users.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 11

In the industry, terms of service for some open-source models allow distillation. For instance, Llama allows it, and DeepSeek previously stated in a paper that they used Llama.

Moreover, crucially, DeepSeek R1 is not merely a simple distilled model. Mark Chen, Chief Scientist at OpenAI, stated:

DeepSeek independently discovered some core concepts adopted by OpenAI during the implementation of o1.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 12

He also acknowledged DeepSeek’s work in cost control and mentioned the trend of distillation technology, noting that OpenAI is also actively exploring model compression and optimization technologies to reduce costs.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 13

So, to summarize: Model distillation technology is very common and recognized in both academia and industry, but it violates OpenAI’s terms of service.

Is this reasonable? Who knows. But the issue is that OpenAI itself has significant compliance issues. It is well known that when training its models, OpenAI scraped data from the entire internet, which includes not only freely available knowledge content but also a large amount of copyrighted articles and works. In December 2023, The New York Times sued Microsoft and OpenAI together for intellectual property infringement. This lawsuit has not yet reached a final verdict; throughout this past year, OpenAI has provided multiple explanations to the court regarding its actions.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 14

In the field, hypocrisy is a poor foundation for enforcing technical standards in a competitive market.

The Hypocrisy of the “Fair Use” Defense

I read DeepSeek’s rebuttal, and it reads less like a legal defense and more like a mirror held up to Silicon Valley’s biggest players. They argue that scraping public internet data is not just reasonable but necessary for innovation, citing long-standing fair use doctrines for non-commercial training. The core of their argument rests on the idea of scaling: they claim that any single piece of stolen content is too small to matter in the massive ocean of LLM training data.

What I watch for is scaling laws don’t erase copyright; they just make infringement harder to track. I think if the model’s value comes from specific proprietary data, “scaling” isn’t a shield.

The article points out the glaring contradiction: OpenAI is accused of using The New York Times’ data to train closed-source, commercial models while simultaneously threatening DeepSeek, which builds open-source alternatives. It’s a double standard that ignores the basic logic of AI development, as noted by 404 Media. The lineage of these technologies is clear—OpenAI stands on Google’s Transformer architecture, which itself rests on decades of academic research. To claim ownership over the foundational methods while policing others for using public data is intellectually dishonest.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 15

In the field, open-source models democratize access; closed ones monetize it. What I watch for is we should be debating data licensing, not pretending the current system is fair.

”DeepSeek Models Lead Only in Cost”

While OpenAI was busy stirring the pot, Anthropic jumped into the ring with its own take on DeepSeek. Founder Dario Amodei didn’t see a rival; he saw a ghost from his past. In a personal blog post, Amodei argued that DeepSeek’s latest model isn’t a leap forward but rather something comparable to what Anthropic had 7–10 months ago—just significantly cheaper.

The training for (Claude 3.5) Sonnet took place 9–12 months ago, while the DeepSeek models were trained in November/December, and Sonnet still performs significantly better in many internal and external evaluations.

Therefore, I believe the correct statement is that “DeepSeek generated a model that achieved performance close to Claude from 7 to 10 months ago at a lower cost (though not as low as advertised).”

Amodei also noted that DeepSeek’s total company investment costs are likely similar to Anthropic’s lab expenses, despite the cheaper training run. Sam Altman kept the narrative consistent. He admitted DeepSeek R1 is impressive for its price tag but insisted OpenAI “will clearly bring better models.” This feels like standard operating procedure. When V3 dropped previously, he sarcastically remarked: Relatively speaking, replicating what is proven to be useful is easy.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 16

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out — figure 17

So, what is the actual value of DeepSeek R1? Analyst Ming-Chi Kuo provided a reference in his latest blog post that highlights two accelerated trends. First, AI computing power can continue to grow through optimized training methods despite the slowing of Scaling Laws, which benefits new application exploration. For the past 1–2 years, investors in the AI server supply chain have relied on the thesis that scaling laws remain effective for sustainable growth. But as marginal returns diminish, the market is shifting focus toward efficiency pathways outside of pure scaling, with DeepSeek as the prime example.

I think cheaper training doesn’t fix a model that can’t handle real-world edge cases.

The second trend is a significant decline in API and token prices, which facilitates AI application diversification. Kuo believes the primary profit source remains “selling shovels” (infrastructure) rather than creating new business lines or adding value to existing ones. DeepSeek-R1’s pricing strategy drives down overall generative AI usage costs, boosting demand for compute and easing investor concerns about profitability. However, it remains to be seen if increased volume can offset the lower price per unit. Kuo noted that only large-scale deployers are hitting the slowdown in marginal returns from Scaling Laws. When efficiency accelerates again, Nvidia will remain the winner.

In the field, we need robots that work on a budget, not just cheaper chatbots.

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out

Author

Silicon Valley in Turmoil: DeepSeek Faces Backlash from OpenAI and Anthropic as U.S. Users Speak Out

The API Loophole and the Distillation Debate

The Hypocrisy of the “Fair Use” Defense

”DeepSeek Models Lead Only in Cost”

Comments

Related News

Latest Headlines