Close Menu
    What's Hot

    XRP Price Prediction: Coinbase USD–XRP Volume Jumps 365% in Early 2026 – Can ETFs Drive XRP Back to $3?

    January 2, 2026

    Algoma Steel started with Hold at Jefferies as tariff overhang outweighs strengths (ASTL:NASDAQ)

    January 2, 2026

    I’m in My 50s, I Regret Not Trusting My Gut More Through the Years

    January 2, 2026
    Facebook X (Twitter) Instagram
    Hot Paths
    • Home
    • News
    • Politics
    • Money
    • Personal Finance
    • Business
    • Economy
    • Investing
    • Markets
      • Stocks
      • Futures & Commodities
      • Crypto
      • Forex
    • Technology
    Facebook X (Twitter) Instagram
    Hot Paths
    Home»Money»DeepSeek Publishes New AI Training Method to Scale LLMs More Easily
    Money

    DeepSeek Publishes New AI Training Method to Scale LLMs More Easily

    Press RoomBy Press RoomJanuary 2, 2026No Comments3 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    DeepSeek got the year rolling with a new idea for training AI. And analysts say it could have a massive impact on the industry.

    The Chinese AI startup published a research paper on Wednesday, describing a method to train large language models that could shape “the evolution of foundational models,” it said.

    The paper, co-authored by its founder Liang Wenfeng, introduces what DeepSeek calls “Manifold-Constrained Hyper-Connections,” or mHC, a training approach designed to scale models without them becoming unstable or breaking altogether.

    As language models grow, researchers often try to improve performance by allowing different parts of a model to share more information internally. However, this increases the risk of the information becoming unstable, the paper said.

    DeepSeek’s latest research enables models to share richer internal communication in a constrained manner, preserving training stability and computational efficiency even as models scale, it added.

    DeepSeek’s new method is a ‘striking breakthrough’

    Wei Sun, the principal analyst for AI at Counterpoint Research, told Business Insider on Friday that the approach is a “striking breakthrough.”

    DeepSeek combined various techniques to minimize the extra cost of training a model, Sun said. She added that even with a slight increase in cost, the new training method could yield much higher performance.

    Sun said the paper reads as a statement of DeepSeek’s internal capabilities. By redesigning the training stack end-to-end, the company is signaling that it can pair “rapid experimentation with highly unconventional research ideas.”

    Deepseek can “once again, bypass compute bottlenecks and unlock leaps in intelligence,” she said, referring to its “Sputnik moment” in January 2025, when the company unveiled its R1 reasoning model.

    The launch shook the tech industry and the US stock market, showing that the R1 model could match top competitors, such as ChatGPT’s o1, at a fraction of the cost.

    Lian Jye Su, the chief analyst at Omdia, a technology research and consulting firm, told Business Insider on Friday that the published research could have a ripple effect across the industry, with rival AI labs developing their own versions of the approach.

    “The willingness to share important findings with the industry while continuing to deliver unique value through new models showcases a newfound confidence in the Chinese AI industry,” Su said of DeepSeek’s paper. Openness is embraced as “a strategic advantage and key differentiator,” he added.

    Is the next DeepSeek model on the horizon?

    The paper comes as DeepSeek is reportedly working toward the release of its next flagship model R2, following an earlier postponement.

    R2, which had been expected in mid-2025, was delayed after Liang expressed dissatisfaction with the model’s performance, according to a June report by The Information. The report said the launch was also complicated by shortages of advanced AI chips, a constraint that has increasingly shaped how Chinese labs train and deploy frontier models.

    While the paper does not mention R2, its timing has raised eyebrows. DeepSeek previously published foundational training research ahead of its R1 model launch.

    Su said DeepSeek’s track record suggests the new architecture will “definitely be implemented in their new model.”

    Sun, on the other hand, is more cautious. “There is most likely no standalone R2 coming,” Sun said. Since DeepSeek has already integrated earlier R1 updates in its V3 model, the technique could form the backbone of DeepSeek’s V4 model, she added.

    Business Insider’s Alistair Barr wrote in June that DeepSeek’s updates to its R1 model failed to generate much traction in the tech industry. Barr argued that distribution matters, and DeepSeek still lacks the broad reach enjoyed by leading AI labs — such as OpenAI and Google — particularly in Western markets.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Press Room

    Related Posts

    I’m in My 50s, I Regret Not Trusting My Gut More Through the Years

    January 2, 2026

    Here Are the Bonuses Lawyers Ended 2025 With at Top US Firms

    January 2, 2026

    Yann LeCun Calls Alexandr Wang ‘Inexperienced,’ Predicts Meta Exits

    January 2, 2026
    Leave A Reply Cancel Reply

    LATEST NEWS

    XRP Price Prediction: Coinbase USD–XRP Volume Jumps 365% in Early 2026 – Can ETFs Drive XRP Back to $3?

    January 2, 2026

    Algoma Steel started with Hold at Jefferies as tariff overhang outweighs strengths (ASTL:NASDAQ)

    January 2, 2026

    I’m in My 50s, I Regret Not Trusting My Gut More Through the Years

    January 2, 2026

    Bitcoin Price Prediction: BTC Near $90,000 as Volume Jumps 120% – Is a 96,000 Breakout Next?

    January 2, 2026
    POPULAR
    Business

    The Business of Formula One

    May 27, 2023
    Business

    Weddings and divorce: the scourge of investment returns

    May 27, 2023
    Business

    How F1 found a secret fuel to accelerate media rights growth

    May 27, 2023
    Advertisement
    Load WordPress Sites in as fast as 37ms!

    Archives

    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • May 2023

    Categories

    • Business
    • Crypto
    • Economy
    • Forex
    • Futures & Commodities
    • Investing
    • Market Data
    • Money
    • News
    • Personal Finance
    • Politics
    • Stocks
    • Technology

    Your source for the serious news. This demo is crafted specifically to exhibit the use of the theme as a news site. Visit our main page for more demos.

    We're social. Connect with us:

    Facebook X (Twitter) Instagram Pinterest YouTube

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • Home
    • Buy Now
    © 2026 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.