Apple's Highly Anticipated AI Features Face Delay
Apple's much-awaited artificial intelligence features, dubbed Apple Intelligence, will not be included in the initial release of iOS 18 and iPadOS 18...
Chinese AI startup DeepSeek, in collaboration with Tsinghua University, has introduced a promising new method for enhancing reasoning abilities in large language models (LLMs). The technique, which merges Generative Reward Modelling (GRM) with Self-Principled Critique Tuning, is designed to bring LLMs closer to human-aligned decision-making and problem-solving.
The resulting models, called DeepSeek-GRM, have demonstrated state-of-the-art performance on general queries, outperforming several existing approaches in reasoning and preference alignment. This research signals a broader trend in AI: moving beyond raw generative power to focus on how models think, critique, and align with human logic.
The new approach combines two complementary strategies:
Generative Reward Modelling (GRM): This guides LLMs to produce responses that are more aligned with human preferences by using model-generated comparisons instead of static labels.
Self-Principled Critique Tuning: This phase teaches models to evaluate and improve their own outputs by applying internal reasoning standards, allowing for self-correction without external human oversight.
Together, these techniques aim to boost both the quality and trustworthiness of LLM responses—especially on open-ended or ambiguous queries where traditional models tend to stumble.
The release of DeepSeek-GRM comes at a pivotal time, as the AI community anticipates DeepSeek’s next-generation model, DeepSeek-R2. While the company has yet to confirm a release date, expectations are high that R2 will build on the foundations laid by the R1 reasoning model and incorporate these latest advancements.
In the meantime, DeepSeek has continued refining its V3 model, improving reasoning accuracy and Chinese language fluency, a domain in which it already has a strong reputation.
DeepSeek is not just pushing the boundaries of AI research behind closed doors. The company has open-sourced several repositories, inviting developer contributions and signaling a commitment to community-driven innovation.
This openness, combined with technical collaboration from academic leaders like Tsinghua University, puts DeepSeek in a unique position: a startup operating with both scientific rigor and a rapidly growing developer ecosystem.
As reasoning becomes the new frontier for LLM development, DeepSeek’s latest method represents a meaningful leap forward. Whether or not DeepSeek-R2 launches soon, it’s clear the company is focused on solving one of AI’s most complex challenges—making models not just smarter, but more aligned with how humans think.
DeepSeek may be quietly shaping the next big wave in AI reasoning. We'll see.
Apple's much-awaited artificial intelligence features, dubbed Apple Intelligence, will not be included in the initial release of iOS 18 and iPadOS 18...
Researchers have unveiled a groundbreaking method that could revolutionize the efficiency of large language models. Named T-FREE (Tokenizer-Free...
Anthropic is reportedly preparing to release an enhanced version of its AI model, Claude 3.7 Sonnet, featuring a groundbreaking 500,000-token context...