Summary provided by Notion AI powered by Gemini 3 pro
This week in Generative AI has been marked by major model launches, fierce competition among leading AI companies, and significant feature updates across platforms.
OpenAI: GPT-5.2 Launch and Strategic Moves
OpenAI has officially released GPT-5.2, the latest iteration in its flagship large language model series. This release is widely seen as a direct response to Google’s recent advances with Gemini 3, intensifying the ongoing “AI arms race” between the two tech giants. According to OpenAI, GPT-5.2 demonstrates improved accuracy, reduced hallucinations, and enhanced performance on complex, real-world knowledge work tasks such as law, accounting, and coding. The model is now available to ChatGPT Plus, Pro, Team, and Enterprise subscribers, as well as developers via API, with three variants: GPT-5.2 Instant, Thinking, and Pro12345.
GPT-5.2’s release followed a period of internal urgency at OpenAI, reportedly triggered by a “code red” after Google’s Gemini 3 Pro launch. OpenAI claims GPT-5.2 outperforms Gemini and matches or exceeds human-level performance on 70% of professional benchmarks. The new model also addresses prior issues with “over-refusals” and latency, aiming to be more useful and responsive in production environments45.
Additionally, OpenAI co-founded the Agentic AI Foundation (AAIF) under the Linux Foundation, alongside Anthropic and Block, with support from Google, Microsoft, AWS, Bloomberg, and Cloudflare. This initiative aims to advance open, interoperable standards for agentic AI systems, such as the new AGENTS.md format for project-specific agent instructions6.
OpenAI also made a previously paid ChatGPT feature—editing PDFs and images with Adobe Photoshop and Acrobat—completely free for all users, expanding the accessibility of advanced AI-powered creativity tools7.
Google: Gemini 3 and Ecosystem Enhancements
Google’s Gemini 3 remains at the forefront of the AI landscape, with the company continuing to push the boundaries of multimodal reasoning and real-world problem-solving. This week, Google rolled out Gemini 3 Deep Think mode for AI Ultra subscribers, providing advanced iterative reasoning for complex math, science, and logic tasks. The new mode enables the model to explore multiple hypotheses simultaneously, further enhancing its capabilities in areas requiring deep analysis8.
Google released the Gemini Live API on Vertex AI, enabling developers to build real-time, multimodal conversational agents that blend voice, vision, and text for highly contextual and human-like interactions. This release is powered by the Gemini 2.5 Flash Native Audio model, which supports low-latency voice and video agents suitable for demanding enterprise workflows9.
Other ecosystem updates include improvements to AI Mode in Google Search, additional links in the Gemini app, and pilot AI features in Google News. These changes reflect Google’s efforts to integrate AI more deeply into its core products and information delivery platforms10.
Anthropic: Claude Opus 4.5 and New Features
Anthropic’s Claude Opus 4.5 has garnered significant attention for its capabilities in autonomous coding and complex task management. Early reviews from developers and users highlight genuine amazement at the model’s performance, with many considering it a transformative tool for professional workflows. Released in late November, Claude Opus 4.5 continues to impress with its ability to handle nuanced, multi-step reasoning tasks11.
This week, Anthropic rolled out several updates for Claude Code, including the launch of Claude Code on Android, a hotkey model switcher, and improved context window information in status lines. The company is also preparing a “Think Back 2025” experience, allowing users to reflect on their interactions with Claude throughout the year1213.
Other Industry Highlights
- LangChain launched the Poly agent for LangSmith, which automates the design, debugging, and improvement of AI agents, furthering the development of agentic AI systems14.
- Mistral 3 debuted as a new contender in the LLM space, with updates targeting both performance and developer tooling15.
- Model deployment and evaluation: Discussions and developments continue regarding model formats (e.g., GGUF for local deployment, ONNX for edge devices), cloud and on-premises hosting strategies, and evolving evaluation methodologies, including human and automated assessments of language quality, reasoning, safety, and robustness16.