Models & Benchmarks
Releases, papers, SOTA benchmarks
Independent · AI news · Analysis
Independent reporting on models, research, agents, and policy—written by our desk, sourced to primary links, open for comments.
Five lanes across the AI stack — pick a lane and dive in.
Fresh coverage from models to policy — updated as the field moves.
I see Guangdong rising at CVPR: Kaiming He claims top honors while Guangdong Univ. of Tech shatters big tech's elite monopoly.
Read article →
Geoffrey Hinton claims AI is conscious now. I read this at the AI Forward Agenda in June 2026. It’s hype, not hardware reality.
Tsinghua & BAAI's Brainμ model hits Science, mapping memory-sleep links. I see academic rigor, but commercial viability remains unproven.
I review AirLLM's method for running 70B models on 4GB VRAM. While the architecture is clever, I question its practical latency and batch processing efficiency compared to standard quantization techniques.
Feifei Li launches a spatial intelligence ImageNet. This sets the standard for 3D AI benchmarks through 2026, shifting focus from 2D to volumetric understanding.
China launches a full-stack embodied AI sim using domestic GPUs, advancing the 2025-2026 agenda. Ops take: Local silicon cuts latency but risks supply chain fragility.
I read about MTP for Qwen3.6-35B-A3B via NEXTN speculative decoding in SGLang. Ops take: Local inference latency drops matter more than lab demos.
I review the May 2026 guide for deploying DeepSeek-V4 via SGLang. It details architecture, serving recipes, and batch 5 pitfalls. My read: Local deployment demands rigorous governance oversight.
Qwen3.6-35B-A3B open-sourced with Mixture-of-Experts for efficient local deployment, targeting low-latency and cost-effective edge inference workflows.
I read the claim that ChatGPT's free model now halves hallucinations and boosts memory. I note this lacks reproducibility data, making it hard to verify if the improvements are genuine or just marketing spin.
Musk’s lawsuit exposes the messy reality behind AI hype. I see a neighborhood squabble over AGI definitions and broken promises.
I read the release notes for GPT-5.5. The hype around Nvidia synergy feels like a distraction from the ongoing lawsuits. I'm skeptical of these benchmark claims.
Standout coverage our desk is following closely.
Qwen3.6-35B-A3B open-sourced with Mixture-of-Experts for efficient local deployment, targeting low-latency and cost-effective edge inference workflows.
I analyze Qwen3.6-27B vs 35B-A3B specs to guide open-source adoption. I think benchmarking methodology remains opaque.
I review the May 2026 guide for deploying DeepSeek-V4 via SGLang. It details architecture, serving recipes, and batch 5 pitfalls. My read: Local deployment demands rigorous governance oversight.
Sign in to comment on stories, reply to other readers, and join the discussion on JustGhostIt.