Summarized by Dodly:
AI's Wild Week: Robots Cook, Code Breaks, and Video Gets Real
Audio Summary
Video Summary
Summary
AI continues its rapid advancement with exciting developments across multiple fields. In 3D generation, RecGen can reconstruct objects from limited images, and Fizz Forge creates physically accurate 3D assets for simulations and games. For image creation, Hydream01 is a new top open-source model excelling at text rendering and complex layouts, while CDM offers a five-times speedup for diffusion models like Stable Diffusion 3. Video generation sees UniVidX producing videos with intrinsic properties like albedo and normals, and Boach 1.0 offering high-quality, character-consistent videos up to 30 seconds. Google's Gemma 4 now boasts up to 3.1x speed improvements with multi-token prediction, and Alpha Evolve is autonomously discovering algorithms that enhance genomics, electricity grids, and even AI hardware. In robotics, Momo Act 2 offers faster, data-rich reasoning for manipulation tasks, and Genesis AI's Gene 26.5 demonstrates human-level dexterity in tasks like cooking and lab experiments. OpenAI has released new real-time voice models including translation and transcription, and Nvidia teased D-Rex for creating realistic, relightable digital human avatars. However, a new benchmark called Program Bench reveals current AI models can't yet rebuild entire programs from scratch, scoring zero percent on all tasks. Additionally, Zia 18B, a new small reasoning model, was trained entirely on AMD hardware and punches above its weight, while Lab OS integrates AI with real science labs for guided experimentation.