Tencent Released PrimitiveAnything: A New AI Framework That Reconstructs 3D Shapes Using Auto-Regressive Primitive Generation

Shape primitive abstraction, which breaks down complex 3D forms into simple, interpretable geometric units, is fundamental to human visual perception and has important implications for computer vision and graphics. While recent methods in 3D generation—using representations like meshes, point clouds, and neural fields—have enabled high-fidelity content creation, they often lack the semantic depth and interpretability needed for tasks such as robotic manipulation or scene understanding. Traditionally, primitive abstraction has been tackled using either optimization-based methods, which fit geometric primitives to shapes but often over-segment them semantically, or learning-based methods, which train on small, category-specific datasets and thus lack generalization. Early approaches used basic primitives like cuboids and cylinders, later evolving to more expressive forms like superquadrics. However, a major challenge persists in designing methods that can abstract shapes in a way that aligns with human cognition while also generalizing across diverse object categories.

Inspired by recent breakthroughs in 3D content generation using large datasets and auto-regressive transformers, the authors propose reframing shape abstraction as a generative task. Rather than relying on geometric fitting or direct parameter regression, their approach sequentially constructs primitive assemblies to mirror human reasoning. This design more effectively captures both semantic structure and geometric accuracy. Prior works in auto-regressive modeling—such as MeshGPT and MeshAnything—have shown strong results in mesh generation by treating 3D shapes as sequences, incorporating innovations like compact tokenization and shape conditioning.

PrimitiveAnything is a framework developed by researchers from Tencent AIPD and Tsinghua University that redefines shape abstraction as a primitive assembly generation task. It introduces a decoder-only transformer conditioned on shape features to generate sequences of variable-length primitives. The framework employs a unified, ambiguity-free parameterization scheme that supports multiple primitive types while maintaining high geometric accuracy and learning efficiency. By learning directly from human-designed shape abstractions, PrimitiveAnything effectively captures how complex shapes are broken into simpler components. Its modular design supports easy integration of new primitive types, and experiments show it produces high-quality, perceptually aligned abstractions across diverse 3D shapes.

PrimitiveAnything is a framework that models 3D shape abstraction as a sequential generation task. It uses a discrete, ambiguity-free parameterization to represent each primitive’s type, translation, rotation, and scale. These are encoded and fed into a transformer, which predicts the next primitive based on prior ones and shape features extracted from point clouds. A cascaded decoder models dependencies between attributes, ensuring coherent generation. Training combines cross-entropy losses, Chamfer Distance for reconstruction accuracy, and Gumbel-Softmax for differentiable sampling. The process continues autoregressively until an end-of-sequence token signals completion, enabling flexible and human-like decomposition of complex 3D shapes.

The researchers introduce a large-scale HumanPrim dataset comprising 120K 3D samples with manually annotated primitive assemblies. Their method is evaluated using metrics like Chamfer Distance, Earth Mover’s Distance, Hausdorff Distance, Voxel-IoU, and segmentation scores (RI, VOI, SC). Compared to existing optimization- and learning-based methods, it shows superior performance and better alignment with human abstraction patterns. Ablation studies confirm the importance of each design component. Additionally, the framework supports 3D content generation from text or image inputs. It offers user-friendly editing, high modeling quality, and over 95% storage saving, making it well-suited for efficient and interactive 3D applications.

In conclusion, PrimitiveAnything is a new framework that approaches 3D shape abstraction as a sequence generation task. By learning from human-designed primitive assemblies, the model effectively captures intuitive decomposition patterns. It achieves high-quality results across various object categories, highlighting its strong generalization ability. The method also supports flexible 3D content creation using primitive-based representations. Due to its efficiency and lightweight structure, PrimitiveAnything is well-suited for enabling user-generated content in applications such as gaming, where both performance and ease of manipulation are essential.

Check out Paper, Demo and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.

Here’s a brief overview of what we’re building at Marktechpost:

ML News Community – r/machinelearningnews (92k+ members)
Newsletter– airesearchinsights.com/(30k+ subscribers)
miniCON AI Events – minicon.marktechpost.com
AI Reports & Magazines – magazine.marktechpost.com
AI Dev & Research News – marktechpost.com (1M+ monthly readers)
Partner with us

The post Tencent Released PrimitiveAnything: A New AI Framework That Reconstructs 3D Shapes Using Auto-Regressive Primitive Generation appeared first on MarkTechPost.

Source: Read MoreÂ

CodeSOD: A Unique Way to Primary Key

BrowserStack launches Figma plugin for detecting accessibility issues in design phase

Parasoft brings agentic AI to service virtualization in latest release

Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

The best CRM software with email marketing in 2025: Expert tested and reviewed

This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

I’m a wearables editor and here are the 7 Pixel Watch 4 rumors I’m most curious about

8 ways I quickly leveled up my Linux skills – and you can too

The Intersection of Agile and Accessibility – A Series on Designing for Everyone

The Intersection of Agile and Accessibility – A Series on Designing for Everyone

Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

Execute Ping Commands and Get Back Structured Data in PHP

A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

“I don’t think I changed his mind” — NVIDIA CEO comments on H20 AI GPU sales resuming in China following a meeting with President Trump

Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

Tencent Released PrimitiveAnything: A New AI Framework That Reconstructs 3D Shapes Using Auto-Regressive Primitive Generation

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

Boolformer: Symbolic Regression of Logic Functions with Transformers

AppFlowy is an open source alternative to Notion

CVE-2025-7764 – Code-Projects Online Appointment Booking System SQL Injection Vulnerability

Dems demand audit of CVE program as Federal funding remains uncertain

CVE-2025-49009 – Facebook Para Facebook Auth Token Information Disclosure

14 Best Free and Open Source Electronic Design Automation Tools

CVE-2025-3816 – Westboy CicadasCMS OS Command Injection Vulnerability

CVE-2025-7560 – PHPGurukul Online Fire Reporting System SQL Injection Vulnerability

CVE-2025-6774 – Gooaclok819 SublinkX Path Traversal Vulnerability

Tencent Released PrimitiveAnything: A New AI Framework That Reconstructs 3D Shapes Using Auto-Regressive Primitive Generation

Related Posts