PSHuman : Revolutionizing 3D Human Reconstruction from a Single Image

In a world where augmented reality and virtual avatars are becoming ubiquitous, the ability to create a photorealistic 3D model of a person from a single photo is a groundbreaking advancement. Enter PSHuman, an open-source project developed by Peng Li and collaborators that leverages cross-scale multiview diffusion to achieve this feat. Capable of generating detailed geometry and realistic 3D human appearances across various poses in just one minute, PSHuman is a game-changer for developers, 3D artists, and AI enthusiasts alike.

Explore the project on its official GitHub repository, where you’ll find the source code, pre-trained models, and even a demo on Hugging Face. Let’s dive into what makes PSHuman so extraordinary!

What is PSHuman?

PSHuman, short for Photorealistic Single-image 3D Human Reconstruction, is the official implementation of an AI model that transforms a single image of a clothed person into a fully textured 3D model. Unlike traditional methods requiring multiple views or expensive scans, PSHuman employs cross-scale multiview diffusion to generate consistent views and high-quality textured meshes.

Key Features of the Project

Photorealistic Quality: The results boast lifelike textures, clothing folds, and natural poses, as showcased in demo videos (e.g., result_clr_scale4_pexels-barbara-olsen-7869640.mp4 and result_clr_scale4_pexels-zdmit-6780091.mp4).
Speed: The entire process—from input image to rendered video—takes under a minute.
Versatility: Perfect for applications in virtual reality, virtual fashion, gaming, or animation.
Open and Accessible: Licensed as open-source, with models available on Hugging Face for easy testing.

The project builds on prior work like ECON and SIFU for human mesh recovery, and Era3D for consistent multiview generation. A SMPL-free version released on November 30, 2024, eliminates the need for SMPL parametric models, enhancing flexibility for varied poses.

Recent Updates: Rapid Evolution

The PSHuman team is moving fast! Here are the highlights:

November 30, 2024: Released the SMPL-free version, which performs robustly for multiview generation without SMPL constraints.
December 11, 2024: Deployed an interactive demo on Hugging Face, thanks to Sylvain Filoni. Try it now by uploading a photo and witnessing the magic!

These updates reflect the team’s commitment to making PSHuman accessible to researchers, developers, and hobbyists.

Installation and Usage: A Quick Guide for Beginners

PSHuman is designed to be straightforward to set up on a powerful GPU setup (over 40GB of VRAM recommended for 768 resolution, with a 512-resolution version in development for RTX 4090 compatibility). Here’s how to get started:

1. Environment Setup

Create a Conda environment and install dependencies:

conda create -n pshuman python=3.10
conda activate pshuman

# PyTorch with CUDA 12.1
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121

# Kaolin (for 3D geometry)
pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.1.0_cu121.html

# Other packages
pip install -r requirements.txt

Download the SMPLX models from OneDrive (link provided in the repo).

2. Data Preparation

Remove the background from your image using Clipdrop or a provided script:python utils/remove_bg.py --path $DATA_PATH
Place the RGBA images in $DATA_PATH.

3. Inference

Run the main script to generate the textured mesh and video:

CUDA_VISIBLE_DEVICES=$GPU python inference.py --config configs/inference-768-6view.yaml \
    pretrained_model_name_or_path='pengHTYX/PSHuman_Unclip_768_6views' \
    validation_dataset.crop_size=740 \
    with_smpl=false \
    validation_dataset.root_dir=$DATA_PATH \
    seed=600 \
    num_views=7 \
    save_mode='rgb'

Tip: Adjust crop_size (720 or 740) and seed (42 or 600) to optimize results for your image.

For training, refer to the paper for data preparation, then run bash scripts/train_768.sh after adjusting paths like data_common.root_dir.

Common Issues

Insufficient VRAM? The current model requires +40GB. A lighter version is coming soon.
No releases published yet, but the code is stable.

Related Projects and Inspirations

PSHuman draws from stellar open-source projects:

ECON and SIFU: For single-image human mesh reconstruction.
Era3D and Unique3D: For consistent multiview image generation.
Continuous-Remeshing: For smooth inverse rendering.

These influences highlight the power of the open-source AI community.

Why PSHuman is a Game-Changer

In an era where AI is democratizing creation, PSHuman lowers the barriers to producing hyper-realistic 3D avatars. Imagine virtual try-ons for e-commerce, ethical deepfakes for filmmaking, or personalized medical simulations. While ethical questions around privacy arise, the open-source approach encourages responsible use.

Have you tried PSHuman?

Vset3D

Virtual Studio
Software

Shop now