一区二区日本_久久久久久久国产精品_无码国模国产在线观看_久久99深爱久久99精品_亚洲一区二区三区四区五区午夜_日本在线观看一区二区

Skip to content

[COLING 2025] Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs

Notifications You must be signed in to change notification settings

yisuanwang/Idea23D

Repository files navigation

Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs

2024.11: ?? Idea-2-3D has been accepted by COLING 2025! ?? See you in Abu Dhabi, UAE, from January 19 to 24, 2025!

2025.01: gradio demo is available at https://3389f4ca9cd69aae21.gradio.live

? GitHub Repo Stars ? arXiv ? ? ?

Junhao Chen *, Xiang Li *, Xiaojun Ye, Chao Li, Zhaoxin Fan ?, Hao Zhao ?


?Introduction

idea23d Based on the LMM we developed Idea23D, a multimodal iterative self-refinement system that enhances any T2I model for automatic 3D model design and generation, enabling various new image creation functionalities togther with better visual qualities while understanding high level multimodal inputs.

??Compatibility:

??Run

The Gradio demo is coming soon, and you can also clone this repo to your local machine and run pipeline.py. he main dependencies we use include: python 3.10, torch==2.2.2+cu118, torchvision==0.17.2+cu118, transformers==4.47.0, tokenizers==0.21.0, numpy==1.26.4, diffusers==0.31.0, rembg==2.0.60, openai==0.28.0 These are compatible with gpt4o, instantMesh, hunyuan3d, sdxl, InternVL2.5-78B, and llava-CoT-11B.

pip install -r requirements-local.txt

You can add new LMM, T2I, and I23D support components by modifying the content under tool/api. An example of generating a watermelon fish is provided in idea23d_pipeline.ipynb. Open Idea23D/idea23d_pipeline.ipynb, Explore freely in the notebook ~

from tool.api.I23Dapi import *
from tool.api.LMMapi import *
from tool.api.T2Iapi import *


# Initialize LMM, T2I, I23D
lmm = lmm_gpt4o(api_key = 'sk-xxx your openai api key')
# lmm = lmm_InternVL2_5_78B(model_path='OpenGVLab/InternVL2_5-78B', gpuid=[0,1,2,3], load_in_8bit=True)
# lmm = lmm_InternVL2_5_78B(model_path='OpenGVLab/InternVL2_5-78B', gpuid=[0,1,2,3], load_in_8bit=False)
# lmm = lmm_InternVL2_8B(model_path = 'OpenGVLab/InternVL2-8B', gpuid=0)
# lmm = lmm_llava_CoT_11B(model_path='Xkev/Llama-3.2V-11B-cot',gpuid=1)
# lmm = lmm_qwen2vl_7b(model_path='Qwen/Qwen2-VL-7B-Instruct', gpuid=1)



# t2i = text2img_sdxl_replicate(replicate_key='your api key')
# t2i = t2i_sdxl(sdxl_base_path='stabilityai/stable-diffusion-xl-base-1.0', sdxl_refiner_path='stabilityai/stable-diffusion-xl-refiner-1.0', gpuid=6)
t2i = t2i_flux(model_path='black-forest-labs/FLUX.1-dev', gpuid=2)


# i23d = i23d_TripoSR(model_path = 'stabilityai/TripoSR' ,gpuid=7)
i23d = i23d_InstantMesh(gpuid=3)
# i23d = i23d_Hunyuan3D(mv23d_cfg_path="Hunyuan3D-1/svrm/configs/svrm.yaml",
#         mv23d_ckt_path="weights/svrm/svrm.safetensors",
#         text2image_path="weights/hunyuanDiT")

If you want to test on the dataset, simply run the pipeline.py script, for example:

python pipeline.py --lmm gpt4o --t2i flux --i23d instantmesh

Evaluation dataset

  1. Download the required dataset dataset from Hugging Face.
  2. Place the downloaded dataset folder in the path Idea23D/dataset.
cd Idea23D
wget https://huggingface.co/yisuanwang/Idea23D/resolve/main/dataset.zip?download=true -O dataset.zip
unzip dataset.zip
rm dataset.zip

Ensure the directory structure matches the path settings in the code for smooth execution.

??ToDO List

?1. Release Code

?2. Support for more models, such as SD3.5, CraftsMan3D, and more.

??Citations

@article{chen2024idea23d,
  title={Idea-2-3D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs}, 
  author={Junhao Chen and Xiang Li and Xiaojun Ye and Chao Li and Zhaoxin Fan and Hao Zhao},
  year={2024},
  eprint={2404.04363},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

??Acknowledgement

We have intensively borrow codes from the following repositories. Many thanks to the authors for sharing their codes.

llava-v1.6-34b, llava-v1.6-mistral-7b, llava-CoT-11B, InternVL2.5-78B, Qwen-VL2-8B, llava-CoT-11B, llama-3.2V-11B, intern-VL2-8B, SD-XL 1.0 base+refiner, DALL·E, Deepfloyd IF, FLUX.1.dev, TripoSR, Zero123, Wonder3D, InstantMesh, LGM, Hunyuan3D, stable-fast-3d,

?? Star History

Star History Chart

About

[COLING 2025] Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
主站蜘蛛池模板: 99久久精品国产麻豆演员表 | 中文字幕一区二区三区四区 | 国产精品一区二 | 亚洲精品一区二区三区在线观看 | 成人免费观看网站 | 久久亚洲一区二区三 | 亚洲精品一区二区三区在线 | 日韩三级免费观看 | 日本网站免费在线观看 | 亚洲国产aⅴ成人精品无吗 综合国产在线 | 国产剧情一区二区三区 | 中文字幕一级毛片 | 国产欧美精品一区二区 | 亚洲欧美在线观看 | 欧美黄色性生活视频 | 精品一级电影 | 中文字幕 在线观看 | 黄色网址在线免费观看 | 日本精品视频在线 | 欧洲性生活视频 | av在线播放网站 | 成人福利视频网站 | www日韩欧美 | 四虎成人在线播放 | 久久精品中文 | 中文字幕在线精品 | 亚洲区视频| 99精品一级欧美片免费播放 | 国产精品毛片av一区 | 一级a爱片久久毛片 | 天天干天天爱天天操 | 性一交一乱一伦视频免费观看 | 亚洲精品精品 | 久久久久久久国产 | www.青娱乐| 亚洲欧美日韩激情 | 99精品国产一区二区青青牛奶 | 国产小视频精品 | 亚洲自拍偷拍免费视频 | 免费视频一区二区 | 99色综合 |