Tech & Gadgets

DeepSeek Unveils Janus-Pro: Unified AI Model for Vision Tasks

DeepSeek launches Janus-Pro, an advanced AI model combining multimodal understanding and generation in a unified framework.

by Philip Lee

Updated January 28, 2025

Beijing, China - DeepSeek unveiled Janus-Pro, a new artificial intelligence model that combines multimodal understanding and generation capabilities within a single framework.

The model is built on DeepSeek-LLM architecture in 1.5 billion and 7 billion parameter versions.

Janus-Pro incorporates SigLIP-L as its vision encoder, processing images at 384 x 384 resolution for understanding tasks.

The system uses a specialized tokenizer with 16x downsampling for image generation capabilities.

DeepSeek released the model under an MIT License for the code, while the model itself is covered by the DeepSeek Model License terms.

The research team includes Xiaokang Chen, Zhiyu Wu, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, and Chong Ruan.

The model aims to resolve previous limitations in multimodal AI by separating visual encoding pathways while maintaining a unified transformer architecture.

Performance metrics indicate Janus-Pro matches or exceeds task-specific models while offering greater flexibility in deployment.

The model and associated code are available through the project's GitHub repository.

by Philip Lee

Updated January 28, 2025