Model input

prompt

Input prompt

num_inference_steps

Number of denoising steps (minimum: 1; maximum: 50)

guidance_scale

Scale for classifier-free guidance (minimum: 1; maximum: 20)

scheduler

Choose a scheduler.

prior_cf_scale

prior_steps

width

Choose width. Lower the setting if out of memory.

height

Choose height. Lower the setting if out of memory.

batch_size

Choose batch size. Lower the setting if out of memory.

Text-to-image

art-net-101

Kandinsky is a neural network for generating images developed by the team of developers at Sberbank. The model works in 101 languages, but the most important thing is that it works in Russian. Its interface will be intuitive for any user from Russia and the CIS, unlike Midjorney.

1.3k

25,000

Model result

Download

Readme

As text and image encoder it uses CLIP model and diffusion image prior (mapping) between latent spaces of CLIP modalities. This approach increases the visual performance of the model and unveils new horizons in blending images and text-guided image manipulation.

kandinsky-2.2 was trained on a large-scale image-text dataset LAION HighRes and fine-tuned on our internal datasets.