Model input
prompt
Input prompt
num_inference_steps
Number of denoising steps (minimum: 1; maximum: 50)
guidance_scale
Scale for classifier-free guidance (minimum: 1; maximum: 20)
scheduler
Choose a scheduler.
prior_cf_scale
prior_steps
width
Choose width. Lower the setting if out of memory.
height
Choose height. Lower the setting if out of memory.
batch_size
Choose batch size. Lower the setting if out of memory.
Text-to-image

art-net-101

Kandinsky is a neural network for generating images developed by the team of developers at Sberbank. The model works in 101 languages, but the most important thing is that it works in Russian. Its interface will be intuitive for any user from Russia and the CIS, unlike Midjorney.

1.3k
25,000
Model result

Readme

As text and image encoder it uses CLIP model and diffusion image prior (mapping) between latent spaces of CLIP modalities. This approach increases the visual performance of the model and unveils new horizons in blending images and text-guided image manipulation.

kandinsky-2.2 was trained on a large-scale image-text dataset LAION HighRes and fine-tuned on our internal datasets.