DALL-E 2, Midjourney and Stable Diffusion - what is the difference?

Jan 09, 2023

iCrowdMarketing powered by iCrowdNewswire


A text-to-image model is machine learning, which needs input of natural language description and produces an image matching that description. Such models began in the mid-2010s. Initially, these were created with the evolvement of generative models such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs).


In recent years, more sophisticated models have appeared. Among them are DALL-E 2, Midjourney, and Stable Diffusion. We already took an in-depth look at Stable Diffusion, so now is a time to compare it to the rest.



DALL-E 2


DALL-E 2 was launched in 2022 by OpenAI for everyone. It may be the easiest to launch by simply going to their website and registering. After registration, you get 50 credits per month for free, and more can be purchased.



Midjourney


Midjourney is accessible through Discord. Users are given a certain amount of free credits to try out the model and an option to purchase more credits. You gain access to Midjourney by following the project on Discord, joining the appropriate channel, and then sending them a series of slash commands.



DALL-E 2 vs. Midjourney vs. Stable Diffusion


Midjourney, DALL-E 2, and Stable Diffusion can generate both stunning and "weak" images. Sometimes, more work with queries is required; in others, one attempt is enough to get a good result. In the end, it's hard to say which neural network is the best, but there are certain differences.


Stable Diffusion is the most complex but also a flexible (and potentially free) option on this list. Since it is an open-source project, you can run it on your machine with a powerful GPU.


However, if you prefer an approach more similar to the DALL-E 2, based on credits, you can use the beta version of DreamStudio. Being an open-source project with an active community behind it, there are many ways to work with Stable Diffusion, well described in this article.


If you need to generate an image here and now, use Midjourney. It's enough to connect to a Discord channel. You can get a good image even without any style settings. Although sometimes you might need more than one shot.


Midjourney's unique style generates oil paintings by default, not photos or drawings like DALL-E 2 and Stable Diffusion. Whether this is an advantage is up to you to decide.


Want to combine several different objects? DALL-E 2 can help with that. It can build complex compositions, which is useful when a designer needs to get a lot of references in a short amount of time. Also, in DALL-E, you can modify images as needed. For example, a neural network has drawn a landscape, but there is an extra tree on the horizon - you can select it and delete it. The service is free and does not limit the number of requests.


Stable Diffusion has a lot of flexibility and allows you to play around with settings. For example, when you want to change the server load, you can reduce the Sampling steps parameter. Although, we recommend not touching the Classifier Free Guidance Scale, or you might end up with images containing nothing but "glitches."



Conclusion


We have already seen digital comics and digital arts changing because of these tools. More is sure to come, and the models we compared above might be the beginning of the new industry emerging.


Tags: English