Midjourney 6 vs 5: Compared — This Update Was Incredible for AI Image Generation
We got an early Christmas present from the Midjourney team with the surprise release of V6. But how big of an improvement is it from V5.2? Spoiler alert: IT'S INCREDIBLE.
John Angelo Yap
Updated September 18, 2024
A ship mid-journey in the universe, generated with Midjourney V6.1
Reading Time: 8 minutes
DALL-E. Meta. Firefly. Stable Diffusion. Like it or not, it's undeniable that the AI image generation market is definitely oversaturated now. However, there has always been one standout.
To me, it's obvious that Midjourney was the best AI image generator in the business. However, I do recognize that it still has some flaws, particularly with generating realistic images and ones with long prompts and text.
That's why I've been patiently waiting for Midjourney V6, and in December 2023, it finally came. I quickly hopped on Discord and started generating as many images as I could. Let me tell you a quick spoiler: it's worth the wait.
Here are some of the best images I created using Midjourney V6 along with the same prompt but applied to Midjourney 5.2:
Midjourney v5 and v6 Output Comparison
It’s been almost a year since Midjourney V6 came out and let me tell you: it’s been incredible. This has been, by far, my favorite image generation model. It somehow fixed every single previous version’s issues. Here are some of my favorite examples:
Portraits
a woman lying in bed with her eyes closed, golden hour, closeup
My biggest gripe with Midjourney was that it couldn't really generate realistic images on par with DALL-E or Meta. The release of V6 seems to have solved that problem. Their realism is on a whole different level now. No more waxy faces and exaggerated features. V6's output is so good that, even if you zoom in, you can see the imperfections that make us human.
Landscapes
landscape, an autumn in the lake during dusk, tranquility
Don't get me wrong: V5.2's image is pretty good, but it's not exactly the look I'm going for. I'm looking for realistic lake images, especially since I didn’t really specify oil paintings in the prompt, something that V6 was able to give me. This upgraded version can create authentic-looking images without sacrificing artistic quality. It's way better than DALL-E 3 on this front, in my opinion.
Product Renders
product photography, a perfume, studio lighting, shadow play, jasmine, soft
I'll admit: I'm not too sure about this one. The key difference is that the product images I'm getting from V5.2 looks processed and market-ready, whereas V6 looks more raw, like it's taken straight out of a camera. It may have something to do with the phrasing of my prompts since I've gotten used to cluttered V5 prompts, something that I need to work on due to V6's evolved nuance.
I will say this though: if you're a seasoned editor looking for detailed, well-shot raw images, V6 is a lot better than V5.2. This gives you a lot more room to work with since these renders are unprocessed.
Movie Stills (Animated)
animated movie still, a young girl following a magical cat to a tree,
inspired by hayao miyazaki, whimsical, magical realism, clean lines, detailed, 8k
This is a great time to talk about nuance. In my prompt, I specifically requested a still that looks like Hayao Miyazaki's work. V5.2's output didn't follow this at all, instead going for a generic 3D DreamWorks style of animation. On the other hand, V6 followed this instruction to a tee. It looks straight out of Howl's Moving Castle.
I also highly suggest you to zoom in and look at those details in V6's output. The still is so much more vivid and full of life. It's genuinely mindblowing how good Midjourney has improved over the last couple of years.
Movie Stills (Live Action)
film still, back shot of a man in a green jacket, symmetrical,
muted colors, directed by wes anderson
Midjourney V5 definitely had a problem with oversimplifying or overcomplicating prompts, especially ones with lots of context. Look at the example above: I kept it minimal but still, V5 wasn't able to be creative with the prompt he's given. V6 solves this problem by filling in the gaps of my prompt while retaining its original thought.
PS. Yes, I know. The guy is missing his right ear but hey, it was V6's first week when I made it. Here’s what the prompt looks like now that V6.1 is out (something that we’ll get to in a bit)
Flat Illustrations
logo for a shoe company, clean background, paul rand
I never really had any issue with generating logos with V5.2 but, after seeing these images side-by-side, I could really tell that there was room for improvement in hindsight. V6's output retains the minimalism of V5.2 while adding its unique spin to the illustrations that gives them more identity.
Surrealism
the planets in the galaxy as hatching eggs of lovecraftian entities,
surrealism, cosmic, lovecraftian, ethereal, celestial bodies
I've always praised Midjourney's surrealist images as one of their strong points. However, it has a tendency to overpopulate its output with subjects that you sometimes can't figure out what's going on — something that you can see above.
V6, with its improved nuance, manages to strike a balance between fulfilling the prompt and being creative. You can now clearly see what they're trying to portray, even with little to no information about the subject.
Text Generation
a restaurant in a quiet chic neighborhood with a neon sign that says "Closed",
night, streetlights
One of Midjourney's biggest promises before V6 came out was that it's going to fix its text generation, which is a huge problem across all AI generators. The only one I've tried that's decent on that end is DALL-E 3, but it looks like Midjourney's next in line.
It perfectly wrote "Closed" in the V6 image, even adding its own flair. As for V5.2, well, unless you've got a restaurant called "CORSTARB," I don't think it's cut out for text generation.
However, Midjourney V6 wasn’t perfect, as you can see here:
comic panel, panicked captain america yelling "Get out of here", speech bubbles, gritty
This just shows that Midjourney V6 couldn’t recognize letters since it's still missing a word from my prompt. In my opinion, this works best with single or two-word texts only. But hey, it's miles better than its competitors, except maybe DALL-E 3 especially now that it has GPT-4o to make it better.
Now, here comes the asterisk: V6.1 was actually released in August 2024, and it looks to be wayyy better at writing text in images. Here’s the same prompt using this new model:
Like I said: way better.
High Context
A detailed oil painting of an old sea captain, steering his ship through a storm.
Saltwater is splashing against his weathered face, determination in his eyes.
Twirling malevolent clouds are seen above and stern waves threaten to submerge
the ship while seagulls dive and twirl through the chaotic landscape.
Thunder and lights embark in the distance,
illuminating the scene with an eerie green glow
Just a heads-up, I borrowed this prompt from OpenAI's DALL-E 3 page. It's sometimes hard to think of elements to add to a prompt. This is also a prompt that OpenAI used to test DALL-E's nuance, so I could also test it with V5 and V6, and then compare.
V5.2 actually did a pretty good job, but still missed a couple of elements like the eerie green glow, seagulls, and thunder. V6 followed everything except seagulls, but there's still one solitary seagull in the background, so this one passes the smell test.
So, Did It Improve?
It did improve, by a lot.
I couldn't show you every test I've done yet (I'm reserving some for my next article) but it's already a hundred times better than V5.2 in my book. It managed to solve the text generation and nuance issues while simultaneously improving its creativity. Every image I've created so far with V6 is crisp, detailed, and accurate.
What else is there to ask for?
When Can We Expect Midjourney V7
No official news yet about Midjourney V7, but we got the next best thing.
Midjourney V6.1 was released in August 2024. This model fixed some of the biggest issues with the base V6 model and made it a lot better. And get this, they added a ton of features on top of it, including something that we’ve been asking for a long time now: a functioning web application.
As for Midjourney V7, I think it’s safe to assume it’s going to come out next year since the company’s apparently focusing on releasing V6.2 later this year instead.
The Bottom Line
When V5 came out, some said that it was a backward step from V4.
Gradually, the team listened to the community and improved its creativity, even adding some functionalities in the process. The result was Midjourney V5.2, which was already my favorite AI image generator in the market.
Midjourney V6 is a significant improvement on V5.2. It took everything that was already good with V5.2 and significantly tweaked its model to create more detailed and accurate images. Everything that I've complained about with V5.2 — nuance, text, realism — they've fixed that and then some.
The best thing is that we can only expect it to get better from here on out. The Midjourney team is already crowdsourcing user image opinions through A/B testing to improve its model.
Midjourney V6 was a turning point in AI image generation. And now they’re building on it and perfecting it. Amazing time we’re living in.
Want to Learn Even More?
If you enjoyed this article, subscribe to our free newsletter where we share tips & tricks on how to use tech & AI to grow and optimize your business, career, and life.