Gold Penguin Logo with Text

Midjourney 6 vs 5 Compared Side By Side Shows How Incredible This Update Was for AI-Image Generation

We got an early Christmas present from the Midjourney team with the surprise release of V6. But how big of an improvement is it from V5.2? Spoiler alert: IT'S INCREDIBLE
Updated January 11, 2024
A ship mid-journey in the universe, generated with Midjourney V6
A ship mid-journey in the universe, generated with Midjourney V6

DALL-E. Meta. Firefly. Stable Diffusion. Like it or not, it's undeniable that the AI image generation market is definitely oversaturated now. However, there has always been one standout.

To me, it's obvious that Midjourney was the best AI image generator in the business. However, I do recognize that it still has some flaws, particularly with generating realistic images and ones with long prompts and text.

That's why I've been patiently waiting for Midjourney V6, and last night, it finally came. I quickly hopped on Discord and started generating as many images as I could. Let me tell you a quick spoiler: it's worth the wait.

Here are some of the best images I created using Midjourney V6 along with the same prompt but applied to Midjourney 5.2:

Midjourney v5 and v6 Output Comparison

It’s been a little over 24 hours since Midjourney v6 came out and let me tell you: the hype is real. This has been, by far, my favorite image generator. It somehow fixed every single one of my problems with the previous version. Here are some of my favorite examples:

Portraits

a woman lying in bed with her eyes closed, golden hour, closeup

My biggest gripe with Midjourney was that it couldn't really generate realistic images on par with DALL-E or Meta. The release of V6 seems to have solved that problem. Their realism is on a whole different level now. No more waxy faces and exaggerated features. V6's output is so good that, even if you zoom in, you can see the imperfections that make us human. This is an immense improvement.

Landscapes

landscape, an autumn in the lake during dusk, tranquility

Don't get me wrong: V5.2's image is pretty good, but it's not exactly the look I'm going for. I'm looking for realistic lake images, something that V6 was able to give me. This upgraded version can create authentic-looking images without sacrificing artistic quality. It's way better than DALL-E 3 on this front, in my opinion.

Product Renders

product photography, a perfume, studio lighting, shadow play, jasmine, soft

I'll admit: I'm not too sure about this one. The key difference is that the product images I'm getting from V5.2 looks processed and market-ready, whereas V6 looks more raw, like it's taken straight out of a camera. It may have something to do with the phrasing of my prompts since I've gotten used to cluttered V5 prompts, something that I need to work on due to V6's evolved nuance.

I will say this though: if you're a seasoned editor looking for detailed, well-shot raw images, V6 is a lot better than V5.2.

Movie Stills (Animated)

animated movie still, a young girl following a magical cat to a tree,
inspired by hayao miyazaki, whimsical, magical realism, clean lines, detailed, 8k

This is a great time to talk about nuance. In my prompt, I specifically requested a film still that looks like Hayao Miyazaki's work. V5.2's output didn't follow this at all, instead going for a generic 3D DreamWorks style of animation. On the other hand, V6 followed this instruction to a tee. It looks straight out of Howl's Moving Castle.

I also highly suggest you to zoom in and look at those details in V6's output. The still is so much more vivid and full of life. It's genuinely mindblowing how good Midjourney has improved over the last couple of months.

Movie Stills (Live Action)

film still, back shot of a man in a green jacket, symmetrical,
muted colors, directed by wes anderson

Midjourney V5 definitely had a problem with oversimplifying or overcomplicating prompts, especially ones with lots of context. Look at the example above: I kept it minimal but still, V5 wasn't able to be creative with the prompt he's given. V6 solves this problem by filling in the gaps of my prompt while retaining its original thought.

PS. Yes, I know. The guy is missing his right ear but hey, it's V6's first week!

Flat Illustrations

logo for a shoe company, clean background, paul rand

I never really had any issue with generating logos with V5.2 but, after seeing these images side-by-side, I could really tell that there was room for improvement in hindsight. V6's output retains the minimalism of V5.2 while adding its unique spin to the illustrations that gives them more identity.

Surrealism

the planets in the galaxy as hatching eggs of lovecraftian entities,
surrealism, cosmic, lovecraftian, ethereal, celestial bodies

I've always praised Midjourney's surrealist images as one of their strong points. However, it has a tendency to overpopulate its outputs with subjects that you sometimes can't figure out what's going on — something that you can see above.

V6, with its improved nuance, manages to strike a balance between fulfilling the prompt and being creative. You can now clearly see what they're trying to portray, even with little to no information about the subject.

Text Generation

a restaurant in a quiet chic neighborhood with a neon sign that says "Closed",
night, streetlights

One of Midjourney's biggest promises before V6 came out was that it's going to fix its text generation, which is a huge problem across all AI generators. The only one I've tried that's decent on that end is DALL-E 3, but it looks like Midjourney's next in line.

It perfectly wrote "Closed" in the V6 image, even adding its own flair. As for V5.2, well, unless you've got a restaurant called "CORSTARB," I don't think it's cut out for text generation.

However, it's still not perfect, as you can see here:

comic panel, panicked captain america yelling "Get out of here", speech bubbles, gritty

This just shows that Midjourney still doesn't recognize letters since it's still missing a word from my prompt. In my opinion, this works best with single or two-word texts only. But hey, it's miles better than its competitors. Even DALL-E 3 isn't this good.

High Context

A detailed oil painting of an old sea captain, steering his ship through a storm.
Saltwater is splashing against his weathered face, determination in his eyes.
Twirling malevolent clouds are seen above and stern waves threaten to submerge
the ship while seagulls dive and twirl through the chaotic landscape.
Thunder and lights embark in the distance,
illuminating the scene with an eerie green glow

Just a heads-up, I borrowed this prompt from OpenAI's DALL-E 3 page. It's sometimes hard to think of elements to add to a prompt. This is also a prompt that OpenAI used to test DALL-E's nuance, so I could also test it with V5 and V6, and then compare.

V5.2 actually did a pretty good job, but still missed a couple of elements like the eerie green glow, seagulls, and thunder. V6 followed everything except seagulls, but there's still one solitary seagull in the background, so this one passes the smell test.

So, Did It Improve?

It did improve, by a lot.

I couldn't show you every test I've done yet (I'm reserving some for my next article) but it's already a hundred times better than V5.2 in my book. It managed to solve the text generation and nuance issues while simultaneously improving its creativity. Every image I've created so far with V6 is crisp, detailed, and accurate.

What else is there to ask for?

The Bottom Line

When V5 came out, some said that it was a backward step from V4.

Gradually, the team listened to the community and improved its creativity, even adding some functionalities in the process. The result was Midjourney V5.2, which was already my favorite AI image generator in the market.

Midjourney V6 is a significant improvement on V5.2. It took everything that was already good with V5.2 and significantly tweaked its model to create more detailed and accurate images. Everything that I've complained about with V5.2 — nuance, text, realism — they've fixed that and then some.

The best thing is that we can only expect it to get better from here on out. The Midjourney team is already crowdsourcing user image opinions through A/B testing to improve its model.

Mark my words: Midjourney V6 is a turning point in AI image generation.

Want To Learn Even More?
If you enjoyed this article, subscribe to our free monthly newsletter
where we share tips & tricks on how to use tech & AI to grow and optimize your business, career, and life.
Written by John Angelo Yap
Hi, I'm Angelo. I'm currently an undergraduate student studying Software Engineering. Now, you might be wondering, what is a computer science student doing writing for Gold Penguin? I took up studying computer science because it was practical and because I was good at it. But, if I had the chance, I'd be writing for a career. Building worlds and adjectivizing nouns for no other reason other than they sound good. And that's why I'm here.
Subscribe
Notify of
guest

1 Comment
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Join Our Newsletter!
If you enjoyed this article, subscribe to our free monthly newsletter where we share tips & tricks on how to use tech & AI to grow and optimize your business, career, and life.
magnifiercross