Gold Penguin Logo with Text

Sora and Midjourney Compared Using The Same Prompt (AI Video vs Pictures)

Sora was just announced last week and we're already seeing some glowing reviews of the product. Using their public showcase, let's compare what their videos look like using the same prompts with Midjourney.
Updated April 1, 2024
A robot with a clapperboard, generated with Midjourney
A robot with a clapperboard, generated with Midjourney

It's been months since the leadership debacle in OpenAI. After the dust settled, people eagerly waited for their next product. Will it be GPT-5? DALL-E 4? A new version of Codex? Turns out, the answer was something almost no one expected.

Last week, OpenAI unveiled Sora, and it had the whole internet talking. What's this new product? How good is it? How will it affect the future?

In this article, we're going to talk about all those questions along with one extra: in a head-to-head prompt comparison with Midjourney. While Sora is meant to generate videos, it's quite interesting to take a peak into how it compares to Midjourney when given the same prompt.

What is Sora?

We’re no strangers to Midjourney, so let’s focus on Sora: OpenAI’s latest diffusion model aimed at text-to-video generation. As of February 2024, it isn’t publicly available yet, but we’re already seeing glimpses of how amazing it is through OpenAI’s showcase on their website and the videos generated by them on Twitter.

Similar to DALL-E 3, it uses the power of transformer architecture to better understand prompts and modify them into something that it can understand. As for its creativity, it’s able to generate both photorealistic and animated videos with an almost eerie uncanniness to real videos.

I’ll be honest: I haven’t been this excited about an AI model since the early days of DALL-E. 

Sora vs. Midjourney: Direct Prompt Comparison

Midjourney can't generate videos, but that doesn't mean we can't compare its output with Sora. It's very interesting to see the subtle but important differences in how this model generates video rather than an image. Things like:

  • More realistic and hyper-detailed personal features
  • Colors are more "real" and don't seem like they're oversaturated lightroom edits
  • True cinematic shots that could be seen in a real movie (rather than images stitched together)

There's really just another element of realism I've never seen in any type of AI-generated content. It's kind of unnerving but still cool.

We've been consistently praising Midjourney in our past articles, so it stands to reason that it's just as good, if not better, than Sora at its early stage, right? Not necessarily.

My first impression is that Sora will be the first hyper-realistic AI creation model. Even compared to DALL-E, made by OpenAI (which I don't think is as versatile and useful as Midjourney), Sora is just so much more impressive. Here's how it compares against Midjourney stills:

The Man in the Clouds

Prompt: A young man at his 20s is sitting on a piece of cloud in the sky, reading a book.

Her Eyes

Prompt: Extreme close up of a 24 year old woman’s eye blinking, standing in Marrakech during magic hour, cinematic film shot in 70mm, depth of field, vivid colors, cinematic.

Big Sur

Prompt: Drone view of waves crashing against the rugged cliffs along Big Sur’s garay point beach. The crashing blue waters create white-tipped waves, while the golden light of the setting sun illuminates the rocky shore. A small island with a lighthouse sits in the distance, and green shrubbery covers the cliff’s edge. The steep drop from the road down to the beach is a dramatic feat, with the cliff’s edges jutting out over the sea. This is a view that captures the raw beauty of the coast and the rugged landscape of the Pacific Coast Highway.

The Gold Rush

Prompt: Historical footage of California during the gold rush.

Paper Airplanes

Prompt: A flock of paper airplanes flutters through a dense jungle, weaving around trees as if they were migrating birds.

The Robot

Prompt: The story of a robot’s life in a cyberpunk setting.

The Wolf

Prompt: A beautiful silhouette animation shows a wolf howling at the moon, feeling lonely, until it finds its pack.

The Fluffy Monster

Prompt: Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.

The Otter

Prompt: An adorable happy otter confidently stands on a surfboard wearing a yellow lifejacket, riding along turquoise tropical waters near lush tropical islands, 3D digital render art style.

The Bottom Line

This does feel a little like comparing apples and oranges, but that wasn't the intent of me writing this. It's genuinely mind-boggling to me how some of these are so close to each other both in looks and general quality, but there's an additional quality of realism that has never really been a thing before.

If this is what we can expect with Sora, then the hype is definitely justified. DALL-E 2 wasn't as good as the highlighted samples, so we might be expecting the same here. And that's fine. This is still insane.

It truly feels like text-to-image was a peak that we've already conquered. Don't get me wrong, it will continue to improve, but all eyes are definitely on text-to-video generation now. Along with Runway and Pika Labs, Sora is leading the way to a new challenge in the AI space. And what's crazy is that what we're seeing is just a preview of what's to come.

Who knows what will happen in the next few years? Will deepfakes make a comeback for the worse? Or maybe a fully AI-generated film will be up for an Academy Award.

Whatever it is, I just hope that we're well-prepared for its consequences as a society.

Want To Learn Even More?
If you enjoyed this article, subscribe to our free monthly newsletter
where we share tips & tricks on how to use tech & AI to grow and optimize your business, career, and life.
Written by John Angelo Yap
Hi, I'm Angelo. I'm currently an undergraduate student studying Software Engineering. Now, you might be wondering, what is a computer science student doing writing for Gold Penguin? I took up studying computer science because it was practical and because I was good at it. But, if I had the chance, I'd be writing for a career. Building worlds and adjectivizing nouns for no other reason other than they sound good. And that's why I'm here.
Notify of

1 Comment
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Join Our Newsletter!
If you enjoyed this article, subscribe to our newsletter where we share tips & tricks on how to make use of some incredible AI tools that you can use to grow and optimize a business