Depending on how you look at it (and where you fall on the adoption curve) it has been more or less than a year since Generative AI made its debut into our collective consciousness. Since then, the number of tools has exploded…for this post, I check out a few top Image Generation tools and see how they currently stack up.
The AI Image Challenge
For this ‘battle’, the task is quite simple:
- Chosen Location – Thailand
- Desired Image – Showcasing People and Location
- Objective – Create compelling visual content to use for destination marketing purposes.
- Success Criteria – Ease, Speed, Realism, Creativity, Usability
Sample Text to Image Prompts
I kept it simple and tried two prompts for each Generative Image tool:
- Thailand Nature, Thai People, Thai Food, Cinematic, Tourism Photography, 4k
- Beautiful Thai Woman with folded hands, welcome to Thailand, Cinematic, Centered Studio Portrait, 4k, Nature in background
Bing Images, Powered by DALL-E
Here’s what Bing Images produced:
Generally pretty good, save a few artifacts and a bit of weirdness with the eyes in the first portrait.
Stable Diffusion
Here’s what Stable Diffusion cooked up:
Quick image generation and a couple of the destination shots are quite good, but the general quality of people portraits still doesn’t quite pass muster.
Leonardo.ai
Next, a test with Leonardo‘s tool – they offer a few different models and for this test I used “Deliberate 1.1”
The different models add a bit of variety to the types of images you can produce. The first set of destination isn’t too bad but the people don’t render quite so well. The second set is far better, with some hand distortion but a couple of usable, winning images here.
Midjourney
Midjourney offers some pretty fantastic image quality, especially with the latest version 5.1 and these tests were no exception:
The image with the woman with folded hands was on-point…the only flaw being the hands / fingers, which Gen AI tools still seem to struggle with.
Adobe Firefly
Adobe Firefly, which will also power image generation by Google Bard (similar to how Microsoft has Bing Chat generate images using OpenAI’s DALL-E), had this to offer (using the Photo option…you can also select Art or Graphic):
For the first set of images, Firefly steers the focus mostly to food, for some reason…the second set of portraits look surprisingly realistic, with the first capturing the desired ask quite well.
The Verdict
Here’s how I think the five tools stack up:
It’s important to note that the above scores are based on personal testing on a pretty narrow scope…also, these are likely to change pretty quickly as these tools evolve in the coming weeks and months.
As you can see Midjourney narrowly claims the top spot, with Adobe Firefly pretty close behind. I think if Midjourney made its interface more accessible for novice users (vs via Discord), it could probably extend its lead considerably.
It’ll be interesting to see which tools become the preferred choice for creators…I suspect it will be a combination of the above factors, overall distribution, cost as well as usage rights offered by each platform in the end.
It’s fascinating now to what these AI tools are doing. Making statues of Jesus and Hercules perform like cirque du Soleil and Pop being chased by cops on street and the so on. Your experiments offer great insights and those distorted hands looked funny 🙂
LikeLike
Pingback: The Lines Are Blurring – How Good Are You At Spotting Real vs AI Generated Images? – Hotel Marketing, Technology and Loyalty
Pingback: Gen AI Image Battle 2 – MidJourney, DALL E, Stable Diffusion, Leonardo and Adobe – Hotel Marketing, Technology and Loyalty