The Image Model Shift No One Is Taking Seriously Yet

Hey friends,

A few months ago, if you asked an AI to generate an image with words on it, you'd get nonsense. Letters that looked vaguely like a different language. Spelling that made you laugh.

I tried the same kind of prompt last week. Clean text. Right font feel. Properly placed inside the image.

That's the part most people are still missing about AI image generation. Not where it is today. How fast it's moving.

The Strawberry Trap

There's a habit a lot of people have fallen into. They test AI on something it can't do, decide it's broken, and stop paying attention.

The famous one is the strawberry test. Ask AI how many R's are in the word strawberry, and for a long time it would confidently get it wrong. People take that one example and conclude the whole technology is overhyped.

Here's what they're missing. The thing that has actually changed is that AI has now learned how to code, and the companies building these models are using AI to improve their own models. That feedback loop is the real story.

Think about how long it used to take to fix a basic bug in something like SAP or a Microsoft system. Sometimes years. Now look at the tools we're using day to day. I run Claude Cowork, and there's a meaningful update to it almost every single day. Not just new features. Improvements to existing ones.

When the people building the tools get faster at building, the tools themselves move at a different pace. That's where image generation is right now.

Where The Models Are At

Two of the biggest jumps in the last few months have come from ChatGPT's new image model and Google Gemini's image model, which they call nano banana. Both are at the point where, if you know how to prompt them properly, you can generate images that work in real business contexts.

I'm not going to do a side-by-side review here. There are plenty of comparisons online, and you can run your own tests in a few minutes. The point is not which one wins. The point is that the bar has moved.

This Is A Skill, Not A Toy

Image generation is not just a fun thing to play with on a Saturday. It is becoming a core skill for business professionals and leaders. The same way writing a clear brief is a skill, prompting an image model is a skill. The people who learn it will produce visual content faster, cheaper, and more relevant to their actual message than the people who don't.

The mistake most people make is treating image prompts like Google searches. They type three words, get a generic result, and walk away disappointed.

A good image prompt does five things. It names the subject clearly. It sets the style or medium. It describes the composition and framing. It calls out the lighting or mood. And it specifies the format or aspect ratio.

That's the framework. Subject, style, composition, mood, format. Five layers, in that order, and the output changes completely.

Three Prompts You Can Try This Week

Copy these into ChatGPT, Gemini, or any image model you use. Adjust the details to your business and see what comes back. Here's what comes back without making any changes to the prompt into ChatGPT.

❝

Prompt 1: Realistic Photo Style:

A photorealistic image of a small business owner in their late 40s sitting at a wooden desk in a bright, modern office, looking thoughtfully at a laptop screen. Natural light coming in from a large window on the left. Shot at eye level, slight depth of field, warm tones. 16:9 aspect ratio.

❝

Prompt 2: Illustrated or Animated Style

A clean flat-illustration style image of a person at a desk surrounded by floating icons representing different AI tools. Soft pastel colour palette, simple geometric shapes, no text in the image. Suitable for a blog header. 3:2 aspect ratio.

❝

Prompt 3: Business or Document Context

A clean, professional infographic-style image showing three stacked cards, each labelled with a single word: Plan, Act, Review. White background, modern sans-serif font, subtle drop shadows, blue and black colour scheme. Square format, 1:1.

Run the same prompt across two or three different models. You'll quickly see which one fits your style best. That's your starting point.

Stop Paying For Stock Images

Now here's the practical bit.

If your team is still spending hours fiddling with stock photo libraries, paying for big subscriptions, and settling for images that don't quite fit the message, there is a real opportunity sitting in front of you. AI-generated images, used with proper disclaimers and the right context, can replace a chunk of that work.

Not all of it. There are still cases where you want a real photographer, a real model, a real product shot. But the in-between work, blog headers, social posts, slide visuals, internal documents, that's where AI image generation is already good enough to use today.

This is one of those quiet productivity gains that adds up across a year.

The Other Side Of This

I have to mention the flip side, because it is going to matter.

As these images get more realistic, the scope for fraud is going to grow with them. I have already seen examples of people generating fake receipts, fake documents, and fake supporting evidence with these tools. The output is convincing.

If your business deals with documents, invoices, claims, identity checks, anything where a visual artefact is treated as proof, you need to be thinking about this now. Not panicking about it. Just being aware that the same technology that lets you make a great social post also lets someone fabricate a receipt that passes a quick human review.

The same goes for scams. Voice cloning has been the headline for a while. Image and document generation is the next layer. Most businesses have not updated their verification processes for this yet.

A Calm Takeaway

The pattern here is the one I keep coming back to. People judge AI on what it cannot do today, and miss the speed at which the things it cannot do are shrinking.

Image generation is a good case study. A year ago, the output was a punchline. Today, it is genuinely useful in real work. A year from now, the gap between the businesses using it well and the businesses still buying stock photos will be wide.

The opportunity is not to become an image expert. It is to spend a few hours this week learning enough to get good at prompting one of these tools, so you can use it when it matters.

The people who do that will quietly move ahead. The people who keep telling the strawberry joke will not.

See you next week,

— Aamir

📲 Resources & Links

🎧 Listen to the Podcast Episode on: Spotify | Apple Podcasts | YouTube

📘 Book: The CEO Who Mocked AI (Until It Made Him Millions) by Aamir Qutub

🚀 Dumb Monkey AI Academy

📱 Dumb Monkey AI Academy App: Apple | Android