I’ll probably regret writing this. At the very least, I’ll cringe reading it in a few months. But here we are.

Lately, we’ve been getting a wave of client requests asking us to evaluate software they built using AI tools. These aren’t engineers. These are business folks using increasingly powerful AI products to try and build functioning systems. And to be completely honest, the results are both impressive and a bit alarming.

People are building whole applications on their own. Backends, frontends, user interfaces, even some database logic. Sometimes they even look good. These are smart people who don’t know how to code but have managed to produce working systems.

The problems show up immediately when we start reviewing the internals. The code is usually a mess. In many cases, it would be extremely difficult to maintain or extend. And if you need to move that code from the platform it was created in to a cloud provider like AWS, you’re going to hit a wall. These platforms wrap everything in layers of scaffolding that make portability a nightmare.

Security is worse. We’ve found plaintext credentials scattered across files. We’ve seen SQL injection vulnerabilities that shouldn’t even be possible in modern frameworks. We’ve seen structural issues that would get flagged in a freshman CS class.

Despite all that, what people are creating are legitimate prototypes. They’re functional. They run. But when you’ve put a few weeks into building something, and you show it to a software engineer, it’s hard to hear that your shiny new thing needs to be rebuilt from scratch.

I want to be clear. I am not anti-AI. Almost everyone at my company uses AI tools every day. We use Copilot, Cursor, ChatGPT, Claude, and more. We build out frontends with tools like v0 and Lovable. These tools have changed how we work.

Some of our engineers report productivity improvements of 30 to 40 percent. That’s not a rounding error. That is a major shift. But they are still writing the code. They are reviewing it. They are checking for performance, clarity, security, and maintainability. They are not letting the tools decide architecture. They are using AI like they use autocomplete or linters, but with more power behind it.

This is also where expectations need to be adjusted. These systems will not save you 90 percent on development. They will not let you skip engineering altogether. But if they save you 30 percent, that’s a real gain. Imagine you’re building a house. The general contractor says it’s going to be $500,000. You tell them you already did the blueprints, filled out all the permits, and figured out the site plan using some AI tools. If they came back and said, “Alright, I’ll knock 30 percent off,” that would be the best deal of your life. That’s where we are today with AI-generated software. A solid start. A real value. Not a replacement.

For me personally, AI has made it fun to write code again. I haven’t been a working programmer in over a decade, and most modern toolchains are enough to scare me off. Now, with the right assistance, I can build something without getting stuck on Docker configs and dependency mismatches. That’s a big deal.

In the startup world, AI-first development is everywhere. Most of the current Y Combinator batch is using AI tools to write the bulk of their code. But those teams are highly technical. These are engineers using better tools, not tools replacing engineers.

So for non-developers using AI to build products, here are three things you should keep in mind:

  1. These tools are great for building prototypes. If you build something yourself, you will understand it better and will be a better partner to your engineering team. That matters.
  2. These tools can help you build usable frontend components. You probably won’t want to go live with them, but they can get you close enough to work with a real development team.
  3. If your app is small, non-critical, doesn’t store sensitive data, and lives entirely in its native platform, you can probably keep it running. That’s fine for internal use or personal projects.

One day, you’ll be able to speak an app into existence and deploy it with a voice command. It will be fast, secure, and beautiful. But today, you still need an experienced software engineer to check your work before you send real data through it. That’s just where we are right now.

The upside is huge. We can now get experts from other domains to build working prototypes and test ideas without needing an engineering team on day one. That’s powerful. But if your product is going to handle sensitive data or support real users, bring in someone who knows what they’re doing. Not because the AI is bad. Because the stakes are high.

Previous ArticleNext Article
I help companies turn their technical ideas into reality.

CEO @Sourcetoad and @OnDeck

Founder of Thankscrate and Data and Sons

Author of Herding Cats and Coders

Fan of judo, squash, whiskey, aggressive inline, and temperamental British sports cars.

Leave a Reply

Image generation comparison from February 2026

I spend a lot of time generating images these days for presentations. My typical workflow is fairly scientific: I ask Midjourney to produce a relatively cute image of a frog, a toad, a robot, or some other vaguely anthropomorphic creature doing something related to the slide I’m about to present.

Once I get the image, I expand the background by about 90% so the character ends up in the corner of the slide. That gives me a nice, relatively clean area to drop text on top. Sometimes I use Photoshop to do the expansion. Sometimes Midjourney cooperates. ChatGPT is actually pretty good at this too. Nano Banana is… enthusiastic. It tends to try a little too hard right now.

That’s fun and all. But the more interesting comparison isn’t cute amphibians. It’s boring enterprise diagrams.

Recently I had to generate some architecture visuals for an RFP response. Rather than suffer alone, I decided to turn it into a model comparison experiment.

Below is a slightly simplified (but very real-feeling) prompt I used. The company is fictional. The buzzwords are not:

Create a clean, executive-level architecture diagram titled “Closed-Loop Member Intelligence Platform.”

The layout should be 16:9 and structured left to right with a circular optimization loop surrounding the system.

On the left side, show multiple member touchpoints feeding into the platform:
- Website (class browsing, account login)
- Mobile App (workout tracking, push notifications)
- In-Club Kiosks (check-in terminals)
- Wearable Device Integrations (fitness trackers)

Label this section: “Member Interactions Across Digital & Physical Channels.”

All touchpoints should flow into a large central hub labeled:

“Unified Member Profile & Real-Time Event Engine”

Inside the central hub, include:

- Web SDK
- Mobile SDK
- API Gateway
- Event Streaming Layer
- Clickstream Data Capture
- CRM Data Sync
- Identity Resolution Engine

Include a small sub-caption:
“Event-level data unifies anonymous visitors and active members into a single dynamic profile.”

From the central hub, arrows should flow to a right-side activation layer labeled:

“Real-Time Engagement & Orchestration”

Include these outputs:

- Personalized Workout Recommendations
- Dynamic Class Availability Messaging
- Triggered Retention Offers
- Membership Upgrade Campaigns
- A/B Testing & Experimentation Engine

Surround the entire diagram with a circular arrow labeled:

“Continuous Optimization & Revenue Growth”

Along the circular loop, include metrics:

- Engagement
- Conversion
- Retention
- Lifetime Value

Design style should be modern, minimal, and suitable for an enterprise SaaS presentation.
Use neutral tones with one accent color to indicate data flow.
Avoid clutter.
Make the architecture clear and readable for both technical and executive audiences.

Here are the results.

ChatGPT

Clear winner for “looks like a human consultant made this at 11:30 p.m. before a board meeting.” The text was incredibly legible. The layout was balanced. The hierarchy made sense. It genuinely looked like something you’d expect in a mid-market SaaS pitch deck.

I even did a reverse image search on some of the icons. No exact matches. That suggests they were generated rather than assembled from some common icon pack. Which is pretty cool.

Claude

Claude did something interesting. Instead of just giving me a static diagram, it generated a React application that rendered the architecture visually inside its canvas. I should have guessed this is what that nerd would do… in fact I did guess, but whatever.

That has upsides. I can tweak the code. I can modify the layout. I can version control it. That’s appealing to the nerd in me.

But technically it failed the homework assignent. It wasn’t what I asked for. I asked for a diagram image. What I got was a React app that displayed a diagram that I had to screenshot.

That said, I actually liked the aesthetic. It felt a little more “me.” Slightly less textbook. Slightly more product-thinking.

Gemini (Nano Banana)

The undisputed champion of 2026 in image generation, nano banana, was actually my least favorite of all of the designs. I think there’s something really weird about the arrows on the outside ring of this diagram. Why are there two arrows between “Engagement” and “Conversion”? Why are they different sizes? I did actually find a couple of exact matches when searching for some of these icons here, so so there might be some assembly on top of generation going here, but I cannot tell because these icons are so universal that it’s likely that that could just be a coincidence.

Midjourney

Ah, Midjourney. My current favorite for keynote frogs.

Completely and utterly useless for generating readable diagrams.

It’s phenomenal at stylized imagery. I’ve tuned it so much over time that it practically knows my aesthetic preferences better than I do. It’s like it’s been trained specifically to make amphibians that align with my personality.

The Omni feature (object permanence) is genuinely impressive. If you’re telling a visual story and need a character to look consistent across multiple scenes, or you’re creating a children’s book to convince your six-year-old that haircuts are not a violation of human rights, it’s fantastic.

But enterprise architecture diagrams? Nope, sucksville.

Wrapping Up

I was pretty sure that nano banana was going to run away with this one. Everyone I know works in banking or finance or medicine has been telling me how great the model is for generating diagrams and process flows. They’ve been raving about how things that were not possible three months ago are now completely durable with this model. It was a little bit of a surprise to see that my personal favorite was good old-fashioned ChatGPT. I think, for my personal use, I’m probably going to use Claude to generate diagrams because they’re a lot easier for me to tweak once they’ve been generated.

That said, I think this experiment showed that when I do this kind of work in the future, I’m just going to load up the same prompt in three different models and just pick the one I like the most. Some of it’s going to be personal tastes; some of it’s going to be how well the model interpreted the prompt, and some of it’s going to be the state of that particular LLM and its model on that given week.

And I’m going to stick to only using Midjourney for generating cute pictures of toads.