Skip to main content
Founder Strategy

We Lost Seven Figures Because a Developer Trusted AI — Here's What We Built to Fix It

A founder's firsthand account of how blind trust in AI-generated code led to a catastrophic security breach, what university research reveals about the AI literacy crisis, and the interactive tool built to prevent it from happening again.

Jahja Nur Zulbeari | | 11 min read
Jahja Nur Zulbeari — Founder of Zulbera

I founded Zulbera at 16. By the time I was running a team of 12, we were among the first to integrate OpenAI’s API directly into our workflow — long before ChatGPT made AI mainstream. AI was our superpower. It made us faster, leaner, and more competitive than agencies three times our size.

Then one commit changed everything.

The Incident That Cost Us Seven Figures

A developer on our team was using AI tools to accelerate a client project. Standard workflow — prompt, generate, ship. The code worked. It passed basic testing. It looked clean.

What nobody caught: the AI-generated output included sensitive credentials and secret API keys, hardcoded in a way that was not immediately obvious. The code was pushed to the repository without a proper security review.

The breach that followed resulted in financial losses in the seven-figure range and caused serious damage to a key client relationship.

The root cause was not malice. It was not incompetence in the traditional sense. The developer simply trusted the AI output the way you trust a senior colleague’s pull request — implicitly. They could access AI tools fluently but could not evaluate the output critically. They accepted it and transmitted it to production.

That gap between access and evaluation is what I now call the AI literacy crisis. And it is everywhere.

Nobody Teaches Developers How to Distrust AI

Here is the uncomfortable truth: the entire developer ecosystem is set up to make you trust AI output by default.

AI tools are integrated directly into your IDE. The suggestions appear inline, right where your own code would go. The output is syntactically correct, well-formatted, and often accompanied by confident explanations. There is no warning label. No friction. No moment where the system says, “You should probably verify this.”

Compare this to any other professional domain. A lawyer does not file a brief written by an intern without reviewing it. A doctor does not prescribe medication based on a suggestion without cross-referencing. An engineer does not approve structural calculations without independent verification.

But developers — including experienced ones — routinely push AI-generated code to production with zero review. Not because they are careless, but because nobody taught them that AI output is a fundamentally different kind of input that requires a fundamentally different kind of scrutiny.

Traditional code review catches bugs because reviewers understand that humans make mistakes. AI-generated code bypasses this instinct because the output feels authoritative. It does not hedge. It does not say “I think” or “you might want to check.” It presents vulnerable code with the same confidence as bulletproof code.

The Research: 57% of CS Students Blindly Trust AI

After the incident, I needed to understand whether this was a personal failure or a systemic one. So I partnered with FINKI — the Faculty of Computer Science and Engineering at Ss. Cyril and Methodius University in Skopje — to find out.

We designed a structured survey targeting CS and IT students, measuring their AI usage habits against their actual ability to critically evaluate AI-generated content. The framework was built on the four pillars of media literacy — Access, Analyze, Evaluate, and Transmit — applied to AI as a new form of media.

The results were alarming.

The Numbers

100% of respondents use AI tools weekly or more frequently. The access barrier is zero.

57% blindly trust AI-generated explanations without cross-referencing any other source.

36% rarely or never review AI-generated code before using it in their projects.

36% rarely or never check AI output for security vulnerabilities like exposed API keys or injection flaws.

86% believe AI can replace the need for deep programming knowledge — a dangerous misconception that directly enables the trust problem.

57% have already experienced a situation where AI-generated code caused a real problem in their work.

The composite AI literacy score across all respondents was 2.93 out of 5.00 — below the midpoint. The majority of students who use AI every single day cannot critically evaluate what it gives them.

Experience Does Not Fix This

This was the finding that hit hardest, because it mirrored exactly what happened at Zulbera.

We ran a one-way ANOVA to test whether programming experience predicts AI literacy. Three groups: 0-1 years, 2-3 years, and 4+ years of experience.

F-statistic: 0.8516. p-value: 0.4555.

Not significant. Not even close.

Students with four or more years of programming experience scored just as poorly on AI literacy as complete beginners. The developer who caused our breach was experienced too. Experience teaches you to spot human errors, but AI errors are a different category entirely — confident, fluent, and invisible to traditional code review instincts.

AI literacy is not a skill you develop by writing more code. It is a skill that must be trained deliberately.

What We Built: The AI Trust Test

After seeing the data, the response was obvious. We did not need another blog post telling developers to “be careful with AI.” We needed a tool that makes you experience the consequences of blind trust in a safe environment before you experience them in production.

That tool is trust.zulbera.com.

How It Works

The AI Trust Test is a free, interactive web app designed to train the Analyze and Evaluate pillars of AI literacy:

  1. You face 8-10 real-world scenarios. Each one presents AI-generated code, an AI-written explanation, or an AI-suggested solution pulled from realistic development contexts.

  2. For each scenario, you decide: Trust or Verify. You evaluate whether the output is safe to use as-is or whether it requires closer inspection.

  3. After all scenarios, the app reveals what you missed. Hardcoded API keys hidden in comments. SQL injection vulnerabilities in generated queries. Plain-text password storage in authentication functions. Factual errors in confident technical explanations. And at least one scenario that is perfectly safe — testing whether you have become over-suspicious.

  4. You receive a personal AI Literacy Score with specific, actionable recommendations for improving your critical evaluation habits.

The entire experience takes under ten minutes. But the lessons tend to stick, because you are not reading about hypothetical risks — you are making real decisions and seeing real consequences.

The Meta-Story

Here is the part I find most compelling: the AI Trust Test itself was built using AI tools. Our team vibe-coded the entire interactive application in roughly three days — a project that would have taken weeks before AI.

That is the thesis in action. AI is extraordinarily powerful when wielded by someone who verifies its output. The same tools that caused a seven-figure loss at Zulbera were used to build the solution — the difference was literacy.

The Five Rules Our Team Follows Now

After the breach and the research, we rebuilt our entire AI workflow around five rules. Every developer at Zulbera follows them. Every new hire learns them on day one.

Rule 1: Every AI Output Is an Untrusted Pull Request

No AI-generated code goes into production without the same review process you would apply to code written by an unknown contributor. Read every line. Question every assumption. Check every import.

Rule 2: Secret Scanning Runs Before Every Commit

Automated tools scan for credentials, API keys, tokens, and secrets before any code reaches the repository. This is a technical backstop, not a replacement for human review — but it catches the failures that get past human eyes.

Rule 3: AI Explanations Get Cross-Referenced

When AI provides a technical explanation — how an API works, what a library does, how a protocol operates — we verify against official documentation before acting on it. AI explanations are starting points for research, not conclusions.

Rule 4: Security-Critical Code Gets Extra Scrutiny

Authentication, authorization, payment processing, data encryption, and anything touching user data — these areas get an additional review cycle specifically focused on security when AI tools were involved in generating the code.

Rule 5: We Train Continuously

AI literacy is not a one-time certification. Our team runs through the AI Trust Test periodically, and we share examples of AI failures we catch in real work. The goal is to maintain a healthy skepticism without killing the productivity benefits that make AI valuable.

Why This Matters Beyond Our Team

The AI literacy gap is not a Zulbera problem. It is not a FINKI problem. It is an industry problem.

Every company adopting AI tools — and by 2026, that is effectively every software company — is creating an environment where developers can access powerful generation tools without any training on critical evaluation. The speed benefit is real. The risk is equally real.

The 2.93 out of 5 literacy score from our research is a baseline, not an outlier. If anything, self-reported surveys overestimate actual literacy because of social desirability bias. The real number is likely worse.

Organizations that invest in AI literacy training now will avoid the breaches, the rework, and the seven-figure lessons that organizations ignoring this problem will learn the hard way. The tools exist. The data is clear. The question is whether you act before or after the incident.

Try It Yourself

Visit trust.zulbera.com, take the AI Trust Test, and find out where you actually stand. It takes less than ten minutes, it is free, and it might be the most useful ten minutes you spend this month.

Share your score. Challenge your team. Start the conversation.

Because the developers who will thrive in the AI era are not the ones who prompt the fastest. They are the ones who verify the smartest.

#ThinkBeforeYouPrompt

Jahja Nur Zulbeari

Jahja Nur Zulbeari

Founder & Technical Architect

Zulbera — Digital Infrastructure Studio

Let's talk

Ready to build
something great?

Whether it's a new product, a redesign, or a complete rebrand — we're here to make it happen.

View Our Work
Avg. 2h response 120+ projects shipped Based in EU

Trusted by Novem Digital, Revide, Toyz AutoArt, Univerzal, Red & White, Livo, FitCommit & more