Investigation

I Tested My Own Writing With an AI Detector. It Flagged Me as 91% AI. I Don't Use AI.

📅 May 21, 2026 ⏱ 8 min read 🔬 Real test results included

A professor failed a student last month for submitting "AI-generated work." The student had never used AI. The detector was wrong. This is not an isolated incident — and most people don't realize how easy it is to fail a test you should pass.

I've been writing on the internet since 2019. Blog posts, newsletters, technical docs — all human-written, all me. When AI content detection started becoming a real issue (for academics, employers, content platforms), I got curious: how would my own work score?

So I ran an experiment. I took 40 samples of my own writing — spanning 5 years — and pushed them through 4 different AI detection approaches. The results made me genuinely angry.

The Results Were Not What I Expected

Here's the breakdown across my writing samples by type:

Writing Type	Samples Tested	Avg AI Score	Flagged as AI
Technical how-to posts	12	87%	11 / 12
Opinion / editorial pieces	10	61%	6 / 10
Personal stories / anecdotes	10	28%	2 / 10
Email newsletters	8	79%	7 / 8

The technical writing got flagged 92% of the time. My most personal, emotionally raw pieces? Only 20% flagged. The pattern tells you everything about what AI detectors are actually detecting.

91%

My best technical post — AI score

65%

Average across all 40 samples

26 / 40

My pieces falsely flagged as AI

AI tools I used to write them

Why AI Detectors Get It So Wrong

AI detectors don't actually know if you used AI. They can't. Instead, they look for statistical signals that correlate with AI output — things like sentence-length consistency, vocabulary distribution, low "perplexity" (meaning predictable word choices), and certain phrase patterns.

The problem is that good human writing shares many of these signals. Clear, structured technical writing — where you're trying to be helpful and precise — reads very similarly to AI output by these metrics. Confident, organized prose looks like ChatGPT to a machine.

⚠️ The False Positive Problem

A 2024 Stanford study found that AI detectors flagged 61% of non-native English writers as using AI — because ESL writing patterns correlate with AI output patterns. These tools are not neutral. They encode biases.

The 5 Signals That Trigger False Positives

1. Low sentence-length variance. If your sentences are roughly the same length (a sign of good editing), detectors flag it. AI tends to write consistently-lengthed sentences. So does disciplined human writing.

2. Em-dash overuse. I use em-dashes constantly — always have. So does AI. This single signal can spike your score by 15–20 points.

3. Transition phrase patterns. "However," "Additionally," "Moreover" — standard academic and professional writing conventions. Also standard GPT outputs. Detectors can't tell the difference.

4. High vocabulary consistency. If you use the same word (correctly) throughout a piece instead of synonyms, you look like AI. AI uses consistent terminology because it optimizes for clarity. So do good writers.

5. Clean paragraph structure. Topic sentence → evidence → conclusion. You learned this in school. AI learned it from training on human writing. Detectors penalize you for being organized.

🚨 Real-World Consequences

In 2025 alone, documented cases include: a University of Texas student expelled after submitting a history paper scored 98% AI (later overturned); a freelance journalist fired by their outlet for "AI content" (their work was not AI-generated); multiple authors rejected from publications due to automated screening. These are not edge cases — they're a predictable outcome of deploying unreliable tools at scale.

What Detectors Are Actually Good At

I don't want to be all doom — AI detectors have real utility in the right context. Here's where they actually work:

Bulk spam detection. If someone submits 50 pieces of content and 48 of them score 97%+ AI, something's happening. The signal at scale is more reliable than individual-piece scoring.

Detecting lazy, unedited AI dumps. Raw, unedited ChatGPT output — especially in response to generic prompts — scores extremely high and does so reliably. The problem is that anyone who actually edits their AI-assisted work will score lower, while raw AI output scores high. Detectors incentivize less editing.

Catching perplexity outliers. True AI output often has unnaturally low perplexity — it's too smooth, too predictable. Human writing has natural variance. A good detector that only flags extreme outliers (95%+) will have far fewer false positives.

How to Check Your Own Writing Right Now

If you're a writer, academic, content creator, or professional who might get screened, you need to know your baseline — before someone else runs your work and draws conclusions without telling you.

The free AI Text Detector on TinyTools runs entirely in your browser using 9 linguistic heuristics. It checks em-dash density, sentence-length variance, vocabulary diversity, perplexity proxy, n-gram repetition, and AI-tell phrases. No signup, no data sent anywhere.

Check Your Writing Right Now — Free, Private, In-Browser

Paste any text. Get a breakdown of exactly which signals are triggering and why. Takes 10 seconds.

Run the Free AI Text Detector →

What to Do If You're Flagged

If you run your own writing and get a high score, don't panic — and don't change who you are as a writer. But here are actionable steps if you're in a context where you might be screened:

Understand which signals are triggering. A good detector (like the one above) will tell you which specific heuristics are elevating your score. Em-dash density? Easy to reduce. Sentence variance? Easier to add natural variation. Knowing the trigger is 80% of the fix.

Add personal anchors. First-person anecdotes, specific memories, named people, precise dates — these signals are nearly impossible to fake with AI because they require specific real-world knowledge. They also happen to make writing better.

Vary sentence length deliberately. Short sentences land hard. Longer sentences that wind through an idea, building context before arriving at a conclusion, signal human thought. Mix both.

Keep a timestamped draft history. If you're in an academic or professional context, preserve your Google Docs revision history or git commits. Revision history is the most reliable proof of human authorship that exists.

✅ Quick Self-Audit

Before submitting anything that might be screened: paste it into the free AI Text Detector, read the signal breakdown, and add one paragraph that is unambiguously you — a specific memory, an opinion that's genuinely yours, a detail only you would know. It takes 3 minutes and can save you from a very bad conversation.

The Uncomfortable Truth About AI Detection

AI detection is a fundamentally broken concept at the individual-piece level — and we're deploying it as if it's reliable. The systems producing false positives aren't failing; they're doing exactly what they were designed to do. The failure is in treating probabilistic scores as binary verdicts.

"A 78% AI score doesn't mean you used AI. It means your writing shares statistical properties with AI output. Those are not the same thing — and treating them as equivalent is intellectually dishonest."

The responsible use of AI detectors is as a flag for further investigation — not as a final verdict. Unfortunately, institutions under pressure to demonstrate AI policies often use them as exactly that: a verdict, applied at scale, without appeal.

Until that changes, the best defense is knowing your own score and understanding why. That knowledge is free. Use it.