It works similar to other online testing tools like Originality, TurnItIn, and PassedAI, but specifically seems to break text down into a more detailed analysis and scrutinize it more before classifying it as AI-written or not.
I took a deep dive into the tool to see how it works and what you could use it for.
If you're not familiar with how AI detection works, it works by analyzing text for patterns to determine how predictable writing is. The easier the text is to predict by an AI, the higher chance it overlaps with actual AI-produced content.
These tools all work by reverse engineering text prompts to determine if the AI can recreate what was entered (and with what accuracy).
What is GPTZero?
GPTZero was launched on January 3, 2023, by Edward Tian, who created the tool as his thesis project. The freemium AI-detection tool had a whopping 1.2 million users after 5 months and 2.5 million users as of posting. But it’s not because Tian was a 22-year-old computer science senior at Princeton. It’s because of the mission & purpose of how the tool detects AI content.
The app’s name and the “Humans Deserve the Truth” tagline speak for themselves. So yes, GPTZero aims to fight the misuse of ChatGPT by detecting content written by AI tools such as ChatGPT, GPT-3, GPT-4, and LlaMA, a large language model (LLM) by Meta AI. They've also raised $3.5 million in capital funding.
Who is GPTZero for?
GPTZero was mainly designed for educators. It was created specifically to evaluate the work of students. Their model also promises to avoid false positives, which means that it is most likely to release accurate scores. Other professionals such as publishers, editors, and those who hire writers can also use this tool (including students).
How Does GPTZero Work?
Technically, GPTZero has only one feature – to detect AI content. But as I said earlier, it’s the way it releases the details of the result. You use GPTZero by pasting text into the paragraph box and submitting it for detection. It analyzes text based on 2 characteristics: "perplexity" and "burstiness"
Perplexity – How random your text is based on predictability. The model runs text through GPT-2 (345 million parameters). The range of perplexity is not quite known, but values closer to 0 are very likely to be artificially generated, while those closer to 100 has a higher chance of being human-written.
Burstiness – The occurrence of non-common items appearing in random clusters over time (aka creative variability). Perplexity is uniformly distributed and consistently low for machine-generated content. As humans naturally include more variability in their writing, you'll notice it has a lower chance of being predicted by patterns.
Testing GPTZero with ChatGPT
According to the company, GPTZero was trained with an equal balance of human and AI-written articles. Moreover, their tool is designed to classify 99% of the human-written articles correctly, and 85% of the AI-generated articles correctly.
To put GPTZero to the test, I put content from GPT 4, GPT-3, and my own sentences without AI assistance on the tool. You can see the results and scores in the summary table below.
|Number of Characters||1,796||2,921||2,218||4,296||4,437||1,071|
|Result||Most likely human written||Most likely human written||Most likely human written||May include parts written by AI||Most likely human written||Most likely human written|
I'll go ahead and test GPTZero with an example I asked ChatGPT. This seems fairly predictable but also a tad bit creative. Let's see how it does.
In short, GPTZero does a decent job at detecting short-form AI content, but does a lot better at detect long-form AI writing. GPTZero concluded that the short texts from GPT-4, GPT-3, and human were likely to be written by a human. When I increased the number of characters from GPT-3 to 4,296, GPTZero detected it as partly written by AI.
GPTZero resulted in a text perplexity of 12. A very low number – indicating a higher probability of being generated by AI (this is correct). Regarding burstiness, we got a score of 45. After scrolling down the page you can click "Get GPTZero Result" and you'll get a final score and a predictor. This is what the above paragraph received:
Ok so it looks like it did a good job. I'm going to test an academic thesis paragraph now & then I'll test something I wrote in a past blog. I'm assuming the thesis will be confidently human-generated, and my writing be somewhere in the middle. Here's the thesis:
With a huge sentence perplexity (especially among the average sentence perplexity), GPTZero predicted a very low chance of this being AI-generated, showing high signs of it being human-produced:
Now for my personal text from a recent blog on detecting ChatGPT (ironically a funny article to use for this example). Just like predicted, we met in the middle between AI-generated content & a professional academic thesis abstract – but thankfully I still came out as human-generated 😎
Support and Community
GPTZero has a Facebook community called GPTZero Educators, which now has more than 4.3K members. But as expected, most of them are in the education industry. The topics are usually about how to help students avoid cheating via ChatGPT. If you need technical support, you can send them a message via their contact page. GPTZero also accepts requests for new features.
GPTZero has a free version (GPTZero Classic), which you can use without signing up. It has a limit of 5,000 characters (about 700-1,200 words) per input/document. Aside from the character limit, the only difference between the free and paid plans is that the latter has a “better” detector threshold designed for educators. It seems like they're putting most of their effort into these premium models.
|GPTZero Classic||GPTZero Educator||GPTZero Pro|
|Character limit per document||5,000||50,000||50,000|
|Upload files limit per batch||3||Unlimited||Unlimited|
|Number of words per month||Unlimited||1 million||2 million|
|AI detection model||Free||Finetuned||Premium with high limits|
Pros and Cons
Is GPTZero Groundbreaking for AI Detection?
I think it's a great tool to give insight, but it's clearly not polished enough for everyday use. It's extremely impressive that a 22 year old created an amazing company and vows to only detect things as AI that really are. The issue in AI detection isn't finding AI content, it's more about not flagging people as using AI when they weren't.
Compared to other tools on the market, I think GPTZero has great potential to keep growing. Especially within the education industry. It's definitely worth trying and has a very promising and authentic team beside the product. If you've used the tool as an educator, I’d love to hear your experience in the comments below!