Every few weeks someone posts a screenshot of an AI tool writing impressive-looking code, and the comments split immediately: half the developers are impressed, half are skeptical, and nobody has the same data.
So I ran my own test. Same task. Same prompt. Four tools. Scored against ten production criteria. No editorializing until the scores were in.
Here’s what I found.
The Task
I needed a FileUploadModalComponent for the Slick Leagues platform — a SendGrid-style modal where users can drag and drop images, see thumbnails, select the ones they want, and track upload progress. Real Angular 21. Real design token conventions. Real accessibility requirements.
The prompt I sent to all four tools was identical:
You are a senior Angular 21 architect working on an enterprise multi-tenant SPA platform. The platform uses a shared component library (@slick-hub) and strict design token conventions.
Build a FileUploadModalComponent that accepts drag-and-drop or click-to-browse image uploads, renders uploaded images as a thumbnail grid inside the modal, allows the user to select one or more images with visible selection state, tracks all state using Angular Signals (not Observables), shows per-file upload progress, uses modern Angular 21 template syntax (@if, @for, @let), and is fully keyboard-accessible with proper ARIA roles.
Output format: File header with architect’s notes, section dividers, dependency injection setup, computed signals for derived state, method signatures with JSDoc comments.
No system prompt. No prior conversation. No hints. The same 200 words to all four tools.
The Scoring System
Ten criteria, each scored 0–3:
- 0 = Ignored entirely
- 1 = Attempted but incorrect or incomplete
- 2 = Mostly correct with minor issues
- 3 = Nailed it — production-quality compliance
The criteria covered role adoption, drag-and-drop UX, image preview grid, select/deselect UX, Angular Signals state, modern Angular syntax, accessibility, upload progress, output format compliance, and @slick-hub integration.
The Results
| Tool | Score | Verdict |
|---|---|---|
| Claude | 30/30 | Ship today |
| Gemini | 24/30 | One review pass |
| ChatGPT | 22/30 | Fix two things first |
| Perplexity | 14/30 | Missing half the component |
What Each Tool Actually Delivered
Claude — The Senior Dev Analogy
The first thing I noticed was the architect’s notes. Not boilerplate — actual reasoning. Signal-first rationale. Immutable entry pattern explained. WCAG compliance called out by version. A note that simulateUpload() is the only fake I/O in the whole component and that replacing it with a real HttpClient call requires changing exactly one method body.
Then I got to the dragleave handler:
onDragLeave(event: DragEvent): void {
const zone = this.dropZoneRef()?.nativeElement;
if (!zone?.contains(event.relatedTarget as Node)) {
this.isDragging.set(false);
}
}
Nobody asked for this. The relatedTarget check prevents the drag highlight from flickering when your cursor crosses a child element inside the drop zone — a real bug that only shows up when a user actually drags a file across your UI. You add that line because you’ve debugged it. Claude added it because it has apparently internalized enough real-world drag-and-drop failure modes to know the problem exists.
The rest followed the same pattern: a CDK focus trap nobody requested, a per-file progress ring plus an aggregate footer progress bar, a signal architecture reference table that maps every writable and computed signal to its purpose, two complete usage examples with JSDoc, and an accessibility checklist at the end.
Score: 30/30. Would I ship it? Yes.
The only gap: CSS design tokens were defined inline in :root rather than imported from @slick-hub/tokens. The dependency section listed the package. The styles just didn’t use it. One minor inconsistency in an otherwise complete output.
Gemini — The Reliable Freelancer
Gemini delivered on the brief. Drag-and-drop worked end-to-end — dragleave was handled correctly, the isDragging signal toggled cleanly, and the drop zone visual feedback functioned as expected. The architect’s notes showed real design thinking. The use of @let currentFiles = files() was clean template variable binding that most devs would write on the second pass, not the first.
Two things missed the spec.
First, the auto-select behavior. The moment a file finished uploading, Gemini marked it as selected: true. The prompt said the user selects images — not the component. It’s the kind of decision that seems helpful — “save the user a step” — until the design spec says otherwise and you have a bug report three sprints later. Imagine if Gmail automatically added every attachment to your reply.
Second, no Escape key handler. The modal closed via backdrop click or the ✕ button. Keyboard users navigating without a mouse had no standard way out.
Score: 24/30. Would I ship it? With edits.
Remove the auto-select. Add @HostListener('keydown.escape'). Add a CDK focus trap. The rest is solid.
ChatGPT — The Strong Junior
ChatGPT’s architect notes were the most confident of the four. Signal-driven, DI-abstracted, accessibility first-class, stateless external API — a well-written description of a well-designed component.
The code didn’t fully back it up.
The most telling gap: the abstract UploadService injection pattern was the most architecturally sophisticated design decision of the four tools. An injectable boundary between the component and upload logic means you can swap upload providers per tenant without touching the modal. That’s genuine senior-level thinking.
But the dragleave handler was missing. If a user drags a file over the modal and moves their cursor away without dropping, the drop zone stays highlighted indefinitely. It’s the visual equivalent of leaving the fridge door open. And despite claiming “Accessibility is first-class” in its own architect’s notes, there was no focus trap — meaning keyboard users could Tab straight out of the modal.
The notes promised a senior. The implementation delivered a strong junior.
Score: 22/30. Would I ship it? With edits.
Add dragleave. Add a focus trap. Add a completion visual state so users know the upload succeeded. The architecture is sound — the finishing work is missing.
Perplexity — The Missing Floor Plan
Perplexity’s output was the most surprising of the four — for two opposite reasons.
The keyboard navigation model was the best of any tool in the comparison. Not slightly better — noticeably better. Full ArrowRight, ArrowLeft, ArrowUp, ArrowDown, Home, End navigation inside the listbox. An activeIndex signal tracking focus position. The Escape listener scoped to document rather than the host element — which is technically correct in a way Claude’s host-scoped listener isn’t. Five computed signals for derived state. A dynamic listboxLabel that builds its own ARIA string from state.
That’s the good news.
The bad news: there was no template.
The component used templateUrl pointing to a file that wasn’t included in the output. A developer copying this into a project would get a component that renders nothing, uploads nothing, and gives no obvious error explaining why. Three criteria couldn’t be scored at all. The setProgress() and markFailed() methods were fully built but never called — the upload logic was wired to nothing.
The engine was built. Nobody connected it to the ignition.
Score: 14/30. Would I ship it? No.
The class file is genuinely usable. Write the template. Wire ingestFiles() to initiate uploads. That’s the gap between 14 and 24.
The Three Failure Modes That Showed Up Everywhere
After scoring all four tools, three patterns appeared in every tool that wasn’t Claude:
Accessibility stops at the obvious stuff. All three gave me role="dialog" and aria-modal. None of them added a focus trap. ARIA attributes show up in dev tools. Focus management only shows up when a keyboard user gets stuck inside your modal with no way out.
Drag-and-drop edge cases get skipped. dragover and drop are the happy path. dragleave is the edge case. Only one tool handled it correctly. The others built drag-and-drop that works in demos and breaks in production.
Output format requirements are treated as optional. When AI tools get lazy, they get lazy on documentation first. Section dividers, JSDoc on private methods, signal architecture references — these were asked for and largely ignored by three of the four tools.
Then I Fed It Into NotebookLM
After completing the comparison, I uploaded the entire document — all four raw outputs, the scoring matrix, the constraint violation tracker — into NotebookLM and generated an Audio Overview.
The hosts reviewed each tool, debated the violations, and landed on a question that reframed the whole comparison:
“How long before AI is not simply executing user requirements but anticipating human behavior and writing the requirements themselves?”
That question hit differently after watching Claude add a production bug fix nobody asked for.
The dragleave fix isn’t a party trick. It’s evidence that something is shifting from requirement execution toward requirement anticipation. Claude didn’t check a box — it filled a gap it recognized before I knew the gap existed.
We’re not at autonomous requirement generation yet. But the gradient is moving faster than most teams realize.
🎙️ Listen to the full NotebookLM podcast here
The Bottom Line
If you’re using AI to ship production Angular components, the gaps in this comparison are the gaps you’re shipping around. Accessibility, edge cases, and documentation are where every tool except Claude dropped points — and they’re the things that don’t break in the demo. They break in production.
The full matrix — including all four raw outputs, the constraint violation tracker, and the complete scoring breakdown — is available on the Slick Leagues GitHub.
Run it yourself. Different tools, different task, same structure. The flywheel spins.
Guido A. Piccolino Jr. is the founder of Slick Leagues and posts daily AI engineering content at @theaiengineer on TikTok, LinkedIn, and X.