I Built a Full-Stack AI Chat App. Here's What Actually Happened.

Few days ago I built Cortexa — a full-stack AI chat application with streaming responses, conversation history, model switching, image upload, voice input, and web search grounding. The kind of app that sounds like a weekend project and takes six times longer.

This is not a tutorial. This is a debrief. What I actually built, what broke, what I got wrong, and what I'd change if I started over today.

What Cortexa is

Cortexa is an AI chat platform built on:

Next.js 16.2.4 (App Router, server actions, streaming)

MongoDB for conversation storage

NextAuth.js for authentication

Featherless AI as the LLM provider (open-source and uncensored models)

Tavily for web search grounding

Tailwind + Framer Motion for UI and animations

The app lets you select from multiple open-source LLMs, maintain conversation history across sessions, upload images, speak prompts via voice, and optionally ground responses with live web search. It was built specifically for developers who want access to uncensored and abliterated models — useful for security research and adversarial testing.

On paper that's a lot of features. In practice it meant a lot of surface area for things to go wrong.

What I got right

The first decision that paid off was implementing streaming from day one rather than adding it later. In Next.js 16, you do this with the ReadableStream API and server-sent events. Getting this wired up before the UI was even close to finished meant I never had to refactor around it.

Streaming is table stakes for AI chat now. Users feel the latency even when it's fast. A 1.5-second wait with no feedback feels broken. A 1.5-second wait with tokens appearing feels like thinking.

I chose Featherless AI specifically for two reasons, first cuz a friend gave me access to the premium version and because it gives access to open-source models — including abliterated and uncensored variants — through a unified API without requiring me to self-host anything. For the use case I was targeting (developers doing security research), this was the right call, not for illegal activities.

One model, one API format, swap the model ID in the request body. The abstraction made model selection a UI feature rather than an engineering problem(learnt about abstraction in my 2nd year -SEN201).

I Built a Full-Stack AI Chat App. Here's What Actually Happened.

I Built a Full-Stack AI Chat App. Here's What Actually Happened.

What Cortexa is

What I got right

Streaming early

Featherless AI as the model backend

// discussion

// you might also like

We Shipped Without the Credits. Here's How the AMD Hackathon Actually Went.

Your Render Free Tier Is Not Broken. It's Just Cold.

Hello, World — Welcome to My Corner of the Internet

Keeping auth simple

What broke (or slowed me down)

MongoDB conversation storage at scale

Image upload was more expensive than I expected

Voice input reliability across browsers

No proper error boundary strategy

The workflow that made this possible

What I'd change if I started over

The part nobody writes about

Test Cortexa live