Case Study

Manshverse

Multi-model AI platform. Smart routing across Groq and Gemini, deep reasoning engine, and 45+ historical personas.

React · Vite · Firebase · Groq API · Gemini API · UPI Gateway Live site ↗

The Problem

The AI wrapper market is incredibly saturated. I wanted to build an AI platform that went beyond simple chat completions by offering distinct, deeply crafted personas (like Einstein, Oppenheimer, and Sherlock) that actually retained their character nuances across long contexts. Furthermore, I needed to implement a cost-effective routing engine that could decide when to use lightning-fast models (via Groq) versus high-context reasoning models (via Gemini).

Architecture

The system is split into three core layers:

Intelligent Routing Engine

Instead of hardcoding a single LLM provider, Manshverse uses an abstraction layer that intercepts user prompts. For short, latency-sensitive conversational queries, the engine routes the request to Groq (Llama-3), returning tokens at ~800 tokens/sec. For complex analytical queries requiring deeper context windows, it seamlessly switches to Gemini Pro. This hybrid approach drops operational costs by 60% while maximizing perceived performance.

Persona System & Memory

I engineered 45+ distinct system prompt frameworks. To maintain character consistency without blowing up the context window, I implemented a rolling context buffer using Firebase. The system retains the last 15 exchanges fully, while summarizing older context into a dense memory block that gets prepended to the system prompt dynamically.

Subscription & UPI Gateway

The platform is monetized via a custom billing system integrated with a UPI gateway for Indian users. I built a secure webhook listener that verifies payment signatures and automatically upgrades the user's Firebase claim token, unlocking premium models and lifting rate limits without requiring a manual refresh.

The Hardest Problems

1. Token Streaming State

Handling Server-Sent Events (SSE) for streaming text across different provider APIs was challenging. Groq's stream format differs from Gemini's. I wrote a unified streaming adapter that standardizes the chunks into a consistent readable stream format so the React frontend only has to manage one type of state update.

2. Web/Android Code Sharing

I wanted to ship an Android app without rewriting the logic. I utilized a monolithic approach where the core AI networking and state management hooks are extracted, allowing me to compile the web app into an APK wrapped with native bridges for push notifications and local storage.