Multilayered AI Voice Agents for automating automation?
Building TidyCalls: A Modern SaaS AI Agent for automating call center automation
Introduction
TidyCalls is a SaaS platform that leverages AI to provide intelligent call screening and management services. Built with a multilayered voice AI Agent infrastructure and made approachable to anyone with modern web technologies and a focus on developer and user experience, it represents a new approach to handling business communications.
Philosophy & Core Principles
User-Centric Design
The project was built around three core user needs:
- Peace of Mind: Business owners need to focus without missing opportunities
- Intelligence: Call screening that understands context and intent, without a fixed formula of how the conversation can go
- Transparency: Clear insights into call handling and decision-making with transcripts for later scrutiny
Technical Philosophy
- Server Components: Leveraging Next.js 14's server components for optimal performance at places
- Progressive Enhancement: Starting with core functionality and enhancing based on capabilities
- Type Safety: End-to-end TypeScript for reliability and developer experience
- Managed Services: Using services where someone else is on call to ensure the uptime and ops efficiency
Technology Stack
Frontend
- Next.js 14: App Router for modern React patterns
- TypeScript: For type safety and better developer experience
- Tailwind CSS: Utility-first styling
- Shadcn/UI: Component library built on Radix UI
- Clerk: Authentication and user management
- Stripe: Billing and checkout
- v0: Design
Backend
- AWS: Serverless SAM for serving web requests, ECS for continuous streaming services - a low-latency websocket proxy
- Python: Lambdas behind API Gateway and interfacing with AWS services and AI were mostly written in Python
- Node.js: Runtime environment for the streaming proxy, capable of async processing of parallel streams
- Bedrock: The wonderful service that enables us to switch models easily while scaling and being very cheap to run
- Claude: AI models for call screening, virtual assistant, summary and decision making
- Telephonyx: Global telephony infrastructure connections to leading networks
- DynamoDB: Data persistence
- Polly and Transcribe: Voice Features
- GCP: Alternative Voice Features
- Stripe: Metered billing for telephony and consumption-based charges
Architecture
General end-to-end
The basic structure is that of a B2B SaaS website, with a React/NextJS frontend, Shadcn components and a custom backend behing an API. Authentication and payments are such pivotal services in a SaaS product that you really want to stand on the shoulders of giants. Hence, checkout and payments are handled via Stripe with their superior UX, and authentication is facilitated via Clerk - relieving the service from holding user credentials.
Media handling
In a real-time conversational agent, it's key to keep latencies low. This is why TidyCalls relies on a dedicated media pipeline that brings the voice right over to the AI - in the future, we'll be able to use voice and vision models to process the media directly, thereby further reducing the latencies whhile keeping costs under control.
Key Features
-
Real-time Call Management
// Example of real-time call handling async function handleIncomingCall(call: Call) { const context = await getCallContext(call); const decision = await AI.analyzeCall(context); return decision.shouldConnect ? connectCall() : screenCall(); }
-
Dashboard Analytics
-
Settings Management
interface Settings { phoneNumber: string; screeningPreferences: ScreeningPreferences; availability: AvailabilitySchedule; language: SupportedLanguage; }
Development Process
1. Planning & Setup
- Requirements gathering
- Technology selection
- Infrastructure setup
- CI/CD pipeline configuration
2. Core Development
- Authentication implementation
- Basic call handling
- Settings management
- Dashboard development
3. AI Integration
- OpenAI model selection and testing
- Prompt engineering
- Context management
- Response handling
4. Testing & Optimization
- Unit testing with Jest
- E2E testing with Playwright
- Performance optimization
- Security auditing
Performance Optimization
Server Components
// Example of optimized server component
async function CallLog() {
const calls = await fetchCalls();
return (
<Table>
{calls.map((call) => (
<CallRow key={call.id} call={call} />
))}
</Table>
);
}
Client-Side Optimization
- Route prefetching
- Image optimization
- Component code splitting
- State management optimization
Challenges & Solutions
-
Real-time Updates
- Challenge: Keeping dashboard data fresh
- Solution: Implemented efficient polling with WebSocket fallback
-
AI Response Time
- Challenge: Quick call screening decisions
- Solution: Parallel processing and caching strategies
-
Type Safety
- Challenge: Maintaining types across the stack
- Solution: Shared type definitions and zod validation
Future Roadmap
-
Enhanced AI Features
- Multi-language support
- Sentiment analysis
- Custom AI training
-
Integration Expansion
- CRM integrations
- Calendar synchronization
- Mobile app development
Conclusion
TidyCalls demonstrates how modern web technologies and AI can be combined to solve real business problems. The project's success lies in its focus on user experience, performance, and maintainable code architecture.
Resources
To follow!
This post was written by Sami Pippuri, lead developer of TidyCalls. For more information, visit tidycalls.com.