Cover Image for Multilayered AI Voice Agents for automating automation?

Multilayered AI Voice Agents for automating automation?

Sami Pippuri
October 5, 2024

Building TidyCalls: A Modern SaaS AI Agent for automating call center automation

Introduction

TidyCalls is a SaaS platform that leverages AI to provide intelligent call screening and management services. Built with a multilayered voice AI Agent infrastructure and made approachable to anyone with modern web technologies and a focus on developer and user experience, it represents a new approach to handling business communications.

Philosophy & Core Principles

User-Centric Design

The project was built around three core user needs:

  1. Peace of Mind: Business owners need to focus without missing opportunities
  2. Intelligence: Call screening that understands context and intent, without a fixed formula of how the conversation can go
  3. Transparency: Clear insights into call handling and decision-making with transcripts for later scrutiny

Technical Philosophy

  • Server Components: Leveraging Next.js 14's server components for optimal performance at places
  • Progressive Enhancement: Starting with core functionality and enhancing based on capabilities
  • Type Safety: End-to-end TypeScript for reliability and developer experience
  • Managed Services: Using services where someone else is on call to ensure the uptime and ops efficiency

Technology Stack

Frontend

  • Next.js 14: App Router for modern React patterns
  • TypeScript: For type safety and better developer experience
  • Tailwind CSS: Utility-first styling
  • Shadcn/UI: Component library built on Radix UI
  • Clerk: Authentication and user management
  • Stripe: Billing and checkout
  • v0: Design

Backend

  • AWS: Serverless SAM for serviing web requests, ECS for continuous streaming services
  • Python: Lambdas behind API Gateway and interfacing with AWS services and AI were mostly written in Python
  • Node.js: Runtime environment for the streaming proxy, capable of async processing of parallel streams
  • Bedrock: The wonderful service that enables us to switch models easily while scaling and being very cheap to run
  • Claude: AI models for call screening, virtual assistant, summary and decision making
  • Telephonyx: Global telephony infrastructure connections to leading networks
  • DynamoDB: Data persistence
  • Polly and Transcribe: Voice Features
  • GCP: Alternative Voice Features
  • Stripe: Metered billing for telephony and consumption-based charges

Architecture

graph TD A[Client Browser] --> B[Next.js App Router] B --> C[Server Components] B --> D[Client Components] C --> E[API Routes] E --> F[Backend Logic] E --> G[Telephony] E --> H[Database] E --> I[Stripe] E --> J[Clerk] C --> J[Clerk]
graph TD A[Caller Phone] --> B[Telephony] B --> C[Audio Stream] C --> D[Backend Logic] D --> E[LLM] E --> D D --> C C --> B B --> F[Client Phone]

Key Features

  1. Real-time Call Management

    // Example of real-time call handling
    async function handleIncomingCall(call: Call) {
      const context = await getCallContext(call);
      const decision = await AI.analyzeCall(context);
      return decision.shouldConnect ? connectCall() : screenCall();
    }
    
  2. Dashboard Analytics
    Dashboard Screenshot

  3. Settings Management

    interface Settings {
      phoneNumber: string;
      screeningPreferences: ScreeningPreferences;
      availability: AvailabilitySchedule;
      language: SupportedLanguage;
    }
    

Development Process

1. Planning & Setup

  • Requirements gathering
  • Technology selection
  • Infrastructure setup
  • CI/CD pipeline configuration

2. Core Development

  • Authentication implementation
  • Basic call handling
  • Settings management
  • Dashboard development

3. AI Integration

  • OpenAI model selection and testing
  • Prompt engineering
  • Context management
  • Response handling

4. Testing & Optimization

  • Unit testing with Jest
  • E2E testing with Playwright
  • Performance optimization
  • Security auditing

Performance Optimization

Server Components

// Example of optimized server component
async function CallLog() {
  const calls = await fetchCalls();
  return (
    <Table>
      {calls.map((call) => (
        <CallRow key={call.id} call={call} />
      ))}
    </Table>
  );
}

Client-Side Optimization

  • Route prefetching
  • Image optimization
  • Component code splitting
  • State management optimization

Challenges & Solutions

  1. Real-time Updates

    • Challenge: Keeping dashboard data fresh
    • Solution: Implemented efficient polling with WebSocket fallback
  2. AI Response Time

    • Challenge: Quick call screening decisions
    • Solution: Parallel processing and caching strategies
  3. Type Safety

    • Challenge: Maintaining types across the stack
    • Solution: Shared type definitions and zod validation

Future Roadmap

  1. Enhanced AI Features

    • Multi-language support
    • Sentiment analysis
    • Custom AI training
  2. Integration Expansion

    • CRM integrations
    • Calendar synchronization
    • Mobile app development

Conclusion

TidyCalls demonstrates how modern web technologies and AI can be combined to solve real business problems. The project's success lies in its focus on user experience, performance, and maintainable code architecture.

Resources

To follow!


This post was written by Sami Pippuri, lead developer of TidyCalls. For more information, visit tidycalls.com.