Live translation

SpeechLab , an AI startup backed by Andrew Ng’s AI Fund, focuses on AI-powered speech-to-speech products.

Context: Single Product to Multiple

Our first product, AI dubbing, leverages machine learning, text-to-speech, and natural language processing to help users to localize video and audio content efficiently.

Our AI dubbing ↗ enables users to generate lifelike multilingual voiceovers easily.

As a startup, we’re evolving from a single-product focus to a diverse multi-product portfolio. To address more real-time user scenarios, our next step is to develop a low-latency speech-to-speech product.

Problem Statement -
Challenges in Multilingual Meetings

In today's globalized world, effective communication across language barriers remains a significant challenge. Traditional solutions have several limitations:

Key Pain Points

Language barriers
in global context

Reduced participation due to language barriers in cross-language meetings and conferences.

Limited accessibility
for multilingual events

Multiple interpreters needed to cover different languages, with advance booking required.

High costs
of traditional solutions

Beyond other complexities, human interpreters are expensive, require substantial investment.

Other challenges:

Need for transcription alongside translation;

Requirement for real-time translation without disrupting;

Seamless integration with existing meeting platforms.

Define

Solution: AI live interpretation for multilingual virtual meetings.
Platform: Web-based application integrated with meeting platforms.

Target Users

• Global business with distributed teams
• Conference and virtual event organizers
• International content creators and livestream hosts

In short, it caters to business clients.

User Goals

Effortless communication across languages

Reliable high-quality real-time translation

Simple process integrates with existing workflows

Flexible language selection and audio control

Business Goals

1. User growth. Multilingual support + integration with popular platforms, to increase user adoption & retention + expand product reach.
2. Revenue Generation. Drive recurring revenue + cross-sell with the AI dubbing product through different plans.
3. User satisfaction. Multiple products and cost-effective, improve key metrics to enhance user loyalty.

Product & Design Process

In our startup, I took on the role of both the product designer and the product manager for this project.

Team Collaboration & Project Planning

To ensure the successful collaborative development of the project, I defined the product model and focus for different stage among team members, along with a concept idea for a mobile app.

Devs

Handle platform integrations; Implement the user interface; Manage the backend to enable real-time audio processing.

LLM

Provide language model capabilities, including real-time speech-to-text processing, translation, and speech synthesis.

Define product vision, translate user needs into actionable tasks, prioritize features; Design all the way from concept to finalized pixels.

Phase 1: MVP Development

Single-speaker, audio-only live translation.

Goal: Validate core concept with minimal viable feature set.

Scope:‍
• One Core Flow: Live translation for Google Meet.
• Audio-Only: Provides interpretation audio alongside translated text.
• Single Language Direction: One presenter → multiple listeners.
• Basic Controls: Translation language & audio volume

Simple 2-Step Create Experience:

Easy 1-Step Join Process:

Attending live-translated meetings

In this audio-only MVP, guests can either listen to the interpretation with transcript, or view the video while listening to the translated audio through the original meeting.

User Value:
- Instant solution for basic translation needs
- Simple setup and intuitive, low-barrier workflow
- Google Meet integration for a familiar experience

Business Value:
- Accelerate market entry to gain a competitive edge
- Quick market validation with minimal features
- Build initial user base for feedback and core data

Iterations

Phase 2: Enhance Functionality

More feature support making it a more comprehensive product.

Goal: Expand functionalities to meet more advanced user needs.

Phase 2 focused on expanding the product’s capabilities to address more meeting scenarios and improve overall user experience, laying the groundwork for broader adoption. Key features including:

Support for multiple
same-language presenters

Meeting
recording/playback

Additional meeting
platform support

Volume control for both
original and translated audio

At this point, users can find our product on the Zoom App Marketplace:

User Value:
- Support multiple same-language presenters
- Record meetings and enable playback
- Enhance user experience with better audio controls

Business Value:
- Boost user retention and platform engagement
- Enable cross-selling opportunities with the dubbing product
- Achieve competitive differentiation

We're working on this phase now. Try it out: SpeechLab Live^↗

Phase 3: Complete Solution

Goal: Support complex use cases to meet almost all user needs.

Expected scope:
• Video meeting integration support
• Multiple presenters support
• Different language settings for each presenter
• Full control features

We're working on the Phase 2 now. Try it out: SpeechLab Live^↗

Thoughts

Product Decisions & Trade-offs

Platform Integration
- User Benefit: Familiar interface, minimal learning curve
- Business Benefit: Faster adoption, reduced development costs
Language Selection Flow
- User Benefit: Intuitive language switching
- Business Benefit: Scalable architecture for future expansion
Post-Meeting Integration
- User Benefit: Complete communication solution
- Business Benefit: Cross-product synergy

Key Innovations

Cross-Platform Integration; Language Flexibility; Post-Meeting Features.

Key Learnings

Product Strategy
- Balance between user needs and business goals is crucial
- Iterative development allows for quick adjustments
- Strong core experience drives organic growth
User Experience
- Simple onboarding drives adoption
- Integration with existing workflows is key
- User control builds trust
Business Development
- Solving real problems leads to natural growth
- Cross-product integration creates lasting value
- Data-driven decisions improve outcomes

Live Translation

AI Live Translation

Context: Single Product to Multiple

Problem Statement -
Challenges in Multilingual Meetings

Product & Design Process

Iterations

Thoughts

Live Translation

AI Live Translation

Context: Single Product to Multiple

Problem Statement -Challenges in Multilingual Meetings

Product & Design Process

Iterations

Thoughts

Problem Statement -
Challenges in Multilingual Meetings