Capstone — Live Multilingual Travel Translator#
Objective#
Build an end-to-end API that takes spoken audio in one language and returns spoken audio translated into another language.
Architecture#
- Input: User uploads an audio file (
.wavor.mp3) via FastAPI. - STT: Use Whisper to transcribe the audio to text.
- LLM: Use Claude/GPT to translate the text to the target language (e.g., English to Japanese).
- TTS: Use ElevenLabs (or OpenAI TTS) to generate audio of the translated text.
- Output: Return the generated audio file to the user.
Deliverables#
- FastAPI source code.
- A README documenting how to test the endpoint with
curl.