Build a Voice Assistant Using Qwen API - Step by Step
Build a Voice Assistant Using Qwen API – Step by Step
Voice assistants are everywhere—smart speakers, mobile apps, even in your car. But building one from scratch used to be complex and expensive. Not anymore. With the Qwen API, you can add powerful natural language understanding to your own voice assistant project in minutes. In this qwen api tutorial, I’ll walk you through creating a simple ai voice assistant that listens, thinks, and speaks—all using Qwen’s language model.
By the end, you’ll have a working prototype you can run on your laptop or Raspberry Pi. And if you need cheap, reliable API tokens, I’ll show you where to get them. Let’s dive in.
What You’ll Need
- Python 3.8+ installed
- An API key for Qwen (get one from tai.shadie-oneapi.com)
- Microphone & speakers (built-in laptop ones work fine)
- Basic Python knowledge
How It Works
Our voice assistant will follow a simple pipeline:
- Speech-to-text – capture audio and transcribe it (we’ll use
speech_recognitionlibrary) - Send to Qwen API – pass the transcribed text to Qwen and get a response
- Text-to-speech – convert the answer to spoken audio (using
pyttsx3or your OS TTS)
Step 1: Set Up Your Environment
Install the required packages:
pip install openai speechrecognition pyttsx3 pyaudio
Note: The Qwen API is compatible with the OpenAI SDK, so we’ll use the openai package with a custom base URL.
Step 2: Basic Voice Assistant Code
Here’s your first code example – a minimal voice loop that listens, queries Qwen, and speaks the answer.
import openai
import speech_recognition as sr
import pyttsx3
# Configure Qwen API (replace with your own key)
openai.api_key = "your-qwen-api-key"
openai.base_url = "https://api.tai.shadie-oneapi.com/v1/"
# Initialize speech engine
engine = pyttsx3.init()
recognizer = sr.Recognizer()
def listen():
with sr.Microphone() as source:
print("Listening...")
audio = recognizer.listen(source)
try:
text = recognizer.recognize_google(audio)
print(f"You said: {text}")
return text
except:
return None
def ask_qwen(prompt):
response = openai.chat.completions.create(
model="qwen-turbo",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
def speak(text):
engine.say(text)
engine.runAndWait()
# Main loop
if __name__ == "__main__":
speak("Hello, I'm your Qwen voice assistant. Ask me anything.")
while True:
user_input = listen()
if user_input:
answer = ask_qwen(user_input)
print(f"Qwen: {answer}")
speak(answer)
Run this script, say something like “What’s the weather in Tokyo?” and watch it work. The speech_recognition library uses Google’s free STT, which is fine for prototyping. For production, you might swap in a local model.
Step 3: Adding Context and Memory
A real voice assistant should remember the conversation. Let’s upgrade the code to keep a history of messages.
import openai
import speech_recognition as sr
import pyttsx3
openai.api_key = "your-qwen-api-key"
openai.base_url = "https://api.tai.shadie-oneapi.com/v1/"
engine = pyttsx3.init()
recognizer = sr.Recognizer()
messages = [
{"role": "system", "content": "You are a helpful voice assistant. Keep answers short and conversational."}
]
def listen():
with sr.Microphone() as source:
print("Listening...")
audio = recognizer.listen(source)
try:
text = recognizer.recognize_google(audio)
print(f"You: {text}")
return text
except:
return None
def ask_qwen(user_text):
messages.append({"role": "user", "content": user_text})
response = openai.chat.completions.create(
model="qwen-turbo",
messages=messages
)
reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
return reply
def speak(text):
engine.say(text)
engine.runAndWait()
if __name__ == "__main__":
speak("Hi, I'm your Qwen assistant. Let's chat.")
while True:
user_input = listen()
if user_input:
if "exit" in user_input.lower():
speak("Goodbye!")
break
answer = ask_qwen(user_input)
print(f"Assistant: {answer}")
speak(answer)
Now your assistant can follow multi-turn conversations. Try: “Tell me a joke,” then “Make it funnier.”
Step 4: Handling Errors and Edge Cases
Voice assistants can be flaky. Add a try/except around the API call and a fallback if the speech recognizer fails:
def ask_qwen(user_text):
try:
messages.append({"role": "user", "content": user_text})
response = openai.chat.completions.create(
model="qwen-turbo",
messages=messages,
timeout=10
)
reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
return reply
except Exception as e:
return f"Sorry, I couldn't reach Qwen: {str(e)}"
Taking It Further
This qwen api voice assistant is just the starting point. You can:
- Add wake-word detection (e.g., “Hey Qwen”) using
porcupineorsnowboy - Use a local TTS engine like
espeakfor offline use - Integrate with smart home APIs or calendar
- Run it on a Raspberry Pi for a dedicated device
Why Use Qwen for Your Voice Assistant?
Qwen models (like Qwen-Turbo and Qwen-Plus) are fast, support long contexts, and handle multiple languages. They’re perfect for voice applications where low latency matters. And with the OpenAI‑compatible API, you can reuse existing code and tools.
Getting an Affordable API Key
To build this ai voice assistant without breaking the bank, you need a reliable API provider. I recommend tai.shadie-oneapi.com. They offer Qwen API tokens at competitive prices, with no hidden fees and excellent uptime. Whether you’re prototyping or deploying at scale, their service keeps your costs low while delivering high‑speed access.
What’s Next?
Now you have a working voice assistant powered by Qwen. Play with the code, customize the prompt, and add your own features. The best way to learn is to build—so fire up your terminal and give it a try.
If you need Qwen API access, head over to tai.shadie-oneapi.com and grab a key. Happy coding!