Build a Voice Assistant Using Qwen API - Step by Step

📅 2026-06-04 · 5 min read

Build a Voice Assistant Using Qwen API – Step by Step

Voice assistants are everywhere—smart speakers, mobile apps, even in your car. But building one from scratch used to be complex and expensive. Not anymore. With the Qwen API, you can add powerful natural language understanding to your own voice assistant project in minutes. In this qwen api tutorial, I’ll walk you through creating a simple ai voice assistant that listens, thinks, and speaks—all using Qwen’s language model.

By the end, you’ll have a working prototype you can run on your laptop or Raspberry Pi. And if you need cheap, reliable API tokens, I’ll show you where to get them. Let’s dive in.

What You’ll Need

How It Works

Our voice assistant will follow a simple pipeline:

  1. Speech-to-text – capture audio and transcribe it (we’ll use speech_recognition library)
  2. Send to Qwen API – pass the transcribed text to Qwen and get a response
  3. Text-to-speech – convert the answer to spoken audio (using pyttsx3 or your OS TTS)

Step 1: Set Up Your Environment

Install the required packages:

pip install openai speechrecognition pyttsx3 pyaudio

Note: The Qwen API is compatible with the OpenAI SDK, so we’ll use the openai package with a custom base URL.

Step 2: Basic Voice Assistant Code

Here’s your first code example – a minimal voice loop that listens, queries Qwen, and speaks the answer.

import openai
import speech_recognition as sr
import pyttsx3

# Configure Qwen API (replace with your own key)
openai.api_key = "your-qwen-api-key"
openai.base_url = "https://api.tai.shadie-oneapi.com/v1/"

# Initialize speech engine
engine = pyttsx3.init()
recognizer = sr.Recognizer()

def listen():
    with sr.Microphone() as source:
        print("Listening...")
        audio = recognizer.listen(source)
    try:
        text = recognizer.recognize_google(audio)
        print(f"You said: {text}")
        return text
    except:
        return None

def ask_qwen(prompt):
    response = openai.chat.completions.create(
        model="qwen-turbo",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

def speak(text):
    engine.say(text)
    engine.runAndWait()

# Main loop
if __name__ == "__main__":
    speak("Hello, I'm your Qwen voice assistant. Ask me anything.")
    while True:
        user_input = listen()
        if user_input:
            answer = ask_qwen(user_input)
            print(f"Qwen: {answer}")
            speak(answer)

Run this script, say something like “What’s the weather in Tokyo?” and watch it work. The speech_recognition library uses Google’s free STT, which is fine for prototyping. For production, you might swap in a local model.

Step 3: Adding Context and Memory

A real voice assistant should remember the conversation. Let’s upgrade the code to keep a history of messages.

import openai
import speech_recognition as sr
import pyttsx3

openai.api_key = "your-qwen-api-key"
openai.base_url = "https://api.tai.shadie-oneapi.com/v1/"

engine = pyttsx3.init()
recognizer = sr.Recognizer()

messages = [
    {"role": "system", "content": "You are a helpful voice assistant. Keep answers short and conversational."}
]

def listen():
    with sr.Microphone() as source:
        print("Listening...")
        audio = recognizer.listen(source)
    try:
        text = recognizer.recognize_google(audio)
        print(f"You: {text}")
        return text
    except:
        return None

def ask_qwen(user_text):
    messages.append({"role": "user", "content": user_text})
    response = openai.chat.completions.create(
        model="qwen-turbo",
        messages=messages
    )
    reply = response.choices[0].message.content
    messages.append({"role": "assistant", "content": reply})
    return reply

def speak(text):
    engine.say(text)
    engine.runAndWait()

if __name__ == "__main__":
    speak("Hi, I'm your Qwen assistant. Let's chat.")
    while True:
        user_input = listen()
        if user_input:
            if "exit" in user_input.lower():
                speak("Goodbye!")
                break
            answer = ask_qwen(user_input)
            print(f"Assistant: {answer}")
            speak(answer)

Now your assistant can follow multi-turn conversations. Try: “Tell me a joke,” then “Make it funnier.”

Step 4: Handling Errors and Edge Cases

Voice assistants can be flaky. Add a try/except around the API call and a fallback if the speech recognizer fails:

def ask_qwen(user_text):
    try:
        messages.append({"role": "user", "content": user_text})
        response = openai.chat.completions.create(
            model="qwen-turbo",
            messages=messages,
            timeout=10
        )
        reply = response.choices[0].message.content
        messages.append({"role": "assistant", "content": reply})
        return reply
    except Exception as e:
        return f"Sorry, I couldn't reach Qwen: {str(e)}"

Taking It Further

This qwen api voice assistant is just the starting point. You can:

Why Use Qwen for Your Voice Assistant?

Qwen models (like Qwen-Turbo and Qwen-Plus) are fast, support long contexts, and handle multiple languages. They’re perfect for voice applications where low latency matters. And with the OpenAI‑compatible API, you can reuse existing code and tools.

Getting an Affordable API Key

To build this ai voice assistant without breaking the bank, you need a reliable API provider. I recommend tai.shadie-oneapi.com. They offer Qwen API tokens at competitive prices, with no hidden fees and excellent uptime. Whether you’re prototyping or deploying at scale, their service keeps your costs low while delivering high‑speed access.

What’s Next?

Now you have a working voice assistant powered by Qwen. Play with the code, customize the prompt, and add your own features. The best way to learn is to build—so fire up your terminal and give it a try.

If you need Qwen API access, head over to tai.shadie-oneapi.com and grab a key. Happy coding!

🚀 Start Using AI API Today — Starting at $1

No monthly subscription. Pay as you go. Instant API key delivery.

Get Started →