A powerful developer tool that enables you to integrate voice input, AI understanding (via Gemini), and spoken responses into any modern web app.
Simple commands, powerful results. Clone the repository and you're ready to go.
Integrate the SDK into your web application with just a few lines of code.
frontend.js
import { MBZVoiceAgent } from "./mbz-voice-sdk.js"
const agent = new MBZVoiceAgent({
apiUrl: "http://localhost:8000/ask",
lang: "en-US",
speak: true
})
// Start listening
document.getElementById("listen-btn").onclick = () => {
agent.listen()
}
Set up the backend with Python and FastAPI in minutes.
main.py
from fastapi import FastAPI
import google.generativeai as genai
import os
app = FastAPI()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-1.5-flash")
@app.post("/ask")
async def ask(request: Request):
data = await request.json()
query = data.get("query")
response = model.generate_content(query)
return {"answer": response.text}
Free Forever
Core Features
Accuracy Rate
Founded in Pakistan
Explore the powerful features that make MBZ Voice SDK the perfect choice for your voice-enabled applications
Advanced voice recognition with high accuracy
Gemini AI integration for intelligent responses
Natural text-to-speech capabilities
Privacy-focused design with local processing
Lightning-fast response times
Community-driven development
Experience how voice data flows through our SDK with this interactive 3D visualization
Easily capture user speech with browser-native voice recognition capabilities. Works across all modern browsers and devices.
Process speech with Google's Gemini AI for intelligent responses to user queries. Contextual understanding and natural language processing.
Convert AI responses to natural-sounding speech with text-to-speech capabilities. Natural-sounding voice synthesis for better user experience.
All processing happens on your servers. No data is stored or shared with third parties without your explicit permission.
Open source with an active community of developers. Contribute, suggest features, and help shape the future of the SDK.
See how the MBZ Voice SDK can power conversational interfaces in your applications
See how MBZ Voice SDK stacks up against other voice recognition solutions
Features | MBZ Voice SDK Pakistan's First | Other Voice SDKs Commercial | Browser APIs Basic |
---|---|---|---|
Voice Recognition | + | ||
AI Integration | + | ||
Text-to-Speech | |||
Customizable | |||
Easy Integration | + | ||
Local Processing | |||
Price | Free | $$$ | Free |
Open Source |
git clone https://github.com/ProMBZ/mbz-voice-sdk.git
cd mbz-voice-sdk/backend pip install -r requirements.txt
# Create a .env file in the backend directory GEMINI_API_KEY=your_api_key_here
uvicorn main:app --reload
const agent = new MBZVoiceAgent({
apiUrl: "/ask", // Backend API endpoint
speak: true // Enable text-to-speech
})
The endpoint URL for your backend API that processes voice input
Default: "/ask"
Whether to enable text-to-speech for AI responses
Default: true
Soon you'll be able to install MBZ Voice SDK directly from npm with a simple command.
npm install mbz-voice-sdk
The backend will soon be available as a Python package for easy installation.
pip install mbz-voice-sdk
MBZ Voice SDK is completely free to use in your projects. No hidden fees, no usage limits, no credit card required.
Use in commercial and personal projects with no restrictions
View, modify, and contribute to the source code on GitHub
Only need a Gemini API key for the AI functionality