Voice-powered accessibility agent making the web accessible to everyone, especially for blind and visually impaired users.
- Overview
- Key Features
- Quick Start
- Tech Stack
- Project Structure
- Installation
- Development
- Architecture
- Configuration
- API Integration
- Roadmap
- Contributing
- License
- Support
Cupcake is a revolutionary accessibility application that empowers blind and visually impaired users to navigate and interact with any website using only their voice. By combining cutting-edge speech recognition, intelligent audio feedback, and smart content summarization, Cupcake transforms the web browsing experience into an intuitive, voice-controlled interface.
Mission: To break down digital barriers and create a truly inclusive web for everyone, regardless of visual ability.
- ✨ Complete Voice Control – No keyboard shortcuts to memorize, just speak naturally
- ✨ Intelligent Context Awareness – Understands page structure and content relevance
- ✨ Lightning Fast – Optimized for minimal latency and responsive interactions
- ✨ Privacy Focused – All API keys encrypted and stored locally
- ✨ Developer Friendly – Well-documented codebase with TypeScript strict mode
| Feature | Description |
|---|---|
| Voice Navigation | Control websites entirely through natural voice commands |
| Speech Recognition | Advanced accuracy with OpenAI Whisper technology |
| Audio Feedback | High-quality text-to-speech for all interactions |
| Smart Summaries | AI-powered page content summarization |
| Global Hotkeys | Quick activation with system-wide keyboard shortcuts |
| Session Persistence | Remembers your preferences and API keys securely |
| Real-Time Sync | WebSocket integration for live backend communication |
| Multi-Window | Windows, macOS, and Linux support |
# 1. Clone the repository
git clone https://github.com/Manavarya09/cupcake.git
cd cupcake
# 2. Install dependencies
npm install
# 3. Set up environment
cp .env.example .env
# 4. Start development
npm run devThat's it! Your Electron app with React frontend will launch automatically.
| Layer | Technology | Purpose |
|---|---|---|
| Desktop | Electron 33.3+ | Cross-platform desktop framework |
| Frontend | React 19.0+ | Modern UI library |
| Language | TypeScript 5.7+ | Type-safe JavaScript |
| Styling | Tailwind CSS 4.0+ | Utility-first CSS framework |
| Build | Vite 6.0+ | Lightning-fast bundler |
| Hotkeys | uiohook-napi 1.5+ | Global system hotkey capture |
| Service | Function |
|---|---|
| OpenAI Whisper | Speech-to-text recognition |
| Text-to-Speech | Voice synthesis & output |
| OpenClaw | AI backend services |
| WebSocket | Real-time bidirectional communication |
cupcake/
│
├── electron/ # Desktop application logic
│ ├── main/
│ │ ├── index.ts # Application entry point
│ │ ├── windowManager.ts # Window lifecycle management
│ │ ├── hotkeyManager.ts # Global hotkey handler
│ │ ├── ipcHandlers.ts # Inter-process communication
│ │ ├── managers/ # Business logic managers
│ │ │ ├── sessionManager.ts
│ │ │ ├── apiKeyManager.ts
│ │ │ └── openclawManager.ts
│ │ ├── services/ # External service wrappers
│ │ │ ├── whisperService.ts
│ │ │ ├── ttsService.ts
│ │ │ ├── summarizerService.ts
│ │ │ └── openclawClient.ts
│ │ └── utils/ # Helper utilities
│ │ ├── constants.ts
│ │ ├── deviceIdentity.ts
│ │ └── store.ts
│ └── preload/ # IPC security bridge
│
├── src/ # React frontend application
│ ├── App.tsx # Root component
│ ├── main.tsx # DOM entry point
│ ├── windows/ # Window components
│ │ ├── sightlineBar/ # Main control UI
│ │ ├── config/ # Settings window
│ │ └── borderOverlay/ # Visual overlay
│ └── lib/ # Frontend utilities
│ └── ipc.ts # IPC bridge client
│
├── cupcake-app/ # Marketing website
├── shared/ # Shared TypeScript types
├── build/ # Build configuration
├── scripts/ # Setup & build scripts
│
├── package.json # Project dependencies
├── tsconfig.json # TypeScript config
├── vite.config.ts # Build configuration
├── eslint.config.js # Code quality
└── README.md # This file
- Node.js 18+ (LTS recommended)
- npm 9+ or yarn/pnpm
- macOS 10.13+ / Windows 10+ / Linux (Ubuntu 18.04+)
git clone https://github.com/Manavarya09/cupcake.git
cd cupcakenpm installThis installs all required packages including Electron, React, Vite, and service integrations.
# Create environment file from template
cp .env.example .env
# Edit with your API credentials
# Required: OpenAI API key for Whisper
# Optional: OpenClaw service endpointnpm run build:electronnpm run bundle-openclawnpm run devThis command:
- Compiles TypeScript for Electron main process
- Starts Vite dev server (
http://localhost:5173) - Launches Electron with live reload
- Enables hot-module replacement (HMR)
Result: Code changes instantly reflect in the running application.
npm run buildCreates optimized bundles:
dist-electron/– Compiled main processdist/– Optimized React bundle
npm run previewLaunch the production build locally for testing.
npm run dev # Development with hot reload
npm run build # Production build
npm run preview # Preview production build
npm run build:electron # Compile main process only
npm run electron:dev # Start Electron in dev mode
npm run bundle-openclaw # Bundle OpenClaw SDK┌─────────────────────────────────────────────────────┐
│ Electron Main Process │
│ ┌──────────────────────────────────────────────┐ │
│ │ Window Manager │ │
│ │ - Creates/manages app windows │ │
│ │ - Routes IPC messages │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Hotkey Manager │ │
│ │ - Global keyboard shortcuts │ │
│ │ - Voice activation triggers │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Services Layer │ │
│ │ - Whisper (speech recognition) │ │
│ │ - TTS (text-to-speech) │ │
│ │ - Summarizer (content analysis) │ │
│ │ - OpenClaw Client (backend communication) │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Manager Layer │ │
│ │ - Session (user state) │ │
│ │ - API Keys (secure storage) │ │
│ │ - OpenClaw (service integration) │ │
│ └──────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
↕ IPC Bridge (Preload)
┌─────────────────────────────────────────────────────┐
│ React Frontend (Renderer Process) │
│ ┌──────────────────────────────────────────────┐ │
│ │ Cupcake Bar Window │ │
│ │ - Main control interface │ │
│ │ - Waveform visualization │ │
│ │ - Real-time feedback │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Config Window │ │
│ │ - Settings management │ │
│ │ - API key configuration │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Border Overlay │ │
│ │ - Visual accessibility indicators │ │
│ └──────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
- Process Isolation – Preload script enforces strict boundaries
- No Remote Code – Never load untrusted code as remote
- Encrypted Storage – API keys stored securely with Electron Store
- Type Safety – TypeScript strict mode prevents common vulnerabilities
| File | Purpose | Scope |
|---|---|---|
tsconfig.json |
TypeScript compiler options | Root |
tsconfig.app.json |
App-specific TS config | React |
electron/tsconfig.json |
Electron TS configuration | Main process |
vite.config.ts |
Build tool configuration | Bundling |
eslint.config.js |
Code quality rules | Linting |
.env.example |
Environment template | Secrets |
# Required
VITE_DEV_SERVER_URL=http://localhost:5173
# API Keys (add your credentials)
OPENAI_API_KEY=sk-...
OPENCLAW_API_KEY=...
OPENCLAW_ENDPOINT=https://api.openclaw.io
# Optional Settings
LOG_LEVEL=info
ENABLE_METRICS=true- Purpose: Speech-to-text recognition
- Features: Multi-language support, high accuracy, offline capable
- API Key Required: Yes
- Purpose: Audio output generation
- Features: Customizable rate, tone, language support
- Configuration: Embedded service
- Purpose: AI-powered backend services
- Features: Custom commands, advanced processing
- Configuration: Optional
- Purpose: Real-time bidirectional communication
- Features: Live updates, streaming data
- Configuration: Custom endpoint
- Voice recognition & control
- Basic navigation commands
- TTS output
- Advanced page understanding
- Custom command learning
- Context-aware assistance
- Browser extension support
- Mobile companion app
- Multi-language support
- Plugin system
- User feedback integration
- Open-source community building
// Use TypeScript strict mode
// Prefer interfaces over types
// Write descriptive variable names
// Keep functions focused and small- Type Safety – No
anytypes; use proper generics - Error Handling – Use try-catch for async operations
- Testing – Write unit tests for business logic
- Documentation – Comment complex algorithms
- Performance – Profile before optimizing
We follow conventional commits:
feat: add new feature
fix: resolve bug
chore: maintenance tasks
docs: documentation updates
# Find process using the port
lsof -i :5173
# Kill the process
kill -9 <PID># Clear build cache and rebuild
rm -rf dist-electron
npm run build:electron# Reinstall dependencies
rm -rf node_modules package-lock.json
npm install# Restart dev server
npm run dev- Close unnecessary browser tabs
- Reduce audio buffer size in settings
- Disable auto-summarization if not needed
- Check microphone settings
- Ensure OpenAI API key is valid
- Verify internet connection
- Try speaking more clearly
- Check system audio settings
- Verify TTS service endpoint
- Try restarting the application
We welcome contributions from the community! Here's how to get started:
- Check Issues for existing discussions
- Fork the repository
- Create a feature branch
git checkout -b feature/your-amazing-feature- Write clean, tested code
- Follow our coding standards
- Update relevant documentation
- Create descriptive commit messages
git add .
git commit -m "feat: describe your amazing feature"
git push origin feature/your-amazing-feature- Provide a clear description of changes
- Include screenshots/recordings if UI changes
- Link related issues
- Respond to review feedback
This project is licensed under the MIT License – see LICENSE file for details.
Users are free to use, modify, and distribute this software, provided they include the original license and copyright notice.
- Issues & Bugs – GitHub Issues
- Discussions – GitHub Discussions
- Documentation – Check the
/docsfolder for detailed guides
- Check if the bug already exists
- Create a new issue with:
- Clear title and description
- Steps to reproduce
- Expected vs. actual behavior
- Environment details (OS, Node version, etc.)
Create an issue with:
- Clear use case
- Expected behavior
- Suggested implementation (if any)
Cupcake stands on the shoulders of incredible open-source projects:
- Electron – Desktop magic
- React – UI excellence
- Vite – Lightning-fast builds
- TypeScript – Type safety
- OpenAI Whisper – Speech recognition
- Tailwind CSS – Beautiful styling
Cupcake is more than an app—it's a movement toward digital accessibility for all. By combining voice technology, AI, and thoughtful design, we're making the web a place where everyone can thrive, regardless of sight.
Made with care for accessibility. Built by developers. For everyone.
GitHub • Issues • Discussions
Questions? Open an issue and we'll help!