Metadata-Version: 2.1
Name: osagent
Version: 0.1.1
Summary: osagent
Author: XLANG
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: pip>=24.2
Requires-Dist: python-xlib>=0.33
Requires-Dist: lxml>=5.2.2
Requires-Dist: pyautogui>=0.9.54
Requires-Dist: requests>=2.32.3
Requires-Dist: flask>=3.0.3
Requires-Dist: numpy<2
Requires-Dist: gymnasium>=0.29.1
Requires-Dist: playwright>=1.45.1
Requires-Dist: pydrive>=1.3.1
Requires-Dist: requests-toolbelt>=1.0.0
Requires-Dist: rapidfuzz>=3.9.5
Requires-Dist: beautifulsoup4>=4.12.3
Requires-Dist: pandas>=2.2.2
Requires-Dist: formulas>=1.2.8
Requires-Dist: cssselect>=1.2.0
Requires-Dist: xmltodict>=0.13.0
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pypdf2>=3.0.1
Requires-Dist: PyMuPDF>=1.24.9
Requires-Dist: borb>=2.1.24
Requires-Dist: easyocr>=1.7.1
Requires-Dist: python-docx>=1.1.2
Requires-Dist: odfpy>=1.4.1
Requires-Dist: pdfplumber>=0.11.2
Requires-Dist: mutagen>=1.47.0
Requires-Dist: pypdf>=4.3.1
Requires-Dist: python-pptx>=0.6.23
Requires-Dist: pyacoustid>=1.3.0
Requires-Dist: ImageHash>=4.3.1
Requires-Dist: librosa>=0.10.2.post1
Requires-Dist: fastdtw>=0.3.4
Requires-Dist: psutil>=6.0.0
Requires-Dist: tqdm>=4.66.4
Requires-Dist: boto3>=1.34.151
Requires-Dist: backoff>=2.2.1
Requires-Dist: dashscope>=1.20.3
Requires-Dist: google-generativeai>=0.7.2
Requires-Dist: openai>=1.37.1
Requires-Dist: tiktoken>=0.7.0
Requires-Dist: groq>=0.9.0
Requires-Dist: sounddevice>=0.4.7
Requires-Dist: pygame>=2.6.0
Requires-Dist: pynput>=1.7.7
Provides-Extra: mac
Requires-Dist: pyobjc-framework-ApplicationServices>=10.3.1; extra == "mac"
Provides-Extra: win
Requires-Dist: pywinauto>=0.6.8; extra == "win"
Provides-Extra: all
Requires-Dist: foo[mac,win]; extra == "all"

# OSAgent

OSAgent is a demo application designed to assist users through voice commands on their computers. It supports Ubuntu, Windows, and macOS systems, can run in the background, and integrates voice input. The application leverages advanced models based on the [OSWorld](https://os-world.github.io/) benchmark for quick integration and efficient functionality.

## Features

- **Cross-Platform**: Works on Ubuntu, Windows, and macOS.
- **Background Operation**: Continuously runs without disrupting other tasks.
- **Voice Input**: Accepts and processes voice commands.
- **Advanced Model Integration**: Utilizes OSWorld benchmark models for enhanced performance.

## Installation and Running

To get started with OSAgent, follow these steps:

1. Set your OpenAI API key:
   ```shell
   export OPENAI_API_KEY="YOUR_KEY"
   ```

2. Install the OSAgent package:
   ```shell
   pip install osagent
   ```

3. Run the main agent module:
   ```shell
   python -m osagent.agents.main
   ```

**Note:** On macOS, you may receive popups requesting network, screenshot, or accessibility permissions. Please grant these permissions for OSAgent to function correctly.

## Usage

In default the program run in voice command feature, please press and hold the combination of `shift + ctrl + x`, then speak your instruction and release the keys to process the command.

For more detailed information, please refer to our [documentation](https://os-world.github.io/).

