Metadata-Version: 2.4
Name: quick-scraping
Version: 1.0.1
Summary: Toolkit para web scraping combinando Selenium e BeautifulSoup
Home-page: https://github.com/bruno-sardou/quick-scraping
Author: Bruno Sardou
Author-email: brunosardou@proton.me
Project-URL: Bug Tracker, https://github.com/bruno-sardou/quick-scraping/issues
Project-URL: Documentation, https://quick-scraping.readthedocs.io/
Project-URL: Source Code, https://github.com/bruno-sardou/quick-scraping
Keywords: web scraping,selenium,beautifulsoup,automation,html,parsing,extraction
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: beautifulsoup4>=4.9.0
Requires-Dist: selenium>=4.0.0
Requires-Dist: requests>=2.25.0
Requires-Dist: lxml>=4.6.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0.0; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=4.0.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme; extra == "docs"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license-file
Dynamic: project-url
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Quick Scraping

Uma biblioteca Python completa para web scraping e automação, combinando o poder do Selenium e BeautifulSoup.

## Características

- **Módulo Selenium**: Automação completa de navegadores web
- **Módulo BeautifulSoup**: Análise e extração avançada de HTML
- **Integração**: Funções para trabalhar com ambas as bibliotecas de forma integrada
- **Ferramentas de Utilidade**: Processamento de texto, CSV, JSON e mais
- **Logging**: Sistema de registro completo para todas as operações

## Instalação

```bash
pip install quick-Scraping
```

Para instalar com dependências opcionais:

```bash
# Para desenvolvimento
pip install quick-Scraping[dev]

# Para documentação
pip install quick-Scraping[docs]
```

## Uso Básico

### Exemplo com Selenium

```python
from quick_Scraping.selenium_functions import SeleniumHelper

# Inicializa o Selenium Helper
with SeleniumHelper(browser_type="chrome", headless=True) as selenium:
    # Navega para uma URL
    selenium.navigate.to("https://www.exemplo.com")
    
    # Espera por um elemento e clica nele
    selenium.element.wait_for_element("id", "meu-botao", timeout=10)
    selenium.interact.click("id", "meu-botao")
    
    # Obtém o HTML da página
    html = selenium.driver.page_source
```

### Exemplo com BeautifulSoup

```python
from quick_Scraping.beautifulsoup_functions import HTMLParser, DataExtractor

# Inicializa o parser HTML e o extrator de dados
parser = HTMLParser()
extractor = DataExtractor()

# Carrega HTML de um arquivo ou URL
soup = parser.load_from_url("https://www.exemplo.com")

# Extrai dados estruturados
links = extractor.extract_links(soup)
table_headers, table_data = extractor.extract_table(soup, table_selector="table.dados")

# Extrai artigo completo
article = extractor.extract_article_content(soup)
```

### Exemplo de Integração

```python
from quick_Scraping.common import ScrapingHelper

# Inicializa o helper integrado
scraper = ScrapingHelper()

# Configura extração de dados
extraction_config = {
    "title": {"type": "text", "selector": "h1.title"},
    "products": {"type": "table", "selector": "table.products"},
    "links": {"type": "links", "selector": "a.product-link"}
}

# Navega e extrai dados
results = scraper.extract_data_with_selenium(
    url="https://www.exemplo.com/produtos",
    extraction_config=extraction_config,
    wait_time=3
)

# Salva os resultados
scraper.save_results(results, "produtos.json", format="json")
```
