Metadata-Version: 2.4
Name: simpleynews
Version: 0.3.3
Summary: HTML-first Yahoo Finance news scraping with broader ticker coverage
Author-email: Alexander Warth <alexander.warth@mailbox.org>
License-Expression: Apache-2.0
Project-URL: Homepage, https://gitlab.com/CochainComplex/simpleynews
Project-URL: Repository, https://gitlab.com/CochainComplex/simpleynews
Keywords: finance,news,scraping,yahoo-finance,stocks
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: requests>=2.31
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: isort>=5.13; extra == "dev"
Requires-Dist: pylint>=3.3; extra == "dev"
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: twine>=5.1; extra == "dev"
Dynamic: license-file

# SimpleYNews

SimpleYNews is a small Python package for scraping Yahoo Finance news from the
quote/news HTML page. The package is intentionally HTML-first so it can cover
news that is often missing from the narrower Yahoo Finance JSON endpoint that
libraries such as `yfinance` rely on.

## Important legal notice

Yahoo!, Y!Finance, and Yahoo Finance are trademarks of Yahoo, Inc. SimpleYNews
is not affiliated with, endorsed by, or vetted by Yahoo. Use of Yahoo's name is
solely for the purpose of accurately describing which service this tool
interacts with (nominative fair use).

**Terms of Service.** Users should be aware that Yahoo's Terms of Service
(Section 2(d)(ix)) state that users must not access or collect data from Yahoo
services using automated means without Yahoo's express prior permission, must
not interfere with the services, and must not build a competing or substitute
service from Yahoo content. Users are solely responsible for reviewing those
terms, determining whether their particular use is permitted, and ceasing use if
Yahoo objects or changes its access controls or terms.

**robots.txt.** Yahoo's `robots.txt` for `finance.yahoo.com` restricts
automated access and explicitly blocks numerous scraping and AI user-agents.
Under US law, `robots.txt` is not a legally binding instrument per se, but it
evidences the site operator's wishes. Under EU and German law, `robots.txt` may
qualify as a machine-readable opt-out from text and data mining under the DSM
Directive (2019/790) Art. 4 and the German Urheberrechtsgesetz (UrhG) Section
44b. This tool does not fetch or check `robots.txt` before making requests.

**Cookie consent and contractual risk.** When accessing Yahoo Finance pages from
jurisdictions subject to GDPR, Yahoo presents a cookie consent form. This tool
submits that form programmatically as part of HTTP session handling. Users
should be aware that such submission may constitute acceptance of Yahoo's Terms
of Service on behalf of the user under clickwrap contract principles (cf. CJEU,
*Ryanair Ltd v. PR Aviation BV*, C-30/14, 2015). Users should independently assess whether programmatic interaction with consent
mechanisms creates contractual obligations and whether their intended use
complies with any such obligations.

**No circumvention.** This tool does not bypass login requirements, password
gates, CAPTCHAs, rate-limit enforcement mechanisms, or encryption. It accesses
only publicly reachable pages that require no authentication. It sends standard
HTTP requests comparable to those a web browser would make.

Because of the foregoing, SimpleYNews is an experimental tool intended only
for personal research and educational use. If you need a production-grade or
contractually clean solution, use a licensed data provider.

*Legal notices last updated: 2026-03-09.*

## Why this exists

- `yfinance` news coverage is not broad enough for all tickers.
- Yahoo's page HTML often contains richer news data for non-US listings.
- This package keeps the familiar `Ticker(...).news` API while using a wider
  extraction strategy.

## Project scope

The defensible position for this project is narrow:

- It is a local library, not a hosted service.
- It returns article metadata and links, not full article bodies.
- It is meant for personal research workflows and compatibility experiments.
- It is not intended to mirror Yahoo Finance, replace Yahoo Finance, or resell
  Yahoo-derived data.
- It should be used conservatively, with low request volume and no bulk crawl.
- It does not bypass login requirements, password gates, CAPTCHAs, or
  encryption.

That scope does not eliminate legal or contractual risk. It only explains why
the project exists technically: the human-facing Yahoo Finance quote/news page
can expose broader coverage than the narrower endpoints used by other libraries.

## Statutory and regulatory landscape

Depending on jurisdiction, automated data collection may implicate one or more
of the following legal frameworks. This list is illustrative, not exhaustive.

**United States.** Computer Fraud and Abuse Act (18 U.S.C. Section 1030); DMCA
anti-circumvention provisions (17 U.S.C. Section 1201); federal copyright law
(17 U.S.C. Section 101 et seq.); state common-law torts including trespass to
chattels and unfair competition.

**European Union.** Database Directive (96/9/EC) sui generis database right;
Digital Single Market Directive (2019/790) Art. 3-4 text and data mining
exceptions and Art. 15 press publisher right; General Data Protection Regulation
(2016/679); ePrivacy Directive (2002/58/EC).

**Germany.** Urheberrechtsgesetz (UrhG) Sections 44b and 60d (text and data
mining); Sections 87a-87e (database protection); Gesetz gegen den unlauteren
Wettbewerb (UWG).

**United Kingdom.** Computer Misuse Act 1990; Copyright, Designs and Patents
Act 1988 (CDPA) database right; UK GDPR (retained EU law).

**Other jurisdictions.** Users in Canada, Australia, and other jurisdictions
should be aware that comparable computer misuse, copyright, database protection,
and privacy statutes may apply. This list covers only selected jurisdictions and
is not exhaustive.

The legality of automated data collection varies significantly across
jurisdictions. Conduct that is permissible in one country may be prohibited, or
even criminal, in another. Users are solely responsible for assessing and
complying with the laws applicable to their use.

## Data protection

News metadata returned by this tool may include personal data as defined by
GDPR Art. 4(1), such as journalist or author names and publisher identifiers.

- Under GDPR Art. 4(7), the end user who runs this tool — not the tool's
  developer — is the data controller who determines the purposes and means of
  processing.
- Users processing personal data of individuals in the EU/EEA must ensure a
  valid legal basis (Art. 6 GDPR) and comply with transparency obligations
  (Art. 13-14 GDPR), including informing data subjects when personal data has
  not been obtained directly from them.
- All data processing occurs locally on the user's machine. This tool does not
  transmit scraped data to the developer or any third party.
- Users operating at scale should consider whether a Data Protection Impact
  Assessment (DPIA) is required under Art. 35 GDPR.

## Responsible use

- Keep request volume low. Do not bulk-crawl Yahoo Finance.
- Do not redistribute, resell, or sublicense scraped data.
- Do not use scraped data to build a service that competes with or substitutes
  for Yahoo Finance.
- Respect any cease-and-desist communication or access restriction from Yahoo.
- Yahoo may change its website structure, access controls, or terms at any time
  without notice.
- Thumbnail URLs in results point to Yahoo-hosted images. Embedding or
  redistributing those images outside of personal, local use may constitute
  hotlinking or unauthorized use of Yahoo's content delivery resources.

## Extraction strategy

SimpleYNews parses the quote/news page in layers:

1. Embedded structured page state such as `root.App.main`
2. JSON-LD news metadata
3. DOM selectors as a final fallback

For `.DE` tickers, the scraper tries `de.finance.yahoo.com` before the default
US site. For other tickers, it may fall back to regional Yahoo Finance sites if
the primary site returns no results.

## Installation

```bash
python -m pip install simpleynews
```

## Quick start

```python
from simpleynews import SimpleYNews

bmw = SimpleYNews.Ticker("BMW.DE")
for item in bmw.news:
    print(item["title"])
    print(item["link"])
    print(item["publisher"])
```

## Returned shape

Each item in `.news` is a dictionary with these keys:

- `uuid`
- `title`
- `link`
- `publisher`
- `providerPublishTime`
- `type`
- `relatedTickers`
- `thumbnail`
- `summary`

## Development

Use either `pyenv` or a plain `venv`, but keep Python versions aligned with
`pyproject.toml` and `.python-version`.

### Option A: pyenv + venv (recommended)

```bash
pyenv install 3.11 -s
pyenv local 3.11
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e .[dev]
```

### Option B: system Python + venv

```bash
python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e .[dev]
```

### Quality checks

```bash
PYLINTHOME=.pylint.d python -m pylint simpleynews tests
python -m isort --check-only simpleynews tests
python -m pytest
python -m build
```

### Versioning and release

- Single source of truth: `simpleynews/__init__.py` (`__version__`).
- Packaging metadata reads the version dynamically from that file.
- Keep PyPI credentials outside the repository.

### Release checklist

```bash
# 1) bump version in simpleynews/__init__.py
# 2) run checks
PYLINTHOME=.pylint.d python -m pylint simpleynews tests
python -m isort --check-only simpleynews tests
python -m pytest

# 3) rebuild artifacts
rm -f dist/*
python -m build
python -m twine check dist/*

# 4) upload using environment variables
TWINE_USERNAME=__token__ \
TWINE_PASSWORD=pypi-REPLACE_WITH_PROJECT_TOKEN \
python -m twine upload dist/*
```

If upload fails with `File already exists`, bump patch version and retry.

## Disclaimer

Yahoo!, Yahoo Finance, and related marks are owned by Yahoo, Inc. Review
Yahoo's terms before using this package or any data it returns:

- Yahoo Terms of Service:
  https://legal.yahoo.com/us/en/yahoo/terms/otos/index.html
- Yahoo Finance Community terms:
  https://legal.yahoo.com/us/en/yahoo/terms/product-atos/finance/index.html
- Yahoo Developer API Terms:
  https://legal.yahoo.com/us/en/yahoo/terms/product-atos/apiforydn/index.html
- Yahoo permissions guidance:
  https://legal.yahoo.com/us/en/yahoo/permissions/requests/index.html

These links are provided for reference and may change. Users should
independently verify current terms.

**No warranty of legality.** The developers make no representation or warranty
that operation of this tool is lawful in any jurisdiction. The tool is provided
"as is" without warranty of any kind, including warranties of legality,
accuracy, completeness, availability, or fitness for any purpose. Use is
entirely at your own risk.

**No warranty of accuracy.** News metadata returned by this tool may be
incomplete, delayed, or inaccurate. Do not use it as the basis for financial,
investment, or trading decisions.

**No legal advice.** Nothing in this project — including its source code,
documentation, and legal notices — constitutes legal advice. Users should
consult qualified legal counsel in their jurisdiction before using this tool or
relying on data obtained through it.

**Jurisdictional variation.** The legality of web scraping, automated data
collection, and data reuse varies significantly across jurisdictions. Conduct
that is permissible in one country may be prohibited, or even criminal, in
another.

**Indemnification.** To the maximum extent permitted by applicable law, users
agree to hold harmless and indemnify the developers and contributors of this
project from any claims, damages, liabilities, or expenses arising from the
user's use of this tool or data obtained through it. In jurisdictions where
blanket indemnification clauses in gratuitous transactions are unenforceable
(including under German Schenkungsrecht, BGB Section 516 et seq.), this clause
applies only to the extent permitted by mandatory law.

By using SimpleYNews, you accept these terms and agree to comply with all
applicable laws and the terms of any third-party services accessed through it.
SimpleYNews is not legal advice and must not be presented as an approved or
authorized Yahoo integration.

## License

Apache License 2.0. See [LICENSE](LICENSE).
