Metadata-Version: 2.1
Name: web2db
Version: 0.1.3
Summary: Fetch webpage full-text, persist link and full text to SQLITE3 db, resumable with tqdm progressbar.
Home-page: https://github.com/pushkarparanjpe/web2db
Author: Pushkar Paranjpe
Author-email: pushkarparanjpe@gmail.com
License: MIT
Platform: UNKNOWN
Description-Content-Type: text/markdown

# web2db


Fetches the full text of input URLs and persists them to sqlite3 DB file.  
Fetching is resumable and comes with a progressbar.  


### Install:  
```pip install web2db```


### Quickstart:  

```python
import web2db  
web2db.dump('data.db', urls=[
    'https://www.google.com',
    'https://www.yahoo.com',
    'https://www.msn.com'
])
```

Query the DB file:
```python
df = web2db.to_df(sqlite3_file_path)
print(df.shape)
print(df)
```

### SQL Schema:  
- Table:  
	- WebPages:  (url text, fulltext text)  


### Features:
- Resumable webpage fetching
- Saves to local SQLITE3 DB
- tqdm progress bar


