Metadata-Version: 2.4
Name: lazyq
Version: 0.0.4
Summary: A lightweight, chainable query pipeline for Python
Author-email: vikasAWA <awasthivikas02@gmail.com>
License-Expression: Apache-2.0
Project-URL: Repository, https://github.com/vikasAWA/lazyq
Project-URL: Documentation, https://vikasAWA.github.io/lazyq/
Keywords: nbdev
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file



<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## What is lazyq?

**lazyq** is a lightweight, chainable query pipeline for Python.

Instead of executing operations immediately, lazyq builds up a **map of
instructions** and only runs them when you actually need the results.
This makes it memory-efficient and great for working with large
datasets!

## Usage

### Installation

Install latest from the GitHub
[repository](https://github.com/vikasAWA/lazyq):

``` sh
$ pip install git+https://github.com/vikasAWA/lazyq.git
```

or from [conda](https://anaconda.org/vikasAWA/lazyq)

``` sh
$ conda install -c vikasAWA lazyq
```

or from [pypi](https://pypi.org/project/lazyq/)

``` sh
$ pip install lazyq
```

### Documentation

Documentation can be found hosted on this GitHub
[repository](https://github.com/vikasAWA/lazyq)’s
[pages](https://vikasAWA.github.io/lazyq/). Additionally you can find
package manager specific guidelines on
[conda](https://anaconda.org/vikasAWA/lazyq) and
[pypi](https://pypi.org/project/lazyq/) respectively.

## How to use

## Usage

lazyq lets you build queries step by step. Let’s explore with some
country data!

## Exploring with a list of dicts

Let’s start simple — here’s a list of countries. We’ll use
[`Query.from_iterable()`](https://vikasAWA.github.io/lazyq/core.html#query.from_iterable)
to wrap it.

``` python
from lazyq import *

countries = [
    {"name": "India", "continent": "Asia", "population": 1428000000, "gdp": 3750000000000, "area_km2": 3287263},
    {"name": "China", "continent": "Asia", "population": 1425000000, "gdp": 17700000000000, "area_km2": 9596960},
    {"name": "USA", "continent": "North America", "population": 331000000, "gdp": 25460000000000, "area_km2": 9833517},
    {"name": "Brazil", "continent": "South America", "population": 215000000, "gdp": 1920000000000, "area_km2": 8515767},
    {"name": "Nigeria", "continent": "Africa", "population": 218000000, "gdp": 477000000000, "area_km2": 923768},
    {"name": "Germany", "continent": "Europe", "population": 84000000, "gdp": 4070000000000, "area_km2": 357114},
    {"name": "Australia", "continent": "Oceania", "population": 26000000, "gdp": 1693000000000, "area_km2": 7692024},
    {"name": "Egypt", "continent": "Africa", "population": 105000000, "gdp": 387000000000, "area_km2": 1002450},
    {"name": "France", "continent": "Europe", "population": 68000000, "gdp": 2780000000000, "area_km2": 551695},
    {"name": "Canada", "continent": "North America", "population": 38000000, "gdp": 2140000000000, "area_km2": 9984670},
]
```

### **Query 1:** Let’s just get all country names.

``` python
# Build the query - nothing runs yet!
q = Query.from_iterable(countries).select('name')
print(q) # shows the pipeline map
```

    Query(select(name))

``` python
# lets run it
q.collect()
```

    [{'name': 'India'},
     {'name': 'China'},
     {'name': 'USA'},
     {'name': 'Brazil'},
     {'name': 'Nigeria'},
     {'name': 'Germany'},
     {'name': 'Australia'},
     {'name': 'Egypt'},
     {'name': 'France'},
     {'name': 'Canada'}]

``` python
# or you can use show() also 
q.show() # by default will show 5 values only. You can pass no. as argument
```

    {'name': 'India'}
    {'name': 'China'}
    {'name': 'USA'}
    {'name': 'Brazil'}
    {'name': 'Nigeria'}

### **Query 2:** Which continents have more than one country in our list?

Let’s group countries by continent, count how many are in each group,
then filter to only show continents with more than one country.

``` python
q = Query.from_iterable(countries)
q
```

    Query()

``` python
# first we can groupby continents
q.groupby('continent').collect(1) # will show only 1 value
```

    [('Asia',
      [{'name': 'India',
        'continent': 'Asia',
        'population': 1428000000,
        'gdp': 3750000000000,
        'area_km2': 3287263},
       {'name': 'China',
        'continent': 'Asia',
        'population': 1425000000,
        'gdp': 17700000000000,
        'area_km2': 9596960}])]

``` python
# we can use count() to count the number of items in a group
q.groupby('continent').count().collect()
```

    [('Asia', 2),
     ('North America', 2),
     ('South America', 1),
     ('Africa', 2),
     ('Europe', 2),
     ('Oceania', 1)]

``` python
# so to get the continents having more than 1 country we can use filter
q.groupby('continent').count().filter(lambda x: x[1] > 1).show()
```

    ('Asia', 2)
    ('North America', 2)
    ('Africa', 2)
    ('Europe', 2)

### **Query 3:** Top 3 richest countries by GDP

Let’s find the top 3 richest countries. We use `.sort()` to order by GDP
(highest first), then `.collect(3)` to grab only the top 3 — all in one
lazy chain!

``` python
q.sort('gdp', reverse=True).collect(3)
```

    [{'name': 'USA',
      'continent': 'North America',
      'population': 331000000,
      'gdp': 25460000000000,
      'area_km2': 9833517},
     {'name': 'China',
      'continent': 'Asia',
      'population': 1425000000,
      'gdp': 17700000000000,
      'area_km2': 9596960},
     {'name': 'Germany',
      'continent': 'Europe',
      'population': 84000000,
      'gdp': 4070000000000,
      'area_km2': 357114}]

``` python
# If you want to just select a particular key only. Use select()
q.sort('gdp', reverse=True).select('name').collect(3)
```

    [{'name': 'USA'}, {'name': 'China'}, {'name': 'Germany'}]

### **Query 4:** Countries with population over 200 million

We use `F('population')` to reference the population field and
`> 200_000_000` to build a condition. Only countries matching it pass
through the filter!

``` python
q.filter(F('population') > 200_000_000).collect()
```

    [{'name': 'India',
      'continent': 'Asia',
      'population': 1428000000,
      'gdp': 3750000000000,
      'area_km2': 3287263},
     {'name': 'China',
      'continent': 'Asia',
      'population': 1425000000,
      'gdp': 17700000000000,
      'area_km2': 9596960},
     {'name': 'USA',
      'continent': 'North America',
      'population': 331000000,
      'gdp': 25460000000000,
      'area_km2': 9833517},
     {'name': 'Brazil',
      'continent': 'South America',
      'population': 215000000,
      'gdp': 1920000000000,
      'area_km2': 8515767},
     {'name': 'Nigeria',
      'continent': 'Africa',
      'population': 218000000,
      'gdp': 477000000000,
      'area_km2': 923768}]

### **Query 5:** Total Population per Continent

We use `.groupby()` to group countries by continent, then
`.sum('population')` to add up the population in each group. Great for
aggregating data!

``` python
q.groupby('continent').sum('population').collect()
```

    [('Asia', 2853000000.0),
     ('North America', 369000000.0),
     ('South America', 215000000.0),
     ('Africa', 323000000.0),
     ('Europe', 152000000.0),
     ('Oceania', 26000000.0)]

### **Query 6:** Largest Country by Area in Each Continent

We use `.groupby()` then `.max('area_km2')` to find the biggest country
in each continent.

``` python
q.groupby('continent').max('area_km2').collect()
```

    [{'name': 'China',
      'continent': 'Asia',
      'population': 1425000000,
      'gdp': 17700000000000,
      'area_km2': 9596960},
     {'name': 'Canada',
      'continent': 'North America',
      'population': 38000000,
      'gdp': 2140000000000,
      'area_km2': 9984670},
     {'name': 'Brazil',
      'continent': 'South America',
      'population': 215000000,
      'gdp': 1920000000000,
      'area_km2': 8515767},
     {'name': 'Egypt',
      'continent': 'Africa',
      'population': 105000000,
      'gdp': 387000000000,
      'area_km2': 1002450},
     {'name': 'France',
      'continent': 'Europe',
      'population': 68000000,
      'gdp': 2780000000000,
      'area_km2': 551695},
     {'name': 'Australia',
      'continent': 'Oceania',
      'population': 26000000,
      'gdp': 1693000000000,
      'area_km2': 7692024}]

## 🐍 CS50 Language Popularity Survey CSV

Let’s load a CSV file and count how many respondents use each
programming language.

``` python
survey = Query.from_csv('../data/favorites.csv')
survey.collect(2)
```

    [{'Timestamp': '10/20/2025 9:45:26',
      'language': 'Python',
      'problem': 'Readability'},
     {'Timestamp': '10/20/2025 10:08:03',
      'language': 'Python',
      'problem': 'Mario'}]

### How many people use Python vs other languages?

``` python
survey.groupby('language').count().collect()
```

    [('Python', 190), ('Scratch', 24), ('C', 58)]

### Most Common Problem

Let’s find the most frequently mentioned problem using `groupby`,
`count`, and `sort`!

``` python
survey.groupby('problem').count().sort(1, reverse=True).show(1)
```

    ('Hello, World', 42)

### Filter by Language

Use `F('language') == 'C'` to filter only C programmers, then `select()`
to pick specific fields.

``` python
survey.filter(F('language') == 'C').select(['language', 'problem']).collect(4) # only showing 4
```

    [{'language': 'C', 'problem': 'Cash'},
     {'language': 'C', 'problem': 'Filter'},
     {'language': 'C', 'problem': 'DNA'},
     {'language': 'C', 'problem': 'Speller'}]

## Working with SQLite

lazyq can connect directly to a SQLite database. Use
[`Query.from_sqlite()`](https://vikasAWA.github.io/lazyq/core.html#query.from_sqlite)
to load a table, or
[`Query.from_sqlite_query()`](https://vikasAWA.github.io/lazyq/core.html#query.from_sqlite_query)
to run a custom SQL query and then chain lazyq operations on the result.

### IMDB shows

``` python
shows_db = Query.from_sqlite('../data/shows.db', table='shows')
```

``` python
shows_db.collect(3)
```

    [{'id': 62614, 'title': "Zeg 'ns Aaa", 'year': 1981, 'episodes': 227},
     {'id': 63881, 'title': 'Catweazle', 'year': 1970, 'episodes': 26},
     {'id': 63962, 'title': 'UFO', 'year': 1970, 'episodes': 26}]

### Explore Available Tables

Use `from_sqlite_query()` with a metadata query to list all tables in
your SQLite database!

``` python
tables = Query.from_sqlite_query('../data/shows.db', "SELECT name FROM sqlite_master WHERE type='table'").collect()
tables
```

    [{'name': 'genres'},
     {'name': 'people'},
     {'name': 'ratings'},
     {'name': 'shows'},
     {'name': 'stars'},
     {'name': 'writers'}]

### Top 5 Shows with Most Episodes

Load from SQLite, drop null episodes, sort and select only the fields we
need.

``` python
shows_db.dropna('episodes').sort('episodes', reverse=True).select(['title', 'episodes']).collect(5)
```

    [{'title': 'NRK Nyheter', 'episodes': 18593},
     {'title': 'The Young and the Restless', 'episodes': 13297},
     {'title': 'See the World by Train', 'episodes': 10674},
     {'title': 'WREG News 3 at 10PM', 'episodes': 10663},
     {'title': 'Barátok közt', 'episodes': 10456}]

### Top 5 Highest Rated Shows

Use `from_sqlite_query()` to JOIN tables in SQL, then chain lazyq
operations on the result!

``` python
Query.from_sqlite_query('../data/shows.db', """
    SELECT shows.title, ratings.rating
    FROM shows JOIN ratings ON shows.id = ratings.show_id
""").sort('rating', reverse=True).limit(5).collect()
```

    [{'title': 'Peanut Headz: Black History Toonz', 'rating': 10.0},
     {'title': 'Quase Anjos: Edição Especial', 'rating': 10.0},
     {'title': 'Living with Siblings', 'rating': 10.0},
     {'title': 'DIY Science Time', 'rating': 10.0},
     {'title': 'The Show with the Campervan', 'rating': 9.9}]

### Filter Shows by Year

Use `F('year') == 1970` to filter shows from a specific year directly
from SQLite!

``` python
shows_db.filter(F('year') == 1970).select(['title', 'year']).collect(5)
```

    [{'title': 'Catweazle', 'year': 1970},
     {'title': 'UFO', 'year': 1970},
     {'title': 'Ace of Wands', 'year': 1970},
     {'title': 'The Adventures of Don Quick', 'year': 1970},
     {'title': 'Albert and Victoria', 'year': 1970}]
