5  add.to() - One-to-Many & Many-to-One Patterns

5.1 Overview

Learn how to work with lists of DataFrames to handle complex data relationships. This page introduces powerful patterns for working with multiple datasets simultaneously.

What you’ll learn: - One-to-Many: Single target with multiple reference DataFrames - Many-to-One: Multiple targets with a single reference DataFrame - Many-to-Many: Multiple targets with multiple references - When to use each pattern

Prerequisites: - Basic Lookup - Understanding of single DataFrame operations - Multiple Columns & Keys - Working with multiple columns


5.2 Example 1: One-to-Many Pattern

Business Context: You have a customer list and need to aggregate order amounts from multiple months (January and February). Each month’s orders are in separate DataFrames.

Code:

import additory as add
import pandas as pd

# Single target: Customer list
customers = pd.DataFrame({
    'customer_id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie']
})

# Multiple references: Orders from different months
orders_jan = pd.DataFrame({
    'customer_id': [1, 1, 2],
    'amount': [100, 150, 200]
})

orders_feb = pd.DataFrame({
    'customer_id': [1, 3, 3],
    'amount': [175, 125, 225]
})

# One-to-many: single target, list of references
result = add.to(
    customers,
    bring_from=[orders_jan, orders_feb],    # List of reference DataFrames
    bring='amount',
    against='customer_id',
    strategy={'amount': 'sum'}              # Aggregate multiple matches
)

# Positional parameters (also works without naming certain parameters):
# result = add.to(customers, [orders_jan, orders_feb], 'amount', 'customer_id', strategy={'amount': 'sum'})

print(result)

Output:

   customer_id     name  amount
0            1    Alice     425
1            2      Bob     200
2            3  Charlie     350

Explanation: - Pass a list [orders_jan, orders_feb] to bring_from - additory combines data from both DataFrames - Alice (ID 1): 100 + 150 (Jan) + 175 (Feb) = 425 - Bob (ID 2): 200 (Jan only) = 200 - Charlie (ID 3): 125 + 225 (Feb only) = 350 - The strategy parameter handles multiple matches (we’ll cover this in detail next page)

Note: This also works with polars DataFrames.


5.3 Example 2: Many-to-One Pattern

Business Context: You have customer lists split by type (internal and external), and you need to add order amounts from a single orders DataFrame to both lists.

Code:

import additory as add
import pandas as pd

# Multiple targets: Different customer segments
customers_internal = pd.DataFrame({
    'customer_id': [1, 2],
    'name': ['Alice', 'Bob']
})

customers_external = pd.DataFrame({
    'customer_id': [3, 4],
    'name': ['Charlie', 'David']
})

# Single reference: All orders
orders = pd.DataFrame({
    'customer_id': [1, 2, 3, 4],
    'amount': [100, 200, 150, 175]
})

# Many-to-one: list of targets, single reference
results = add.to(
    [customers_internal, customers_external],    # List of target DataFrames
    bring_from=orders,
    bring='amount',
    against='customer_id'
)

# Positional parameters (also works without naming certain parameters):
# results = add.to([customers_internal, customers_external], orders, 'amount', 'customer_id')

# Results is a list of DataFrames
print("Internal customers:")
print(results[0])
print("\nExternal customers:")
print(results[1])

Output:

Internal customers:
   customer_id   name  amount
0            1  Alice     100
1            2    Bob     200

External customers:
   customer_id     name  amount
0            3  Charlie     150
1            4    David     175

Explanation: - Pass a list [customers_internal, customers_external] as the first argument - additory returns a list of DataFrames (same order as input) - Each target DataFrame gets matched against the same reference - results[0] contains internal customers with amounts - results[1] contains external customers with amounts - This is useful for processing segmented data consistently

Note: This also works with polars DataFrames.


5.4 Example 3: Many-to-Many Pattern

Business Context: You have customer lists split by quarter (Q1 and Q2), and order data also split by month (January and February). You need to match each customer list with all relevant orders.

Code:

import additory as add
import pandas as pd

# Multiple targets: Customers by quarter
customers_q1 = pd.DataFrame({
    'customer_id': [1, 2],
    'name': ['Alice', 'Bob']
})

customers_q2 = pd.DataFrame({
    'customer_id': [3, 4],
    'name': ['Charlie', 'David']
})

# Multiple references: Orders by month
orders_jan = pd.DataFrame({
    'customer_id': [1, 2],
    'amount': [100, 200]
})

orders_feb = pd.DataFrame({
    'customer_id': [3, 4],
    'amount': [150, 175]
})

# Many-to-many: list of targets, list of references
results = add.to(
    [customers_q1, customers_q2],           # List of targets
    bring_from=[orders_jan, orders_feb],    # List of references
    bring='amount',
    against='customer_id',
    strategy={'amount': 'sum'}
)

# Positional parameters (also works without naming certain parameters):
# results = add.to([customers_q1, customers_q2], [orders_jan, orders_feb], 'amount', 'customer_id', strategy={'amount': 'sum'})

print("Q1 customers:")
print(results[0])
print("\nQ2 customers:")
print(results[1])

Output:

Q1 customers:
   customer_id  name  amount
0            1 Alice     100
1            2   Bob     200

Q2 customers:
   customer_id     name  amount
0            3  Charlie     150
1            4    David     175

Explanation: - Pass lists for both targets and references - Each target DataFrame is matched against ALL reference DataFrames - additory returns a list of DataFrames (one per target) - Q1 customers get amounts from both Jan and Feb orders (where they exist) - Q2 customers get amounts from both Jan and Feb orders (where they exist) - This pattern is powerful for time-series or segmented data analysis

Note: This also works with polars DataFrames.


5.5 Pattern Comparison

5.5.1 When to Use Each Pattern

Pattern Target Reference Returns Use Case
Basic Single DF Single DF Single DF Simple lookup
One-to-Many Single DF List of DFs Single DF Aggregate from multiple sources
Many-to-One List of DFs Single DF List of DFs Apply same reference to multiple targets
Many-to-Many List of DFs List of DFs List of DFs Complex multi-source matching

5.5.2 Visual Guide

Basic (1:1)
Target ──> Reference
  ↓
Result

One-to-Many (1:N)
Target ──> [Ref1, Ref2, Ref3]
  ↓
Result

Many-to-One (N:1)
[Target1, Target2] ──> Reference
  ↓
[Result1, Result2]

Many-to-Many (N:N)
[Target1, Target2] ──> [Ref1, Ref2]
  ↓
[Result1, Result2]

5.6 Working with Result Lists

5.6.1 Accessing Individual Results

# Many-to-one or many-to-many returns a list
results = add.to(
    [df1, df2, df3],
    bring_from=ref_df,
    bring='amount',
    against='id'
)

# Access by index
first_result = results[0]
second_result = results[1]

# Iterate over results
for i, result in enumerate(results):
    print(f"Result {i}: {len(result)} rows")

5.6.2 Combining Results

# If you want to combine all results into one DataFrame
import pandas as pd

combined = pd.concat(results, ignore_index=True)

5.7 Key Takeaways

  • Use lists [df1, df2] for targets or references to enable advanced patterns
  • One-to-Many: Single target + multiple references → Single result
  • Many-to-One: Multiple targets + single reference → List of results
  • Many-to-Many: Multiple targets + multiple references → List of results
  • When you pass a list of targets, you always get a list of results back
  • These patterns are essential for working with segmented or time-series data

5.8 Common Questions

Q: Can I mix pandas and polars DataFrames in the lists?
A: No, all DataFrames in a single operation must be the same type (all pandas or all polars).

Q: What if the reference DataFrames have overlapping data?
A: Use the strategy parameter to control how duplicates are handled (sum, mean, first, etc.). We cover this in detail on the next page.

Q: Is there a limit to how many DataFrames I can include in a list?
A: No hard limit, but performance may degrade with very large lists. Consider combining DataFrames first if you have dozens of them.

Q: Do the DataFrames in a list need to have the same columns?
A: They need to have the key columns (against) and the columns you’re bringing (bring). Other columns can differ.


5.9 Next Steps


Version: 0.1.3
Last Updated: March 9, 2026