Parsing custom database¶
This tutorial shows how to build a custom database in Excel and how to parse it using MARIO.
Parsing from Excel¶
Start by opening Excel or any equivalent software. Any custom MARIO-readable IOT must follow these rules: - It must be in .xlsx format - It must have two sheets. The first must contain the table, the second must be named “units” and contains the info on units of measure
For instance, the following example is for a SUT of 2 regions, 2 commodities and 2 activities.
Table sheet¶

Alt text¶
The structure is the same for both IOTs and SUTs with the difference SUTs must differenciate between activities and commodities, while IOTs just needs sectors. You will notice: - There must be 3 level of indices on both rows and columns - The first level is always the name of the region, apart from those table sets not defined on regions, such as “Factor of production” and “Satellite account”. For these two sets, just provide “-”. - The second level is always the name of the set (i.e. “Activity”, “Commodity”, “Consumption category”, “Factor of production”, “Satellite account”). In case of an IOT, provide “Sector” instead of “Activity” and “Commodity” - The third level is a label, referring to the name of the item - There must not be blank cells within the matrices There are no particular rules for the order of the labels and sets, MARIO will always sort all the indices in alphabetical order before doing any calculation.
Units sheet¶
Regardin unit of measures, this sheet must be named “units” and the header of the column of units (column C of the sheet) must be labelled “unit” as in the following example

Alt text¶
Again the rules are on the indices, that must be provided for all the labels, avoiding repeating the same label for multiple regions: regions indeed are not required in this sheet. MARIO can handle hybrid-units databases.
Parsing a customized database¶
Once the customized database is prepared in Excel, just provide the path, type of table (SUT or IOT) and the mode (flows or coefficients) and MARIO will be able to parse it using the “parse_from_excel” function
import mario # Import MARIO
path = 'custom_SUT.xlsx' # Define the desired path to the folder where Exiobase should be downloaded
database = mario.parse_from_excel(
path = path,
table = 'SUT',
mode = 'flows',
)
database.X
Item | production | ||
---|---|---|---|
Region | Level | Item | |
R1 | Activity | Production of Goods | 1.0 |
Production of Services | 0.9 | ||
R2 | Activity | Production of Goods | 1.0 |
Production of Services | 1.2 | ||
R3 | Activity | Production of Goods | 1.0 |
Production of Services | 0.9 | ||
R1 | Commodity | Goods | 45.0 |
Services | 31.4 | ||
R2 | Commodity | Goods | 66.0 |
Services | 44.0 | ||
R3 | Commodity | Goods | 61.0 |
Services | 44.0 |
The same structure is replicable for IOT database. If you want to see how the table should look like, you can load the test models and save them to excel to have a closer look to the structure:
mario.load_test("IOT").to_excel("test_iot.xlsx")
Parsing from pd.DataFrames¶
You can also build a mario.Database, using pd.DataFrames:
from mario import Database
import pandas as pd
import numpy as np
# Creating indeces according to mario format
regions = ['reg.1']
Z_levels = ['Sector']
sectors = ['sec.1','sec.2']
factors = ['Labor']
satellite = ['CO2']
Y_level = ['Consumption category']
demands = ['Households']
Z_index = pd.MultiIndex.from_product([regions,Z_levels,sectors])
Y_columns = pd.MultiIndex.from_product([regions,Y_level,demands])
# creating matrices
Z = pd.DataFrame(
data = np.array([
[10,70],
[50,10]]),
index = Z_index,
columns= Z_index
)
Y = pd.DataFrame(
data = np.array([
[200],
[80]]),
index = Z_index,
columns= Y_columns,
)
E = pd.DataFrame(
data = np.array([
[30,20]]),
index = satellite,
columns= Z_index,
)
V = pd.DataFrame(
data = np.array([
[220,60]]),
index = factors,
columns= Z_index,
)
EY = pd.DataFrame(
data = np.array([8]),
index = satellite,
columns= Y_columns,
)
Z
reg.1 | ||||
---|---|---|---|---|
Sector | ||||
sec.1 | sec.2 | |||
reg.1 | Sector | sec.1 | 10 | 70 |
sec.2 | 50 | 10 |
Y
reg.1 | |||
---|---|---|---|
Consumption category | |||
Households | |||
reg.1 | Sector | sec.1 | 200 |
sec.2 | 80 |
You also need to identify the units in a separate python dict as follow:
# units as a dict of pd.DataFrames
units= {
'Sector':pd.DataFrame('EUR',index=sectors,columns=['unit']),
'Satellite account':pd.DataFrame('Ton',index=satellite,columns=['unit']),
'Factor of production': pd.DataFrame('EUR',index=factors,columns=['unit'])
}
units
{'Sector': unit
sec.1 EUR
sec.2 EUR,
'Satellite account': unit
CO2 Ton,
'Factor of production': unit
Labor EUR}
Now you can create a mario.Database object:
# Creating a mario database
data = Database(
Z=Z,
Y=Y,
E=E,
V=V,
EY=EY,
table='IOT',
units=units,
name='iot test'
)
data.z
Region | reg.1 | |||
---|---|---|---|---|
Level | Sector | |||
Item | sec.1 | sec.2 | ||
Region | Level | Item | ||
reg.1 | Sector | sec.1 | 0.035714 | 0.500000 |
sec.2 | 0.178571 | 0.071429 |
data.p
Database: to calculate p following matrices are need.
['w'].Trying to calculate dependencies.
price index | |||
---|---|---|---|
Region | Level | Item | |
reg.1 | Sector | sec.1 | 1.0 |
sec.2 | 1.0 |
Link to the jupyter notebook file
.