Kedro plugins¶
Note: This documentation is based onKedro 0.16.5
, if you spot anything that is incorrect then please create an issue or pull request.
Kedro plugins allow you to create new features for Kedro and inject additional commands into the CLI. Plugins are developed as separate Python packages that exist outside of any Kedro project.
Overview¶
Kedro uses setuptools
, which is a collection of enhancements to the Python distutils
to allow developers to build and distribute Python packages. Kedro uses various entry points in pkg_resources
to provide plugin functionality.
Example of a simple plugin¶
Here is a simple example of a plugin that prints the pipeline as JSON:
kedrojson/plugin.py
import click
from kedro.framework.cli import get_project_context
@click.group(name="JSON")
def commands():
""" Kedro plugin for printing the pipeline in JSON format """
pass
@commands.command()
def to_json():
""" Display the pipeline in JSON format """
context = get_project_context()
print(context.pipeline.to_json())
The plugin provides the following entry_points
config in setup.py
:
entry_points={
"kedro.project_commands": ["kedrojson = kedrojson.plugin:commands"],
}
Once the plugin is installed, you can run it as follows:
kedro to_json
Working with click
¶
Commands must be provided as click
Groups
The click Group
will be merged into the main CLI Group. In the process, the options on the group are lost, as is any processing that was done as part of its callback function.
ProjectContext
¶
While running, plugins may request information about the current project by calling kedro.framework.cli.get_project_context()
.
This function provides access to the verbose flag via the key verbose
and to anything returned by the project’s KedroContext
. The returned instance of ProjectContext(KedroContext)
class must contain at least the following properties and methods:
project_version
: the version of Kedro the project was created with, orNone
if the project was not created withkedro new
.project_path
: the path to the directory where either.kedro.yml
orpyproject.toml
is located.config_loader
: an instance ofkedro.config.ConfigLoader
.catalog
: an instance ofkedro.io.DataCatalog
.pipeline
: an instance ofkedro.pipeline.Pipeline
.
Plugins may require additional keys to be added to ProjectContext
in run.py
.
Note:kedro.framework.cli.get_project_context(key)
, wherekey
isget_config
,create_catalog
,create_pipeline
,template_version
,project_name
andproject_path
, is deprecated as ofKedro 0.15.0
, and will be removed for future versions.
Initialisation¶
If the plugin initialisation needs to occur prior to Kedro starting, it can declare the entry_point
key kedro.init
. This entry point must refer to a function that currently has no arguments, but for future proofing you should declare it with **kwargs
.
global
and project
commands¶
Plugins may also add commands to the Kedro CLI, which supports two types of commands:
- global - available both inside and outside a Kedro project. Global commands use the
entry_point
keykedro.global_commands
. - project - available only when a Kedro project is detected in the current directory. Project commands use the
entry_point
keykedro.project_commands
.
Suggested command convention¶
We use the following command convention: kedro <plugin-name> <command>
, with kedro <plugin-name>
acting as a top-level command group. This is our suggested way of structuring your plugin bit it is not necessary for your plugin to work.
Hooks¶
You can develop hook implementations and have them automatically registered to the ProjectContext
when the plugin is installed. To enable this for your custom plugin, simply add the following entry in your setup.py
:
setup(
...
entry_points={"kedro.hooks": ["plugin_name = plugin_name.plugin:hooks"]},
)
where plugin.py
is the module where you declare hook implementations:
import logging
from kedro.framework.hooks import hook_impl
class MyHooks:
@hook_impl
def after_catalog_created(self, catalog): # pylint: disable=unused-argument
logging.info("Reached after_catalog_created hook")
hooks = MyHooks()
Note: Here,hooks
should be an instance of the class defining the hooks.
Contributing process¶
When you are ready to submit your code:
- Create a separate repository using our naming convention for
plugin
s (kedro-<plugin-name>
) - Choose a command approach:
global
and / orproject
commands:- All
global
commands should be provided as a singleclick
group - All
project
commands should be provided as anotherclick
group - The
click
groups are declared through thepkg_resources
entry_point system
- All
- Include a
README.md
describing your plugin’s functionality and all dependencies that should be included - Use GitHub tagging to tag your plugin as a
kedro-plugin
so that we can find it
Note: In future, we will feature a list of “Plugins by Contributors”. Your plugin needs to have an Apache 2.0 compatible license to be considered for this list.
Supported Kedro plugins¶
- Kedro-Docker, a tool for packaging and shipping Kedro projects within containers
- Kedro-Airflow, a tool for converting your Kedro project into an Airflow project
- Kedro-Viz, a tool for visualising your Kedro pipelines
Community-developed plugins¶
see the full list of plugins listed on the GitHub tag kedro-hook
- Kedro-Pandas-Profiling by Justin Malloy, a simple plugin that uses Pandas-Profiling to profile datasets in the Kedro catalog
- find-kedro by Waylon Walker, automatically construct pipelines using pytest style pattern matching
- kedro-static-viz by Waylon Walker, generates a static kedro viz site (html, css, js)