Packaging a project¶
In this section, you will learn how to build your project documentation, as well as how to bundle your project into a Python package for handover.
Add documentation to your project¶
While Kedro documentation can be found by running kedro docs
from the command line, project-specific documentation can be generated by running kedro build-docs
in the project’s root directory.
This will create documentation based on the code structure of your project. Documentation will also include the docstrings
defined in the project code. The resulting HTML files can be found in docs/build/html/
.
kedro build-docs
uses the Sphinx framework to build your project documentation, so if you want to customise it, please refer to docs/source/conf.py
and the corresponding section of the Sphinx documentation.
Package your project¶
You can package your project by running kedro package
from the command line. This will create one .egg
file and one .whl
file within the src/dist/
folder of your project, which are Python packaging formats for binary distribution. For further information about packaging for Python, documentation is provided here.
After packaging your project, you can move the .egg
and .whl
files to your execution environment and install them accordingly using pip install <path/to/your/wheel/file>
. For example, if you name your project as kedro-spaceflights
and your package kedro_spaceflights
, this pip
installation will allow you to run your Kedro project with python -m kedro_spaceflights.run
. There is also an executable kedro-spaceflights
located in the bin
directory of your Python installation location.
Please note that this packaging method only contains Python source code of your Kedro pipeline, not any of the conf/
, data/
and logs/
directories. To successfully run the packaged project, you still need to be inside a directory that contain these sub-directories. This allows you to distribute the same source code but run it with different configuration, data and logging location in different environments.
Note:data/
folder is optional if your pipeline(s) don’t load or save any local data.
Manage project dependencies¶
Ensuring that you have accounted for all Python package versions that your project relies on encourages reproducibility of your Kedro project. Use the kedro build-reqs
CLI command to pin package versions. It works by taking a requirements.in
file (or requirements.txt
if the first one does not exist), resolving all package versions using pip compile and freezing them by putting pinned versions back into requirements.txt
. It significantly reduces the chances of dependencies issues due to downstream changes as you would always install the same package versions using kedro install
.
Extend your project¶
- You can also check out Kedro-Docker, an officially supported Kedro plugin for packaging and shipping Kedro projects within Docker containers.
- We also support converting your Kedro project into an Airflow project with the Kedro-Airflow plugin.
What is next?¶
You have now successfully built a project along with its documentation and packaged it using one of standard Python distribution formats. You may choose to open-source your project and make it available to a wider community of users and contributors. For further steps we advise you to consult this GitHub guide, PyPI help, a Read the Docs tutorial, and a guide on Open Source Licenses & Standards.