Guidelines for contributing developers¶
This page explains the principles and development process that we ask contributing developers to follow.
Any contributions you make will be under the Apache 2.0 Software License
In short, when you submit code changes, your submissions are understood to be under the same the Apache 2.0 License that covers the Kedro project. You should have permission to share the submitted code.
Note
You don’t need to contribute code to help the Kedro project. See our list of other ways you can contribute to Kedro.
Introduction¶
This guide is a practical description of:
How to set up your development environment to contribute to Kedro.
How to prepare a pull request against the Kedro repository.
Before you start: development set up¶
To work on the Kedro codebase, you will need to be set up with Git, and Make.
Note
If your development environment is Windows, you can use the win_setup_conda
and win_setup_env
commands from Circle CI configuration to guide you in the correct way to do this.
You will also need to create and activate virtual environment. If this is unfamiliar to you, read through our pre-requisites documentation.
Next, you’ll need to fork the Kedro source code from the Github repository:
Fork the project by clicking Fork in the top-right corner of the Kedro GitHub repository
Choose your target account
If you need further guidance, consult the Github documentation about forking a repo.
You are almost ready to go. In your terminal, navigate to the folder into which you forked the Kedro code.
Run these commands to install everything you need to work with Kedro:
make install-test-requirements
make install-pre-commit
Once the above commands have executed successfully, do a sanity check to ensure that kedro
works in your environment:
make test
Note
If the tests in tests/extras/datasets/spark
are failing, and you are not planning to work on Spark related features, then you can run a reduced test suite that excludes them. Do this by executing make test-no-spark
.
Get started: areas of contribution¶
Once you are ready to contribute, a good place to start is to take a look at the good first issues
and help wanted issues
on GitHub.
We focus on three areas for contribution: core
, extras
and plugin
:
core
refers to the primary Kedro library. Read thecore
contribution process for details.extras
refers to features that could be added tocore
that do not introduce too many dependencies or require new Kedro CLI commands to be created e.g. adding a new dataset to thekedro.extras.dataset
data management module. All the datasets are placed underkedro.extras.datasets
to separate heavy dependencies (e.g Pandas) from Kedrocore
components. Read theextras
contribution process for more information.plugin refers to new functionality that requires a Kedro CLI command e.g. adding in Airflow functionality. The plugin development documentation contains guidance on how to design and develop a Kedro
plugin
.
core
contribution process¶
Typically, we only accept small contributions to the core
Kedro library but we accept new features as plugins or additions to the extras
module.
To contribute:
Create a feature branch on your forked repository and push all your local changes to that feature branch.
Is your change non-breaking and backwards-compatible? Your feature branch should branch off from:
main
if you intend for it to be a non-breaking, backwards-compatible change.develop
if you intend for it to be a breaking change.
Before you submit a pull request (PR), please ensure that unit, end-to-end (E2E) tests and linting are passing for your changes by running
make test
,make e2e-tests
andmake lint
locally, have a look at the section Running checks locally below.Open a PR:
- For backwards compatible changes, open a PR against the
quantumblacklabs:main
branch from your feature branch. - For changes that are NOT backwards compatible, open a PR against the
quantumblacklabs:develop
branch from your feature branch.
- For backwards compatible changes, open a PR against the
Await reviewer comments.
Update the PR according to the reviewer’s comments.
Your PR will be merged by the Kedro team once all the comments are addressed.
Note
We will work with you to complete your contribution but we reserve the right to take over abandoned PRs.
extras
contribution process¶
You can add new work to extras
if you do not need to create a new Kedro CLI command:
Create an issue describing your contribution.
Work in
extras
and create a feature branch on your forked repository and push all your local changes to that feature branch.Before you submit a pull request, please ensure that unit, E2E tests and linting are passing for your changes by running
make test
,make e2e-tests
andmake lint
locally, have a look at the section Running checks locally below.Include a
README.md
with instructions on how to use your contribution.Is your change non-breaking and backwards-compatible?
- For backwards compatible changes, open a PR against the
quantumblacklabs:main
branch from your feature branch. - For changes that are NOT backwards compatible, open a PR against the
quantumblacklabs:develop
branch from your feature branch.
- For backwards compatible changes, open a PR against the
Reference your issue in the PR description (e.g.,
Resolves #<issue-number>
).Await review comments, then update the PR according to the reviewer’s comments.
Your PR will be merged by the Kedro team once all the comments are addressed.
Note
We will work with you to complete your contribution but we reserve the right to take over abandoned PRs.
Note
There are two special considerations when contributing a dataset:
Add the dataset to
kedro.extras.datasets.rst
so it shows up in the API documentation.Add the dataset to
static/jsonschema/kedro-catalog-X.json
for IDE validation.
Create a pull request¶
Create your pull request with a descriptive title. Before you submit it, consider the following:
You should aim for cross-platform compatibility on Windows, macOS and Linux
We use SemVer for versioning
We have designed our code to be compatible with Python 3.6 onwards and our style guidelines are (in cascading order):
PEP 8 conventions for all Python code
Google docstrings for code comments
PEP 484 type hints for all user-facing functions / class methods e.g.
def count_truthy(elements: List[Any]) -> int:
return sum(1 for elem in elements if element)
Ensure that your PR builds cleanly before you submit it, by running the CI/CD checks locally, as follows:
To run E2E tests you need to install the test requirements which includes behave
.
We also use pre-commit hooks for the repository to run the checks automatically.
Note
If Spark/PySpark/Hive tests for datasets are failing it might be due to the lack of Java>8 support from Spark. You can try using export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
which works under macOS or other workarounds.
PEP-8 Standards (pylint
and flake8
)¶
make lint
Unit tests, 100% coverage (pytest
, pytest-cov
)¶
You need the dependencies from test_requirements.txt
installed.
make test
Note
We place conftest.py files in some test directories to make fixtures reusable by any tests in that directory. If you need to see which test fixtures are available and where they come from, you can issue the following command pytest --fixtures path/to/the/test/location.py
.
E2E tests (behave
)¶
behave
Others¶
Our CI / CD also checks that kedro
installs cleanly on a fresh Python virtual environment, a task which depends on successfully building the documentation:
make build-docs
Note
This command will only work on Unix-like systems and requires pandoc
to be installed.
Hints on pre-commit usage¶
The checks will automatically run on all the changed files on each commit.
Even more extensive set of checks (including the heavy set of pylint
checks)
will run before the push.
The pre-commit/pre-push checks can be omitted by running with --no-verify
flag, as per below:
git commit --no-verify <...>
git push --no-verify <...>
(-n
alias works for git commit
, but not for git push
)
All checks will run during CI build, so skipping checks on push will not allow you to merge your code with failing checks.
You can uninstall the pre-commit hooks by running:
make uninstall-pre-commit
pre-commit
will still be used by make lint
, but will not install the git hooks.
Need help?¶
Working on your first pull request? You can learn how from these resources:
Please check the Q&A on GitHub discussions and ask any new questions about the development process there too!