Directory Structure:

└── nebari
    ├── .github
    │   ├── actions
    │   │   ├── health-check
    │   │   │   └── action.yml
    │   │   ├── init-local
    │   │   │   └── action.yml
    │   │   ├── publish-from-template
    │   │   │   ├── action.yml
    │   │   │   └── render_template.py
    │   │   └── setup-local
    │   │       └── action.yml
    │   ├── failed-workflow-issue-templates
    │   │   └── test-provider.md
    │   ├── ISSUE_TEMPLATE
    │   │   ├── bug-report.yml
    │   │   ├── config.yml
    │   │   ├── feature-request.yml
    │   │   ├── general-issue.yml
    │   │   ├── release-checklist.md
    │   │   └── testing-checklist.md
    │   ├── workflows
    │   │   ├── generate_cli_doc.yml
    │   │   ├── release-notes-sync.yaml
    │   │   ├── release.yaml
    │   │   ├── run-precommit.yaml
    │   │   ├── test_aws_integration.yaml
    │   │   ├── test_azure_integration.yaml
    │   │   ├── test_conda_build.yaml
    │   │   ├── test_gcp_integration.yaml
    │   │   ├── test_helm_charts.yaml
    │   │   ├── test_local_integration.yaml
    │   │   ├── test_local_upgrade.yaml
    │   │   ├── test-provider.yaml
    │   │   ├── test.yaml
    │   │   ├── trivy.yml
    │   │   └── typing.yaml
    │   ├── PULL_REQUEST_TEMPLATE.md
    │   └── release-notes-sync-config.yaml
    ├── scripts
    │   ├── aws-force-destroy.py
    │   ├── helm-validate.py
    │   └── keycloak-export.py
    ├── src
    │   ├── _nebari
    │   │   ├── provider
    │   │   │   ├── cicd
    │   │   │   │   ├── __init__.py
    │   │   │   │   ├── common.py
    │   │   │   │   ├── github.py
    │   │   │   │   ├── gitlab.py
    │   │   │   │   └── linter.py
    │   │   │   ├── cloud
    │   │   │   │   ├── __init__.py
    │   │   │   │   ├── amazon_web_services.py
    │   │   │   │   ├── azure_cloud.py
    │   │   │   │   ├── commons.py
    │   │   │   │   └── google_cloud.py
    │   │   │   ├── dns
    │   │   │   │   ├── __init__.py
    │   │   │   │   └── cloudflare.py
    │   │   │   ├── oauth
    │   │   │   │   ├── __init__.py
    │   │   │   │   └── auth0.py
    │   │   │   ├── __init__.py
    │   │   │   ├── git.py
    │   │   │   ├── helm.py
    │   │   │   ├── kubernetes.py
    │   │   │   ├── kustomize.py
    │   │   │   └── opentofu.py
    │   │   ├── stages
    │   │   │   ├── bootstrap
    │   │   │   │   └── __init__.py
    │   │   │   ├── infrastructure
    │   │   │   │   ├── template
    │   │   │   │   │   ├── aws
    │   │   │   │   │   │   ├── modules
    │   │   │   │   │   │   │   ├── accounting
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   ├── efs
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   ├── kubernetes
    │   │   │   │   │   │   │   │   ├── files
    │   │   │   │   │   │   │   │   │   └── user_data.tftpl
    │   │   │   │   │   │   │   │   ├── autoscaling.tf
    │   │   │   │   │   │   │   │   ├── locals.tf
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   │   │   ├── policy.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   ├── network
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   └── registry
    │   │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │   │       ├── outputs.tf
    │   │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   │   ├── locals.tf
    │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   │   └── versions.tf
    │   │   │   │   │   ├── azure
    │   │   │   │   │   │   ├── modules
    │   │   │   │   │   │   │   ├── kubernetes
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   └── registry
    │   │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   ├── providers.tf
    │   │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   │   └── versions.tf
    │   │   │   │   │   ├── existing
    │   │   │   │   │   │   └── main.tf
    │   │   │   │   │   ├── gcp
    │   │   │   │   │   │   ├── modules
    │   │   │   │   │   │   │   ├── kubernetes
    │   │   │   │   │   │   │   │   ├── templates
    │   │   │   │   │   │   │   │   │   └── kubeconfig.yaml
    │   │   │   │   │   │   │   │   ├── locals.tf
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   │   │   ├── service_account.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   ├── network
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   └── registry
    │   │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   │   └── versions.tf
    │   │   │   │   │   └── local
    │   │   │   │   │       ├── main.tf
    │   │   │   │   │       ├── metallb.yaml
    │   │   │   │   │       ├── outputs.tf
    │   │   │   │   │       └── variables.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_ingress
    │   │   │   │   ├── template
    │   │   │   │   │   ├── modules
    │   │   │   │   │   │   └── kubernetes
    │   │   │   │   │   │       └── ingress
    │   │   │   │   │   │           ├── main.tf
    │   │   │   │   │   │           ├── outputs.tf
    │   │   │   │   │   │           └── variables.tf
    │   │   │   │   │   ├── locals.tf
    │   │   │   │   │   ├── main.tf
    │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   └── versions.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_initialize
    │   │   │   │   ├── template
    │   │   │   │   │   ├── modules
    │   │   │   │   │   │   ├── cluster-autoscaler
    │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   ├── extcr
    │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   ├── initialization
    │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   ├── nvidia-installer
    │   │   │   │   │   │   │   ├── aws-nvidia-installer.tf
    │   │   │   │   │   │   │   ├── gcp-nvidia-installer.tf
    │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   └── traefik_crds
    │   │   │   │   │   │       └── main.tf
    │   │   │   │   │   ├── external-container-registry.tf
    │   │   │   │   │   ├── locals.tf
    │   │   │   │   │   ├── main.tf
    │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   └── versions.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_keycloak
    │   │   │   │   ├── template
    │   │   │   │   │   ├── modules
    │   │   │   │   │   │   └── kubernetes
    │   │   │   │   │   │       └── keycloak-helm
    │   │   │   │   │   │           ├── main.tf
    │   │   │   │   │   │           ├── outputs.tf
    │   │   │   │   │   │           ├── values.yaml
    │   │   │   │   │   │           └── variables.tf
    │   │   │   │   │   ├── main.tf
    │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   └── versions.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_keycloak_configuration
    │   │   │   │   ├── template
    │   │   │   │   │   ├── main.tf
    │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   ├── permissions.tf
    │   │   │   │   │   ├── providers.tf
    │   │   │   │   │   ├── social_auth.tf
    │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   └── versions.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_kuberhealthy
    │   │   │   │   ├── template
    │   │   │   │   │   └── values.yaml
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_kuberhealthy_healthchecks
    │   │   │   │   ├── template
    │   │   │   │   │   └── base
    │   │   │   │   │       ├── conda-store-healthcheck.yaml
    │   │   │   │   │       ├── jupyterhub-healthcheck.yaml
    │   │   │   │   │       └── keycloak-healthcheck.yaml
    │   │   │   │   └── __init__.py
    │   │   │   ├── kubernetes_services
    │   │   │   │   ├── template
    │   │   │   │   │   ├── modules
    │   │   │   │   │   │   └── kubernetes
    │   │   │   │   │   │       ├── cephfs-mount
    │   │   │   │   │   │       │   ├── main.tf
    │   │   │   │   │   │       │   ├── outputs.tf
    │   │   │   │   │   │       │   └── variables.tf
    │   │   │   │   │   │       ├── forwardauth
    │   │   │   │   │   │       │   ├── main.tf
    │   │   │   │   │   │       │   ├── outputs.tf
    │   │   │   │   │   │       │   └── variables.tf
    │   │   │   │   │   │       ├── nfs-mount
    │   │   │   │   │   │       │   ├── main.tf
    │   │   │   │   │   │       │   ├── outputs.tf
    │   │   │   │   │   │       │   └── variables.tf
    │   │   │   │   │   │       ├── nfs-server
    │   │   │   │   │   │       │   ├── main.tf
    │   │   │   │   │   │       │   ├── output.tf
    │   │   │   │   │   │       │   └── variables.tf
    │   │   │   │   │   │       └── services
    │   │   │   │   │   │           ├── argo-workflows
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── values.yaml
    │   │   │   │   │   │           │   ├── variables.tf
    │   │   │   │   │   │           │   └── versions.tf
    │   │   │   │   │   │           ├── conda-store
    │   │   │   │   │   │           │   ├── config
    │   │   │   │   │   │           │   │   └── conda_store_config.py
    │   │   │   │   │   │           │   ├── output.tf
    │   │   │   │   │   │           │   ├── server.tf
    │   │   │   │   │   │           │   ├── shared-pvc.tf
    │   │   │   │   │   │           │   ├── storage.tf
    │   │   │   │   │   │           │   ├── variables.tf
    │   │   │   │   │   │           │   └── worker.tf
    │   │   │   │   │   │           ├── dask-gateway
    │   │   │   │   │   │           │   ├── files
    │   │   │   │   │   │           │   │   ├── controller_config.py
    │   │   │   │   │   │           │   │   └── gateway_config.py
    │   │   │   │   │   │           │   ├── controller.tf
    │   │   │   │   │   │           │   ├── crds.tf
    │   │   │   │   │   │           │   ├── gateway.tf
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── middleware.tf
    │   │   │   │   │   │           │   ├── outputs.tf
    │   │   │   │   │   │           │   └── variables.tf
    │   │   │   │   │   │           ├── jupyterhub
    │   │   │   │   │   │           │   ├── files
    │   │   │   │   │   │           │   │   ├── ipython
    │   │   │   │   │   │           │   │   │   └── ipython_config.py
    │   │   │   │   │   │           │   │   ├── jupyter
    │   │   │   │   │   │           │   │   │   ├── jupyter_jupyterlab_pioneer_config.py.tpl
    │   │   │   │   │   │           │   │   │   └── jupyter_server_config.py.tpl
    │   │   │   │   │   │           │   │   ├── jupyterhub
    │   │   │   │   │   │           │   │   │   ├── 01-theme.py
    │   │   │   │   │   │           │   │   │   ├── 02-spawner.py
    │   │   │   │   │   │           │   │   │   ├── 03-profiles.py
    │   │   │   │   │   │           │   │   │   └── 04-auth.py
    │   │   │   │   │   │           │   │   └── jupyterlab
    │   │   │   │   │   │           │   │       └── overrides.json
    │   │   │   │   │   │           │   ├── configmaps.tf
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── middleware.tf
    │   │   │   │   │   │           │   ├── outputs.tf
    │   │   │   │   │   │           │   ├── values.yaml
    │   │   │   │   │   │           │   └── variables.tf
    │   │   │   │   │   │           ├── jupyterhub-ssh
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── sftp.tf
    │   │   │   │   │   │           │   ├── ssh.tf
    │   │   │   │   │   │           │   └── variables.tf
    │   │   │   │   │   │           ├── keycloak-client
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── outputs.tf
    │   │   │   │   │   │           │   ├── variables.tf
    │   │   │   │   │   │           │   └── versions.tf
    │   │   │   │   │   │           ├── minio
    │   │   │   │   │   │           │   ├── ingress.tf
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── outputs.tf
    │   │   │   │   │   │           │   ├── values.yaml
    │   │   │   │   │   │           │   └── variables.tf
    │   │   │   │   │   │           ├── monitoring
    │   │   │   │   │   │           │   ├── dashboards
    │   │   │   │   │   │           │   │   └── Main
    │   │   │   │   │   │           │   │       ├── cluster_information.json
    │   │   │   │   │   │           │   │       ├── conda_store.json
    │   │   │   │   │   │           │   │       ├── jupyterhub_dashboard.json
    │   │   │   │   │   │           │   │       ├── keycloak.json
    │   │   │   │   │   │           │   │       ├── traefik.json
    │   │   │   │   │   │           │   │       └── usage_report.json
    │   │   │   │   │   │           │   ├── loki
    │   │   │   │   │   │           │   │   ├── main.tf
    │   │   │   │   │   │           │   │   ├── values_loki.yaml
    │   │   │   │   │   │           │   │   ├── values_minio.yaml
    │   │   │   │   │   │           │   │   ├── values_promtail.yaml
    │   │   │   │   │   │           │   │   └── variables.tf
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── values.yaml
    │   │   │   │   │   │           │   ├── variables.tf
    │   │   │   │   │   │           │   └── versions.tf
    │   │   │   │   │   │           ├── postgresql
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── outputs.tf
    │   │   │   │   │   │           │   ├── values.yaml
    │   │   │   │   │   │           │   └── variables.tf
    │   │   │   │   │   │           ├── redis
    │   │   │   │   │   │           │   ├── main.tf
    │   │   │   │   │   │           │   ├── outputs.tf
    │   │   │   │   │   │           │   ├── values.yaml
    │   │   │   │   │   │           │   └── variables.tf
    │   │   │   │   │   │           └── rook-ceph
    │   │   │   │   │   │               ├── cluster-values.yaml.tftpl
    │   │   │   │   │   │               ├── main.tf
    │   │   │   │   │   │               ├── operator-values.yaml
    │   │   │   │   │   │               ├── variables.tf
    │   │   │   │   │   │               └── versions.tf
    │   │   │   │   │   ├── argo-workflows.tf
    │   │   │   │   │   ├── conda-store.tf
    │   │   │   │   │   ├── dask_gateway.tf
    │   │   │   │   │   ├── forward-auth.tf
    │   │   │   │   │   ├── jupyterhub_ssh.tf
    │   │   │   │   │   ├── jupyterhub.tf
    │   │   │   │   │   ├── locals.tf
    │   │   │   │   │   ├── monitoring.tf
    │   │   │   │   │   ├── outputs.tf
    │   │   │   │   │   ├── providers.tf
    │   │   │   │   │   ├── rook-ceph.tf
    │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   └── versions.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── nebari_tf_extensions
    │   │   │   │   ├── template
    │   │   │   │   │   ├── modules
    │   │   │   │   │   │   ├── helm-extensions
    │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   └── nebariextension
    │   │   │   │   │   │       ├── ingress.tf
    │   │   │   │   │   │       ├── keycloak-config.tf
    │   │   │   │   │   │       ├── locals.tf
    │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   ├── helm-extension.tf
    │   │   │   │   │   ├── nebari-config.tf
    │   │   │   │   │   ├── providers.tf
    │   │   │   │   │   ├── tf-extensions.tf
    │   │   │   │   │   ├── variables.tf
    │   │   │   │   │   └── versions.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── terraform_state
    │   │   │   │   ├── template
    │   │   │   │   │   ├── aws
    │   │   │   │   │   │   ├── modules
    │   │   │   │   │   │   │   └── terraform-state
    │   │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │   │       ├── output.tf
    │   │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   │   └── main.tf
    │   │   │   │   │   ├── azure
    │   │   │   │   │   │   ├── modules
    │   │   │   │   │   │   │   └── terraform-state
    │   │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   │   └── main.tf
    │   │   │   │   │   ├── existing
    │   │   │   │   │   │   └── main.tf
    │   │   │   │   │   ├── gcp
    │   │   │   │   │   │   ├── modules
    │   │   │   │   │   │   │   ├── gcs
    │   │   │   │   │   │   │   │   ├── main.tf
    │   │   │   │   │   │   │   │   └── variables.tf
    │   │   │   │   │   │   │   └── terraform-state
    │   │   │   │   │   │   │       ├── main.tf
    │   │   │   │   │   │   │       └── variables.tf
    │   │   │   │   │   │   └── main.tf
    │   │   │   │   │   └── local
    │   │   │   │   │       └── main.tf
    │   │   │   │   └── __init__.py
    │   │   │   ├── __init__.py
    │   │   │   ├── base.py
    │   │   │   └── tf_objects.py
    │   │   ├── subcommands
    │   │   │   ├── __init__.py
    │   │   │   ├── deploy.py
    │   │   │   ├── destroy.py
    │   │   │   ├── dev.py
    │   │   │   ├── info.py
    │   │   │   ├── init.py
    │   │   │   ├── keycloak.py
    │   │   │   ├── plugin.py
    │   │   │   ├── render.py
    │   │   │   ├── support.py
    │   │   │   ├── upgrade.py
    │   │   │   └── validate.py
    │   │   ├── __init__.py
    │   │   ├── cli.py
    │   │   ├── config_set.py
    │   │   ├── config.py
    │   │   ├── constants.py
    │   │   ├── deploy.py
    │   │   ├── deprecate.py
    │   │   ├── destroy.py
    │   │   ├── initialize.py
    │   │   ├── keycloak.py
    │   │   ├── render.py
    │   │   ├── upgrade.py
    │   │   ├── utils.py
    │   │   └── version.py
    │   └── nebari
    │       ├── __init__.py
    │       ├── __main__.py
    │       ├── hookspecs.py
    │       ├── plugins.py
    │       └── schema.py
    ├── tests
    │   ├── common
    │   │   ├── __init__.py
    │   │   ├── conda_store_utils.py
    │   │   ├── config_mod_utils.py
    │   │   ├── handlers.py
    │   │   ├── kube_api.py
    │   │   ├── navigator.py
    │   │   └── playwright_fixtures.py
    │   ├── tests_deployment
    │   │   ├── __init__.py
    │   │   ├── conftest.py
    │   │   ├── constants.py
    │   │   ├── keycloak_utils.py
    │   │   ├── test_conda_store_roles_loaded.py
    │   │   ├── test_dask_gateway.py
    │   │   ├── test_grafana_api.py
    │   │   ├── test_jupyterhub_api.py
    │   │   ├── test_jupyterhub_ssh.py
    │   │   ├── test_loki_deployment.py
    │   │   └── utils.py
    │   ├── tests_e2e
    │   │   └── playwright
    │   │       ├── .env.tpl
    │   │       ├── README.md
    │   │       └── test_playwright.py
    │   ├── tests_integration
    │   │   ├── __init__.py
    │   │   ├── conftest.py
    │   │   ├── deployment_fixtures.py
    │   │   ├── README.md
    │   │   ├── test_all_clouds.py
    │   │   ├── test_gpu.py
    │   │   └── test_preemptible.py
    │   ├── tests_unit
    │   │   ├── cli_validate
    │   │   │   ├── aws.error.kubernetes-version.yaml
    │   │   │   ├── aws.happy.yaml
    │   │   │   ├── azure.happy.yaml
    │   │   │   ├── gcp.happy.yaml
    │   │   │   ├── local.error.authentication-type-custom.yaml
    │   │   │   ├── local.error.extra-inputs.yaml
    │   │   │   ├── local.error.project_name.ends_with_special.yaml
    │   │   │   ├── local.error.project_name.starts_with_number.yaml
    │   │   │   ├── local.error.project_name.too_long.yaml
    │   │   │   ├── local.happy.auth0.yaml
    │   │   │   ├── local.happy.github.yaml
    │   │   │   ├── local.happy.project_name.with_numbers.yaml
    │   │   │   ├── local.happy.yaml
    │   │   │   ├── min.happy.jupyterlab.default_settings.yaml
    │   │   │   ├── min.happy.jupyterlab.gallery_settings.yaml
    │   │   │   ├── min.happy.monitoring.overrides.yaml
    │   │   │   └── min.happy.yaml
    │   │   ├── qhub-config-yaml-files-for-upgrade
    │   │   │   ├── qhub-config-aws-310-customauth.yaml
    │   │   │   ├── qhub-config-aws-310.yaml
    │   │   │   └── qhub-users-import.json
    │   │   ├── __init__.py
    │   │   ├── conftest.py
    │   │   ├── test_cli_deploy.py
    │   │   ├── test_cli_dev.py
    │   │   ├── test_cli_init_repository.py
    │   │   ├── test_cli_init.py
    │   │   ├── test_cli_keycloak.py
    │   │   ├── test_cli_plugin.py
    │   │   ├── test_cli_support.py
    │   │   ├── test_cli_upgrade.py
    │   │   ├── test_cli_validate.py
    │   │   ├── test_cli.py
    │   │   ├── test_commons.py
    │   │   ├── test_config_set.py
    │   │   ├── test_config.py
    │   │   ├── test_init.py
    │   │   ├── test_links.py
    │   │   ├── test_render.py
    │   │   ├── test_schema.py
    │   │   ├── test_stages.py
    │   │   ├── test_upgrade.py
    │   │   ├── test_utils.py
    │   │   └── utils.py
    │   ├── __init__.py
    │   ├── conftest.py
    │   └── utils.py
    ├── .cirun.yml
    ├── .pre-commit-config.yaml
    ├── CODE_OF_CONDUCT.md
    ├── CONTRIBUTING.md
    ├── pyproject.toml
    ├── pytest.ini
    ├── README.md
    ├── RELEASE.md
    └── SECURITY.md



---
File: nebari/.github/actions/health-check/action.yml
---

name: health-check
description: "Check health of Nebari deployment"

inputs:
  domain:
    description: Domain name
    required: true

runs:
  using: composite

  steps:
    - name: List kubernetes components
      shell: bash
      run: kubectl get --all-namespaces all,cm,secret,pv,pvc,ing

    - name: Check if JupyterHub login page is accessible
      shell: bash
      run: curl --insecure --include 'https://${{ inputs.domain }}/hub/home'



---
File: nebari/.github/actions/init-local/action.yml
---

name: init-local
description: "Initialize Nebari config for local deployment"

inputs:
  directory:
    description: "Path to directory to initialize in"
    required: false
    default: './local-deployment'

outputs:
  directory:
    description: "Path to config directory"
    value: ${{ steps.metadata.outputs.directory }}
  config:
    description: "Path to Nebari config"
    value: ${{ steps.metadata.outputs.config }}
  project:
    description: "Project name"
    value: ${{ steps.metadata.outputs.project }}
  domain:
    description: "Domain name"
    value: ${{ steps.metadata.outputs.domain }}

runs:
  using: composite

  steps:
    - shell: bash
      id: metadata
      run: |
        # Setup metadata
        DIRECTORY=$(realpath '${{ inputs.directory }}')
        mkdir --parents "${DIRECTORY}"
        echo "directory=${DIRECTORY}" | tee --append "${GITHUB_OUTPUT}"

        CONFIG="${DIRECTORY}/nebari-config.yaml"
        echo "config=${CONFIG}" | tee --append "${GITHUB_OUTPUT}"

        PROJECT='github-actions'
        echo "project=${PROJECT}" | tee --append "${GITHUB_OUTPUT}"

        DOMAIN='github-actions.nebari.dev'
        nslookup "${DOMAIN}"
        echo "domain=${DOMAIN}" | tee --append "${GITHUB_OUTPUT}"

    - shell: bash -l {0}
      id: init
      working-directory: ${{ steps.metadata.outputs.directory }}
      run: |
        nebari init local \
          --project-name '${{ steps.metadata.outputs.project }}' \
          --domain-name '${{ steps.metadata.outputs.domain }}' \
          --auth-provider password \
          --output '${{ steps.metadata.outputs.config }}'

    - shell: bash
      run: |
        # Update nebari config for CI

        # Change default JupyterLab theme
        cat >> '${{ steps.metadata.outputs.config }}' <<- EOM
        jupyterlab:
          default_settings:
            "@jupyterlab/apputils-extension:themes":
              theme: JupyterLab Dark
        EOM

        # Change default value for minio persistence size
        cat >> '${{ steps.metadata.outputs.config }}' <<- EOM
        monitoring:
          enabled: true
          overrides:
            minio:
              persistence:
                size: 1Gi
        EOM

    - shell: bash
      run: |
        # Display Nebari config
        cat '${{ steps.metadata.outputs.config }}'



---
File: nebari/.github/actions/publish-from-template/action.yml
---

name: publish-from-template
description: "Publish information from a template"

inputs:
  filename:
    description: Path to issue template. Usually in .github/issue-templates
    required: true

runs:
  using: composite

  steps:
    - name: Render template
      # Render template only in CI to make sure rendering on a schedule works as planned
      if: github.event_name != 'schedule'
      shell: bash
      env: ${{ env }}
      run:
        python ${{ github.action_path }}/render_template.py ${{inputs.filename }}

    - uses: JasonEtco/create-an-issue@v2
      # Only render template and create an issue in case the workflow is a scheduled one
      if: github.event_name == 'schedule'
      env: ${{ env }}
      with:
        filename: ${{ inputs.filename }}
        update_existing: false



---
File: nebari/.github/actions/publish-from-template/render_template.py
---

import os
import sys
from pathlib import Path

import jinja2


def main(template_path):
    loader = jinja2.FileSystemLoader(searchpath=template_path.parent)
    env = jinja2.Environment(loader=loader)
    template = env.get_template(template_path.name)
    print(template.render(env=os.environ))


if __name__ == "__main__":
    template_path = Path(sys.argv[1])
    main(template_path)



---
File: nebari/.github/actions/setup-local/action.yml
---

name: setup-local
description: "Setup runner for local deployment"

inputs:
  kubectl-version:
    description: "Version of kubectl to install"
    required: false
    default: "1.19.16"

runs:
  using: composite

  steps:
    - uses: azure/setup-kubectl@v4
      with:
        version: v${{ inputs.kubectl-version }}

    - shell: bash
      run: |
        # Enable docker permissions for user
        sudo docker ps
        sudo usermod -aG docker $USER && newgrp docker

        docker info
        docker ps

    - shell: bash
      run: |
        # Get routing table for docker pods
        ip route



---
File: nebari/.github/failed-workflow-issue-templates/test-provider.md
---

---
title: Provider CI jobs fail for {{ env.PROVIDER }} / {{ env.CICD }}
---

See https://github.com/{{ env.REPO }}/actions/runs/{{ env.ID }} for details.



---
File: nebari/.github/ISSUE_TEMPLATE/bug-report.yml
---

name: "Bug report 🐛"
description: "Create a report to help us reproduce and correct the bug"
title: "[BUG] - <title>"
labels: ["type: bug 🐛", "needs: triage 🚦"]

body:
  - type: markdown
    attributes:
      value: |
        # Welcome 👋

        Thanks for using Nebari and taking some time to contribute to this project.

        Please fill out each section below. This info allows Nebari maintainers to diagnose (and fix!) your issue as
        quickly as possible.
        Before submitting a bug, please make sure the issue hasn't been already addressed by searching through
        [the past issues](https://github.com/nebari-dev/nebari/issues).

        Useful links:

        - Documentation: https://www.nebari.dev
        - Contributing: https://www.nebari.dev/community/

  - type: textarea
    attributes:
      label: Describe the bug
      description: |
        A clear and concise description of what the bug is.
        We suggest using bullets (indicated by * or -).
      placeholder: Be as precise as you can.
    validations:
      required: true

  - type: textarea
    attributes:
      label: Expected behavior
      description: |
        A clear and concise description of what you expected to happen.
        We suggest using bullets (indicated by * or -).
    validations:
      required: true

  - type: input
    attributes:
      label: OS and architecture in which you are running Nebari
    validations:
      required: true

  - type: textarea
    attributes:
      label: How to Reproduce the problem?
      description: |
        Please provide a minimal code example to reproduce the error.
        Be as succinct as possible, and provide detailed step by step guidelines to reproduce the bug (using numbered items).
        If you have created a GitHub gist, you can paste the link in this box instead.
    validations:
      required: true

  - type: textarea
    attributes:
      label: Command output
      render: bash session
      description: |
        Provide the output of the steps above, including the commands
        themselves and any tracebacks/logs. If you're familiar with
        Markdown, this block will have triple backticks added automatically
        around it -- you don't have to add them.

        If you want to present output from multiple commands, please present
        that as a shell session (commands you run get prefixed with `$ `).
        Please also ensure that the "How to reproduce the problem?" section contains matching
        instructions for reproducing this.
    validations:
      required: false

  - type: textarea
    attributes:
      label: Versions and dependencies used.
      description: |
        Describe your environment:
        - Conda version (use `conda --version`)
        - Kubernetes version (use `kubectl version`)
        - Nebari version
    validations:
      required: false

  - type: dropdown
    attributes:
      label: Compute environment
      description: Are you using a public cloud provider or testing locally with minikube? Select the option that applies.
      multiple: false
      options:
        - "Azure"
        - "GCP"
        - "AWS"
        - "kind"
    validations:
      required: false

  - type: dropdown
    attributes:
      label: Integrations
      description: Is this issue related to any of the Nebari integrations?
      multiple: true
      options:
        - "Keycloak"
        - "conda-store"
        - "Dask"
        - "CDS dashboards"
        - "Grafana"
        - "Argo"

  - type: textarea
    attributes:
      label: Anything else?
      description: |
        Links? References? Anything that will give us more context about the issue you are encountering!

        Tip: You can attach images or log files by clicking this area to highlight it and then dragging files in.
    validations:
      required: false

  - type: markdown
    attributes:
      value: >
        Thanks for contributing 🎉!



---
File: nebari/.github/ISSUE_TEMPLATE/config.yml
---

blank_issues_enabled: false
contact_links:
  - name: Nebari Documentation
    url: https://www.nebari.dev/docs/
    about: Check out the Nebari documentation
  - name: Nebari Discussions - our user forum
    url: https://github.com/orgs/nebari-dev/discussions
    about: Ask questions, discuss RFDs and help other Nebari's users
  - name: Documentation issues 📖
    about: Did you find an error in our documentation? Report your findings here.
    url: https://github.com/nebari-dev/nebari-docs/issues/new/choose
  - name: (maintainers only) - Blank issue
    url: https://github.com/nebari-dev/nebari/issues/new
    about: For maintainers only - should be used sparingly



---
File: nebari/.github/ISSUE_TEMPLATE/feature-request.yml
---

name: "Feature request"
description: "Create a feature request to help us improve"
title: "[ENH] - <title>"
labels: ["type: enhancement"]

body:
  - type: markdown
    attributes:
      value: |
        Hi! Thanks for using Nebari and taking some time to contribute to Nebari.
  - type: textarea
    attributes:
      label: Feature description
      description: |
        Describe what you are proposing. Provide as much context as possible and link to related issues and/or pull requests.
        This section should contain "what" you are proposing.
        Are you having any problems? Briefly describe what your painpoints are. For example: "I'm always frustrated when ..."
    validations:
      required: true

  - type: textarea
    attributes:
      label: Value and/or benefit
      description: |
        What is the value in adding this feature, and who will benefit from it? Include any information that could help us prioritize the issue.
        This section should contain "why" this issue should be resolved.
        ✨ If this is for a new feature or enhancement, consider adding [user stories](https://www.atlassian.com/agile/project-management/user-stories).
    validations:
      required: true

  - type: textarea
    attributes:
      label: Anything else?
      description: |
        Links? References? Anything that will give us more context about the issue you are encountering!
        Tip: You can attach images or log files by clicking this area to highlight it and then dragging files in.
    validations:
      required: false



---
File: nebari/.github/ISSUE_TEMPLATE/general-issue.yml
---

name: "General issue 💡"
description: "A general template for many kinds of issues."
title: "<title>"
labels: ["needs: triage 🚦"]

body:
  - type: markdown
    attributes:
      value: |
        # Welcome 👋

        Thanks for using Nebari and taking some time to contribute to this project.

        Please fill out each section below. This info allows Nebari maintainers to diagnose (and fix!) your issue as
        quickly as possible.
        Before submitting a bug, please make sure the issue hasn't been already addressed by searching through
        the past issues in this repository.

        Useful links:

        - Documentation: https://www.nebari.dev
        - Contributing: https://www.nebari.dev/community/

  - type: textarea
    attributes:
      label: Context
      description: |
        Describe what you are proposing. Provide as much context as possible and link to related issues and/or pull requests.
        This section should contain "what" you are proposing.
        Are you having any problems? Briefly describe what your pain points are.
    validations:
      required: true

  - type: textarea
    attributes:
      label: Value and/or benefit
      description: |
        What is the value of adding this feature, and who will benefit from it? Include any information that could help us prioritize the issue.
        This section should contain "why" this issue should be resolved.
        ✨ If this is for a new feature or enhancement, consider adding [user stories](https://www.atlassian.com/agile/project-management/user-stories).
    validations:
      required: true

  - type: textarea
    attributes:
      label: Anything else?
      description: |
        Links? References? Anything that will give us more context about the issue you are encountering!

        Tip: You can attach images or log files by clicking this area to highlight it and then dragging files in.
    validations:
      required: false



---
File: nebari/.github/ISSUE_TEMPLATE/release-checklist.md
---

---
name: Release Checklist
about: For maintainers only.
title: "[RELEASE] <version>"
labels:
  - "type: release 🏷"
assignees: ""
---

# Release Checklist

## Release details

Scheduled release date - <yyyy/mm/dd>

Release captain responsible - <@gh_username>

## Starting point - a new release is out

- [x] Create _this_ issue to track and discuss the upcoming release.
- [ ] Use the previous release issue for any final release-specific discussions, then close.
  - This can be a good time to debrief and discuss improvements to the release process.

## Looking forward - planning

- [ ] [Create milestone for next release](https://github.com/nebari-dev/nebari/milestones) (if it doesn't already exist) and link it back here.
- [ ] Triage `bugs` to determine what be should included in the release and add it to the milestone.
- [ ] What new features, if any, will be included in the release and add it to the milestone.
  - This will be, in large part, determined by the roadmap.
  - Is there a focus for this release (i.e. UX/UI, stabilization, etc.)?

## Pre-release process

- [ ] Decide on a date for the release.
  - What outstanding issues need to be addressed?
  - Has documentation been updated appropriately?
  - Are there any breaking changes that should be highlighted?
  - Are there any upstream releases we are waiting on?
  - [Do we need to update the `dask` versions in the `nebari-dask`?](https://github.com/conda-forge/nebari-dask-feedstock/blob/main/recipe/meta.yaml#L13-L16)
  - Will there be an accompanying blog post?
- [ ] Prepare for the release.
  - [ ] Update the [`nebari upgrade`](https://github.com/nebari-dev/nebari/blob/main/src/_nebari/upgrade.py) for this release
    - [ ] Add upgrade messaging including deprecation warnings, version specific warnings and so on.
  - [ ] Optionally, announce a merge freeze.
  - [ ] Release Candidate (RC) cycle.
    - Is this a hotfix?
      - [ ] Create a new branch off of the last version tag.
        - Use this branch to cut the pre-release and the "official" release.
      - [ ] `git cherry-pick` the commits that should be included.
    - [ ] [Cut RC via GHA release workflow (w/ "This is a pre-release" checked).](https://github.com/nebari-dev/nebari/releases/new)
    - [ ] Perform end-to-end testing. [Use the Testing Checklist template.](https://github.com/nebari-dev/nebari/issues/new?assignees=&labels=type%3A+release+%F0%9F%8F%B7&template=testing-checklist.md&title=Testing+checklist+for+<version>)
      - For minor releases, relying on the end-to-end integration tests might suffice.
    - [ ] End-user validation.
      - If possible, pull in volunteers to help test.
      - (Repeat steps if necessary)
  - [ ] [Update `RELEASE.md` notes.](https://github.com/nebari-dev/nebari/blob/main/RELEASE.md)

## Cut the official release

_If there were changes to the following packages, handle their releases before cutting a new release for Nebari_
- [ ] [Cut PyPI release for `nebari-workflow-controller`](https://github.com/nebari-dev/nebari-workflow-controller)
- [ ] [Cut PyPI release for `argo-jupyter-scheduler`](https://github.com/nebari-dev/argo-jupyter-scheduler)

_These steps must be actioned in the order they appear in this checklist._

- [ ] [Tag, build and push docker images](https://github.com/nebari-dev/nebari-docker-images/releases/new)
- [ ] [Update and cut release for `nebari-dask` meta package on Conda-Forge.](https://github.com/conda-forge/nebari-dask-feedstock)
- [ ] Update `CURRENT_RELEASE` (and any other tags) in the [`constants.py`](https://github.com/nebari-dev/nebari/blob/main/src/_nebari/constants.py#L1)
- [ ] [Cut PyPI release via GHA release workflow.](https://github.com/nebari-dev/nebari/releases/new)
  - Avoid appending `v` to tag.
    - Copy release notes from `RELEASE.md`.
- [ ] [Merge automated release PR for `nebari` on Conda-Forge.](https://github.com/conda-forge/nebari-feedstock)
- [ ] Merge release branch into `main`



---
File: nebari/.github/ISSUE_TEMPLATE/testing-checklist.md
---

---
name: Testing Checklist
about: For maintainers only.
title: "Testing checklist for <version>"
labels:
  - "type: release 🏷"
assignees: ""
---

# Testing Checklist

_Use/modify this checklist to capture the platform's core services that need to be manually tested ._

## Manual testing: core services

If the integration tests for all of the cloud providers are successful, that is a good sign!
However, the following core services still need to be manually validated (until we can automate them).

At minimum, the following services will need to be tested:

- [ ] [Log into keycloak as root user](https://www.nebari.dev/docs/how-tos/configuring-keycloak/#change-keycloak-root-password)
  - [ ] [Add a user](https://www.nebari.dev/docs/how-tos/configuring-keycloak/#adding-a-nebari-user)
- [ ] [Log into conda-store and create](https://www.nebari.dev/docs/tutorials/creating-new-environments)
  - [ ] a conda environment in a shared namespace and,
  - [ ] a conda environment in your personal namespace
- [ ] [Launch dask-gateway cluster, test auto-scaler and](https://www.nebari.dev/docs/tutorials/using_dask)
  - [ ] [Validate that the dask-labextention is working](https://www.nebari.dev/docs/tutorials/using_dask/#step-4---understand-dasks-diagnostic-tools)
- [ ] [Confirm that a notebook can be submitted via Jupyter-Scheduler](https://nebari.dev/docs/tutorials/jupyter-scheduler)
- [ ] [Open VS-Code extension](https://www.nebari.dev/docs/how-tos/using-vscode)
  - [ ] [Add the Python extension](https://www.nebari.dev/docs/how-tos/using-vscode#adding-extensions)
  - [ ] [Create a `.py` file and run it](https://www.nebari.dev/docs/how-tos/using-vscode#running-python-code)



---
File: nebari/.github/workflows/generate_cli_doc.yml
---

name: Update API docs

on:
  pull_request:
    paths:
    - "src/_nebari/subcommands/**"
    - "src/_nebari/cli.py"
  push:
    branches:
      - main
    paths:
    - "src/_nebari/subcommands/**"
    - "src/_nebari/cli.py"
  workflow_dispatch:

jobs:
  update_api:
    permissions:
      contents: write
      pull-requests: write
    runs-on: ubuntu-latest
    defaults:
      run:
        shell: bash -l {0}
        working-directory: ./docs-sphinx
    steps:
      - name: Check out repository 🛎️
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.10"

      - name: Install nebari and docs dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -e "../[docs]"

      - name: Generate new API docs
        run: |
          make html

      - name: Copy cli doc
        run: |
          cp _build/html/cli.html cli.html

      - name: Look for changes to generated docs
        uses: tj-actions/verify-changed-files@v12
        id: verify-changed-files
        with:
          files: |
            docs-sphinx/cli.html

      - name: Create Pull Request in code repo
        id: create_pull_request
        uses: peter-evans/create-pull-request@v4
        if: steps.verify-changed-files.outputs.files_changed == 'true' && github.event_name != 'pull_request'
        with:
          token: ${{ secrets.NEBARI_SENSEI_API_DOCS_PR_OPENER }}
          commit-message: Update api docs
          committer: GitHub <noreply@github.com>
          author: ${{ github.actor }} <${{ github.actor }}@users.noreply.github.com>
          signoff: false
          branch: auto_cli_doc_update
          delete-branch: true
          title: '[AUTO] Update CLI doc'
          body: |
            Update CLI doc
            - Auto-generated by [create-pull-request][1]

            [1]: https://github.com/peter-evans/create-pull-request
          labels: |
            "area: documentation 📖"
          draft: false
          base: ${{ github.head_ref }}



---
File: nebari/.github/workflows/release-notes-sync.yaml
---

name: Sync release notes with nebari.dev/docs

on:
  release:
    types: [created]
  workflow_dispatch:

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@master
      - name: Run Release File Sync ♻️
        uses: BetaHuhn/repo-file-sync-action@v1
        with:
          GH_PAT: ${{ secrets.NEBARI_SENSEI_API_DOCS_PR_OPENER }}
          CONFIG_PATH: .github/release-notes-sync-config.yaml
          COMMIT_BODY: "MAINT - Sync release notes :robot:"
          PR_LABELS: |
            type: file sync ♻️



---
File: nebari/.github/workflows/release.yaml
---

name: Test & Publish PyPi release

on:
  release:
    types: [created]

jobs:
  test-pypi:
    name: Test PyPi release
    runs-on: ubuntu-latest
    permissions:
      id-token: write  # IMPORTANT: this permission is mandatory for trusted publishing
    steps:
      - name: Set up python
        uses: actions/setup-python@v5
        with:
          python-version: "3.10"

      - name: Upgrade pip
        run: python -m pip install --upgrade pip build

      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Create tag
        # if present, remove leading `v`
        run: |
          echo "NEBARI_TAG=$(git describe --tags | sed 's/^v//')" >> $GITHUB_ENV
          echo ${{ env.NEBARI_TAG }}

      - name: Build source and binary
        run: python -m build --sdist --wheel .

      - name: Publish to test PyPI
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          repository-url: https://test.pypi.org/legacy/

      - name: Sleep
        run: sleep 120

      - name: Test install from Test PyPI
        run: |
          pip install \
          --index-url https://test.pypi.org/simple/ \
          --extra-index-url https://pypi.org/simple \
          nebari==${{ env.NEBARI_TAG }}

  release-pypi:
    name: Publish Nebari on PyPi
    runs-on: ubuntu-latest
    needs: test-pypi
    permissions:
      id-token: write
      contents: read

    steps:
      - name: Set up python
        uses: actions/setup-python@v5
        with:
          python-version: "3.10"

      - name: Upgrade pip
        run: python -m pip install --upgrade pip build

      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Build source and binary
        run: python -m build --sdist --wheel .

      - name: Publish package
        uses: pypa/gh-action-pypi-publish@release/v1



---
File: nebari/.github/workflows/run-precommit.yaml
---

name: Run pre-commit

on:
  push:
    branches:
      - main
      - release/\d{4}.\d{1,2}.\d{1,2}
  pull_request:

jobs:
  pre-commit:
    if: github.event.pull_request.merged == false
    runs-on: ubuntu-latest
    defaults:
      run:
        shell: bash -l {0}
    steps:
      - name: Checkout repository 🔔
        uses: actions/checkout@v4.1.1

      - name: Setup python
        uses: actions/setup-python@v5
        with:
          python-version: '3.x'

      - name: Setup terraform
        uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.5.7"


      - name: Run terraform pre-commit ⚡️
        uses: pre-commit/action@v3.0.1
        with:
          extra_args: --all-files terraform_fmt



---
File: nebari/.github/workflows/test_aws_integration.yaml
---

name: AWS Deployment

on:
  schedule:
    - cron: "0 0 * * MON"
  workflow_dispatch:
    inputs:
      image-tag:
        description: 'Nebari image tag created by the nebari-docker-images repo'
        required: true
        default: main
        type: string
      tf-log-level:
        description: 'Change Terraform log levels'
        required: false
        default: info
        type: choice
        options:
        - info
        - warn
        - debug
        - trace
        - error


env:
  AWS_DEFAULT_REGION: "us-west-2"
  NEBARI_IMAGE_TAG: ${{ github.event.inputs.image-tag || 'main' }}
  TF_LOG: ${{ github.event.inputs.tf-log-level || 'info' }}

jobs:
  test-aws-integration:
    runs-on: ubuntu-latest
    if: ${{ vars.SKIP_AWS_INTEGRATION_TEST != 'true' }}
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: 3.11

      - name: Install Nebari
        run: |
          pip install .[dev]
          playwright install

      - name: Authenticate to AWS
        uses: aws-actions/configure-aws-credentials@v1
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          role-session-name: github-action
          aws-region: ${{ env.AWS_DEFAULT_REGION }}

      - name: Integration Tests
        run: |
          pytest --version
          pytest tests/tests_integration/ -vvv -s --cloud aws
        env:
          NEBARI_SECRET__default_images__jupyterhub: "quay.io/nebari/nebari-jupyterhub:${{ env.NEBARI_IMAGE_TAG }}"
          NEBARI_SECRET__default_images__jupyterlab: "quay.io/nebari/nebari-jupyterlab:${{ env.NEBARI_IMAGE_TAG }}"
          NEBARI_SECRET__default_images__dask_worker: "quay.io/nebari/nebari-dask-worker:${{ env.NEBARI_IMAGE_TAG }}"
          CLOUDFLARE_TOKEN: ${{ secrets.CLOUDFLARE_TOKEN }}



---
File: nebari/.github/workflows/test_azure_integration.yaml
---

name: Azure Deployment

on:
  schedule:
    - cron: "0 0 * * MON"
  workflow_dispatch:
    inputs:
      image-tag:
        description: 'Nebari image tag created by the nebari-docker-images repo'
        required: true
        default: main
        type: string
      tf-log-level:
        description: 'Change Terraform log levels'
        required: false
        default: info
        type: choice
        options:
        - info
        - warn
        - debug
        - trace
        - error

env:
  NEBARI_IMAGE_TAG: ${{ github.event.inputs.image-tag || 'main' }}
  TF_LOG: ${{ github.event.inputs.tf-log-level || 'info' }}

jobs:
  test-azure-integration:
    runs-on: ubuntu-latest
    if: ${{ vars.SKIP_AZURE_INTEGRATION_TEST != 'true' }}
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: 3.11

      - name: Install Nebari
        run: |
          pip install .[dev]
          conda install --quiet --yes conda-build
          playwright install

      - name: 'Azure login'
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.ARM_CLIENT_ID }}
          tenant-id: ${{ secrets.ARM_TENANT_ID }}
          subscription-id: ${{ secrets.ARM_SUBSCRIPTION_ID }}

      - name: Integration Tests
        run: |
          pytest --version
          pytest tests/tests_integration/ -vvv -s --cloud azure
        env:
          NEBARI_SECRET__default_images__jupyterhub: "quay.io/nebari/nebari-jupyterhub:${{ env.NEBARI_IMAGE_TAG }}"
          NEBARI_SECRET__default_images__jupyterlab: "quay.io/nebari/nebari-jupyterlab:${{ env.NEBARI_IMAGE_TAG }}"
          NEBARI_SECRET__default_images__dask_worker: "quay.io/nebari/nebari-dask-worker:${{ env.NEBARI_IMAGE_TAG }}"
          ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }}
          ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
          ARM_SUBSCRIPTION_ID: ${{ secrets.ARM_SUBSCRIPTION_ID }}
          ARM_USE_OIDC: "true"
          CLOUDFLARE_TOKEN: ${{ secrets.CLOUDFLARE_TOKEN }}



---
File: nebari/.github/workflows/test_conda_build.yaml
---

name: "Test Conda Build"

on:
  pull_request:
    paths:
      - ".github/workflows/test_conda_build.yaml"
      - "pyproject.toml"
  push:
    branches:
      - main
      - release/\d{4}.\d{1,2}.\d{1,2}
    paths:
      - ".github/workflows/test_conda_build.yaml"
      - "pyproject.toml"

jobs:
  test-conda-build:
    runs-on: ubuntu-latest
    defaults:
        run:
          shell: bash -el {0}
    concurrency:
        group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
        cancel-in-progress: true
    steps:
      - name: "Checkout Infrastructure"
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup miniconda
        uses: conda-incubator/setup-miniconda@v3
        with:
          auto-update-conda: true
          python-version: "3.10"
          channels: conda-forge
          activate-environment: nebari-dev

      - name: Install dependencies
        run: |
          conda install build grayskull conda-build conda-verify

      - name: Generate sdist
        run: |
          python -m build --sdist

      - name: Generate meta.yaml
        run: |
          python -m grayskull pypi dist/*.tar.gz

      - name: Build conda package
        run: |
          conda build nebari

      - name: Test conda package
        run: |
          conda install --use-local nebari
          nebari --version



---
File: nebari/.github/workflows/test_gcp_integration.yaml
---

name: GCP Deployment

on:
  schedule:
    - cron: "0 0 * * MON"
  workflow_dispatch:
    inputs:
      image-tag:
        description: 'Nebari image tag created by the nebari-docker-images repo'
        required: true
        default: main
        type: string
      tf-log-level:
        description: 'Change Terraform log levels'
        required: false
        default: info
        type: choice
        options:
        - info
        - warn
        - debug
        - trace
        - error

env:
  NEBARI_IMAGE_TAG: ${{ github.event.inputs.image-tag || 'main' }}
  TF_LOG: ${{ github.event.inputs.tf-log-level || 'info' }}

jobs:
  test-gcp-integration:
    runs-on: ubuntu-latest
    if: ${{ vars.SKIP_GCP_INTEGRATION_TEST != 'true' }}
    permissions:
      id-token: write
      contents: read
      pull-requests: write
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: 3.11

      - name: Install Nebari
        run: |
          pip install .[dev]
          playwright install

      - name: 'Authenticate to GCP'
        uses: 'google-github-actions/auth@v1'
        with:
            workload_identity_provider: ${{ secrets.GCP_WORKFLOW_PROVIDER }}
            service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}

      - name: Set required environment variables
        run: |
          echo "GOOGLE_CREDENTIALS=${{ env.GOOGLE_APPLICATION_CREDENTIALS }}" >> $GITHUB_ENV

      - name: Integration Tests
        run: |
          pytest --version
          pytest tests/tests_integration/ -vvv -s --cloud gcp
        env:
          NEBARI_SECRET__default_images__jupyterhub: "quay.io/nebari/nebari-jupyterhub:${{ env.NEBARI_IMAGE_TAG }}"
          NEBARI_SECRET__default_images__jupyterlab: "quay.io/nebari/nebari-jupyterlab:${{ env.NEBARI_IMAGE_TAG }}"
          NEBARI_SECRET__default_images__dask_worker: "quay.io/nebari/nebari-dask-worker:${{ env.NEBARI_IMAGE_TAG }}"
          PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
          CLOUDFLARE_TOKEN: ${{ secrets.CLOUDFLARE_TOKEN }}



---
File: nebari/.github/workflows/test_helm_charts.yaml
---

# Right now the trigger is set to run on every Monday at 13:00 UTC,
# or when the workflow file is modified. An additional manual trigger
# is also available.
name: "Validate Helm Charts downloads"

on:
  schedule:
    # Run every Monday at 13:00 UTC
    - cron: "0 13 * * 1"
  pull_request:
    paths:
      - ".github/workflows/test_helm_charts.yaml"
      - "scripts/helm-validate.py"
  push:
    paths:
      - ".github/workflows/test_helm_charts.yaml"
      - "scripts/helm-validate.py"
  workflow_dispatch:

jobs:
  test-helm-charts:
    name: "Helm Charts Validation"
    runs-on: ubuntu-latest
    steps:
      - name: "Checkout Infrastructure"
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - name: Install additional Python dependencies
        run: |
          pip install python-hcl2
          pip install tqdm
      - name: Install nebari
        run: |
          pip install .
      - name: Install Helm
        run: |
          curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
          chmod 700 get_helm.sh
          ./get_helm.sh
      - name: Test Helm installation
        run: |
          helm version
      - name: Test Helm Charts
        run: |
          python scripts/helm-validate.py



---
File: nebari/.github/workflows/test_local_integration.yaml
---

name: "Local Integration Tests"

env:
  TEST_USERNAME: "test-user"
  TEST_PASSWORD: "P@sswo3d"
  NEBARI_IMAGE_TAG: "main"

on:
  pull_request:
    paths:
      - ".github/workflows/test_local_integration.yaml"
      - "tests/**"
      - "scripts/**"
      - "src/**"
      - "pyproject.toml"
      - "pytest.ini"
      - ".cirun.yml"
  push:
    branches:
      - main
      - release/\d{4}.\d{1,2}.\d{1,2}
    paths:
      - ".github/workflows/test_local_integration.yaml"
      - "tests/**"
      - "scripts/**"
      - "src/**"
      - "pyproject.toml"
      - "pytest.ini"
      - ".cirun.yml"
  workflow_call:
    inputs:
      pr_number:
        required: true
        type: string
  workflow_dispatch:

# When the cancel-in-progress: true option is specified, any concurrent jobs or workflows using the same
# concurrency group will cancel both the pending and currently running jobs or workflows. This allows only
# one job or workflow in the concurrency group to be in progress at a time.
concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  test-local-integration:
    runs-on: "cirun-runner--${{ github.run_id }}"
    defaults:
      run:
        shell: bash -l {0}
    steps:
      - name: "Checkout Infrastructure"
        uses: actions/checkout@main
        with:
          fetch-depth: 0

      - name: Setup runner for local deployment
        uses: ./.github/actions/setup-local

      - name: Checkout the branch from the PR that triggered the job
        if: ${{ github.event_name == 'issue_comment' }}
        run: |
          hub version
          hub pr checkout ${{ inputs.pr_number }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Set up Python
        uses: conda-incubator/setup-miniconda@v3
        env:
          CONDA: /home/runnerx/miniconda3
        with:
          auto-update-conda: true
          python-version: "3.11"
          miniconda-version: "latest"

      - name: Install JQ
        run: |
          sudo apt-get update
          sudo apt-get install jq -y

      - name: Install Nebari and playwright
        run: |
          pip install .[dev]
          playwright install

      - name: Initialize Nebari config for local deployment
        id: init
        uses: ./.github/actions/init-local

      - name: Deploy Nebari
        working-directory: ${{ steps.init.outputs.directory }}
        run: nebari deploy --config ${{ steps.init.outputs.config }} --disable-prompt

      - name: Health check
        uses: ./.github/actions/health-check
        with:
          domain: ${{ steps.init.outputs.domain }}

      - name: Create example-user
        working-directory: ${{ steps.init.outputs.directory }}
        run: |
          nebari keycloak adduser --user "${TEST_USERNAME}" "${TEST_PASSWORD}" --config ${{ steps.init.outputs.config }}
          nebari keycloak listusers --config ${{ steps.init.outputs.config }}

      - name: Await Workloads
        uses: jupyterhub/action-k8s-await-workloads@v3
        with:
          workloads: "" # all
          namespace: "dev"
          timeout: 300
          max-restarts: 3

      ### DEPLOYMENT TESTS
      - name: Deployment Pytests
        env:
          NEBARI_CONFIG_PATH: ${{ steps.init.outputs.config }}
          KEYCLOAK_USERNAME: ${{ env.TEST_USERNAME }}
          KEYCLOAK_PASSWORD: ${{ env.TEST_PASSWORD }}
        run: |
          pytest tests/tests_deployment/ -v -s

      ### USER-JOURNEY TESTS
      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Playwright Tests
        env:
          KEYCLOAK_USERNAME: ${{ env.TEST_USERNAME }}
          KEYCLOAK_PASSWORD: ${{ env.TEST_PASSWORD }}
          NEBARI_FULL_URL: "https://${{ steps.init.outputs.domain }}/"
        working-directory: tests/tests_e2e/playwright
        run: |
          # create environment file
          envsubst < .env.tpl > .env
          # run playwright pytest tests in headed mode with the chromium browser
          xvfb-run pytest --browser chromium --slowmo 300 --headed

      - name: Save Playwright recording artifacts
        if: always()
        uses: actions/upload-artifact@v4.3.1
        with:
          name: e2e-playwright
          path: |
            ./tests/tests_e2e/playwright/videos/

      ### CLEANUP AFTER TESTS
      - name: Cleanup nebari deployment
        # Since this is not critical for most pull requests and takes more than half of the time
        # in the CI, it makes sense to only run on merge to main or workflow_dispatch to speed
        # up feedback cycle
        if: github.ref_name == 'main' || github.event_name == 'workflow_dispatch'
        working-directory: ${{ steps.init.outputs.directory }}
        run: nebari destroy --config ${{ steps.init.outputs.config }} --disable-prompt



---
File: nebari/.github/workflows/test_local_upgrade.yaml
---

name: "Local Upgrade Tests"

on:
  pull_request:
    paths:
      - ".github/actions/**"
      - ".github/workflows/test_local_upgrade.yaml"
  release:
    types:
      - prereleased
  workflow_dispatch:


concurrency:
  group: ${{ github.ref_name }}
  cancel-in-progress: true

env:
  NEBARI_IMAGE_TAG: "main"

jobs:
  test-local-upgrade:
    runs-on: "cirun-runner--${{ github.run_id }}"
    defaults:
      run:
        shell: bash -l {0}
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup runner for local deployment
        uses: ./.github/actions/setup-local

      - name: Setup Python
        uses: conda-incubator/setup-miniconda@v3
        env:
          CONDA: /home/runnerx/miniconda3
        with:
          auto-update-conda: true
          python-version: "3.11"
          miniconda-version: "latest"

      - name: Install latest stable Nebari release
        run: pip install nebari

      - name: Initialize Nebari config for local deployment
        id: init
        uses: ./.github/actions/init-local

      - name: Extract old Nebari version
        run: |
          OLD_NEBARI_VERSION=$(grep 'nebari_version: ' ${{ steps.init.outputs.config }} | sed 's/nebari_version: //')
          echo "OLD_NEBARI_VERSION=${OLD_NEBARI_VERSION}" | tee --append "${GITHUB_ENV}"

      - name: Deploy Nebari
        working-directory: ${{ steps.init.outputs.directory }}
        run: nebari deploy --config ${{ steps.init.outputs.config }} --disable-prompt

      - name: Health check before upgrade
        id: health-check-before
        uses: ./.github/actions/health-check
        with:
          domain: ${{ steps.init.outputs.domain }}

      - name: Install current Nebari
        run: pip install --upgrade .

      - name: Upgrade Nebari config
        run: |
          git add --force ${{ steps.init.outputs.config }}
          nebari upgrade --config ${{ steps.init.outputs.config }} --attempt-fixes
          git diff
          nebari validate --config ${{ steps.init.outputs.config }}

      - name: Redeploy Nebari
        working-directory: ${{ steps.init.outputs.directory }}
        run: nebari deploy --config ${{ steps.init.outputs.config }} --disable-prompt

      - name: Health check after upgrade
        id: health-check-after
        uses: ./.github/actions/health-check
        with:
          domain: ${{ steps.init.outputs.domain }}



---
File: nebari/.github/workflows/test-provider.yaml
---

# This is only workflow that requires cloud credentials and therefore will not run on PRs coming from forks.
name: "Test Nebari Provider"

on:
  schedule:
    - cron: "0 3 * * *"
  pull_request:
    paths:
      - ".github/workflows/test-provider.yaml"
      - ".github/failed-workflow-issue-templates/test-provider.md"
      - ".github/actions/publish-from-template"
      - "tests/**"
      - "scripts/**"
      - "src/**"
      - "pyproject.toml"
  push:
    branches:
      - main
      - release/\d{4}.\d{1,2}.\d{1,2}
    paths:
      - ".github/workflows/test-provider.yaml"
      - "tests/**"
      - "scripts/**"
      - "src/**"
      - "pyproject.toml"
  workflow_call:
    inputs:
      pr_number:
        required: true
        type: string

env:
  ARM_CLIENT_ID: ${{ secrets.ARM_CLIENT_ID }}
  ARM_TENANT_ID: ${{ secrets.ARM_TENANT_ID }}
  ARM_SUBSCRIPTION_ID: ${{ secrets.ARM_SUBSCRIPTION_ID }}
  PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}

jobs:
  test-render-providers:
    # Prevents the execution of this test under the following conditions:
    # 1. When the 'NO_PROVIDER_CREDENTIALS' GitHub variable is set, indicating the absence of provider credentials.
    # 2. For pull requests (PRs) originating from a fork, since GitHub does not provide the fork's credentials to the destination repository.
    # ref. https://github.com/nebari-dev/nebari/issues/2379
    if: |
      vars.NO_PROVIDER_CREDENTIALS == '' &&
      (github.event.pull_request.head.repo.full_name == github.repository || github.event_name != 'pull_request')
    name: "Test Nebari Provider"
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
      pull-requests: write
    strategy:
      matrix:
        provider:
          - aws
          - azure
          - gcp
          - local
          - existing
        cicd:
          - none
          - github-actions
          - gitlab-ci
      fail-fast: false
    steps:
      - name: "Checkout Infrastructure"
        uses: actions/checkout@v4

      - name: Checkout the branch from the PR that triggered the job
        if: ${{ github.event_name == 'issue_comment' }}
        run: hub pr checkout ${{ inputs.pr_number }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: 'Authenticate to GCP'
        if: ${{ matrix.provider == 'gcp' }}
        uses: 'google-github-actions/auth@v1'
        with:
          token_format: access_token
          create_credentials_file: 'true'
          workload_identity_provider: ${{ secrets.GCP_WORKFLOW_PROVIDER }}
          service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}

      - name: Set required environment variables
        if: ${{ matrix.provider == 'gcp' }}
        run: |
          echo "GOOGLE_CREDENTIALS=${{ env.GOOGLE_APPLICATION_CREDENTIALS }}" >> $GITHUB_ENV

      - name: 'Authenticate to AWS'
        if: ${{ matrix.provider == 'aws' }}
        uses: aws-actions/configure-aws-credentials@v1
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          role-session-name: github-action
          aws-region: us-east-1

      - name: 'Azure login'
        if: ${{ matrix.provider == 'azure' }}
        uses: azure/login@v1
        with:
          client-id: ${{ secrets.ARM_CLIENT_ID }}
          tenant-id: ${{ secrets.ARM_TENANT_ID }}
          subscription-id: ${{ secrets.ARM_SUBSCRIPTION_ID }}

      - name: Install Nebari
        run: |
          pip install --upgrade pip
          pip install .[dev]

      - name: Nebari Initialize
        run: |
          nebari init "${{ matrix.provider }}" --project "TestProvider" --domain "${{ matrix.provider }}.nebari.dev" --auth-provider password --disable-prompt --ci-provider ${{ matrix.cicd }}
          cat "nebari-config.yaml"

      - name: Nebari Render
        run: |
          nebari render -c "nebari-config.yaml" -o "nebari-${{ matrix.provider }}-${{ matrix.cicd }}-deployment"
          cp "nebari-config.yaml" "nebari-${{ matrix.provider }}-${{ matrix.cicd }}-deployment/nebari-config.yaml"

      - name: Nebari Render Artifact
        uses: actions/upload-artifact@master
        with:
          name: "nebari-${{ matrix.provider }}-${{ matrix.cicd }}-artifact"
          path: "nebari-${{ matrix.provider }}-${{ matrix.cicd }}-deployment"

      - if: failure() || github.event_name == 'pull_request'
        name: Publish information from template
        uses: ./.github/actions/publish-from-template
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PROVIDER: ${{ matrix.provider }}
          CICD: ${{ matrix.cicd }}
        with:
          filename: .github/failed-workflow-issue-templates/test-provider.md



---
File: nebari/.github/workflows/test.yaml
---

name: "Tests"

on:
  pull_request:
    paths:
      - ".github/workflows/test.yaml"
      - "tests/**"
      - "scripts/**"
      - "src/**"
      - "pyproject.toml"
      - "pytest.ini"
  push:
    branches:
      - main
      - release/\d{4}.\d{1,2}.\d{1,2}
    paths:
      - ".github/workflows/test.yaml"
      - "tests/**"
      - "scripts/**"
      - "src/**"
      - "pyproject.toml"
      - "pytest.ini"

jobs:
  test-general:
    name: "Pytest"
    runs-on: ubuntu-latest
    defaults:
      run:
        shell: bash -el {0}
    strategy:
      matrix:
        python-version:
          - "3.10"
          - "3.11"
          - "3.12"
          - "3.13"
      fail-fast: false
    concurrency:
      group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}-${{ matrix.python-version }}
      cancel-in-progress: true
    steps:
      - name: "Checkout Infrastructure"
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup miniconda
        uses: conda-incubator/setup-miniconda@v3
        with:
          auto-update-conda: true
          python-version: ${{ matrix.python-version }}
          channels: conda-forge,defaults
          activate-environment: nebari-dev

      - name: Install Nebari
        run: |
          python --version
          pip install -e .[dev]

      - name: Test Nebari
        run: |
          pytest --version
          pytest --cov=src --cov-report=xml --cov-config=pyproject.toml tests/tests_unit

      - name: Report Coverage
        run: |
          coverage report -m



---
File: nebari/.github/workflows/trivy.yml
---

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

name: Code Scanning

on:
  push:
    branches: [ "main", "release/*" ]
  pull_request:
    # The branches below must be a subset of the branches above
    branches: [ "main" ]
  schedule:
    - cron: '19 23 * * 6'

permissions:
  contents: read

jobs:
  SAST:
    permissions:
      contents: read # for actions/checkout to fetch code
      security-events: write # for github/codeql-action/upload-sarif to upload SARIF results
      actions: read # only required for a private repository by github/codeql-action/upload-sarif to get the Action run status
    name: Trivy config Scan
    runs-on: "ubuntu-20.04"
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Run Trivy vulnerability scanner in fs mode
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'config'
          hide-progress: true
          format: 'sarif'
          output: 'trivy-results.sarif'
          ignore-unfixed: true
          severity: 'CRITICAL,HIGH'

      - name: Upload Trivy scan results to GitHub Security tab
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: 'trivy-results.sarif'



---
File: nebari/.github/workflows/typing.yaml
---

name: "Typing Check"

on:
  pull_request:
    paths:
      - ".github/workflows/typing.yaml"
      - "src/**"
      - "pyproject.toml"
  push:
    branches:
      - main
      - release/\d{4}.\d{1,2}.\d{1,2}
    paths:
      - ".github/workflows/typing.yaml"
      - "src/**"
      - "pyproject.toml"

jobs:
  typing-check:
    runs-on: ubuntu-latest
    concurrency:
      group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
      cancel-in-progress: true
    steps:
      - name: "Checkout Repository"
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
          cache: "pip"

      - name: Install Nebari and type stubs
        run: |
          python --version
          pip install -e .[dev]
          pip install types-Pygments types-requests types-six

      - name: Run MyPy
        continue-on-error: true
        run: |
          mypy



---
File: nebari/.github/PULL_REQUEST_TEMPLATE.md
---

<!--
Thanks for contributing a pull request! Please ensure you have taken a look at
the contribution guidelines: https://nebari.dev/community
-->

## Reference Issues or PRs

<!--
Example: Fixes #1234. See also #3456.
Please use keywords (e.g., Fixes) to create a link to the issues or pull requests
you resolved, so that they will automatically be closed when your pull request
is merged. See https://github.com/blog/1506-closing-issues-via-pull-requests
-->

## What does this implement/fix?

_Put a `x` in the boxes that apply_

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds a feature)
- [ ] Breaking change (fix or feature that would cause existing features not to work as expected)
- [ ] Documentation Update
- [ ] Code style update (formatting, renaming)
- [ ] Refactoring (no functional changes, no API changes)
- [ ] Build related changes
- [ ] Other (please describe):

## Testing

- [ ] Did you test the pull request locally?
- [ ] Did you add new tests?

## How to test this PR?

<!--
If relevant, please outline the steps required to test your contribution
and the expected outcomes from the proposed changes. Providing clear
testing instructions will help reviewers evaluate your contribution.
-->

## Any other comments?

<!--
Please be aware that we are a loose team of volunteers, so patience is necessary;
assistance handling other issues is very welcome.
We value all user contributions. If we are slow to review, either the pull request needs some benchmarking, tinkering,
convincing, etc., or the reviewers are likely busy. In either case,
we ask for your understanding during the
review process.
Thanks for contributing to Nebari 🙏🏼!
-->



---
File: nebari/.github/release-notes-sync-config.yaml
---

# Configuration for ./workflows/release-notes-sync
# Ref: https://github.com/BetaHuhn/repo-file-sync-action

group:
  repos: nebari-dev/nebari-docs
  files:
    - source: RELEASE.md
      dest: docs/docs/references/RELEASE.md



---
File: nebari/scripts/aws-force-destroy.py
---

import argparse
import logging
import time
from pathlib import Path

from _nebari.utils import check_cloud_credentials, load_yaml, timer

logging.basicConfig(level=logging.INFO)


def main():
    parser = argparse.ArgumentParser(description="Force Destroy AWS environment.")
    parser.add_argument("-c", "--config", help="nebari configuration", required=True)
    args = parser.parse_args()

    handle_force_destroy(args)


def handle_force_destroy(args):
    config_filename = Path(args.config)
    if not config_filename.is_file():
        raise ValueError(
            f"passed in configuration filename={config_filename} must exist"
        )

    config = load_yaml(config_filename)

    # Don't verify(config) in case the schema has changed - just pick out the important bits and tear down

    force_destroy_configuration(config)


def parse_arn(arn):
    # http://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html
    elements = arn.split(":", 5)
    result = {
        # "arn": elements[0],
        "partition": elements[1],
        "service": elements[2],
        "region": elements[3],
        "account": elements[4],
        "resource": elements[5],
        "resource_type": None,
        "arn": arn,  # Full ARN
    }
    if "/" in result["resource"]:
        result["resource_type"], result["resource"] = result["resource"].split("/", 1)
    elif ":" in result["resource"]:
        result["resource_type"], result["resource"] = result["resource"].split(":", 1)
    return result


def force_destroy_configuration(config):
    logging.info("""FORCE Removing all infrastructure (not using terraform).""")

    with timer(logging, "destroying nebari"):
        # 01 Check we have cloud details we need
        check_cloud_credentials(config)

        if config.get("provider", "") != "aws":
            raise ValueError("force-destroy currently only available for AWS")

        project_name = config.get("project_name", "").strip()

        if project_name == "":
            raise ValueError("project_name cannot be blank")

        if "amazon_web_services" not in config:
            raise ValueError(
                "amazon_web_services section must exist in nebari-config.yaml"
            )

        region = config["amazon_web_services"].get("region", "").strip()

        if region == "":
            raise ValueError(
                "amazon_web_services.region must exist in nebari-config.yaml"
            )

        logging.info(f"Remove AWS project {project_name} in region {region}")

        env = config.get("namespace", "dev").strip()

        # 02 Remove all infrastructure
        try:
            import boto3
        except ImportError:
            raise ValueError(
                "Please ensure boto3 package is installed using: pip install boto3==1.17.98"
            )

        restag = boto3.client("resourcegroupstaggingapi", region_name=region)

        filter_params = dict(
            TagFilters=[
                {
                    "Key": "Owner",
                    "Values": [
                        "terraform",
                        "terraform-state",
                    ],
                },
                {
                    "Key": "Environment",
                    "Values": [
                        env,
                    ],
                },
                {
                    "Key": "Project",
                    "Values": [
                        project_name,
                    ],
                },
            ],
            ResourcesPerPage=50,
        )

        resources = []

        response = restag.get_resources(**filter_params)

        resources.extend(response["ResourceTagMappingList"])

        while "PaginationToken" in response and response["PaginationToken"]:
            token = response["PaginationToken"]
            response = restag.get_resources(**filter_params, PaginationToken=token)
            resources.extend(response["ResourceTagMappingList"])

        # Load Balancer and other K8s-generated resources will need to be queried separately:

        filter_params = dict(
            TagFilters=[
                {
                    "Key": f"kubernetes.io/cluster/{project_name}-{env}",
                    "Values": [
                        "owned",
                    ],
                }
            ],
            ResourcesPerPage=50,
        )

        response = restag.get_resources(**filter_params)
        resources.extend(response["ResourceTagMappingList"])

        # IAM

        iam = boto3.resource("iam")
        for suffix in ("eks-cluster-role", "eks-node-group-role"):
            try:
                role = iam.Role(f"{project_name}-{env}-{suffix}")

                if role.tags is not None:
                    tags_dict = dict(
                        [(t["Key"], t.get("Value", "")) for t in role.tags]
                    )

                    if (
                        tags_dict.get("Owner", "") == "terraform"
                        and tags_dict.get("Environment", "") == env
                        and tags_dict.get("Project", "") == project_name
                    ):
                        resources.append({"ResourceARN": role.arn})

            except iam.meta.client.exceptions.NoSuchEntityException:
                pass

        # Summarize resources

        type_groups = {}
        for r in resources:
            de_arned = parse_arn(r["ResourceARN"])
            t = f"{de_arned['service']}-{de_arned['resource_type']}"
            type_groups.setdefault(t, []).append(de_arned)
            logging.info(r["ResourceARN"])

        logging.info([(k, len(v)) for k, v in type_groups.items()])

        # Order
        priority_types = (
            "eks-nodegroup",
            "eks-cluster",
            "elasticloadbalancing-loadbalancer",
            "ec2-internet-gateway",
            "ec2-route-table",
            "elasticfilesystem-file-system",
            "ec2-subnet",
            "ec2-security-group",
            "ec2-vpc",
            "ecr-repository",
            "dynamodb-table",
            "s3-None",
            "resource-groups-group",
            "iam-role",
        )

        for pt in priority_types:
            logging.info(f"Inspect {pt}")
            for r in type_groups.get(pt, []):
                if pt == "eks-nodegroup":
                    nodegroup_resource = r["resource"].split("/")

                    cluster_name = nodegroup_resource[0]
                    nodegroup_name = nodegroup_resource[1]

                    logging.info(f"Delete {nodegroup_name} on cluster {cluster_name}")

                    client = boto3.client("eks", region_name=region)
                    client.delete_nodegroup(
                        clusterName=cluster_name, nodegroupName=nodegroup_name
                    )

                elif pt == "eks-cluster":
                    logging.info(f"Delete EKS cluster {r['resource']}")

                    client = boto3.client("eks", region_name=region)

                    response = client.list_nodegroups(clusterName=r["resource"])
                    while len(response["nodegroups"]) > 0:
                        logging.info("Nodegroups still present, sleep 10")
                        time.sleep(10)
                        response = client.list_nodegroups(clusterName=r["resource"])

                    client.delete_cluster(name=r["resource"])

                elif pt == "elasticloadbalancing-loadbalancer":
                    client = boto3.client("elb", region_name=region)

                    logging.info(f"Inspect Load balancer {r['resource']}")

                    logging.info(f"Delete Load balancer {r['resource']}")
                    response = client.delete_load_balancer(
                        LoadBalancerName=r["resource"]
                    )

                elif pt == "ec2-route-table":
                    logging.info(f"Inspect route table {r['resource']}")
                    ec2 = boto3.resource("ec2", region_name=region)
                    route_table = ec2.RouteTable(r["resource"])

                    for assoc in route_table.associations:
                        logging.info(f"Delete route table assoc {assoc.id}")
                        assoc.delete()

                    time.sleep(10)

                    logging.info(f"Delete route table {r['resource']}")
                    route_table.delete()

                elif pt == "ec2-subnet":
                    logging.info(f"Inspect subnet {r['resource']}")
                    ec2 = boto3.resource("ec2", region_name=region)
                    subnet = ec2.Subnet(r["resource"])

                    for ni in subnet.network_interfaces.all():
                        ni.load()
                        # But can only detach if attached...
                        if ni.attachment:
                            ni.detach(DryRun=False, Force=True)
                            ni.delete()

                    logging.info(f"Delete subnet {r['resource']}")
                    subnet.delete(DryRun=False)

                elif pt == "ec2-security-group":
                    logging.info(f"Inspect security group {r['resource']}")
                    ec2 = boto3.resource("ec2", region_name=region)
                    security_group = ec2.SecurityGroup(r["resource"])

                    for ipperms in security_group.ip_permissions_egress:
                        security_group.revoke_egress(
                            DryRun=False, IpPermissions=[ipperms]
                        )

                    for ipperms in security_group.ip_permissions:
                        security_group.revoke_ingress(
                            DryRun=False, IpPermissions=[ipperms]
                        )

                    logging.info(f"Delete security group {r['resource']}")
                    security_group.delete(DryRun=False)

                elif pt == "ec2-internet-gateway":
                    logging.info(f"Inspect internet gateway {r['resource']}")

                    ec2 = boto3.resource("ec2", region_name=region)
                    internet_gateway = ec2.InternetGateway(r["resource"])

                    for attach in internet_gateway.attachments:
                        logging.info(f"Inspect IG attachment {attach['VpcId']}")
                        if attach.get("State", "") == "available":
                            logging.info(f"Detach from VPC {attach['VpcId']}")
                            internet_gateway.detach_from_vpc(VpcId=attach["VpcId"])

                    time.sleep(10)

                    logging.info(f"Delete internet gateway {r['resource']}")
                    internet_gateway.delete(DryRun=False)

                elif pt == "elasticfilesystem-file-system":
                    client = boto3.client("efs", region_name=region)

                    logging.info(f"Delete efs {r['resource']}")

                    mts = client.describe_mount_targets(FileSystemId=r["resource"])

                    for mt in mts["MountTargets"]:
                        client.delete_mount_target(MountTargetId=mt["MountTargetId"])

                    response = client.delete_file_system(FileSystemId=r["resource"])

                    ## Should wait until this returns botocore.errorfactory.FileSystemNotFound:
                    # response = client.describe_file_systems(
                    #    FileSystemId=r['resource']
                    # )

                elif pt == "ec2-vpc":
                    logging.info(f"Inspect VPC {r['resource']}")

                    ec2 = boto3.resource("ec2", region_name=region)

                    vpc = ec2.Vpc(r["resource"])

                    # for cidr_assoc in vpc.cidr_block_association_set:
                    #    logging.info(cidr_assoc)
                    #    r = vpc.disassociate_subnet_cidr_block(
                    #        AssociationId=cidr_assoc['AssociationId']
                    #    )
                    #    logging.info(r)

                    logging.info(f"Delete VPC {r['resource']}")
                    vpc.delete()

                elif pt == "ecr-repository":
                    logging.info(f"Inspect ECR {r['resource']}")
                    client = boto3.client("ecr", region_name=region)

                    logging.info(f"Delete ecr {r['account']} / {r['resource']}")

                    response = response = client.delete_repository(
                        registryId=r["account"],
                        repositoryName=r["resource"],
                        force=True,
                    )

                elif pt == "s3-None":
                    logging.info(f"Inspect S3 {r['resource']}")
                    s3 = boto3.resource("s3", region_name=region)

                    logging.info(f"Delete s3 {r['resource']}")

                    bucket = s3.Bucket(r["resource"])

                    r = bucket.objects.all().delete()

                    r = bucket.object_versions.delete()

                    response = bucket.delete()

                elif pt == "dynamodb-table":
                    logging.info(f"Inspect DynamoDB {r['resource']}")

                    client = boto3.client("dynamodb", region_name=region)

                    logging.info(f"Delete DynamoDB {r['resource']}")

                    response = client.delete_table(TableName=r["resource"])

                elif pt == "resource-groups-group":
                    logging.info(f"Inspect Resource Group {r['resource']}")

                    client = boto3.client("resource-groups", region_name=region)

                    logging.info(f"Delete Resource Group {r['resource']}")

                    response = client.delete_group(Group=r["arn"])

                elif pt == "iam-role":
                    logging.info(f"Inspect IAM Role {r['resource']}")
                    iam = boto3.resource("iam")
                    role = iam.Role(r["resource"])

                    for policy in role.attached_policies.all():
                        logging.info(f"Detach Role policy {policy.arn}")
                        response = role.detach_policy(PolicyArn=policy.arn)

                    logging.info(f"Delete IAM Role {r['resource']}")
                    role.delete()


if __name__ == "__main__":
    main()



---
File: nebari/scripts/helm-validate.py
---

import json
import logging
import os
import re
from pathlib import Path

import hcl2
from tqdm import tqdm

from _nebari.utils import deep_merge

# Configure logging
logging.basicConfig(level=logging.INFO)


class HelmChartIndexer:
    # Define regex patterns to extract variable names
    LOCAL_VAR_PATTERN = re.compile(r"local.(.*[a-z])")
    VAR_PATTERN = re.compile(r"var.(.*[a-z])")

    def __init__(self, stages_dir, skip_charts, debug=False):
        self.stages_dir = stages_dir
        self.skip_charts = skip_charts
        self.charts = {}
        self.logger = logging.getLogger(__name__)

    def get_filepaths_that_contain_helm_release(self):
        """Get list of helm charts from nebari code-base"""
        # using pathlib to get list of files in the project root dir, look for all .tf files that
        # contain helm_release
        path = Path(__file__).parent.parent.absolute()
        path_tree = path.glob(f"{self.stages_dir}/**/main.tf")
        paths = []
        for file in path_tree:
            with open(file) as f:
                contents = f.read()
                if "helm_release" in contents:
                    paths.append(file)
                else:
                    continue
        logging.info(f"Found {len(paths)} files that contain helm_release")
        return paths

    def _argument_contains_variable_hook(self, argument):
        if "local." in argument or "var." in argument:
            return True
        return False

    def _clean_var_name(self, var_name, var_type):
        """Clean variable name"""
        if var_type == "local":
            # $(local.var_name)
            return self.LOCAL_VAR_PATTERN.findall(var_name)[0]
        if var_type == "var":
            # $(var.var_name)
            return self.VAR_PATTERN.findall(var_name)[0]

    def _load_variable_value(self, argument, parent_contents):
        if "local." in argument:
            var_name = self._clean_var_name(argument, "local")
            for local in parent_contents.get("locals", {}):
                if var_name in local:
                    return local[var_name]
            else:
                raise ValueError(f"Could not find local variable {var_name}")
        if "var." in argument:
            var_name = self._clean_var_name(argument, "var")
            for var in parent_contents.get("variable", {}):
                if var_name in var:
                    return var[var_name]["default"]
            else:
                raise ValueError(f"Could not find variable {var_name}")

    def retrieve_helm_information(self, filepath):
        parent_path = Path(filepath).parent

        if parent_path.name in self.skip_charts:
            self.logger.debug(f"Skipping {parent_path.name}")
            return self.charts

        self.logger.debug(f"Processing {parent_path.name}")
        parent_contents = {}

        for file in parent_path.glob("**/*.tf"):
            if file.as_posix().endswith("configmaps.tf"):
                # It should be safe to skip configmaps.tf files as they are not used to define helm_release resources
                # This was included as an exception to avoid a parsing error: on services/jupyterhub/configmaps.tf at line 8, column 5.
                continue
            with open(file, "r") as f:
                parent_contents = deep_merge(parent_contents, hcl2.load(f))

        for resource in parent_contents.get("resource", {}):
            if "helm_release" not in resource:
                continue
            for release_name, release_attrs in resource.get("helm_release", {}).items():
                self.logger.debug(f"Processing helm_release {release_name}")
                chart_name = release_attrs.get("chart", "")
                chart_version = release_attrs.get("version", "")
                chart_repository = release_attrs.get("repository", "")

                if self._argument_contains_variable_hook(chart_version):
                    self.logger.debug(
                        f"Spotted {chart_version} in {chart_name} chart metadata"
                    )
                    chart_version = self._load_variable_value(
                        chart_version, parent_contents
                    )

                if self._argument_contains_variable_hook(chart_repository):
                    self.logger.debug(
                        f"Spotted {chart_repository} in {chart_name} chart metadata"
                    )
                    chart_repository = self._load_variable_value(
                        chart_repository, parent_contents
                    )

                self.logger.debug(
                    f"Name: {chart_name} Version: {chart_version} Repository: {chart_repository}"
                )

                self.charts[chart_name] = {
                    "version": chart_version,
                    "repository": chart_repository,
                }

        if not self.charts:
            self.logger.debug("Could not find any helm_release under module resources")

        return self.charts

    def generate_helm_chart_index(self):
        """
        Generate an index of helm charts by searching for helm_release resources in Terraform files.

        Returns:
            A dictionary where the keys are the names of the charts and the values are dictionaries containing the chart's
            version and repository.

        Raises:
            ValueError: If no helm charts are found in the Terraform files.
        """
        paths = self.get_filepaths_that_contain_helm_release()
        helm_charts = {}
        for path in paths:
            helm_information = self.retrieve_helm_information(path)
            helm_charts.update(helm_information)

        if not helm_charts:
            raise ValueError("No helm charts found in the Terraform files.")

        with open("helm_charts.json", "w") as f:
            json.dump(helm_charts, f)

        return helm_charts


def pull_helm_chart(chart_index: dict, skip_charts: list) -> None:
    """
    Pull helm charts specified in `chart_index` and save them in the `helm_charts` directory.

    Args:
        chart_index: A dictionary containing chart names as keys and chart metadata (version and repository)
            as values.
        skip_charts: A list of chart names to skip.

    Raises:
        ValueError: If a chart could not be found in the `helm_charts` directory after pulling.
    """
    chart_dir = Path("helm_charts")
    chart_dir.mkdir(parents=True, exist_ok=True)

    os.chdir(chart_dir)

    for chart_name, chart_metadata in tqdm(
        chart_index.items(), desc="Downloading charts"
    ):
        chart_version = chart_metadata["version"]
        chart_repository = chart_metadata["repository"]

        if chart_name in skip_charts:
            continue

        os.system(f"helm repo add {chart_name} {chart_repository}")
        os.system(
            f"helm pull {chart_name} --version {chart_version} --repo {chart_repository} --untar"
        )

        chart_filename = Path(f"{chart_name}-{chart_version}.tgz")
        if not chart_filename.exists():
            raise ValueError(
                f"Could not find {chart_name}:{chart_version} directory in {chart_dir}."
            )

    print("All charts downloaded successfully!")
    # shutil.rmtree(Path(os.getcwd()).parent / chart_dir)


def add_workflow_job_summary(chart_index: dict):
    """
    Based on the chart index, add a summary of the workflow job to the action log.

    Args:
        chart_index (dict): A dictionary containing chart names as keys and chart metadata (version and repository)
            as values.
    """
    if "GITHUB_STEP_SUMMARY" in os.environ:
        with open(os.environ["GITHUB_STEP_SUMMARY"], "a") as f:
            f.write("\n\n## Helm Charts\n")
            for chart_name, chart_metadata in chart_index.items():
                chart_version = chart_metadata["version"]
                chart_repository = chart_metadata["repository"]
                f.write(f"- {chart_name} ({chart_version}) from {chart_repository}\n")


if __name__ == "__main__":
    # charts = generate_index_of_helm_charts()
    STAGES_DIR = "src/_nebari/stages"
    SKIP_CHARTS = ["helm-extensions"]

    charts = HelmChartIndexer(
        stages_dir=STAGES_DIR, skip_charts=SKIP_CHARTS
    ).generate_helm_chart_index()
    pull_helm_chart(charts, skip_charts=SKIP_CHARTS)
    add_workflow_job_summary(charts)



---
File: nebari/scripts/keycloak-export.py
---

import argparse
import json
import logging
import sys
from pathlib import Path

from _nebari.keycloak import get_keycloak_admin_from_config

logging.basicConfig(level=logging.INFO)


def main():
    parser = argparse.ArgumentParser(description="Export users and groups from Nebari.")
    parser.add_argument("-c", "--config", help="nebari configuration", required=True)
    args = parser.parse_args()

    handle_keycloak_export(args)


def handle_keycloak_export(args):
    config_filename = Path(args.config)
    if not config_filename.is_file():
        raise ValueError(
            f"passed in configuration filename={config_filename} must exist"
        )

    keycloak_admin = get_keycloak_admin_from_config(config_filename)

    realm = {"id": "nebari", "realm": "nebari"}

    def process_user(u):
        uid = u["id"]
        memberships = keycloak_admin.get_user_groups(uid)

        del u["id"]
        u["groups"] = [g["name"] for g in memberships]
        return u

    realm["users"] = [process_user(u) for u in keycloak_admin.get_users()]

    realm["groups"] = [
        {"name": g["name"], "path": g["path"]}
        for g in keycloak_admin.get_groups()
        if g["name"] not in {"users", "admin"}
    ]

    json.dump(realm, sys.stdout, indent=2)


if __name__ == "__main__":
    main()



---
File: nebari/src/_nebari/provider/cicd/__init__.py
---




---
File: nebari/src/_nebari/provider/cicd/common.py
---

import os


def pip_install_nebari(nebari_version: str) -> str:
    nebari_gh_branch = os.environ.get("NEBARI_GH_BRANCH", None)
    pip_install = f"pip install nebari=={nebari_version}"
    # dev branches
    if nebari_gh_branch:
        pip_install = f"pip install git+https://github.com/nebari-dev/nebari.git@{nebari_gh_branch}"

    return pip_install



---
File: nebari/src/_nebari/provider/cicd/github.py
---

import base64
import os
from typing import Dict, List, Optional, Union

import requests
from nacl import encoding, public
from pydantic import BaseModel, ConfigDict, Field, RootModel

from _nebari.constants import LATEST_SUPPORTED_PYTHON_VERSION
from _nebari.provider.cicd.common import pip_install_nebari
from nebari import schema

GITHUB_BASE_URL = "https://api.github.com/"


def github_request(url, method="GET", json=None, authenticate=True):
    auth = None
    if authenticate:
        missing = []
        for name in ("GITHUB_USERNAME", "GITHUB_TOKEN"):
            if os.environ.get(name) is None:
                missing.append(name)
        if len(missing) > 0:
            raise ValueError(
                f"Environment variable(s) required for GitHub automation - {', '.join(missing)}"
            )
        auth = requests.auth.HTTPBasicAuth(
            os.environ["GITHUB_USERNAME"], os.environ["GITHUB_TOKEN"]
        )

    method_map = {
        "GET": requests.get,
        "PUT": requests.put,
        "POST": requests.post,
    }

    response = method_map[method](
        f"{GITHUB_BASE_URL}{url}",
        json=json,
        auth=auth,
    )
    response.raise_for_status()
    return response


def encrypt(public_key: str, secret_value: str) -> str:
    """Encrypt a Unicode string using the public key."""
    public_key = public.PublicKey(public_key.encode("utf-8"), encoding.Base64Encoder())
    sealed_box = public.SealedBox(public_key)
    encrypted = sealed_box.encrypt(secret_value.encode("utf-8"))
    return base64.b64encode(encrypted).decode("utf-8")


def get_repo_public_key(owner, repo):
    return github_request(f"repos/{owner}/{repo}/actions/secrets/public-key").json()


def update_secret(owner, repo, secret_name, secret_value):
    key = get_repo_public_key(owner, repo)
    encrypted_value = encrypt(key["key"], secret_value)

    return github_request(
        f"repos/{owner}/{repo}/actions/secrets/{secret_name}",
        method="PUT",
        json={"encrypted_value": encrypted_value, "key_id": key["key_id"]},
    )


def get_repository(owner, repo):
    return github_request(f"repos/{owner}/{repo}").json()


def get_repo_tags(owner, repo):
    return github_request(f"repos/{owner}/{repo}/tags", authenticate=False).json()


def create_repository(owner, repo, description, homepage, private=True):
    if owner == os.environ.get("GITHUB_USERNAME"):
        github_request(
            "user/repos",
            method="POST",
            json={
                "name": repo,
                "description": description,
                "homepage": homepage,
                "private": private,
            },
        )
    else:
        github_request(
            f"orgs/{owner}/repos",
            method="POST",
            json={
                "name": repo,
                "description": description,
                "homepage": homepage,
                "private": private,
            },
        )
    return f"git@github.com:{owner}/{repo}.git"


def gha_env_vars(config: schema.Main):
    env_vars = {
        "GITHUB_TOKEN": "${{ secrets.GITHUB_TOKEN }}",
    }

    if os.environ.get("NEBARI_GH_BRANCH"):
        env_vars["NEBARI_GH_BRANCH"] = "${{ secrets.NEBARI_GH_BRANCH }}"

    if config.provider == schema.ProviderEnum.aws:
        env_vars["AWS_ACCESS_KEY_ID"] = "${{ secrets.AWS_ACCESS_KEY_ID }}"
        env_vars["AWS_SECRET_ACCESS_KEY"] = "${{ secrets.AWS_SECRET_ACCESS_KEY }}"
        env_vars["AWS_DEFAULT_REGION"] = "${{ secrets.AWS_DEFAULT_REGION }}"
    elif config.provider == schema.ProviderEnum.azure:
        env_vars["ARM_CLIENT_ID"] = "${{ secrets.ARM_CLIENT_ID }}"
        env_vars["ARM_CLIENT_SECRET"] = "${{ secrets.ARM_CLIENT_SECRET }}"
        env_vars["ARM_SUBSCRIPTION_ID"] = "${{ secrets.ARM_SUBSCRIPTION_ID }}"
        env_vars["ARM_TENANT_ID"] = "${{ secrets.ARM_TENANT_ID }}"
    elif config.provider == schema.ProviderEnum.gcp:
        env_vars["GOOGLE_CREDENTIALS"] = "${{ secrets.GOOGLE_CREDENTIALS }}"
        env_vars["PROJECT_ID"] = "${{ secrets.PROJECT_ID }}"
    elif config.provider in [schema.ProviderEnum.local, schema.ProviderEnum.existing]:
        # create mechanism to allow for extra env vars?
        pass
    else:
        raise ValueError("Cloud Provider configuration not supported")

    return env_vars


### GITHUB-ACTIONS SCHEMA ###


class GHA_on_extras(BaseModel):
    branches: List[str]
    paths: List[str]


GHA_on = RootModel[Dict[str, GHA_on_extras]]
GHA_job_steps_extras = RootModel[Union[str, float, int]]


class GHA_job_step(BaseModel):
    name: str
    uses: Optional[str] = None
    with_: Optional[Dict[str, GHA_job_steps_extras]] = Field(alias="with", default=None)
    run: Optional[str] = None
    env: Optional[Dict[str, GHA_job_steps_extras]] = None
    model_config = ConfigDict(populate_by_name=True)


class GHA_job_id(BaseModel):
    name: str
    runs_on_: str = Field(alias="runs-on")
    permissions: Optional[Dict[str, str]] = None
    steps: List[GHA_job_step]
    model_config = ConfigDict(populate_by_name=True)


GHA_jobs = RootModel[Dict[str, GHA_job_id]]


class GHA(BaseModel):
    name: str
    on: GHA_on
    env: Optional[Dict[str, str]] = None
    jobs: GHA_jobs


class NebariOps(GHA):
    pass


class NebariLinter(GHA):
    pass


### GITHUB ACTION WORKFLOWS ###


def checkout_image_step():
    return GHA_job_step(
        name="Checkout Image",
        uses="actions/checkout@v3",
        with_={"token": GHA_job_steps_extras("${{ secrets.REPOSITORY_ACCESS_TOKEN }}")},
    )


def setup_python_step():
    return GHA_job_step(
        name="Set up Python",
        uses="actions/setup-python@v5",
        with_={"python-version": GHA_job_steps_extras(LATEST_SUPPORTED_PYTHON_VERSION)},
    )


def install_nebari_step(nebari_version):
    return GHA_job_step(name="Install Nebari", run=pip_install_nebari(nebari_version))


def gen_nebari_ops(config):
    env_vars = gha_env_vars(config)

    push = GHA_on_extras(branches=[config.ci_cd.branch], paths=["nebari-config.yaml"])
    on = GHA_on({"push": push})

    step1 = checkout_image_step()
    step2 = setup_python_step()
    step3 = install_nebari_step(config.nebari_version)
    gha_steps = [step1, step2, step3]

    for step in config.ci_cd.before_script:
        gha_steps.append(GHA_job_step(**step))

    step4 = GHA_job_step(
        name="Deploy Changes made in nebari-config.yaml",
        run=f"nebari deploy -c nebari-config.yaml --disable-prompt{' --skip-remote-state-provision' if os.environ.get('NEBARI_GH_BRANCH') else ''}",
    )
    gha_steps.append(step4)

    step5 = GHA_job_step(
        name="Push Changes",
        run=(
            "git config user.email 'nebari@quansight.com' ; "
            "git config user.name 'github action' ; "
            "git add ./.gitignore ./.github ./stages; "
            "git diff --quiet && git diff --staged --quiet || (git commit -m '${{ env.COMMIT_MSG }}') ; "
            f"git push origin {config.ci_cd.branch}"
        ),
        env={
            "COMMIT_MSG": GHA_job_steps_extras(
                "nebari-config.yaml automated commit: ${{ github.sha }}"
            )
        },
    )
    if config.ci_cd.commit_render:
        gha_steps.append(step5)

    for step in config.ci_cd.after_script:
        gha_steps.append(GHA_job_step(**step))

    job1 = GHA_job_id(
        name="nebari",
        runs_on_="ubuntu-latest",
        permissions={
            "id-token": "write",
            "contents": "read",
        },
        steps=gha_steps,
    )
    jobs = GHA_jobs({"build": job1})

    return NebariOps(
        name="nebari auto update",
        on=on,
        env=env_vars,
        jobs=jobs,
    )


def gen_nebari_linter(config):
    env_vars = {}
    nebari_gh_branch = os.environ.get("NEBARI_GH_BRANCH")
    if nebari_gh_branch:
        env_vars["NEBARI_GH_BRANCH"] = "${{ secrets.NEBARI_GH_BRANCH }}"
    else:
        env_vars = None

    pull_request = GHA_on_extras(
        branches=[config.ci_cd.branch], paths=["nebari-config.yaml"]
    )
    on = GHA_on({"pull_request": pull_request})

    step1 = checkout_image_step()
    step2 = setup_python_step()
    step3 = install_nebari_step(config.nebari_version)

    step4_envs = {
        "PR_NUMBER": GHA_job_steps_extras("${{ github.event.number }}"),
        "REPO_NAME": GHA_job_steps_extras("${{ github.repository }}"),
        "GITHUB_TOKEN": GHA_job_steps_extras("${{ secrets.REPOSITORY_ACCESS_TOKEN }}"),
    }

    step4 = GHA_job_step(
        name="Nebari Lintify",
        run="nebari validate --config nebari-config.yaml --enable-commenting",
        env=step4_envs,
    )

    job1 = GHA_job_id(
        name="nebari", runs_on_="ubuntu-latest", steps=[step1, step2, step3, step4]
    )
    jobs = GHA_jobs(
        {
            "nebari-validate": job1,
        }
    )

    return NebariLinter(
        name="nebari linter",
        on=on,
        env=env_vars,
        jobs=jobs,
    )



---
File: nebari/src/_nebari/provider/cicd/gitlab.py
---

from typing import Dict, List, Optional, Union

from pydantic import BaseModel, ConfigDict, Field, RootModel

from _nebari.constants import LATEST_SUPPORTED_PYTHON_VERSION
from _nebari.provider.cicd.common import pip_install_nebari

GLCI_extras = RootModel[Union[str, float, int]]


class GLCI_image(BaseModel):
    name: str
    entrypoint: Optional[str] = None


class GLCI_rules(BaseModel):
    if_: Optional[str] = Field(alias="if")
    changes: Optional[List[str]] = None
    model_config = ConfigDict(populate_by_name=True)


class GLCI_job(BaseModel):
    image: Optional[Union[str, GLCI_image]] = None
    variables: Optional[Dict[str, str]] = None
    before_script: Optional[List[str]] = None
    after_script: Optional[List[str]] = None
    script: List[str]
    rules: Optional[List[GLCI_rules]] = None


GLCI = RootModel[Dict[str, GLCI_job]]


def gen_gitlab_ci(config):
    render_vars = {
        "COMMIT_MSG": "nebari-config.yaml automated commit: {{ '$CI_COMMIT_SHA' }}",
    }

    script = [
        f"git checkout {config.ci_cd.branch}",
        pip_install_nebari(config.nebari_version),
        "nebari deploy --config nebari-config.yaml --disable-prompt --skip-remote-state-provision",
    ]

    commit_render_script = [
        "git config user.email 'nebari@quansight.com'",
        "git config user.name 'gitlab ci'",
        "git add .",
        "git diff --quiet && git diff --staged --quiet || (git commit -m '${COMMIT_MSG}'",
        f"git push origin {config.ci_cd.branch})",
    ]

    if config.ci_cd.commit_render:
        script += commit_render_script

    rules = [
        GLCI_rules(
            if_=f"$CI_COMMIT_BRANCH == '{config.ci_cd.branch}'",
            changes=["nebari-config.yaml"],
        )
    ]

    render_nebari = GLCI_job(
        image=f"python:{LATEST_SUPPORTED_PYTHON_VERSION}",
        variables=render_vars,
        before_script=config.ci_cd.before_script,
        after_script=config.ci_cd.after_script,
        script=script,
        rules=rules,
    )

    return GLCI(
        {
            "render-nebari": render_nebari,
        }
    )



---
File: nebari/src/_nebari/provider/cicd/linter.py
---

import json
import os
import textwrap
from pathlib import Path

import requests

from _nebari.schema import verify


def nebari_validate(config):
    # Gather the output of `nebari validate`.
    print("Validate: info: validating Nebari configuration in nebari-config.yaml")

    def parse_validation(message):
        # this will just separate things for now, but can be enhanced
        return str(message)

    try:
        verify(config)
        msg = "validate: info: successfully validated Nebari configuration"
        print(msg)
        return True, msg, 0

    except BaseException as e:
        msg = "validate: error: failed to validate Nebari configuration."
        print(msg)
        validate_comment = parse_validation(e)
        validate_comment_wrapper = f"\n```\n{validate_comment}\n``` "
        return False, validate_comment_wrapper, 1


def generate_lint_message(config):
    # prep for linting
    pr_config = Path("nebari-config.yaml")
    # lint/validate nebari-config.yaml
    all_pass, messages, validate_code = nebari_validate(config)

    pass_lint = textwrap.dedent(
        """
            This is an automatic response from the Nebari linter.
            I just wanted to let you know that I linted your `nebari-config.yaml` in your PR and I didn't find any
            problems.
            """
    )

    # it should be better to parse this messages first
    bad_lint = (
        textwrap.dedent(
            """
            This is an automatic response from the Nebari linter.
            I just wanted to let you know that I linted your `nebari-config.yaml` in your PR and found some errors:\n"""
        )
        + f"{messages}"
    )

    if not pr_config:
        status = "no configuration file"
        message = textwrap.dedent(
            """
            This is an automatic response from the Nebari linter.
            I was trying to look for the `nebari-config.yaml` file to lint for you, but couldn't find any...
            """
        )

    elif all_pass:
        status = "Success"
        message = pass_lint
    else:
        status = "Failure"
        message = bad_lint

    lint = {
        "message": f"#### `nebari validate` {status} \n" + message,
        "code": validate_code,
    }
    return lint


def comment_on_pr(config):
    lint = generate_lint_message(config)
    message = lint["message"]
    exitcode = lint["code"]

    print(
        "If the comment was not published, the following would "
        "have been the message:\n{}".format(message)
    )

    # comment on PR
    owner, repo_name = os.environ["REPO_NAME"].split("/")
    pr_id = os.environ["PR_NUMBER"]

    token = os.environ["GITHUB_TOKEN"]
    url = f"https://api.github.com/repos/{owner}/{repo_name}/issues/{pr_id}/comments"

    payload = {"body": message}
    headers = {"Content-Type": "application/json", "Authorization": f"token {token}"}
    requests.post(url=url, headers=headers, data=json.dumps(payload))

    return exit(exitcode)



---
File: nebari/src/_nebari/provider/cloud/__init__.py
---




---
File: nebari/src/_nebari/provider/cloud/amazon_web_services.py
---

import functools
import os
import re
import time
from dataclasses import dataclass
from typing import Dict, List, Optional

import boto3
from botocore.exceptions import ClientError, EndpointConnectionError

from _nebari.constants import AWS_ENV_DOCS
from _nebari.provider.cloud.commons import filter_by_highest_supported_k8s_version
from _nebari.utils import check_environment_variables
from nebari import schema

MAX_RETRIES = 5
DELAY = 5


def check_credentials() -> None:
    required_variables = {"AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY"}
    check_environment_variables(required_variables, AWS_ENV_DOCS)


@functools.lru_cache()
def aws_session(
    region: Optional[str] = None,
) -> boto3.Session:
    """Create a boto3 session."""
    check_credentials()
    aws_access_key_id = os.environ["AWS_ACCESS_KEY_ID"]
    aws_secret_access_key = os.environ["AWS_SECRET_ACCESS_KEY"]
    aws_session_token = os.environ.get("AWS_SESSION_TOKEN")

    if not region:
        raise ValueError(
            "Please specify `region` in the nebari-config.yaml or if initializing the nebari-config, set the region via the "
            "`--region` flag or via the AWS_DEFAULT_REGION environment variable.\n"
        )

    return boto3.Session(
        region_name=region,
        aws_access_key_id=aws_access_key_id,
        aws_secret_access_key=aws_secret_access_key,
        aws_session_token=aws_session_token,
    )


@functools.lru_cache()
def regions(region: str) -> Dict[str, str]:
    """Return dict of enabled regions for the AWS account.

    NOTE: This function attempts to call the EC2 describe_regions() API.
    If the API call fails, we catch the two most common exceptions:
      - EndpointConnectionError: This is raised when the region specified is invalid.
      - ClientError (AuthFailure): This is raised when the credentials are invalid or trying to specify a region in a non-standard partition (e.g. AWS GovCloud) or vice-versa.
    """
    session = aws_session(region=region)
    try:
        client = session.client("ec2")
        regions = client.describe_regions()["Regions"]
        return {_["RegionName"]: _["RegionName"] for _ in regions}
    except EndpointConnectionError as e:
        print("Please double-check that the region specified is valid.", e)
        exit(1)
    except ClientError as e:
        if "AuthFailure" in str(e):
            print(
                "Please double-check that the AWS credentials are valid and have the correct permissions.",
                "If you're deploying into a non-standard partition (e.g. AWS GovCloud), please ensure the region specified exists in that partition.",
            )
            exit(1)
        else:
            raise e


@functools.lru_cache()
def zones(region: str) -> Dict[str, str]:
    """Return dict of enabled availability zones for the AWS region."""
    session = aws_session(region=region)
    client = session.client("ec2")

    response = client.describe_availability_zones()
    return {_["ZoneName"]: _["ZoneName"] for _ in response["AvailabilityZones"]}


@functools.lru_cache()
def kubernetes_versions(region: str) -> List[str]:
    """Return list of available kubernetes supported by cloud provider. Sorted from oldest to latest."""
    # AWS SDK (boto3) currently doesn't offer an intuitive way to list available kubernetes version. This implementation grabs kubernetes versions for specific EKS addons. It will therefore always be (at the very least) a subset of all kubernetes versions still supported by AWS.
    session = aws_session(region=region)
    client = session.client("eks")

    supported_kubernetes_versions = list()
    available_addons = client.describe_addon_versions()
    for addon in available_addons.get("addons", None):
        for eksbuild in addon.get("addonVersions", None):
            for k8sversion in eksbuild.get("compatibilities", None):
                supported_kubernetes_versions.append(
                    k8sversion.get("clusterVersion", None)
                )

    supported_kubernetes_versions = sorted(list(set(supported_kubernetes_versions)))
    return filter_by_highest_supported_k8s_version(supported_kubernetes_versions)


@functools.lru_cache()
def instances(region: str) -> Dict[str, str]:
    """Return dict of available instance types for the AWS region."""
    session = aws_session(region=region)
    client = session.client("ec2")
    paginator = client.get_paginator("describe_instance_types")
    instance_types = sorted(
        [j["InstanceType"] for i in paginator.paginate() for j in i["InstanceTypes"]]
    )
    return {t: t for t in instance_types}


@dataclass
class Kms_Key_Info:
    Arn: str
    KeyUsage: str
    KeySpec: str
    KeyManager: str


@functools.lru_cache()
def kms_key_arns(region: str) -> Dict[str, Kms_Key_Info]:
    """Return dict of available/enabled KMS key IDs and associated KeyMetadata for the AWS region."""
    session = aws_session(region=region)
    client = session.client("kms")
    kms_keys = {}
    # https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kms/client/list_keys.html
    for key in client.list_keys().get("Keys"):
        key_id = key["KeyId"]
        # https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kms/client/describe_key.html#:~:text=Response%20Structure
        key_data = client.describe_key(KeyId=key_id).get("KeyMetadata")
        if key_data.get("Enabled"):
            kms_keys[key_id] = Kms_Key_Info(
                Arn=key_data.get("Arn"),
                KeyUsage=key_data.get("KeyUsage"),
                KeySpec=key_data.get("KeySpec"),
                KeyManager=key_data.get("KeyManager"),
            )
    return kms_keys


def aws_get_vpc_id(name: str, namespace: str, region: str) -> Optional[str]:
    """Return VPC ID for the EKS cluster namedd `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    client = session.client("ec2")
    response = client.describe_vpcs()

    for vpc in response["Vpcs"]:
        tags = vpc.get("Tags", [])
        for tag in tags:
            if tag["Key"] == "Name" and tag["Value"] == cluster_name:
                return vpc["VpcId"]
    return None


def set_asg_tags(asg_node_group_map: Dict[str, str], region: str) -> None:
    """Set tags for AWS node scaling from zero to work."""
    session = aws_session(region=region)
    autoscaling_client = session.client("autoscaling")
    tags = []
    for asg_name, node_group in asg_node_group_map.items():
        tags.append(
            {
                "Key": "k8s.io/cluster-autoscaler/node-template/label/dedicated",
                "Value": node_group,
                "ResourceId": asg_name,
                "ResourceType": "auto-scaling-group",
                "PropagateAtLaunch": True,
            }
        )
    autoscaling_client.create_or_update_tags(Tags=tags)


def aws_get_asg_node_group_mapping(
    name: str, namespace: str, region: str
) -> Dict[str, str]:
    """Return a dictionary of autoscaling groups and their associated node groups."""
    asg_node_group_mapping = {}
    session = aws_session(region=region)
    eks = session.client("eks")
    node_groups_response = eks.list_nodegroups(
        clusterName=f"{name}-{namespace}",
    )
    node_groups = node_groups_response.get("nodegroups", [])
    for nodegroup in node_groups:
        response = eks.describe_nodegroup(
            clusterName=f"{name}-{namespace}", nodegroupName=nodegroup
        )
        node_group_name = response["nodegroup"]["nodegroupName"]
        auto_scaling_groups = response["nodegroup"]["resources"]["autoScalingGroups"]
        for auto_scaling_group in auto_scaling_groups:
            asg_node_group_mapping[auto_scaling_group["name"]] = node_group_name
    return asg_node_group_mapping


def aws_get_subnet_ids(name: str, namespace: str, region: str) -> List[str]:
    """Return list of subnet IDs for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")
    response = client.describe_subnets()

    subnet_ids = []
    required_tags = 0
    for subnet in response["Subnets"]:
        tags = subnet.get("Tags", [])
        for tag in tags:
            if (
                tag["Key"] == "Project"
                and tag["Value"] == name
                or tag["Key"] == "Environment"
                and tag["Value"] == namespace
            ):
                required_tags += 1
        if required_tags == 2:
            subnet_ids.append(subnet["SubnetId"])
        required_tags = 0

    return subnet_ids


def aws_get_route_table_ids(name: str, namespace: str, region: str) -> List[str]:
    """Return list of route table IDs for the EKS cluster named `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    client = session.client("ec2")
    response = client.describe_route_tables()

    routing_table_ids = []
    for routing_table in response["RouteTables"]:
        tags = routing_table.get("Tags", [])
        for tag in tags:
            if tag["Key"] == "Name" and tag["Value"] == cluster_name:
                routing_table_ids.append(routing_table["RouteTableId"])

    return routing_table_ids


def aws_get_internet_gateway_ids(name: str, namespace: str, region: str) -> List[str]:
    """Return list of internet gateway IDs for the EKS cluster named `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    client = session.client("ec2")
    response = client.describe_internet_gateways()

    internet_gateways = []
    for internet_gateway in response["InternetGateways"]:
        tags = internet_gateway.get("Tags", [])
        for tag in tags:
            if tag["Key"] == "Name" and tag["Value"] == cluster_name:
                internet_gateways.append(internet_gateway["InternetGatewayId"])

    return internet_gateways


def aws_get_security_group_ids(name: str, namespace: str, region: str) -> List[str]:
    """Return list of security group IDs for the EKS cluster named `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    client = session.client("ec2")
    response = client.describe_security_groups()

    security_group_ids = []
    for security_group in response["SecurityGroups"]:
        tags = security_group.get("Tags", [])
        for tag in tags:
            if tag["Key"] == "Name" and tag["Value"] == cluster_name:
                security_group_ids.append(security_group["GroupId"])

    return security_group_ids


def aws_get_load_balancer_name(vpc_id: str, region: str) -> Optional[str]:
    """Return load balancer name for the VPC ID."""
    if not vpc_id:
        print("No VPC ID provided. Exiting...")
        return None

    session = aws_session(region=region)
    client = session.client("elb")
    response = client.describe_load_balancers()["LoadBalancerDescriptions"]

    for load_balancer in response:
        if load_balancer["VPCId"] == vpc_id:
            return load_balancer["LoadBalancerName"]
    return None


def aws_get_efs_ids(name: str, namespace: str, region: str) -> List[str]:
    """Return list of EFS IDs for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("efs")
    response = client.describe_file_systems()

    efs_ids = []
    required_tags = 0
    for efs in response["FileSystems"]:
        tags = efs.get("Tags", [])
        for tag in tags:
            if (
                tag["Key"] == "Project"
                and tag["Value"] == name
                or tag["Key"] == "Environment"
                and tag["Value"] == namespace
            ):
                required_tags += 1
        if required_tags == 2:
            efs_ids.append(efs["FileSystemId"])
        required_tags = 0

    return efs_ids


def aws_get_efs_mount_target_ids(efs_id: str, region: str) -> List[str]:
    """Return list of EFS mount target IDs for the EFS ID."""
    if not efs_id:
        print("No EFS ID provided. Exiting...")
        return []

    session = aws_session(region=region)
    client = session.client("efs")
    response = client.describe_mount_targets(FileSystemId=efs_id)

    mount_target_ids = []
    for mount_target in response["MountTargets"]:
        mount_target_ids.append(mount_target["MountTargetId"])

    return mount_target_ids


def aws_get_ec2_volume_ids(name: str, namespace: str, region: str) -> List[str]:
    """Return list of EC2 volume IDs for the EKS cluster named `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    client = session.client("ec2")
    response = client.describe_volumes()

    volume_ids = []
    for volume in response["Volumes"]:
        tags = volume.get("Tags", [])
        for tag in tags:
            if tag["Key"] == "KubernetesCluster" and tag["Value"] == cluster_name:
                volume_ids.append(volume["VolumeId"])

    return volume_ids


def aws_get_iam_policy(
    region: Optional[str], name: Optional[str] = None, pattern: Optional[str] = None
) -> Optional[str]:
    """Return IAM policy ARN for the policy name or pattern."""
    session = aws_session(region=region)
    client = session.client("iam")
    response = client.list_policies(Scope="Local")

    for policy in response["Policies"]:
        if (name and policy["PolicyName"] == name) or (
            pattern and re.match(pattern, policy["PolicyName"])
        ):
            return policy["Arn"]
    return None


def aws_delete_load_balancer(name: str, namespace: str, region: str):
    """Delete load balancer for the EKS cluster named `{name}-{namespace}`."""
    vpc_id = aws_get_vpc_id(name, namespace, region=region)
    if not vpc_id:
        print("No VPC ID provided. Exiting...")
        return

    load_balancer_name = aws_get_load_balancer_name(vpc_id, region=region)
    if not load_balancer_name:
        print("No load balancer found. Exiting...")
        return

    session = aws_session(region=region)
    client = session.client("elb")

    try:
        client.delete_load_balancer(LoadBalancerName=load_balancer_name)
        print(f"Initiated deletion for load balancer {load_balancer_name}")
    except ClientError as e:
        if "ResourceNotFoundException" in str(e):
            print(f"Load balancer {load_balancer_name} not found. Exiting...")
            return
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            client.describe_load_balancers(LoadBalancerNames=[load_balancer_name])
            print(f"Waiting for load balancer {load_balancer_name} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        except ClientError as e:
            if "ResourceNotFoundException" in str(e):
                print(f"Load balancer {load_balancer_name} deleted successfully")
                return
            else:
                raise e
        retries += 1


def aws_delete_efs_mount_targets(efs_id: str, region: str):
    """Delete all mount targets for the EFS ID."""
    if not efs_id:
        print("No EFS provided. Exiting...")
        return

    session = aws_session(region=region)
    client = session.client("efs")

    mount_target_ids = aws_get_efs_mount_target_ids(efs_id, region=region)
    for mount_target_id in mount_target_ids:
        try:
            client.delete_mount_target(MountTargetId=mount_target_id)
            print(f"Initiated deletion for mount target {mount_target_id}")
        except ClientError as e:
            if "MountTargetNotFound" in str(e):
                print(f"Mount target {mount_target_id} not found. Exiting...")
            else:
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        mount_target_ids = aws_get_efs_mount_target_ids(efs_id, region=region)
        if len(mount_target_ids) == 0:
            print(f"All mount targets for EFS {efs_id} deleted successfully")
            return
        else:
            print(f"Waiting for mount targets for EFS {efs_id} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_efs_file_system(efs_id: str, region: str):
    """Delete EFS file system for the EFS ID."""
    if not efs_id:
        print("No EFS provided. Exiting...")
        return

    session = aws_session(region=region)
    client = session.client("efs")

    try:
        client.delete_file_system(FileSystemId=efs_id)
        print(f"Initiated deletion for EFS {efs_id}")
    except ClientError as e:
        if "FileSystemNotFound" in str(e):
            print(f"EFS {efs_id} not found. Exiting...")
            return
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            client.describe_file_systems(FileSystemId=efs_id)
            print(f"Waiting for EFS {efs_id} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        except ClientError as e:
            if "FileSystemNotFound" in str(e):
                print(f"EFS {efs_id} deleted successfully")
                return
            else:
                raise e
        retries += 1


def aws_delete_efs(name: str, namespace: str, region: str):
    """Delete EFS resources for the EKS cluster named `{name}-{namespace}`."""
    efs_ids = aws_get_efs_ids(name, namespace, region=region)
    for efs_id in efs_ids:
        aws_delete_efs_mount_targets(efs_id, region=region)
        aws_delete_efs_file_system(efs_id, region=region)


def aws_delete_subnets(name: str, namespace: str, region: str):
    """Delete all subnets for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")

    vpc_id = aws_get_vpc_id(name, namespace, region=region)
    subnet_ids = aws_get_subnet_ids(name, namespace, region=region)
    for subnet_id in subnet_ids:
        try:
            client.delete_subnet(SubnetId=subnet_id)
            print(f"Initiated deletion for subnet {subnet_id}")
        except ClientError as e:
            if "InvalidSubnetID.NotFound" in str(e):
                print(f"Subnet {subnet_id} not found. Exiting...")
            else:
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        subnet_ids = aws_get_subnet_ids(name, namespace, region=region)
        if len(subnet_ids) == 0:
            print(f"All subnets for VPC {vpc_id} deleted successfully")
            return
        else:
            print(f"Waiting for subnets for VPC {vpc_id} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_route_tables(name: str, namespace: str, region: str):
    """Delete all route tables for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")

    vpc_id = aws_get_vpc_id(name, namespace, region=region)
    route_table_ids = aws_get_route_table_ids(name, namespace, region=region)
    for route_table_id in route_table_ids:
        try:
            client.delete_route_table(RouteTableId=route_table_id)
            print(f"Initiated deletion for route table {route_table_id}")
        except ClientError as e:
            if "InvalidRouteTableID.NotFound" in str(e):
                print(f"Route table {route_table_id} not found. Exiting...")
            else:
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        route_table_ids = aws_get_route_table_ids(name, namespace, region=region)
        if len(route_table_ids) == 0:
            print(f"All route tables for VPC {vpc_id} deleted successfully")
            return
        else:
            print(f"Waiting for route tables for VPC {vpc_id} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_internet_gateways(name: str, namespace: str, region: str):
    """Delete all internet gateways for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")

    vpc_id = aws_get_vpc_id(name, namespace, region=region)
    internet_gateway_ids = aws_get_internet_gateway_ids(name, namespace, region=region)
    for internet_gateway_id in internet_gateway_ids:
        try:
            client.detach_internet_gateway(
                InternetGatewayId=internet_gateway_id, VpcId=vpc_id
            )
            client.delete_internet_gateway(InternetGatewayId=internet_gateway_id)
            print(
                f"Initiated deletion for internet gateway {internet_gateway_id} from VPC {vpc_id}"
            )
        except ClientError as e:
            if "InvalidInternetGatewayID.NotFound" in str(e):
                print(f"Internet gateway {internet_gateway_id} not found. Exiting...")
            else:
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        internet_gateway_ids = aws_get_internet_gateway_ids(
            name, namespace, region=region
        )
        if len(internet_gateway_ids) == 0:
            print(f"All internet gateways for VPC {vpc_id} deleted successfully")
            return
        else:
            print(f"Waiting for internet gateways for VPC {vpc_id} to be detached...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_security_groups(name: str, namespace: str, region: str):
    """Delete all security groups for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")

    vpc_id = aws_get_vpc_id(name, namespace, region=region)
    security_group_ids = aws_get_security_group_ids(name, namespace, region=region)
    for security_group_id in security_group_ids:
        try:
            client.delete_security_group(GroupId=security_group_id)
            print(f"Initiated deletion for security group {security_group_id}")
        except ClientError as e:
            if "InvalidGroupID.NotFound" in str(e):
                print(f"Security group {security_group_id} not found. Exiting...")
            else:
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        security_group_ids = aws_get_security_group_ids(name, namespace, region=region)
        if len(security_group_ids) == 0:
            print(f"All security groups for VPC {vpc_id} deleted successfully")
            return
        else:
            print(f"Waiting for security groups for VPC {vpc_id} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_vpc(name: str, namespace: str, region: str):
    """Delete VPC for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")

    vpc_id = aws_get_vpc_id(name, namespace, region=region)
    if vpc_id is None:
        print(f"No VPC {vpc_id} provided. Exiting...")
        return

    try:
        client.delete_vpc(VpcId=vpc_id)
        print(f"Initiated deletion for VPC {vpc_id}")
    except ClientError as e:
        if "InvalidVpcID.NotFound" in str(e):
            print(f"VPC {vpc_id} not found. Exiting...")
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        vpc_id = aws_get_vpc_id(name, namespace, region=region)
        if vpc_id is None:
            print(f"VPC {vpc_id} deleted successfully")
            return
        else:
            print(f"Waiting for VPC {vpc_id} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_dynamodb_table(name: str, region: str):
    """Delete DynamoDB table."""
    session = aws_session(region=region)
    client = session.client("dynamodb")

    try:
        client.delete_table(TableName=name)
        print(f"Initiated deletion for DynamoDB table {name}")
    except ClientError as e:
        if "ResourceNotFoundException" in str(e):
            print(f"DynamoDB table {name} not found. Exiting...")
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            client.describe_table(TableName=name)
            print(f"Waiting for DynamoDB table {name} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        except ClientError as e:
            if "ResourceNotFoundException" in str(e):
                print(f"DynamoDB table {name} deleted successfully")
                return
            else:
                raise e
        retries += 1


def aws_delete_ec2_volumes(name: str, namespace: str, region: str):
    """Delete all EC2 volumes for the EKS cluster named `{name}-{namespace}`."""
    session = aws_session(region=region)
    client = session.client("ec2")

    volume_ids = aws_get_ec2_volume_ids(name, namespace, region=region)
    for volume_id in volume_ids:
        try:
            client.delete_volume(VolumeId=volume_id)
            print(f"Initiated deletion for volume {volume_id}")
        except ClientError as e:
            if "InvalidVolume.NotFound" in str(e):
                print(f"Volume {volume_id} not found. Exiting...")
            else:
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        volume_ids = aws_get_ec2_volume_ids(name, namespace, region=region)
        if len(volume_ids) == 0:
            print("All volumes deleted successfully")
            return
        else:
            print("Waiting for volumes to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        retries += 1


def aws_delete_s3_objects(
    bucket_name: str,
    endpoint: Optional[str] = None,
    region: Optional[str] = None,
):
    """
    Delete all objects in the S3 bucket.

    Parameters:
        bucket_name (str): S3 bucket name
        endpoint (str): S3 endpoint URL
        region (str): AWS region

    """
    session = aws_session(region=region)
    s3 = session.client("s3", endpoint_url=endpoint)

    try:
        s3_objects = s3.list_objects(Bucket=bucket_name)
        s3_objects = s3_objects.get("Contents")
        if s3_objects:
            for obj in s3_objects:
                s3.delete_object(Bucket=bucket_name, Key=obj["Key"])

    except ClientError as e:
        if e.response["Error"]["Code"] == "NoSuchBucket":
            print(f"Bucket {bucket_name} not found. Exiting...")
        else:
            raise e

    try:
        versioned_objects = s3.list_object_versions(Bucket=bucket_name)
        for version in versioned_objects.get("DeleteMarkers", []):
            print(version)
            s3.delete_object(
                Bucket=bucket_name, Key=version["Key"], VersionId=version["VersionId"]
            )
    except ClientError as e:
        if e.response["Error"]["Code"] == "NoSuchBucket":
            print(f"Bucket {bucket_name} not found. Exiting...")
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            objs = s3.list_objects(Bucket=bucket_name)["ResponseMetadata"].get(
                "Contents"
            )
            if objs is None:
                print("Bucket objects all deleted successfully")
                return
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        except ClientError as e:
            if e.response["Error"]["Code"] == "NoSuchBucket":
                print(f"Bucket {bucket_name} deleted successfully")
                return
            else:
                raise e
        retries += 1


def aws_delete_s3_bucket(
    bucket_name: str,
    endpoint: Optional[str] = None,
    region: Optional[str] = None,
):
    """
    Delete S3 bucket.

    Parameters:
        bucket_name (str): S3 bucket name
        endpoint (str): S3 endpoint URL
        region (str): AWS region
    """
    aws_delete_s3_objects(bucket_name, endpoint, region)

    session = aws_session(region=region)
    s3 = session.client("s3", endpoint_url=endpoint)

    try:
        s3.delete_bucket(Bucket=bucket_name)
        print(f"Initiated deletion for bucket {bucket_name}")
    except ClientError as e:
        if e.response["Error"]["Code"] == "NoSuchBucket":
            print(f"Bucket {bucket_name} not found. Exiting...")
            return
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            s3.head_bucket(Bucket=bucket_name)
            print(f"Waiting for bucket {bucket_name} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        except ClientError as e:
            if (e.response["Error"]["Code"] == "NoSuchBucket") or (
                e.response["Error"]["Code"] == "NotFound"
            ):
                print(f"Bucket {bucket_name} deleted successfully")
                return
            else:
                raise e
        retries += 1


def aws_delete_iam_role_policies(role_name: str, region: str):
    """Delete all policies attached to the IAM role."""
    session = aws_session(region=region)
    iam = session.client("iam")

    try:
        response = iam.list_attached_role_policies(RoleName=role_name)
        for policy in response["AttachedPolicies"]:
            iam.delete_role_policy(RoleName=role_name, PolicyName=policy["PolicyName"])
            print(f"Delete IAM policy {policy['PolicyName']} from IAM role {role_name}")
    except ClientError as e:
        if "NoSuchEntity" in str(e):
            print(f"IAM role {role_name} not found. Exiting...")
        else:
            raise e


def aws_delete_iam_policy(name: str, region: str):
    """Delete IAM policy."""
    session = aws_session(region=region)
    iam = session.client("iam")

    try:
        iam.delete_policy(PolicyArn=name)
        print(f"Initiated deletion for IAM policy {name}")
    except ClientError as e:
        if "NoSuchEntity" in str(e):
            print(f"IAM policy {name} not found. Exiting...")
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            iam.get_policy(PolicyArn=name)
            print(f"Waiting for IAM policy {name} to be deleted...")
            sleep_time = DELAY * (2**retries)
            time.sleep(sleep_time)
        except ClientError as e:
            if "NoSuchEntity" in str(e):
                print(f"IAM policy {name} deleted successfully")
                return
            else:
                raise e
        retries += 1


def aws_delete_iam_role(role_name: str, region: str):
    """Delete IAM role."""
    session = aws_session(region=region)
    iam = session.client("iam")

    try:
        attached_policies = iam.list_attached_role_policies(RoleName=role_name)
    except ClientError as e:
        if e.response["Error"]["Code"] == "NoSuchEntity":
            print(f"IAM role {role_name} not found. Exiting...")
            return
        else:
            raise e
    for policy in attached_policies["AttachedPolicies"]:
        iam.detach_role_policy(RoleName=role_name, PolicyArn=policy["PolicyArn"])
        print(f"Detached policy {policy['PolicyName']} from role {role_name}")

        if policy["PolicyArn"].startswith("arn:aws:iam::aws:policy"):
            continue

        policy_versions = iam.list_policy_versions(PolicyArn=policy["PolicyArn"])

        for version in policy_versions["Versions"]:
            if not version["IsDefaultVersion"]:
                iam.delete_policy_version(
                    PolicyArn=policy["PolicyArn"], VersionId=version["VersionId"]
                )
                print(
                    f"Deleted version {version['VersionId']} of policy {policy['PolicyName']}"
                )

        iam.delete_policy(PolicyArn=policy["PolicyArn"])
        print(f"Deleted policy {policy['PolicyName']}")

    iam.delete_role(RoleName=role_name)
    print(f"Deleted role {role_name}")


def aws_delete_node_groups(name: str, namespace: str, region: str):
    """Delete all node groups for the EKS cluster named `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    eks = session.client("eks")
    try:
        response = eks.list_nodegroups(clusterName=cluster_name)
        node_groups = response.get("nodegroups", [])
    except ClientError as e:
        if "ResourceNotFoundException" in str(e):
            print(f"Cluster {cluster_name} not found. Exiting...")
            return
        else:
            raise e

    for node_group in node_groups:
        try:
            eks.delete_nodegroup(clusterName=cluster_name, nodegroupName=node_group)
            print(
                f"Initiated deletion for node group {node_group} belonging to cluster {cluster_name}"
            )
        except ClientError as e:
            if "ResourceNotFoundException" not in str(e):
                raise e

    retries = 0
    while retries < MAX_RETRIES:
        pending_deletion = []

        for node_group in node_groups:
            try:
                response = eks.describe_nodegroup(
                    clusterName=cluster_name, nodegroupName=node_group
                )
                if response["nodegroup"]["status"] == "DELETING":
                    pending_deletion.append(node_group)
            except ClientError as e:
                if "ResourceNotFoundException" in str(e):
                    pass
                else:
                    raise e

        if not pending_deletion:
            print("All node groups have been deleted successfully.")
            return

        if retries < MAX_RETRIES - 1:
            sleep_time = DELAY * (2**retries)
            print(
                f"{len(pending_deletion)} node groups still pending deletion. Retrying in {sleep_time} seconds..."
            )
            time.sleep(sleep_time)

        retries += 1
        pending_deletion.clear()

    print(f"Failed to confirm deletion of all node groups after {MAX_RETRIES} retries.")


def aws_delete_cluster(name: str, namespace: str, region: str):
    """Delete EKS cluster named `{name}-{namespace}`."""
    cluster_name = f"{name}-{namespace}"
    session = aws_session(region=region)
    eks = session.client("eks")

    try:
        eks.delete_cluster(name=cluster_name)
        print(f"Initiated deletion for cluster {cluster_name}")
    except ClientError as e:
        if "ResourceNotFoundException" in str(e):
            print(f"Cluster {cluster_name} not found. Exiting...")
            return
        else:
            raise e

    retries = 0
    while retries < MAX_RETRIES:
        try:
            response = eks.describe_cluster(name=cluster_name)
            if response["cluster"]["status"] == "DELETING":
                sleep_time = DELAY * (2**retries)
                print(
                    f"Cluster {cluster_name} still pending deletion. Retrying in {sleep_time} seconds..."
                )
                time.sleep(sleep_time)
            else:
                raise ClientError(
                    f"Unexpected status for cluster {cluster_name}: {response['cluster']['status']}"
                )
        except ClientError as e:
            if "ResourceNotFoundException" in str(e):
                print(f"Cluster {cluster_name} has been deleted successfully.")
                return
            else:
                raise e

        retries += 1

    print(
        f"Failed to confirm deletion of cluster {cluster_name} after {MAX_RETRIES} retries."
    )


def aws_cleanup(config: schema.Main):
    """Delete all Amazon Web Services resources created by Nebari"""

    name = config.project_name
    namespace = config.namespace
    region = config.amazon_web_services.region

    aws_delete_node_groups(name, namespace, region)
    aws_delete_cluster(name, namespace, region)

    aws_delete_load_balancer(name, namespace, region)

    aws_delete_efs(name, namespace, region)

    aws_delete_subnets(name, namespace, region)
    aws_delete_route_tables(name, namespace, region)
    aws_delete_internet_gateways(name, namespace, region)
    aws_delete_security_groups(name, namespace, region)
    aws_delete_vpc(name, namespace, region)

    aws_delete_ec2_volumes(name, namespace, region)

    dynamodb_table_name = f"{name}-{namespace}-terraform-state-lock"
    aws_delete_dynamodb_table(dynamodb_table_name, region)

    s3_bucket_name = f"{name}-{namespace}-terraform-state"
    aws_delete_s3_bucket(s3_bucket_name, region)

    iam_role_name = f"{name}-{namespace}-eks-cluster-role"
    iam_role_node_group_name = f"{name}-{namespace}-eks-node-group-role"
    iam_policy_name_regex = "^eks-worker-autoscaling-{name}-{namespace}(\\d+)$".format(
        name=name, namespace=namespace
    )
    iam_policy = aws_get_iam_policy(region, pattern=iam_policy_name_regex)
    if iam_policy:
        aws_delete_iam_role_policies(iam_role_node_group_name, region)
        aws_delete_iam_policy(iam_policy, region)
    aws_delete_iam_role(iam_role_name, region)
    aws_delete_iam_role(iam_role_node_group_name, region)


### PYDANTIC VALIDATORS ###


def validate_region(region: str) -> str:
    """Validate that the region is one of the enabled AWS regions"""
    available_regions = regions(region=region)
    if region not in available_regions:
        raise ValueError(
            f"Region {region} is not one of available regions {available_regions}"
        )
    return region


def validate_kubernetes_versions(region: str, kubernetes_version: str) -> str:
    """Validate that the Kubernetes version is available in the specified region"""
    available_versions = kubernetes_versions(region=region)
    if kubernetes_version not in available_versions:
        raise ValueError(
            f"Kubernetes version {kubernetes_version} is not one of available versions {available_versions}"
        )
    return kubernetes_version



---
File: nebari/src/_nebari/provider/cloud/azure_cloud.py
---

import functools
import logging
import os
import time
from typing import Dict

from azure.core.exceptions import ResourceNotFoundError
from azure.identity import DefaultAzureCredential
from azure.mgmt.containerservice import ContainerServiceClient
from azure.mgmt.resource import ResourceManagementClient

from _nebari.constants import AZURE_ENV_DOCS
from _nebari.provider.cloud.commons import filter_by_highest_supported_k8s_version
from _nebari.utils import (
    AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
    check_environment_variables,
    construct_azure_resource_group_name,
)
from nebari import schema

logger = logging.getLogger("azure")
logger.setLevel(logging.ERROR)

DURATION = 10
RETRIES = 10


def check_credentials() -> DefaultAzureCredential:
    required_variables = {"ARM_CLIENT_ID", "ARM_SUBSCRIPTION_ID", "ARM_TENANT_ID"}
    check_environment_variables(required_variables, AZURE_ENV_DOCS)

    optional_variable = "ARM_CLIENT_SECRET"
    arm_client_secret = os.environ.get(optional_variable, None)
    if arm_client_secret:
        logger.info("Authenticating as a service principal.")
    else:
        logger.info(f"No {optional_variable} environment variable found.")
        logger.info("Allowing Azure SDK to authenticate using OIDC or other methods.")
    return DefaultAzureCredential()


@functools.lru_cache()
def initiate_container_service_client():
    subscription_id = os.environ.get("ARM_SUBSCRIPTION_ID", None)
    credentials = check_credentials()

    return ContainerServiceClient(
        credential=credentials, subscription_id=subscription_id
    )


@functools.lru_cache()
def initiate_resource_management_client():
    subscription_id = os.environ.get("ARM_SUBSCRIPTION_ID", None)
    credentials = check_credentials()

    return ResourceManagementClient(
        credential=credentials, subscription_id=subscription_id
    )


@functools.lru_cache()
def kubernetes_versions(region="Central US"):
    """Return list of available kubernetes supported by cloud provider. Sorted from oldest to latest."""
    client = initiate_container_service_client()
    azure_location = region.replace(" ", "").lower()

    k8s_versions_list = client.container_services.list_orchestrators(
        azure_location, resource_type="managedClusters"
    ).as_dict()
    supported_kubernetes_versions = []

    for key in k8s_versions_list["orchestrators"]:
        if key["orchestrator_type"] == "Kubernetes":
            supported_kubernetes_versions.append(key["orchestrator_version"])

    supported_kubernetes_versions = sorted(supported_kubernetes_versions)
    return filter_by_highest_supported_k8s_version(supported_kubernetes_versions)


def delete_resource_group(resource_group_name: str):
    """Delete resource group and all resources within it."""

    client = initiate_resource_management_client()
    try:
        client.resource_groups.begin_delete(resource_group_name)
    except ResourceNotFoundError:
        logger.info(f"Resource group `{resource_group_name}` deleted successfully.")
        return

    retries = 0
    while retries < RETRIES:
        try:
            client.resource_groups.get(resource_group_name)
        except ResourceNotFoundError:
            logger.info(f"Resource group `{resource_group_name}` deleted successfully.")
            break
        logger.info(
            f"Waiting for resource group `{resource_group_name}` to be deleted..."
        )
        time.sleep(DURATION)
        retries += 1


def azure_cleanup(config: schema.Main):
    """Delete all resources on Azure created by Nebari"""

    # deleting this resource group automatically deletes the associated node resource group
    ask_resource_group = construct_azure_resource_group_name(
        project_name=config.project_name,
        namespace=config.namespace,
        base_resource_group_name=config.azure.resource_group_name,
    )

    state_resource_group = construct_azure_resource_group_name(
        project_name=config.project_name,
        namespace=config.namespace,
        base_resource_group_name=config.azure.resource_group_name,
        suffix=AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
    )

    delete_resource_group(ask_resource_group)
    delete_resource_group(state_resource_group)


### PYDANTIC VALIDATORS ###


def validate_tags(tags: Dict[str, str]) -> Dict[str, str]:
    max_name_length = 512
    max_value_length = 256
    invalid_chars = "<>%&\\?/"

    for tag_name, tag_value in tags.items():
        if any(char in tag_name for char in invalid_chars):
            raise ValueError(
                f"Tag name '{tag_name}' contains invalid characters. Invalid characters are: `{invalid_chars}`"
            )

        if len(tag_name) > max_name_length:
            raise ValueError(
                f"Tag name '{tag_name}' exceeds maximum length of {max_name_length} characters."
            )

        if len(tag_value) > max_value_length:
            raise ValueError(
                f"Tag value '{tag_value}' for tag '{tag_name}' exceeds maximum length of {max_value_length} characters."
            )

    return tags



---
File: nebari/src/_nebari/provider/cloud/commons.py
---

import re

from _nebari.constants import HIGHEST_SUPPORTED_K8S_VERSION


def filter_by_highest_supported_k8s_version(k8s_versions_list):
    filtered_k8s_versions_list = []
    for k8s_version in k8s_versions_list:
        version = tuple(filter(None, re.search(r"(\d+)\.(\d+)", k8s_version).groups()))
        if version <= HIGHEST_SUPPORTED_K8S_VERSION:
            filtered_k8s_versions_list.append(k8s_version)
    return filtered_k8s_versions_list



---
File: nebari/src/_nebari/provider/cloud/google_cloud.py
---

import functools
import json
import os
from typing import List, Set

import google.api_core.exceptions
from google.auth import load_credentials_from_dict, load_credentials_from_file
from google.cloud import compute_v1, container_v1, iam_admin_v1, storage

from _nebari.constants import GCP_ENV_DOCS
from _nebari.provider.cloud.commons import filter_by_highest_supported_k8s_version
from _nebari.utils import check_environment_variables
from nebari import schema


def check_credentials() -> None:
    required_variables = {"GOOGLE_CREDENTIALS", "PROJECT_ID"}
    check_environment_variables(required_variables, GCP_ENV_DOCS)


@functools.lru_cache()
def load_credentials():
    check_credentials()
    credentials = os.environ["GOOGLE_CREDENTIALS"]
    project_id = os.environ["PROJECT_ID"]

    # Scopes need to be explicitly defined when using workload identity
    # federation.
    scopes = ["https://www.googleapis.com/auth/cloud-platform"]

    # Google credentials are stored as strings in GHA secrets so we need
    # to determine if the credentials are stored as a file or not before
    # reading them
    if credentials.endswith(".json"):
        loaded_credentials, _ = load_credentials_from_file(credentials, scopes=scopes)
    else:
        loaded_credentials, _ = load_credentials_from_dict(
            json.loads(credentials), scopes=scopes
        )

    return loaded_credentials, project_id


@functools.lru_cache()
def regions() -> Set[str]:
    """Return a dict of available regions."""
    credentials, project_id = load_credentials()
    client = compute_v1.RegionsClient(credentials=credentials)
    response = client.list(project=project_id)

    return {region.name for region in response}


@functools.lru_cache()
def instances(region: str) -> set[str]:
    """Return a set of available compute instances in a region."""
    credentials, project_id = load_credentials()
    zones_client = compute_v1.services.region_zones.RegionZonesClient(
        credentials=credentials
    )
    instances_client = compute_v1.MachineTypesClient(credentials=credentials)
    zone_list = zones_client.list(project=project_id, region=region)
    zones = [zone for zone in zone_list]
    instance_set: set[str] = set()
    for zone in zones:
        instance_list = instances_client.list(project=project_id, zone=zone.name)
        for instance in instance_list:
            instance_set.add(instance.name)
    return instance_set


@functools.lru_cache()
def kubernetes_versions(region: str) -> List[str]:
    """Return list of available kubernetes supported by cloud provider. Sorted from oldest to latest."""
    credentials, project_id = load_credentials()
    client = container_v1.ClusterManagerClient(credentials=credentials)
    response = client.get_server_config(
        name=f"projects/{project_id}/locations/{region}", timeout=300
    )
    supported_kubernetes_versions = response.valid_master_versions

    return filter_by_highest_supported_k8s_version(supported_kubernetes_versions)


def get_patch_version(full_version: str) -> str:
    return full_version.split("-")[0]


def get_minor_version(full_version: str) -> str:
    patch_version = get_patch_version(full_version)
    parts = patch_version.split(".")
    return f"{parts[0]}.{parts[1]}"


def cluster_exists(cluster_name: str, region: str) -> bool:
    """Check if a GKE cluster exists."""
    credentials, project_id = load_credentials()
    client = container_v1.ClusterManagerClient(credentials=credentials)

    try:
        client.get_cluster(
            name=f"projects/{project_id}/locations/{region}/clusters/{cluster_name}"
        )
    except google.api_core.exceptions.NotFound:
        return False
    return True


def bucket_exists(bucket_name: str) -> bool:
    """Check if a storage bucket exists."""
    credentials, _ = load_credentials()
    client = storage.Client(credentials=credentials)

    try:
        client.get_bucket(bucket_name)
    except google.api_core.exceptions.NotFound:
        return False
    return True


def service_account_exists(service_account_name: str) -> bool:
    """Check if a service account exists."""
    credentials, project_id = load_credentials()
    client = iam_admin_v1.IAMClient(credentials=credentials)

    service_account_path = client.service_account_path(project_id, service_account_name)
    try:
        client.get_service_account(name=service_account_path)
    except google.api_core.exceptions.NotFound:
        return False
    return True


def delete_cluster(cluster_name: str, region: str):
    """Delete a GKE cluster if it exists."""
    credentials, project_id = load_credentials()
    if not cluster_exists(cluster_name, region):
        print(
            f"Cluster {cluster_name} does not exist in project {project_id}, region {region}. Exiting gracefully."
        )
        return

    client = container_v1.ClusterManagerClient(credentials=credentials)
    try:
        client.delete_cluster(
            name=f"projects/{project_id}/locations/{region}/clusters/{cluster_name}"
        )
        print(f"Successfully deleted cluster {cluster_name}.")
    except google.api_core.exceptions.GoogleAPIError as e:
        print(f"Failed to delete bucket {bucket_name}. Error: {e}")


def delete_storage_bucket(bucket_name: str):
    """Delete a storage bucket if it exists."""
    credentials, project_id = load_credentials()

    if not bucket_exists(bucket_name):
        print(
            f"Bucket {bucket_name} does not exist in project {project_id}. Exiting gracefully."
        )
        return

    client = storage.Client(credentials=credentials)
    bucket = client.get_bucket(bucket_name)
    try:
        bucket.delete(force=True)
        print(f"Successfully deleted bucket {bucket_name}.")
    except google.api_core.exceptions.GoogleAPIError as e:
        print(f"Failed to delete bucket {bucket_name}. Error: {e}")


def delete_service_account(service_account_name: str):
    """Delete a service account if it exists."""
    credentials, project_id = load_credentials()

    if not service_account_exists(service_account_name):
        print(
            f"Service account {service_account_name} does not exist in project {project_id}. Exiting gracefully."
        )
        return

    client = iam_admin_v1.IAMClient(credentials=credentials)
    service_account_path = client.service_account_path(project_id, service_account_name)
    try:
        client.delete_service_account(name=service_account_path)
        print(f"Successfully deleted service account {service_account_name}.")
    except google.api_core.exceptions.GoogleAPIError as e:
        print(f"Failed to delete service account {service_account_name}. Error: {e}")


def gcp_cleanup(config: schema.Main):
    """Delete all GCP resources."""
    check_credentials()
    project_name = config.project_name
    namespace = config.namespace
    project_id = config.google_cloud_platform.project
    region = config.google_cloud_platform.region
    cluster_name = f"{project_name}-{namespace}"
    bucket_name = f"{project_name}-{namespace}-terraform-state"
    service_account_name = (
        f"{project_name}-{namespace}@{project_id}.iam.gserviceaccount.com"
    )

    delete_cluster(cluster_name, region)
    delete_storage_bucket(bucket_name)
    delete_service_account(service_account_name)



---
File: nebari/src/_nebari/provider/dns/__init__.py
---




---
File: nebari/src/_nebari/provider/dns/cloudflare.py
---

import logging
import os

import CloudFlare

logger = logging.getLogger(__name__)


def update_record(zone_name, record_name, record_type, record_address):
    for variable in {"CLOUDFLARE_TOKEN"}:
        if variable not in os.environ:
            raise ValueError(
                f"Cloudflare required environment variable={variable} not defined"
            )

    cf = CloudFlare.CloudFlare(token=os.environ["CLOUDFLARE_TOKEN"])

    record = {
        "name": record_name,
        "type": record_type,
        "content": record_address,
        "ttl": 1,
        "proxied": False,
    }

    zone_id = None
    for zone in cf.zones.get():
        if zone["name"] == zone_name:
            zone_id = zone["id"]
            break
    else:
        raise ValueError(f"Cloudflare zone {zone_name} not found")

    existing_record = cf.zones.dns_records.get(
        zone_id, params={"name": f"{record_name}.{zone_name}", "type": record_type}
    )
    if existing_record:
        logger.info(
            f"record name={record_name} type={record_type} address={record_address} already exists updating"
        )
        cf.zones.dns_records.put(zone_id, existing_record[0]["id"], data=record)
    else:
        logger.info(
            f"record name={record_name} type={record_type} address={record_address} does not exists creating"
        )
        cf.zones.dns_records.post(zone_id, data=record)



---
File: nebari/src/_nebari/provider/oauth/__init__.py
---




---
File: nebari/src/_nebari/provider/oauth/auth0.py
---

import logging
import os

from auth0.authentication import GetToken
from auth0.management import Auth0

logger = logging.getLogger(__name__)


def create_client(jupyterhub_endpoint: str, project_name: str, reuse_existing=True):
    for variable in {"AUTH0_DOMAIN", "AUTH0_CLIENT_ID", "AUTH0_CLIENT_SECRET"}:
        if variable not in os.environ:
            raise ValueError(f"Required environment variable={variable} not defined")

    get_token = GetToken(
        os.environ["AUTH0_DOMAIN"],
        os.environ["AUTH0_CLIENT_ID"],
        client_secret=os.environ["AUTH0_CLIENT_SECRET"],
    )
    token = get_token.client_credentials(
        f'https://{os.environ["AUTH0_DOMAIN"]}/api/v2/'
    )
    mgmt_api_token = token["access_token"]

    auth0 = Auth0(os.environ["AUTH0_DOMAIN"], mgmt_api_token)

    oauth_callback_url = (
        f"https://{jupyterhub_endpoint}/auth/realms/nebari/broker/auth0/endpoint"
    )

    for client in auth0.clients.all(
        fields=["name", "client_id", "client_secret", "callbacks"], include_fields=True
    ):
        if client["name"] == project_name and reuse_existing:
            if oauth_callback_url not in client["callbacks"]:
                logger.info(
                    f"updating existing application={project_name} client_id={client['client_id']} adding callback url={oauth_callback_url}"
                )
                auth0.clients.update(
                    client["client_id"],
                    {"callbacks": client["callbacks"] + [oauth_callback_url]},
                )

            return {
                "auth0_subdomain": ".".join(os.environ["AUTH0_DOMAIN"].split(".")[:-2]),
                "client_id": client["client_id"],
                "client_secret": client["client_secret"],
            }

    client = auth0.clients.create(
        {
            "name": project_name,
            "description": f"Nebari - {project_name} - {jupyterhub_endpoint}",
            "callbacks": [oauth_callback_url],
            "app_type": "regular_web",
        }
    )

    return {
        "auth0_subdomain": ".".join(os.environ["AUTH0_DOMAIN"].split(".")[:-2]),
        "client_id": client["client_id"],
        "client_secret": client["client_secret"],
    }



---
File: nebari/src/_nebari/provider/__init__.py
---




---
File: nebari/src/_nebari/provider/git.py
---

import configparser
import os
import subprocess
from pathlib import Path
from typing import Optional

from _nebari.utils import change_directory


def is_git_repo(path: Optional[Path] = None):
    path = path or Path.cwd()
    return ".git" in os.listdir(path)


def initialize_git(path: Optional[Path] = None):
    path = path or Path.cwd()
    with change_directory(path):
        subprocess.check_output(["git", "init"])
        # Ensure initial branch is called main
        subprocess.check_output(["git", "checkout", "-b", "main"])


def add_git_remote(
    remote_path: str, path: Optional[Path] = None, remote_name: str = "origin"
):
    path = path or Path.cwd()

    c = configparser.ConfigParser()
    with open(path / ".git/config") as f:
        c.read_file(f)
    if f'remote "{remote_name}"' in c:
        if c[f'remote "{remote_name}"']["url"] == remote_path:
            return  # no action needed
        else:
            raise ValueError(
                f"git add remote would change existing remote name={remote_name}"
            )

    with change_directory(path):
        subprocess.check_output(["git", "remote", "add", remote_name, remote_path])



---
File: nebari/src/_nebari/provider/helm.py
---

import logging
import os
import subprocess
import tempfile
from pathlib import Path

from _nebari import constants
from _nebari.utils import run_subprocess_cmd

logger = logging.getLogger(__name__)


class HelmException(Exception):
    pass


def download_helm_binary(version=constants.HELM_VERSION) -> Path:
    filename_directory = Path(tempfile.gettempdir()) / "helm" / version
    filename_path = filename_directory / "helm"

    if not filename_directory.is_dir():
        filename_directory.mkdir(parents=True)

    if not filename_path.is_file():
        logger.info(
            "downloading and extracting Helm binary version %s to path=%s",
            constants.HELM_VERSION,
            filename_path,
        )
        old_path = os.environ.get("PATH")
        new_path = f"{filename_directory}:{old_path}"
        install_script = subprocess.run(
            [
                "curl",
                "-s",
                "https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3",
            ],
            stdout=subprocess.PIPE,
            check=True,
        )
        subprocess.run(
            [
                "bash",
                "-s",
                "--",
                "-v",
                constants.HELM_VERSION,
                "--no-sudo",
            ],
            input=install_script.stdout,
            check=True,
            env={"HELM_INSTALL_DIR": str(filename_directory), "PATH": new_path},
        )

    filename_path.chmod(0o555)
    return filename_path


def run_helm_subprocess(processargs, **kwargs) -> None:
    helm_path = download_helm_binary()
    logger.info("helm at %s", helm_path)
    if run_subprocess_cmd([helm_path] + processargs, **kwargs):
        raise HelmException("Helm returned an error")


def version() -> str:
    helm_path = download_helm_binary()
    logger.info("checking helm=%s version", helm_path)

    version_output = subprocess.check_output([helm_path, "version"]).decode("utf-8")
    return version_output



---
File: nebari/src/_nebari/provider/kubernetes.py
---

# Copyright 2019 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


import os
import pathlib
import re

import yaml
from kubernetes import client
from kubernetes.dynamic.client import DynamicClient
from kubernetes.dynamic.resource import Resource

UPPER_FOLLOWED_BY_LOWER_RE = re.compile("(.)([A-Z][a-z]+)")
LOWER_OR_NUM_FOLLOWED_BY_UPPER_RE = re.compile("([a-z0-9])([A-Z])")


def create_from_directory(
    k8s_client, yaml_dir=None, verbose=False, namespace="default", apply=False, **kwargs
):
    """
    Perform an action from files from a directory. Pass True for verbose to
    print confirmation information.

    Input:
    k8s_client: an ApiClient object, initialized with the client args.
    yaml_dir: string. Contains the path to directory.
    verbose: If True, print confirmation from the create action.
        Default is False.
    namespace: string. Contains the namespace to create all
        resources inside. The namespace must preexist otherwise
        the resource creation will fail. If the API object in
        the yaml file already contains a namespace definition
        this parameter has no effect.

    Available parameters for creating <kind>:
    :param async_req bool
    :param bool include_uninitialized: If true, partially initialized
        resources are included in the response.
    :param str pretty: If 'true', then the output is pretty printed.
    :param str dry_run: When present, indicates that modifications
        should not be persisted. An invalid or unrecognized dryRun
        directive will result in an error response and no further
        processing of the request.
        Valid values are: - All: all dry run stages will be processed

    Returns:
        The list containing the created kubernetes API objects.

    Raises:
        FailToCreateError which holds list of `client.rest.ApiException`
        instances for each object that failed to create.
    """

    if not yaml_dir:
        raise ValueError("`yaml_dir` argument must be provided")
    elif not os.path.isdir(yaml_dir):  # noqa
        raise ValueError("`yaml_dir` argument must be a path to directory")

    files = [
        os.path.join(yaml_dir, i)  # noqa
        for i in os.listdir(yaml_dir)
        if os.path.isfile(os.path.join(yaml_dir, i))  # noqa
    ]
    if not files:
        raise ValueError("`yaml_dir` contains no files")

    failures = []
    k8s_objects_all = []

    for file in files:
        try:
            k8s_objects = create_from_yaml(
                k8s_client,
                file,
                verbose=verbose,
                namespace=namespace,
                apply=apply,
                **kwargs,
            )
            k8s_objects_all.append(k8s_objects)
        except OperationFailureError as failure:
            failures.extend(failure.api_exceptions)
    if failures:
        raise OperationFailureError(failures)
    return k8s_objects_all


def create_from_yaml(
    k8s_client,
    yaml_file=None,
    yaml_objects=None,
    verbose=False,
    namespace="default",
    apply=False,
    **kwargs,
):
    """
    Perform an action from a yaml file. Pass True for verbose to
    print confirmation information.
    Input:
    yaml_file: string. Contains the path to yaml file.
    k8s_client: an ApiClient object, initialized with the client args.
    yaml_objects: List[dict]. Optional list of YAML objects; used instead
        of reading the `yaml_file`. Default is None.
    verbose: If True, print confirmation from the create action.
        Default is False.
    namespace: string. Contains the namespace to create all
        resources inside. The namespace must preexist otherwise
        the resource creation will fail. If the API object in
        the yaml file already contains a namespace definition
        this parameter has no effect.

    Available parameters for creating <kind>:
    :param async_req bool
    :param bool include_uninitialized: If true, partially initialized
        resources are included in the response.
    :param str pretty: If 'true', then the output is pretty printed.
    :param str dry_run: When present, indicates that modifications
        should not be persisted. An invalid or unrecognized dryRun
        directive will result in an error response and no further
        processing of the request.
        Valid values are: - All: all dry run stages will be processed

    Returns:
        The created kubernetes API objects.

    Raises:
        FailToCreateError which holds list of `client.rest.ApiException`
        instances for each object that failed to create.
    """

    def create_with(objects, apply=apply):
        failures = []
        k8s_objects = []
        for yml_document in objects:
            if yml_document is None:
                continue
            try:
                created = create_from_dict(
                    k8s_client,
                    yml_document,
                    verbose,
                    namespace=namespace,
                    apply=apply,
                    **kwargs,
                )
                k8s_objects.append(created)
            except OperationFailureError as failure:
                failures.extend(failure.api_exceptions)
        if failures:
            raise OperationFailureError(failures)
        return k8s_objects

    class Loader(yaml.loader.SafeLoader):
        yaml_implicit_resolvers = yaml.loader.SafeLoader.yaml_implicit_resolvers.copy()
        if "=" in yaml_implicit_resolvers:
            yaml_implicit_resolvers.pop("=")

    if yaml_objects:
        yml_document_all = yaml_objects
        return create_with(yml_document_all)
    elif yaml_file:
        with open(os.path.abspath(yaml_file)) as f:  # noqa
            yml_document_all = yaml.load_all(f, Loader=Loader)
            return create_with(yml_document_all, apply)
    else:
        raise ValueError(
            "One of `yaml_file` or `yaml_objects` arguments must be provided"
        )


def create_from_dict(
    k8s_client, data, verbose=False, namespace="default", apply=False, **kwargs
):
    """
    Perform an action from a dictionary containing valid kubernetes
    API object (i.e. List, Service, etc).

    Input:
    k8s_client: an ApiClient object, initialized with the client args.
    data: a dictionary holding valid kubernetes objects
    verbose: If True, print confirmation from the create action.
        Default is False.
    namespace: string. Contains the namespace to create all
        resources inside. The namespace must preexist otherwise
        the resource creation will fail. If the API object in
        the yaml file already contains a namespace definition
        this parameter has no effect.

    Returns:
        The created kubernetes API objects.

    Raises:
        FailToCreateError which holds list of `client.rest.ApiException`
        instances for each object that failed to create.
    """
    # If it is a list type, will need to iterate its items
    api_exceptions = []
    k8s_objects = []

    if "List" in data["kind"]:
        # Could be "List" or "Pod/Service/...List"
        # This is a list type. iterate within its items
        kind = data["kind"].replace("List", "")
        for yml_object in data["items"]:
            # Mitigate cases when server returns a xxxList object
            # See kubernetes-client/python#586
            if kind != "":
                yml_object["apiVersion"] = data["apiVersion"]
                yml_object["kind"] = kind
            try:
                created = create_from_yaml_single_item(
                    k8s_client,
                    yml_object,
                    verbose,
                    namespace=namespace,
                    apply=apply,
                    **kwargs,
                )
                k8s_objects.append(created)
            except client.rest.ApiException as api_exception:
                api_exceptions.append(api_exception)
    else:
        # This is a single object. Call the single item method
        try:
            created = create_from_yaml_single_item(
                k8s_client, data, verbose, namespace=namespace, apply=apply, **kwargs
            )
            k8s_objects.append(created)
        except client.rest.ApiException as api_exception:
            api_exceptions.append(api_exception)

    # In case we have exceptions waiting for us, raise them
    if api_exceptions:
        raise OperationFailureError(api_exceptions)

    return k8s_objects


def create_from_yaml_single_item(
    k8s_client, yml_object, verbose=False, apply=False, **kwargs
):
    kind = yml_object["kind"]
    if apply:
        apply_client = DynamicClient(k8s_client).resources.get(
            api_version=yml_object["apiVersion"], kind=kind
        )
        resp = apply_client.server_side_apply(
            body=yml_object, field_manager="python-client", **kwargs
        )
        return resp
    group, _, version = yml_object["apiVersion"].partition("/")
    if version == "":
        version = group
        group = "core"
    # Take care for the case e.g. api_type is "apiextensions.k8s.io"
    # Only replace the last instance
    group = "".join(group.rsplit(".k8s.io", 1))
    # convert group name from DNS subdomain format to
    # python class name convention
    group = "".join(word.capitalize() for word in group.split("."))
    fcn_to_call = "{0}{1}Api".format(group, version.capitalize())
    k8s_api = getattr(client, fcn_to_call)(k8s_client)
    # Replace CamelCased action_type into snake_case
    kind = UPPER_FOLLOWED_BY_LOWER_RE.sub(r"\1_\2", kind)
    kind = LOWER_OR_NUM_FOLLOWED_BY_UPPER_RE.sub(r"\1_\2", kind).lower()
    # Expect the user to create namespaced objects more often
    if hasattr(k8s_api, "create_namespaced_{0}".format(kind)):
        # Decide which namespace we are going to put the object in,
        # if any
        if "namespace" in yml_object["metadata"]:
            namespace = yml_object["metadata"]["namespace"]
            kwargs["namespace"] = namespace
            resp = getattr(k8s_api, "create_namespaced_{0}".format(kind))(
                body=yml_object, **kwargs
            )
    else:
        kwargs.pop("namespace", None)
        resp = getattr(k8s_api, "create_{0}".format(kind))(body=yml_object, **kwargs)
    if verbose:
        msg = "{0} created.".format(kind)
        if hasattr(resp, "status"):
            msg += " status='{0}'".format(str(resp.status))
        print(msg)
    return resp


def delete_from_yaml(
    k8s_client: client.ApiClient, yaml_file: pathlib.Path = None, verbose: bool = False
) -> None:
    """
    Delete all objects in a yaml file. Pass True for verbose to
    print confirmation information.
    Input:
    yaml_file: string. Contains the path to yaml file.
    k8s_client: an ApiClient object, initialized with the client args.

    Returns:
        None

    Raises:
        OperationFailureError which holds list of `client.rest.ApiException`
        instances for each object that failed to delete.
    """
    dynamic_client = DynamicClient(k8s_client)
    k8s_objects = parse_yaml_file(yaml_file)
    exceptions = []
    for object in k8s_objects:
        try:
            if verbose:
                print(f"Deleting {object.kind} {object.name}")
            if object.namespaced:
                dynamic_client.resources.get(
                    api_version=object.api_version, kind=object.kind
                ).delete(
                    name=object.name,
                    namespace=object.extra_args.get("namespace", "default"),
                )
            else:
                dynamic_client.resources.get(
                    api_version=object.api_version, kind=object.kind
                ).delete(name=object.name)
        except client.rest.ApiException as api_exception:
            if api_exception.reason == "Not Found":
                continue
            if verbose:
                print(f"Failed to delete {object.kind} {object.name}")
            exceptions.append(api_exception)
        except Exception as e:
            print(f"Warning, failed to delete {object.kind} {object.name}: {e}")
    if exceptions:
        raise OperationFailureError(exceptions)


def parse_yaml_file(yaml_file: pathlib.Path) -> list:
    """
    Parse a yaml file and return a list of dictionaries.
    Input:
    yaml_file: pathlib.Path. Contains the path to yaml file.

    Returns:
        A list of kubernetes objects in the yaml file.
    """

    class Loader(yaml.loader.SafeLoader):
        yaml_implicit_resolvers = yaml.loader.SafeLoader.yaml_implicit_resolvers.copy()
        if "=" in yaml_implicit_resolvers:
            yaml_implicit_resolvers.pop("=")

    with open(yaml_file.absolute()) as f:  # noqa
        yml_document_all = yaml.load_all(f, Loader=Loader)

        objects = []
        for doc in yml_document_all:
            object = Resource(
                api_version=doc["apiVersion"],
                prefix=doc["apiVersion"].split("/")[0],
                kind=doc["kind"],
                namespaced=True if "namespace" in doc["metadata"] else False,
                name=doc["metadata"]["name"],
                body=doc,
                namespace=doc["metadata"].get("namespace", None),
                annotations=doc["metadata"].get("annotations", None),
            )
            objects.append(object)
        return objects


class OperationFailureError(Exception):
    """
    An exception class for handling error if an error occurred when
    handling a yaml file.
    """

    def __init__(self, api_exceptions):
        self.api_exceptions = api_exceptions

    def __str__(self):
        msg = ""
        for api_exception in self.api_exceptions:
            msg += "Error from server ({0}): {1}".format(
                api_exception.reason, api_exception.body
            )
        return msg



---
File: nebari/src/_nebari/provider/kustomize.py
---

import logging
import subprocess
import tempfile
from pathlib import Path

from _nebari import constants
from _nebari.utils import run_subprocess_cmd

logger = logging.getLogger(__name__)


class KustomizeException(Exception):
    pass


def download_kustomize_binary(version=constants.KUSTOMIZE_VERSION) -> Path:
    filename_directory = Path(tempfile.gettempdir()) / "kustomize" / version
    filename_path = filename_directory / "kustomize"

    if not filename_directory.is_dir():
        filename_directory.mkdir(parents=True)

    if not filename_path.is_file():
        logger.info(
            "downloading and extracting kustomize binary version %s to path=%s",
            constants.KUSTOMIZE_VERSION,
            filename_path,
        )
        install_script = subprocess.run(
            [
                "curl",
                "-s",
                "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh",
            ],
            stdout=subprocess.PIPE,
            check=True,
        )
        subprocess.run(
            ["bash", "-s", constants.KUSTOMIZE_VERSION, str(filename_directory)],
            input=install_script.stdout,
            check=True,
        )

    filename_path.chmod(0o555)
    return filename_path


def run_kustomize_subprocess(processargs, **kwargs) -> None:
    kustomize_path = download_kustomize_binary()
    try:
        run_subprocess_cmd(
            [kustomize_path] + processargs, capture_output=True, **kwargs
        )
    except subprocess.CalledProcessError as e:
        raise KustomizeException("Kustomize returned an error: %s" % e.stderr)


def version() -> str:
    kustomize_path = download_kustomize_binary()
    logger.info("checking kustomize=%s version", kustomize_path)

    version_output = subprocess.check_output([kustomize_path, "version"]).decode(
        "utf-8"
    )
    return version_output



---
File: nebari/src/_nebari/provider/opentofu.py
---

import contextlib
import io
import json
import logging
import platform
import re
import subprocess
import sys
import tempfile
import urllib.request
import zipfile
from pathlib import Path
from typing import Any, Dict, List

from _nebari import constants
from _nebari.utils import deep_merge, run_subprocess_cmd, timer

logger = logging.getLogger(__name__)


class OpenTofuException(Exception):
    pass


def deploy(
    directory,
    tofu_init: bool = True,
    tofu_import: bool = False,
    tofu_apply: bool = True,
    tofu_destroy: bool = False,
    input_vars: Dict[str, Any] = {},
    state_imports: List[Any] = [],
):
    """Execute a given directory with OpenTofu infrastructure configuration.

    Parameters:
      directory: directory in which to run tofu operations on

      tofu_init: whether to run `tofu init` default True

      tofu_import: whether to run `tofu import` default
        False for each `state_imports` supplied to function

      tofu_apply: whether to run `tofu apply` default True

      tofu_destroy: whether to run `tofu destroy` default
        False

      input_vars: supply values for "variable" resources within
        terraform module

      state_imports: (addr, id) pairs for iterate through and attempt
        to tofu import
    """
    with tempfile.NamedTemporaryFile(
        mode="w", encoding="utf-8", suffix=".tfvars.json"
    ) as f:
        json.dump(input_vars, f.file)
        f.file.flush()

        if tofu_init:
            init(directory)

        if tofu_import:
            for addr, id in state_imports:
                tfimport(
                    addr, id, directory=directory, var_files=[f.name], exist_ok=True
                )

        if tofu_apply:
            apply(directory, var_files=[f.name])

        if tofu_destroy:
            destroy(directory, var_files=[f.name])

        return output(directory)


def download_opentofu_binary(version=constants.OPENTOFU_VERSION):
    os_mapping = {
        "linux": "linux",
        "win32": "windows",
        "darwin": "darwin",
        "freebsd": "freebsd",
        "openbsd": "openbsd",
        "solaris": "solaris",
    }

    architecture_mapping = {
        "x86_64": "amd64",
        "i386": "386",
        "armv7l": "arm",
        "aarch64": "arm64",
        "arm64": "arm64",
    }

    download_url = f"https://github.com/opentofu/opentofu/releases/download/v{version}/tofu_{version}_{os_mapping[sys.platform]}_{architecture_mapping[platform.machine()]}.zip"

    filename_directory = Path(tempfile.gettempdir()) / "opentofu" / version
    filename_path = filename_directory / "tofu"

    if not filename_path.is_file():
        logger.info(
            f"downloading and extracting opentofu binary from url={download_url} to path={filename_path}"
        )
        with urllib.request.urlopen(download_url) as f:
            bytes_io = io.BytesIO(f.read())
        download_file = zipfile.ZipFile(bytes_io)
        download_file.extract("tofu", filename_directory)

    filename_path.chmod(0o555)
    return filename_path


def run_tofu_subprocess(processargs, **kwargs):
    tofu_path = download_opentofu_binary()
    logger.info(f" tofu at {tofu_path}")
    exit_code, output = run_subprocess_cmd([tofu_path] + processargs, **kwargs)
    if exit_code != 0:
        raise OpenTofuException("OpenTofu returned an error")
    return output


def version():
    tofu_path = download_opentofu_binary()
    logger.info(f"checking opentofu={tofu_path} version")

    version_output = subprocess.check_output([tofu_path, "--version"]).decode("utf-8")
    return re.search(r"(\d+)\.(\d+).(\d+)", version_output).group(0)


def init(directory=None, upgrade=True):
    logger.info(f"tofu init directory={directory}")
    with timer(logger, "tofu init"):
        command = ["init"]
        if upgrade:
            command.append("-upgrade")
        run_tofu_subprocess(command, cwd=directory, prefix="tofu")


def apply(directory=None, targets=None, var_files=None):
    targets = targets or []
    var_files = var_files or []

    logger.info(f"tofu apply directory={directory} targets={targets}")
    command = (
        ["apply", "-auto-approve"]
        + ["-target=" + _ for _ in targets]
        + ["-var-file=" + _ for _ in var_files]
    )
    with timer(logger, "tofu apply"):
        run_tofu_subprocess(command, cwd=directory, prefix="tofu")


def output(directory=None):
    tofu_path = download_opentofu_binary()

    logger.info(f"tofu={tofu_path} output directory={directory}")
    with timer(logger, "tofu output"):
        return json.loads(
            subprocess.check_output(
                [tofu_path, "output", "-json"], cwd=directory
            ).decode("utf8")[:-1]
        )


def tfimport(addr, id, directory=None, var_files=None, exist_ok=False):
    var_files = var_files or []

    logger.info(f"tofu import directory={directory} addr={addr} id={id}")
    command = ["import"] + ["-var-file=" + _ for _ in var_files] + [addr, id]
    logger.error(str(command))
    with timer(logger, "tofu import"):
        try:
            run_tofu_subprocess(
                command,
                cwd=directory,
                prefix="tofu",
                strip_errors=True,
                timeout=30,
            )
        except OpenTofuException as e:
            if not exist_ok:
                raise e


def show(directory=None, tofu_init: bool = True) -> dict:

    if tofu_init:
        init(directory)

    logger.info(f"tofu show directory={directory}")
    command = ["show", "-json"]
    with timer(logger, "tofu show"):
        try:
            output = json.loads(
                run_tofu_subprocess(
                    command,
                    cwd=directory,
                    prefix="tofu",
                    strip_errors=True,
                    capture_output=True,
                )
            )
            return output
        except OpenTofuException as e:
            raise e


def refresh(directory=None, var_files=None):
    var_files = var_files or []

    logger.info(f"tofu refresh directory={directory}")
    command = ["refresh"] + ["-var-file=" + _ for _ in var_files]

    with timer(logger, "tofu refresh"):
        run_tofu_subprocess(command, cwd=directory, prefix="tofu")


def destroy(directory=None, targets=None, var_files=None):
    targets = targets or []
    var_files = var_files or []

    logger.info(f"tofu destroy directory={directory} targets={targets}")
    command = (
        [
            "destroy",
            "-auto-approve",
        ]
        + ["-target=" + _ for _ in targets]
        + ["-var-file=" + _ for _ in var_files]
    )

    with timer(logger, "tofu destroy"):
        run_tofu_subprocess(command, cwd=directory, prefix="tofu")


def rm_local_state(directory=None):
    logger.info(f"rm local state file terraform.tfstate directory={directory}")
    tfstate_path = Path("terraform.tfstate")
    if directory:
        tfstate_path = directory / tfstate_path

    if tfstate_path.is_file():
        tfstate_path.unlink()


# ========== Terraform JSON ============
@contextlib.contextmanager
def tf_context(filename):
    try:
        tf_clear()
        yield
    finally:
        with open(filename, "w") as f:
            f.write(tf_render())
        tf_clear()


_TF_OBJECTS = {}


def tf_clear():
    global _TF_OBJECTS
    _TF_OBJECTS = {}


def tf_render():
    global _TF_OBJECTS
    return json.dumps(_TF_OBJECTS, indent=4)


def tf_render_objects(terraform_objects):
    return json.dumps(deep_merge(*terraform_objects), indent=4)


def register(f):
    def wrapper(*args, **kwargs):
        global _TF_OBJECTS
        obj = f(*args, **kwargs)
        _TF_OBJECTS = deep_merge(_TF_OBJECTS, obj)
        return obj

    return wrapper


@register
def Terraform(**kwargs):
    return {"terraform": kwargs}


@register
def RequiredProvider(_name, **kwargs):
    return {"terraform": {"required_providers": {_name: kwargs}}}


@register
def Provider(_name, **kwargs):
    return {"provider": {_name: kwargs}}


@register
def TerraformBackend(_name, **kwargs):
    return {"terraform": {"backend": {_name: kwargs}}}


@register
def Variable(_name, **kwargs):
    return {"variable": {_name: kwargs}}


@register
def Data(_resource_type, _name, **kwargs):
    return {"data": {_resource_type: {_name: kwargs}}}


@register
def Resource(_resource_type, _name, **kwargs):
    return {"resource": {_resource_type: {_name: kwargs}}}


@register
def Output(_name, **kwargs):
    return {"output": {_name: kwargs}}



---
File: nebari/src/_nebari/stages/bootstrap/__init__.py
---

import enum
import io
import pathlib
import typing
from inspect import cleandoc
from typing import Dict, List, Type

from _nebari.provider.cicd.github import gen_nebari_linter, gen_nebari_ops
from _nebari.provider.cicd.gitlab import gen_gitlab_ci
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


def gen_gitignore():
    """
    Generate `.gitignore` file.
    Add files as needed.
    """
    filestoignore = """
        # ignore terraform state
        .terraform
        terraform.tfstate
        terraform.tfstate.backup
        .terraform.tfstate.lock.info

        # python
        __pycache__
    """
    return {pathlib.Path(".gitignore"): cleandoc(filestoignore)}


def gen_cicd(config: schema.Main):
    """
    Use cicd schema to generate workflow files based on the
    `ci_cd` key in the `config`.

    For more detail on schema:
    GiHub-Actions - nebari/providers/cicd/github.py
    GitLab-CI - nebari/providers/cicd/gitlab.py
    """
    cicd_files = {}

    if config.ci_cd.type == CiEnum.github_actions:
        gha_dir = pathlib.Path(".github/workflows/")
        cicd_files[gha_dir / "nebari-ops.yaml"] = gen_nebari_ops(config)
        cicd_files[gha_dir / "nebari-linter.yaml"] = gen_nebari_linter(config)

    elif config.ci_cd.type == CiEnum.gitlab_ci:
        cicd_files[pathlib.Path(".gitlab-ci.yml")] = gen_gitlab_ci(config)

    else:
        raise ValueError(
            f"The ci_cd provider, {config.ci_cd.type.value}, is not supported. Supported providers include: `github-actions`, `gitlab-ci`."
        )

    return cicd_files


@schema.yaml_object(schema.yaml)
class CiEnum(str, enum.Enum):
    github_actions = "github-actions"
    gitlab_ci = "gitlab-ci"
    none = "none"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class CICD(schema.Base):
    type: CiEnum = CiEnum.none
    branch: str = "main"
    commit_render: bool = True
    before_script: typing.List[typing.Union[str, typing.Dict]] = []
    after_script: typing.List[typing.Union[str, typing.Dict]] = []


class InputSchema(schema.Base):
    ci_cd: CICD = CICD()


class OutputSchema(schema.Base):
    pass


class BootstrapStage(NebariStage):
    name = "bootstrap"
    priority = 0

    input_schema = InputSchema
    output_schema = OutputSchema

    def render(self) -> Dict[str, str]:
        contents = {}
        if self.config.ci_cd.type != CiEnum.none:
            for fn, workflow in gen_cicd(self.config).items():
                stream = io.StringIO()
                schema.yaml.dump(
                    workflow.model_dump(
                        by_alias=True, exclude_unset=True, exclude_defaults=True
                    ),
                    stream,
                )
                contents.update({fn: stream.getvalue()})

        contents.update(gen_gitignore())
        return contents


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [BootstrapStage]



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/accounting/main.tf
---

resource "aws_resourcegroups_group" "main" {
  name        = var.project
  description = "project ${var.project} - environment ${var.environment}"

  resource_query {
    query = jsonencode({
      ResourceTypeFilters = ["AWS::AllSupported"]
      TagFilters = [
        {
          Key    = "Project"
          Values = [var.project]
        },
        {
          Key    = "Environment"
          Values = [var.environment]
        },
        {
          Key    = "Owner"
          Values = ["terraform", "terraform-state"]
        }
      ]
    })
  }

  tags = merge({
    Description = "AWS resources project=${var.project} and environment=${var.environment}"
  }, var.tags)

}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/accounting/variables.tf
---

variable "project" {
  description = "Project for resource group filter"
  type        = string
}

variable "environment" {
  description = "Environment for resource group filter"
  type        = string
}

variable "tags" {
  description = "Additional tags to apply to all network resource"
  type        = map(string)
  default     = {}
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/efs/main.tf
---

resource "aws_efs_file_system" "main" {
  creation_token = var.name

  encrypted = true

  throughput_mode = var.efs_throughput

  tags = merge({ Name = var.name }, var.tags)
}

resource "aws_efs_mount_target" "main" {
  count = length(var.efs_subnets)

  file_system_id = aws_efs_file_system.main.id

  subnet_id = var.efs_subnets[count.index]

  security_groups = var.efs_security_groups
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/efs/outputs.tf
---

output "credentials" {
  description = "EFS connection credentials"
  value = {
    dns_name = aws_efs_file_system.main.dns_name
  }
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/efs/variables.tf
---

variable "name" {
  description = "Prefix name to assign to efs resource"
  type        = string
}

variable "tags" {
  description = "Additional tags to apply to resource"
  type        = map(string)
  default     = {}
}

variable "efs_throughput" {
  description = "Throughput mode for EFS filesystem (busting|provisioned)"
  type        = string
  default     = "bursting"
}

variable "efs_subnets" {
  description = "AWS VPC subnets to use for efs"
  type        = list(string)
}

variable "efs_security_groups" {
  description = "AWS security groups"
  type        = list(string)
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/files/user_data.tftpl
---

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"

%{ if node_pre_bootstrap_command != null }
--//
Content-Type: text/x-shellscript; charset="us-ascii"

${node_pre_bootstrap_command}
%{ endif }

%{ if include_bootstrap_cmd }
--//
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
set -ex

/etc/eks/bootstrap.sh ${cluster_name} --b64-cluster-ca ${cluster_cert_authority} --apiserver-endpoint ${cluster_endpoint}
%{ endif }

 --//



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/autoscaling.tf
---

resource "aws_iam_policy" "worker_autoscaling" {
  name_prefix = "eks-worker-autoscaling-${var.name}"
  description = "EKS worker node autoscaling policy for cluster ${var.name}"
  policy      = data.aws_iam_policy_document.worker_autoscaling.json
}

data "aws_iam_policy_document" "worker_autoscaling" {
  statement {
    sid    = "eksWorkerAutoscalingAll"
    effect = "Allow"

    actions = [
      "autoscaling:DescribeAutoScalingGroups",
      "autoscaling:DescribeAutoScalingInstances",
      "autoscaling:DescribeLaunchConfigurations",
      "autoscaling:DescribeTags",
      "ec2:DescribeLaunchTemplateVersions",
    ]

    resources = ["*"]
  }

  statement {
    sid    = "eksWorkerAutoscalingOwn"
    effect = "Allow"

    actions = [
      "autoscaling:SetDesiredCapacity",
      "autoscaling:TerminateInstanceInAutoScalingGroup",
      "autoscaling:UpdateAutoScalingGroup",
    ]

    resources = ["*"]

    condition {
      test     = "StringEquals"
      variable = "autoscaling:ResourceTag/kubernetes.io/cluster/${var.name}"
      values   = ["owned"]
    }

    condition {
      test     = "StringEquals"
      variable = "autoscaling:ResourceTag/k8s.io/cluster-autoscaler/enabled"
      values   = ["true"]
    }
  }
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/locals.tf
---

locals {
  cluster_policies = concat([
    "arn:${local.partition}:iam::aws:policy/AmazonEKSClusterPolicy",
    "arn:${local.partition}:iam::aws:policy/AmazonEKSServicePolicy",
    "arn:${local.partition}:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy",
  ], var.cluster_additional_policies)

  node_group_policies = concat([
    "arn:${local.partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy",
    "arn:${local.partition}:iam::aws:policy/AmazonEKS_CNI_Policy",
    "arn:${local.partition}:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy",
    aws_iam_policy.worker_autoscaling.arn
  ], var.node_group_additional_policies)

  gpu_node_group_names = [for node_group in var.node_groups : node_group.name if node_group.gpu == true]

  partition = data.aws_partition.current.partition
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/main.tf
---

data "aws_partition" "current" {}

resource "aws_eks_cluster" "main" {
  name     = var.name
  role_arn = aws_iam_role.cluster.arn
  version  = var.kubernetes_version

  vpc_config {
    security_group_ids = var.cluster_security_groups
    subnet_ids         = var.cluster_subnets
    #trivy:ignore:AVD-AWS-0040
    endpoint_public_access  = var.endpoint_public_access
    endpoint_private_access = var.endpoint_private_access
    public_access_cidrs     = var.public_access_cidrs
  }

  # Only set encryption_config if eks_kms_arn is not null
  dynamic "encryption_config" {
    for_each = var.eks_kms_arn != null ? [1] : []
    content {
      provider {
        key_arn = var.eks_kms_arn
      }
      resources = ["secrets"]
    }
  }

  depends_on = [
    aws_iam_role_policy_attachment.cluster-policy,
    aws_iam_role_policy_attachment.cluster_encryption,
  ]

  tags = merge({ Name = var.name }, var.tags)
}

## aws_launch_template user_data invocation
## If using a Custom AMI, then the /etc/eks/bootstrap cmds and args must be included/modified,
## otherwise, on default AWS EKS Node AMI, the bootstrap cmd is appended automatically
resource "aws_launch_template" "main" {
  for_each = {
    for node_group in var.node_groups :
    node_group.name => node_group
    if node_group.launch_template != null
  }

  name_prefix = "eks-${var.name}-${each.value.name}-"
  image_id    = each.value.launch_template.ami_id

  vpc_security_group_ids = var.cluster_security_groups


  metadata_options {
    http_tokens            = "required"
    http_endpoint          = "enabled"
    instance_metadata_tags = "enabled"
  }

  block_device_mappings {
    device_name = "/dev/xvda"
    ebs {
      volume_size = 50
      volume_type = "gp2"
    }
  }

  # https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-basics
  user_data = base64encode(
    templatefile(
      "${path.module}/files/user_data.tftpl",
      {
        node_pre_bootstrap_command = each.value.launch_template.pre_bootstrap_command
        # This will ensure the bootstrap user data is used to join the node
        include_bootstrap_cmd  = each.value.launch_template.ami_id != null ? true : false
        cluster_name           = aws_eks_cluster.main.name
        cluster_cert_authority = aws_eks_cluster.main.certificate_authority[0].data
        cluster_endpoint       = aws_eks_cluster.main.endpoint
      }
    )
  )
}


resource "aws_eks_node_group" "main" {
  count = length(var.node_groups)

  cluster_name    = aws_eks_cluster.main.name
  node_group_name = var.node_groups[count.index].name
  node_role_arn   = aws_iam_role.node-group.arn
  subnet_ids      = var.node_groups[count.index].single_subnet ? [element(var.cluster_subnets, 0)] : var.cluster_subnets

  instance_types = [var.node_groups[count.index].instance_type]
  ami_type       = var.node_groups[count.index].ami_type
  disk_size      = var.node_groups[count.index].launch_template == null ? 50 : null

  scaling_config {
    min_size     = var.node_groups[count.index].min_size
    desired_size = var.node_groups[count.index].desired_size
    max_size     = var.node_groups[count.index].max_size
  }

  # Only set launch_template if its node_group counterpart parameter is not null
  dynamic "launch_template" {
    for_each = var.node_groups[count.index].launch_template != null ? [0] : []
    content {
      id      = aws_launch_template.main[var.node_groups[count.index].name].id
      version = aws_launch_template.main[var.node_groups[count.index].name].latest_version
    }
  }

  labels = {
    "dedicated" = var.node_groups[count.index].name
  }

  lifecycle {
    ignore_changes = [
      scaling_config[0].desired_size,
    ]
  }

  # Ensure that IAM Role permissions are created before and deleted
  # after EKS Node Group handling.  Otherwise, EKS will not be able to
  # properly delete EC2 Instances and Elastic Network Interfaces.
  depends_on = [
    aws_iam_role_policy_attachment.node-group-policy,
  ]

  tags = merge({
    "k8s.io/cluster-autoscaler/node-template/label/dedicated" = var.node_groups[count.index].name
    propagate_at_launch                                       = true
  }, var.tags)
}

data "aws_eks_cluster_auth" "main" {
  name = aws_eks_cluster.main.name
}

resource "aws_eks_addon" "aws-ebs-csi-driver" {
  # required for Kubernetes v1.23+ on AWS
  addon_name                  = "aws-ebs-csi-driver"
  cluster_name                = aws_eks_cluster.main.name
  resolve_conflicts_on_create = "OVERWRITE"
  resolve_conflicts_on_update = "OVERWRITE"

  configuration_values = jsonencode({
    controller = {
      nodeSelector = {
        "eks.amazonaws.com/nodegroup" = "general"
      }
    }
    defaultStorageClass = {
      enabled = true
    }
  })

  # Ensure cluster and node groups are created
  depends_on = [
    aws_eks_cluster.main,
    aws_eks_node_group.main,
  ]
}

resource "aws_eks_addon" "coredns" {
  addon_name                  = "coredns"
  cluster_name                = aws_eks_cluster.main.name
  resolve_conflicts_on_create = "OVERWRITE"
  resolve_conflicts_on_update = "OVERWRITE"


  configuration_values = jsonencode({
    nodeSelector = {
      "eks.amazonaws.com/nodegroup" = "general"
    }
  })

  # Ensure cluster and node groups are created
  depends_on = [
    aws_eks_cluster.main,
    aws_eks_node_group.main,
  ]
}

data "tls_certificate" "this" {
  url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}

resource "aws_iam_openid_connect_provider" "oidc_provider" {
  client_id_list  = ["sts.${data.aws_partition.current.dns_suffix}"]
  thumbprint_list = data.tls_certificate.this.certificates[*].sha1_fingerprint
  url             = aws_eks_cluster.main.identity[0].oidc[0].issuer

  tags = merge(
    { Name = "${var.name}-eks-irsa" },
    var.tags
  )
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/outputs.tf
---

output "credentials" {
  description = "AWS eks credentials"
  sensitive   = true
  value = {
    endpoint = aws_eks_cluster.main.endpoint
    token    = data.aws_eks_cluster_auth.main.token
    cluster_ca_certificate = base64decode(
    aws_eks_cluster.main.certificate_authority.0.data)
  }
}

output "node_groups_arn" {
  value = aws_eks_node_group.main[*].arn
}

output "cluster_oidc_issuer_url" {
  description = "The URL on the EKS cluster for the OpenID Connect identity provider"
  value       = aws_eks_cluster.main.identity[0].oidc[0].issuer
}

output "oidc_provider_arn" {
  description = "The ARN of the OIDC Provider"
  value       = aws_iam_openid_connect_provider.oidc_provider.arn
}

# https://github.com/terraform-aws-modules/terraform-aws-eks/blob/16f46db94b7158fd762d9133119206aaa7cf6d63/examples/self_managed_node_group/main.tf
output "kubeconfig" {
  description = "Kubernetes connection configuration kubeconfig"
  value = yamlencode({
    apiVersion      = "v1"
    kind            = "Config"
    current-context = "terraform"
    clusters = [{
      name = aws_eks_cluster.main.name
      cluster = {
        certificate-authority-data = aws_eks_cluster.main.certificate_authority[0].data
        server                     = aws_eks_cluster.main.endpoint
      }
    }]
    contexts = [{
      name = "terraform"
      context = {
        cluster = aws_eks_cluster.main.name
        user    = "terraform"
      }
    }]
    users = [{
      name = "terraform"
      user = {
        token = data.aws_eks_cluster_auth.main.token
      }
    }]
  })
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/policy.tf
---

# =======================================================
# Kubernetes Cluster Roles This sets up the policies that were
# previously done by using eksctl
# =======================================================

# =======================================================
# Kubernetes Cluster Policies
# =======================================================

resource "aws_iam_role" "cluster" {
  name = "${var.name}-eks-cluster-role"

  assume_role_policy = jsonencode({
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "eks.amazonaws.com"
        }
    }]
    Version = "2012-10-17"
  })
  permissions_boundary = var.permissions_boundary
  tags                 = var.tags
}

resource "aws_iam_role_policy_attachment" "cluster-policy" {
  count = length(local.cluster_policies)

  policy_arn = local.cluster_policies[count.index]
  role       = aws_iam_role.cluster.name
}

data "aws_iam_policy_document" "cluster_encryption" {
  count = var.eks_kms_arn != null ? 1 : 0
  statement {
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ListGrants",
      "kms:DescribeKey"
    ]
    resources = [var.eks_kms_arn]
  }
}

resource "aws_iam_policy" "cluster_encryption" {
  count       = var.eks_kms_arn != null ? 1 : 0
  name        = "${var.name}-eks-encryption-policy"
  description = "IAM policy for EKS cluster encryption"
  policy      = data.aws_iam_policy_document.cluster_encryption[count.index].json
}

# Grant the EKS Cluster role KMS permissions if a key-arn is specified
resource "aws_iam_role_policy_attachment" "cluster_encryption" {
  count      = var.eks_kms_arn != null ? 1 : 0
  policy_arn = aws_iam_policy.cluster_encryption[count.index].arn
  role       = aws_iam_role.cluster.name
}

# =======================================================
# Kubernetes Node Group Policies
# =======================================================

resource "aws_iam_role" "node-group" {
  name = "${var.name}-eks-node-group-role"

  assume_role_policy = jsonencode({
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
    }]
    Version = "2012-10-17"
  })
  permissions_boundary = var.permissions_boundary
  tags                 = var.tags
}

resource "aws_iam_role_policy_attachment" "node-group-policy" {
  count = length(local.node_group_policies)

  policy_arn = local.node_group_policies[count.index]
  role       = aws_iam_role.node-group.name
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/variables.tf
---

variable "name" {
  description = "Prefix name for EKS cluster"
  type        = string
}

variable "tags" {
  description = "Additional tags for EKS cluster"
  type        = map(string)
  default     = {}
}

variable "cluster_subnets" {
  description = "AWS VPC subnets to use for EKS cluster"
  type        = list(string)
}

variable "region" {
  description = "AWS region for EKS cluster"
  type        = string
}

variable "kubernetes_version" {
  description = "AWS kubernetes version for EKS cluster"
  type        = string
}

variable "cluster_security_groups" {
  description = "AWS security groups to use for EKS cluster"
  type        = list(string)
}

variable "cluster_additional_policies" {
  description = "Additional policies to add to cluster"
  type        = list(string)
  default     = []
}

variable "node_group_additional_policies" {
  description = "Additional policies to add to each node group"
  type        = list(string)
  default     = []
}

variable "node_groups" {
  description = "Node groups to add to EKS Cluster"
  type = list(object({
    name            = string
    instance_type   = string
    gpu             = bool
    min_size        = number
    desired_size    = number
    max_size        = number
    single_subnet   = bool
    launch_template = map(any)
    ami_type        = string
  }))
}

variable "node_group_instance_type" {
  description = "AWS instance types to use for kubernetes nodes"
  type        = string
  default     = "m5.large"
}

variable "endpoint_public_access" {
  type    = bool
  default = true
}

variable "endpoint_private_access" {
  type    = bool
  default = false
}

variable "eks_kms_arn" {
  description = "kms key arn for EKS cluster encryption_config"
  type        = string
  default     = null
}

variable "public_access_cidrs" {
  type    = list(string)
  default = ["0.0.0.0/0"]
}

variable "permissions_boundary" {
  description = "ARN of the policy that is used to set the permissions boundary for the role"
  type        = string
  default     = null
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/network/main.tf
---

resource "aws_vpc" "main" {
  cidr_block = var.vpc_cidr_block

  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = merge({ Name = var.name }, var.tags, var.vpc_tags)
}

resource "aws_subnet" "main" {
  count = length(var.aws_availability_zones)

  availability_zone       = var.aws_availability_zones[count.index]
  cidr_block              = cidrsubnet(var.vpc_cidr_block, var.vpc_cidr_newbits, count.index)
  vpc_id                  = aws_vpc.main.id
  map_public_ip_on_launch = true

  tags = merge({ Name = "${var.name}-subnet-${count.index}" }, var.tags, var.subnet_tags)

  lifecycle {
    ignore_changes = [
      availability_zone
    ]
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = merge({ Name = var.name }, var.tags)
}

resource "aws_route_table" "main" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }

  tags = merge({ Name = var.name }, var.tags)
}

resource "aws_route_table_association" "main" {
  count = length(var.aws_availability_zones)

  subnet_id      = aws_subnet.main[count.index].id
  route_table_id = aws_route_table.main.id
}

resource "aws_security_group" "main" {
  name        = var.name
  description = "Main security group for infrastructure deployment"

  vpc_id = aws_vpc.main.id

  ingress {
    description = "Allow all ports and protocols to enter the security group"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = [var.vpc_cidr_block]
  }

  #trivy:ignore:AVD-AWS-0104
  egress {
    description = "Allow all ports and protocols to exit the security group"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = merge({ Name = var.name }, var.tags, var.security_group_tags)
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/network/outputs.tf
---

output "security_group_id" {
  description = "AWS security group id"
  value       = aws_security_group.main.id
}

output "subnet_ids" {
  description = "AWS VPC subnet ids"
  value       = aws_subnet.main[*].id
}

output "vpc_id" {
  description = "AWS VPC id"
  value       = aws_vpc.main.id
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/network/variables.tf
---

variable "name" {
  description = "Prefix name to give to network resources"
  type        = string
}

variable "tags" {
  description = "Additional tags to apply to all network resource"
  type        = map(string)
  default     = {}
}

variable "vpc_tags" {
  description = "Additional tags to apply to vpc network resource"
  type        = map(string)
  default     = {}
}

variable "subnet_tags" {
  description = "Additional tags to apply to subnet network resources"
  type        = map(string)
  default     = {}
}

variable "security_group_tags" {
  description = "Additional tags to apply to security group network resource"
  type        = map(string)
  default     = {}
}

variable "aws_availability_zones" {
  description = "AWS Availability zones to operate infrastructure"
  type        = list(string)
}

variable "vpc_cidr_block" {
  description = "VPC cidr for subnets to be inside of"
  type        = string
}

variable "vpc_cidr_newbits" {
  description = "VPC cidr number of bits to support 2^N subnets"
  type        = number
  default     = 2
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/registry/main.tf
---

resource "aws_ecr_repository" "main" {
  name = var.name

  image_scanning_configuration {
    scan_on_push = true
  }

  tags = merge({ Name = var.name }, var.tags)
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/registry/outputs.tf
---

output "credentials" {
  description = "ECR credentials"
  value = {
    arn            = aws_ecr_repository.main.arn
    repository_url = aws_ecr_repository.main.repository_url
  }
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/modules/registry/variables.tf
---

variable "name" {
  description = "Prefix AWS registry name"
  type        = string
}

variable "tags" {
  description = "AWS ECR Registry tags"
  type        = map(string)
  default     = {}
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/locals.tf
---

locals {
  additional_tags = merge(
    {
      Project     = var.name
      Owner       = "terraform"
      Environment = var.environment
    },
    var.tags,
  )
  cluster_name = "${var.name}-${var.environment}"
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/main.tf
---

data "aws_availability_zones" "awszones" {
  filter {
    name   = "opt-in-status"
    values = ["opt-in-not-required"]
  }
}

data "aws_partition" "current" {}

locals {
  # Only override_network if both existing_subnet_ids and existing_security_group_id are not null.
  override_network  = (var.existing_subnet_ids != null) && (var.existing_security_group_id != null)
  subnet_ids        = local.override_network ? var.existing_subnet_ids : module.network[0].subnet_ids
  security_group_id = local.override_network ? var.existing_security_group_id : module.network[0].security_group_id
  partition         = data.aws_partition.current.partition
}

# ==================== ACCOUNTING ======================
module "accounting" {
  source = "./modules/accounting"

  project     = var.name
  environment = var.environment

  tags = local.additional_tags
}


# ======================= NETWORK ======================
module "network" {
  count = local.override_network ? 0 : 1

  source = "./modules/network"

  name = local.cluster_name

  tags = local.additional_tags

  vpc_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
  }

  subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
  }

  security_group_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "owned"
  }

  vpc_cidr_block         = var.vpc_cidr_block
  aws_availability_zones = length(var.availability_zones) >= 2 ? var.availability_zones : slice(sort(data.aws_availability_zones.awszones.names), 0, 2)
}


# ==================== REGISTRIES =====================
module "registry-jupyterlab" {
  source = "./modules/registry"

  name = "${local.cluster_name}-jupyterlab"
  tags = local.additional_tags
}


# ====================== EFS =========================
module "efs" {
  count  = var.efs_enabled ? 1 : 0
  source = "./modules/efs"

  name = "${local.cluster_name}-jupyterhub-shared"
  tags = local.additional_tags

  efs_subnets         = local.subnet_ids
  efs_security_groups = [local.security_group_id]
}

moved {
  from = module.efs
  to   = module.efs[0]
}

# ==================== KUBERNETES =====================
module "kubernetes" {
  source = "./modules/kubernetes"

  name               = local.cluster_name
  tags               = local.additional_tags
  region             = var.region
  kubernetes_version = var.kubernetes_version

  cluster_subnets         = local.subnet_ids
  cluster_security_groups = [local.security_group_id]

  node_group_additional_policies = [
    "arn:${local.partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  ]

  node_groups = var.node_groups

  endpoint_public_access  = var.eks_endpoint_access == "private" ? false : true
  endpoint_private_access = var.eks_endpoint_access == "public" ? false : true
  eks_kms_arn             = var.eks_kms_arn
  public_access_cidrs     = var.eks_public_access_cidrs
  permissions_boundary    = var.permissions_boundary
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/outputs.tf
---

output "kubernetes_credentials" {
  description = "Parameters needed to connect to kubernetes cluster"
  sensitive   = true
  value = {
    host                   = module.kubernetes.credentials.endpoint
    cluster_ca_certificate = module.kubernetes.credentials.cluster_ca_certificate
    token                  = module.kubernetes.credentials.token
  }
}

resource "local_file" "kubeconfig" {
  count = var.kubeconfig_filename != null ? 1 : 0

  content  = module.kubernetes.kubeconfig
  filename = var.kubeconfig_filename
}

output "kubeconfig_filename" {
  description = "filename for nebari kubeconfig"
  value       = var.kubeconfig_filename
}

output "nfs_endpoint" {
  description = "Endpoint for nfs server"
  value       = length(module.efs) == 1 ? module.efs[0].credentials.dns_name : null
}

output "cluster_oidc_issuer_url" {
  description = "The URL on the EKS cluster for the OpenID Connect identity provider"
  value       = module.kubernetes.cluster_oidc_issuer_url
}

output "oidc_provider_arn" {
  description = "The ARN of the OIDC Provider"
  value       = module.kubernetes.oidc_provider_arn
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/variables.tf
---

variable "name" {
  description = "Prefix name to assign to Nebari resources"
  type        = string
}

variable "environment" {
  description = "Environment to create Kubernetes resources"
  type        = string
}

variable "existing_subnet_ids" {
  description = "Existing VPC ID to use for Kubernetes resources"
  type        = list(string)
}

variable "existing_security_group_id" {
  description = "Existing security group ID to use for Kubernetes resources"
  type        = string
}

variable "region" {
  description = "AWS region for EKS cluster"
  type        = string
}

variable "kubernetes_version" {
  description = "AWS kubernetes version for EKS cluster"
  type        = string
}

variable "node_groups" {
  description = "AWS node groups"
  type = list(object({
    name            = string
    instance_type   = string
    gpu             = bool
    min_size        = number
    desired_size    = number
    max_size        = number
    single_subnet   = bool
    launch_template = map(any)
    ami_type        = string
  }))
}

variable "availability_zones" {
  description = "AWS availability zones within AWS region"
  type        = list(string)
}

variable "vpc_cidr_block" {
  description = "VPC cidr block for infrastructure"
  type        = string
}

variable "kubeconfig_filename" {
  description = "Kubernetes kubeconfig written to filesystem"
  type        = string
}

variable "eks_endpoint_access" {
  description = "EKS cluster api server endpoint access setting"
  type        = string
  default     = "public"
}

variable "eks_endpoint_private_access" {
  type    = bool
  default = false
}

variable "eks_kms_arn" {
  description = "kms key arn for EKS cluster encryption_config"
  type        = string
  default     = null
}

variable "eks_public_access_cidrs" {
  type    = list(string)
  default = ["0.0.0.0/0"]
}

variable "permissions_boundary" {
  description = "ARN of the policy that is used to set the permissions boundary for the role"
  type        = string
  default     = null
}

variable "tags" {
  description = "Additional tags to add to resources"
  type        = map(string)
  default     = {}
}

variable "efs_enabled" {
  description = "Enable EFS"
  type        = bool
}



---
File: nebari/src/_nebari/stages/infrastructure/template/aws/versions.tf
---

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.33.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/modules/kubernetes/main.tf
---

# https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster
resource "azurerm_kubernetes_cluster" "main" {
  name                = var.name
  location            = var.location
  resource_group_name = var.resource_group_name
  tags                = var.tags
  api_server_access_profile {
    authorized_ip_ranges = var.authorized_ip_ranges
  }

  # To enable Azure AD Workload Identity oidc_issuer_enabled must be set to true.
  oidc_issuer_enabled       = var.workload_identity_enabled
  workload_identity_enabled = var.workload_identity_enabled

  # DNS prefix specified when creating the managed cluster. Changing this forces a new resource to be created.
  dns_prefix = "Nebari" # required

  # Azure requires that a new, non-existent Resource Group is used, as otherwise the provisioning of the Kubernetes Service will fail.
  node_resource_group     = var.node_resource_group_name
  private_cluster_enabled = var.private_cluster_enabled
  # https://learn.microsoft.com/en-ie/azure/governance/policy/concepts/policy-for-kubernetes
  azure_policy_enabled = var.azure_policy_enabled


  dynamic "network_profile" {
    for_each = var.network_profile != null ? [var.network_profile] : []
    content {
      network_plugin = network_profile.value.network_plugin != null ? network_profile.value.network_plugin : null
      network_policy = network_profile.value.network_policy != null ? network_profile.value.network_policy : null
      service_cidr   = network_profile.value.service_cidr != null ? network_profile.value.service_cidr : null
      dns_service_ip = network_profile.value.dns_service_ip != null ? network_profile.value.dns_service_ip : null
    }
  }

  kubernetes_version = var.kubernetes_version
  default_node_pool {
    vnet_subnet_id       = var.vnet_subnet_id
    name                 = var.node_groups[0].name
    vm_size              = var.node_groups[0].instance_type
    auto_scaling_enabled = "true"
    min_count            = var.node_groups[0].min_size
    max_count            = var.node_groups[0].max_size
    max_pods             = var.max_pods

    orchestrator_version = var.kubernetes_version
    node_labels = {
      "azure-node-pool" = var.node_groups[0].name
    }
    tags = var.tags

    # temparory_name_for_rotation must be <= 12 characters
    temporary_name_for_rotation = "${substr(var.node_groups[0].name, 0, 9)}tmp"
  }

  sku_tier = "Free" # "Free" [Default] or "Paid"

  identity {
    type = "SystemAssigned" # "UserAssigned" or "SystemAssigned".  SystemAssigned identity lifecycles are tied to the ASK Cluster.
  }

  lifecycle {
    ignore_changes = [
      # We ignore changes since otherwise, the ASK cluster unsets this default value every time you deploy.
      # https://github.com/hashicorp/terraform-provider-azurerm/issues/24020#issuecomment-1887670287
      default_node_pool[0].upgrade_settings,
    ]
  }

}

# https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster_node_pool
resource "azurerm_kubernetes_cluster_node_pool" "node_group" {
  for_each = { for i, group in var.node_groups : i => group if i != 0 }

  name                  = each.value.name
  kubernetes_cluster_id = azurerm_kubernetes_cluster.main.id
  vm_size               = each.value.instance_type
  auto_scaling_enabled  = "true"
  mode                  = "User" # "System" or "User", only "User" nodes can scale down to 0
  min_count             = each.value.min_size
  max_count             = each.value.max_size
  max_pods              = var.max_pods
  node_labels = {
    "azure-node-pool" = each.value.name
  }
  orchestrator_version = var.kubernetes_version
  tags                 = var.tags
  vnet_subnet_id       = var.vnet_subnet_id
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/modules/kubernetes/outputs.tf
---

output "credentials" {
  description = "Credentials required for connecting to kubernetes cluster"
  sensitive   = true
  value = {
    # see bottom of https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster
    endpoint               = azurerm_kubernetes_cluster.main.kube_config.0.host
    username               = azurerm_kubernetes_cluster.main.kube_config.0.username
    password               = azurerm_kubernetes_cluster.main.kube_config.0.password
    client_certificate     = base64decode(azurerm_kubernetes_cluster.main.kube_config.0.client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.main.kube_config.0.client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.main.kube_config.0.cluster_ca_certificate)
  }
}

output "kubeconfig" {
  description = "Kubernetes connection kubeconfig"
  sensitive   = true
  value       = azurerm_kubernetes_cluster.main.kube_config_raw
}

output "cluster_oidc_issuer_url" {
  description = "The OpenID Connect issuer URL that is associated with the ASK cluster"
  value       = azurerm_kubernetes_cluster.main.oidc_issuer_url
}

output "resource_group_name" {
  description = "The name of the resource group in which the ASK cluster is created"
  value       = azurerm_kubernetes_cluster.main.resource_group_name
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/modules/kubernetes/variables.tf
---

variable "name" {
  description = "Prefix name to assign to azure kubernetes cluster"
  type        = string
}

# `az account list-locations`
variable "location" {
  description = "Location for GCP Kubernetes cluster"
  type        = string
}

variable "resource_group_name" {
  description = "name of nebari resource group"
  type        = string
}

variable "node_resource_group_name" {
  description = "name of new resource group for ASK nodes"
  type        = string
}

variable "kubernetes_version" {
  description = "Version of Kubernetes"
  type        = string
}

variable "environment" {
  description = "Location for GCP Kubernetes cluster"
  type        = string
}


variable "node_groups" {
  description = "Node pools to add to Azure Kubernetes Cluster"
  type        = list(map(any))
}

variable "vnet_subnet_id" {
  description = "The ID of a Subnet where the Kubernetes Node Pool should exist. Changing this forces a new resource to be created."
  type        = string
  default     = null
}

variable "private_cluster_enabled" {
  description = "Should this Kubernetes Cluster have its API server only exposed on internal IP addresses? This provides a Private IP Address for the Kubernetes API on the Virtual Network where the Kubernetes Cluster is located. Defaults to false. Changing this forces a new resource to be created."
  default     = false
  type        = bool
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}

variable "network_profile" {
  description = "Network profile"
  type = object({
    network_plugin = string
    network_policy = string
    service_cidr   = string
    dns_service_ip = string
  })
  default = null
}

variable "max_pods" {
  description = "Maximum number of pods that can run on a node"
  type        = number
  default     = 60
}

variable "workload_identity_enabled" {
  description = "Enable Workload Identity"
  type        = bool
  default     = false
}

variable "authorized_ip_ranges" {
  description = "The ip range allowed to access the Kubernetes API server, defaults to 0.0.0.0/0"
  type        = list(string)
  default     = ["0.0.0.0/0"]
}

variable "azure_policy_enabled" {
  description = "Enable Azure Policy"
  type        = bool
  default     = false
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/modules/registry/main.tf
---

resource "azurerm_container_registry" "container_registry" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location
  sku                 = "Standard"
  tags                = var.tags
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/modules/registry/variables.tf
---

variable "name" {
  description = "Prefix name to azure container registry"
  type        = string
}

variable "location" {
  description = "Location of nebari resource group"
  type        = string
}

variable "resource_group_name" {
  description = "name of nebari resource group"
  type        = string
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/main.tf
---

resource "azurerm_resource_group" "resource_group" {
  name     = var.resource_group_name
  location = var.region
  tags     = var.tags
}


module "registry" {
  source = "./modules/registry"

  name                = "${var.name}${var.environment}"
  location            = var.region
  resource_group_name = azurerm_resource_group.resource_group.name
  tags                = var.tags
}


module "kubernetes" {
  source = "./modules/kubernetes"

  name                = "${var.name}-${var.environment}"
  environment         = var.environment
  location            = var.region
  resource_group_name = azurerm_resource_group.resource_group.name
  # Azure requires that a new, non-existent Resource Group is used, as otherwise
  # the provisioning of the Kubernetes Service will fail.
  node_resource_group_name = var.node_resource_group_name
  kubernetes_version       = var.kubernetes_version
  tags                     = var.tags
  max_pods                 = var.max_pods
  authorized_ip_ranges     = var.authorized_ip_ranges

  network_profile = var.network_profile

  node_groups = [
    for name, config in var.node_groups : {
      name          = name
      auto_scale    = true
      instance_type = config.instance
      min_size      = config.min_nodes
      max_size      = config.max_nodes
    }
  ]
  vnet_subnet_id            = var.vnet_subnet_id
  private_cluster_enabled   = var.private_cluster_enabled
  workload_identity_enabled = var.workload_identity_enabled
  azure_policy_enabled      = var.azure_policy_enabled
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/outputs.tf
---

output "kubernetes_credentials" {
  description = "Parameters needed to connect to kubernetes cluster"
  sensitive   = true
  value = {
    username               = module.kubernetes.credentials.username
    password               = module.kubernetes.credentials.password
    client_certificate     = module.kubernetes.credentials.client_certificate
    client_key             = module.kubernetes.credentials.client_key
    cluster_ca_certificate = module.kubernetes.credentials.cluster_ca_certificate
    host                   = module.kubernetes.credentials.endpoint
  }
}

resource "local_file" "kubeconfig" {
  count = var.kubeconfig_filename != null ? 1 : 0

  content  = module.kubernetes.kubeconfig
  filename = var.kubeconfig_filename
}

output "kubeconfig_filename" {
  description = "filename for nebari kubeconfig"
  value       = var.kubeconfig_filename
}

output "cluster_oidc_issuer_url" {
  description = "The OpenID Connect issuer URL that is associated with the ASK cluster"
  value       = module.kubernetes.cluster_oidc_issuer_url
}

output "resource_group_name" {
  description = "The name of the resource group in which the ASK cluster is created"
  value       = module.kubernetes.resource_group_name
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/providers.tf
---

provider "azurerm" {
  features {}
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/variables.tf
---

variable "name" {
  description = "Prefix name to assign to nebari resources"
  type        = string
}

variable "environment" {
  description = "Environment to create Kubernetes resources"
  type        = string
}

variable "region" {
  description = "Azure region"
  type        = string
}

variable "kubernetes_version" {
  description = "Azure kubernetes version"
  type        = string
}

variable "node_groups" {
  description = "Azure node groups"
  type = map(object({
    instance  = string
    min_nodes = number
    max_nodes = number
  }))
}

variable "kubeconfig_filename" {
  description = "Kubernetes kubeconfig written to filesystem"
  type        = string
}

variable "resource_group_name" {
  description = "Specifies the Resource Group where the Managed Kubernetes Cluster should exist"
  type        = string
}

variable "node_resource_group_name" {
  description = "The name of the Resource Group where the Kubernetes Nodes should exist"
  type        = string
}

variable "vnet_subnet_id" {
  description = "The ID of a Subnet where the Kubernetes Node Pool should exist. Changing this forces a new resource to be created."
  type        = string
}

variable "private_cluster_enabled" {
  description = "Should this Kubernetes Cluster have its API server only exposed on internal IP addresses? This provides a Private IP Address for the Kubernetes API on the Virtual Network where the Kubernetes Cluster is located. Defaults to false. Changing this forces a new resource to be created."
  default     = false
  type        = bool
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}

variable "network_profile" {
  description = "Network profile"
  type = object({
    network_plugin = string
    network_policy = string
    service_cidr   = string
    dns_service_ip = string
  })
  default = null
}

variable "max_pods" {
  description = "Maximum number of pods that can run on a node"
  type        = number
  default     = 60
}

variable "workload_identity_enabled" {
  description = "Enable Workload Identity"
  type        = bool
  default     = false
}

variable "authorized_ip_ranges" {
  description = "The ip range allowed to access the Kubernetes API server, defaults to 0.0.0.0/0"
  type        = list(string)
  default     = ["0.0.0.0/0"]
}

variable "azure_policy_enabled" {
  description = "Enable Azure Policy"
  type        = bool
  default     = false
}



---
File: nebari/src/_nebari/stages/infrastructure/template/azure/versions.tf
---

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "=4.7.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/infrastructure/template/existing/main.tf
---

variable "kube_context" {
  description = "Optional kubernetes context to use to connect to kubernetes cluster"
  type        = string
}

output "kubernetes_credentials" {
  description = "Parameters needed to connect to kubernetes cluster locally"
  value = {
    config_path    = pathexpand("~/.kube/config")
    config_context = var.kube_context
  }
}

output "kubeconfig_filename" {
  description = "filename for nebari kubeconfig"
  value       = pathexpand("~/.kube/config")
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/kubernetes/templates/kubeconfig.yaml
---

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ${cluster_ca_certificate}
    server: https://${endpoint}
  name: ${context}
contexts:
- context:
    cluster: ${context}
    user: ${context}
  name: ${context}
current-context: ${context}
kind: Config
preferences: {}
users:
- name: ${context}
  user:
    token: ${token}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/kubernetes/locals.tf
---

locals {
  node_group_service_account_roles = concat(var.additional_node_group_roles, [
    "roles/logging.logWriter",
    "roles/monitoring.metricWriter",
    "roles/monitoring.viewer",
    "roles/stackdriver.resourceMetadata.writer"
  ])

  node_group_oauth_scopes = concat(var.additional_node_group_oauth_scopes, [
    "https://www.googleapis.com/auth/logging.write",
    "https://www.googleapis.com/auth/monitoring"
  ])

  merged_node_groups = [for node_group in var.node_groups : merge(var.node_group_defaults, node_group)]

}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/kubernetes/main.tf
---

data "google_client_config" "main" {
}

resource "google_container_cluster" "main" {
  name               = var.name
  location           = var.location
  min_master_version = var.kubernetes_version

  node_locations = var.availability_zones

  # We can't create a cluster with no node pool defined, but we want to only use
  # separately managed node pools. So we create the smallest possible default
  # node pool and immediately delete it.
  remove_default_node_pool = true
  initial_node_count       = 1

  master_auth {
    client_certificate_config {
      issue_client_certificate = true
    }
  }

  release_channel {
    channel = var.release_channel
  }

  networking_mode = var.networking_mode
  network         = var.network
  subnetwork      = var.subnetwork

  dynamic "ip_allocation_policy" {
    for_each = var.ip_allocation_policy == null ? [] : [1]
    content {
      cluster_secondary_range_name  = var.ip_allocation_policy.cluster_secondary_range_name
      services_secondary_range_name = var.ip_allocation_policy.services_secondary_range_name
      cluster_ipv4_cidr_block       = var.ip_allocation_policy.cluster_ipv4_cidr_block
      services_ipv4_cidr_block      = var.ip_allocation_policy.services_ipv4_cidr_block
    }
  }

  dynamic "master_authorized_networks_config" {
    for_each = var.master_authorized_networks_config == null ? [] : [1]
    content {
      cidr_blocks {
        cidr_block   = var.master_authorized_networks_config.cidr_blocks.cidr_block
        display_name = var.master_authorized_networks_config.cidr_blocks.display_name
      }
    }
  }

  dynamic "private_cluster_config" {
    for_each = var.private_cluster_config == null ? [] : [1]
    content {
      enable_private_nodes    = var.private_cluster_config.enable_private_nodes
      enable_private_endpoint = var.private_cluster_config.enable_private_endpoint
      master_ipv4_cidr_block  = var.private_cluster_config.master_ipv4_cidr_block
    }
  }

  lifecycle {
    ignore_changes = [
      node_locations
    ]
  }
}

resource "google_container_node_pool" "main" {
  count = length(local.merged_node_groups)

  name     = local.merged_node_groups[count.index].name
  location = var.location
  cluster  = google_container_cluster.main.name
  version  = var.kubernetes_version

  initial_node_count = local.merged_node_groups[count.index].min_size

  autoscaling {
    min_node_count = local.merged_node_groups[count.index].min_size
    max_node_count = local.merged_node_groups[count.index].max_size
  }

  management {
    auto_repair  = true
    auto_upgrade = false
  }

  node_config {
    preemptible  = local.merged_node_groups[count.index].preemptible
    machine_type = local.merged_node_groups[count.index].instance_type
    image_type   = var.node_group_image_type

    service_account = google_service_account.main.email

    oauth_scopes = local.node_group_oauth_scopes

    metadata = {
      disable-legacy-endpoints = "true"
    }
    labels = merge(local.merged_node_groups[count.index].labels, var.labels)
    dynamic "guest_accelerator" {
      for_each = local.merged_node_groups[count.index].guest_accelerators

      content {
        type  = guest_accelerator.value.name
        count = guest_accelerator.value.count
      }
    }
    tags = var.tags
  }

  lifecycle {
    ignore_changes = [
      node_config[0].taint
    ]
  }
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/kubernetes/outputs.tf
---

output "credentials" {
  description = "Credentials required for connecting to kubernetes cluster"
  sensitive   = true
  value = {
    endpoint = "https://${google_container_cluster.main.endpoint}"
    token    = data.google_client_config.main.access_token
    cluster_ca_certificate = base64decode(
    google_container_cluster.main.master_auth.0.cluster_ca_certificate)
  }
}


# https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/9172b3eaeeb806caca29aa1e3e83e58a26df57b1/modules/auth/main.tf
data "google_client_config" "provider" {}

output "kubeconfig" {
  description = "Kubeconfig for connecting to kubernetes cluster"
  sensitive   = true
  value = templatefile("${path.module}/templates/kubeconfig.yaml", {
    context                = google_container_cluster.main.name
    cluster_ca_certificate = google_container_cluster.main.master_auth[0].cluster_ca_certificate
    endpoint               = google_container_cluster.main.endpoint
    token                  = data.google_client_config.provider.access_token
  })
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/kubernetes/service_account.tf
---

resource "google_service_account" "main" {
  account_id   = var.name
  display_name = "${var.name} kubernetes node-group service account"
}

resource "google_project_iam_member" "main" {
  for_each = toset(local.node_group_service_account_roles)

  role   = each.value
  member = "serviceAccount:${google_service_account.main.email}"

  project = var.project_id
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/kubernetes/variables.tf
---

variable "name" {
  description = "Prefix name for GCP Kubernetes cluster"
  type        = string
}

variable "availability_zones" {
  description = "Zones for Kubernetes cluster to be deployed in"
  type        = list(string)
}

variable "location" {
  description = "Location for GCP Kubernetes cluster"
  type        = string
}

variable "project_id" {
  description = "GCP project_id"
  type        = string
}

variable "additional_node_group_roles" {
  description = "Additional roles to apply to each node group"
  type        = list(string)
  default     = []
}

variable "additional_node_group_oauth_scopes" {
  description = "Additional oauth scopes to apply to each node group"
  type        = list(string)
  default     = []
}

variable "kubernetes_version" {
  description = "Kubernetes version for GKE cluster"
  type        = string
}

variable "release_channel" {
  description = "The cadence of GKE version upgrades"
  type        = string
}

variable "node_groups" {
  description = "Node groups to add to GCP Kubernetes Cluster"
  type        = any
  default = [
    {
      name          = "general"
      instance_type = "n1-standard-2"
      min_size      = 1
      max_size      = 1
      labels        = {}
    },
    {
      name          = "user"
      instance_type = "n1-standard-2"
      min_size      = 0
      max_size      = 2
      labels        = {}
    },
    {
      name          = "worker"
      instance_type = "n1-standard-2"
      min_size      = 0
      max_size      = 5
      labels        = {}
    }
  ]
}

variable "node_group_defaults" {
  description = "Node group default values"
  type = object({
    name          = string
    instance_type = string
    min_size      = number
    max_size      = number
    labels        = map(string)
    preemptible   = bool
    guest_accelerators = list(object({
      type  = string
      count = number
    }))

  })
  default = {
    name          = "node-group-default"
    instance_type = "n1-standard-2"
    min_size      = 0
    max_size      = 1
    labels        = { app : "nebari" }
    preemptible   = false
    # https://www.terraform.io/docs/providers/google/r/container_cluster.html#guest_accelerator
    guest_accelerators = []
  }
}

variable "networking_mode" {
  description = "Determines whether alias IPs or routes will be used for pod IPs in the cluster. Options are VPC_NATIVE or ROUTES."
  type        = string
  default     = "ROUTES"
}

variable "network" {
  description = "Name of the VPC network, where the cluster should be deployed"
  type        = string
  default     = "default"
}

variable "subnetwork" {
  description = "Name of the subnet for deploying cluster into"
  type        = string
  default     = null
}

variable "ip_allocation_policy" {
  description = "Configuration of cluster IP allocation for VPC-native clusters."
  type = map(object({
    cluster_secondary_range_name  = string
    services_secondary_range_name = string
    cluster_ipv4_cidr_block       = string
    services_ipv4_cidr_block      = string
  }))
  default = null
}

variable "master_authorized_networks_config" {
  description = "The desired configuration options for master authorized networks"
  type = map(object({
    cidr_blocks = map(object({
      cidr_block   = string
      display_name = string
    }))
  }))
  default = null
}

variable "private_cluster_config" {
  description = "Configuration for private clusters, clusters with private nodes."
  type = map(object({
    enable_private_nodes    = bool
    enable_private_endpoint = bool
    master_ipv4_cidr_block  = string
  }))
  default = null
}

variable "tags" {
  description = "Google Cloud Platform tags to assign to resources"
  type        = list(string)
  default     = []
}

variable "labels" {
  description = "Google Cloud Platform labels to assign to resources"
  type        = map(string)
  default     = {}
}

variable "node_group_image_type" {
  description = "The image type to use for the node groups"
  type        = string
  default     = null

  validation {
    # Only 2 values are valid according to docs
    # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#image_type
    condition     = var.node_group_image_type == null || contains(["COS_CONTAINERD", "UBUNTU_CONTAINERD"], var.node_group_image_type)
    error_message = "Allowed values for input_parameter are \"COS_CONTAINERD\" or \"UBUNTU_CONTAINERD\"."
  }
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/network/main.tf
---

resource "google_compute_network" "main" {
  name        = var.name
  description = "VCP Gateway for ${var.name}"
}

data "google_compute_subnetwork" "main" {
  name   = var.name
  region = var.region
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/network/variables.tf
---

variable "name" {
  description = "Prefix name to give to network resources"
  type        = string
}

variable "region" {
  description = "GCP region to operate infrastructure"
  type        = string
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/registry/main.tf
---

resource "google_artifact_registry_repository" "registry" {
  # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/artifact_registry_repository#argument-reference
  repository_id = var.repository_id
  location      = var.location
  format        = var.format
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/modules/registry/variables.tf
---

variable "location" {
  # https://cloud.google.com/artifact-registry/docs/docker/pushing-and-pulling
  description = "Location of registry"
  type        = string
}

variable "format" {
  # https://cloud.google.com/artifact-registry/docs/reference/rest/v1/projects.locations.repositories#Format
  description = "The format of packages that are stored in the repository"
  type        = string
  default     = "DOCKER"
}

variable "repository_id" {
  description = "Name of repository"
  type        = string
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/main.tf
---

data "google_compute_zones" "gcpzones" {
  region = var.region
}


module "registry-jupyterhub" {
  source = "./modules/registry"

  repository_id = "${var.name}-${var.environment}"
  location      = var.region
}


module "kubernetes" {
  source = "./modules/kubernetes"

  name       = "${var.name}-${var.environment}"
  location   = var.region
  project_id = var.project_id

  availability_zones = length(var.availability_zones) >= 1 ? var.availability_zones : [data.google_compute_zones.gcpzones.names[0]]

  additional_node_group_roles = [
    "roles/storage.objectViewer",
    "roles/storage.admin"
  ]

  additional_node_group_oauth_scopes = [
    "https://www.googleapis.com/auth/cloud-platform"
  ]

  node_groups                       = var.node_groups
  network                           = var.network
  subnetwork                        = var.subnetwork
  ip_allocation_policy              = var.ip_allocation_policy
  master_authorized_networks_config = var.master_authorized_networks_config
  private_cluster_config            = var.private_cluster_config
  kubernetes_version                = var.kubernetes_version
  release_channel                   = var.release_channel
  tags                              = var.tags
  labels                            = var.labels
  node_group_image_type             = var.node_group_image_type
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/outputs.tf
---

output "kubernetes_credentials" {
  description = "Parameters needed to connect to kubernetes cluster"
  sensitive   = true
  value = {
    host                   = module.kubernetes.credentials.endpoint
    cluster_ca_certificate = module.kubernetes.credentials.cluster_ca_certificate
    token                  = module.kubernetes.credentials.token
  }
}

resource "local_file" "kubeconfig" {
  count = var.kubeconfig_filename != null ? 1 : 0

  content  = module.kubernetes.kubeconfig
  filename = var.kubeconfig_filename
}

output "kubeconfig_filename" {
  description = "filename for nebari kubeconfig"
  value       = var.kubeconfig_filename
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/variables.tf
---

variable "name" {
  description = "Prefix name to assign to nebari resources"
  type        = string
}

variable "environment" {
  description = "Environment to create Kubernetes resources"
  type        = string
}

variable "region" {
  description = "Google Cloud Platform region"
  type        = string
}

variable "project_id" {
  description = "Google project_id"
  type        = string
}

variable "availability_zones" {
  description = "Availability zones to use for nebari deployment"
  type        = list(string)
}

variable "node_groups" {
  description = "GCP node groups"
  type        = any
}

variable "kubeconfig_filename" {
  description = "Kubernetes kubeconfig written to filesystem"
  type        = string
}

variable "tags" {
  description = "Google Cloud Platform tags to assign to resources"
  type        = list(string)
  default     = []
}

variable "labels" {
  description = "Google Cloud Platform labels to assign to resources"
  type        = map(string)
  default     = {}
}


variable "kubernetes_version" {
  description = "Kubernetes version for GKE cluster"
  type        = string
}

variable "release_channel" {
  description = "The cadence of GKE version upgrades"
  type        = string
}

variable "networking_mode" {
  description = "Determines whether alias IPs or routes will be used for pod IPs in the cluster. Options are VPC_NATIVE or ROUTES."
  type        = string
}

variable "network" {
  description = "Name of the VPC network, where the cluster should be deployed"
  type        = string
}

variable "subnetwork" {
  description = "Name of the subnet for deploying cluster into"
  type        = string
}

variable "ip_allocation_policy" {
  description = "Configuration of cluster IP allocation for VPC-native clusters."
  type = map(object({
    cluster_secondary_range_name  = string
    services_secondary_range_name = string
    cluster_ipv4_cidr_block       = string
    services_ipv4_cidr_block      = string
  }))
}

variable "master_authorized_networks_config" {
  description = "The desired configuration options for master authorized networks"
  type = map(object({
    cidr_blocks = map(object({
      cidr_block   = string
      display_name = string
    }))
  }))
}

variable "private_cluster_config" {
  description = "Configuration for private clusters, clusters with private nodes."
  type = map(object({
    enable_private_nodes    = bool
    enable_private_endpoint = bool
    master_ipv4_cidr_block  = string
  }))
}

variable "node_group_image_type" {
  description = "The image type to use for the node groups"
  type        = string
  default     = null

  validation {
    # Only 2 values are valid according to docs
    # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#image_type
    condition     = var.node_group_image_type == null || contains(["COS_CONTAINERD", "UBUNTU_CONTAINERD"], var.node_group_image_type)
    error_message = "Allowed values for input_parameter are \"COS_CONTAINERD\" or \"UBUNTU_CONTAINERD\"."
  }
}



---
File: nebari/src/_nebari/stages/infrastructure/template/gcp/versions.tf
---

terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "6.14.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/infrastructure/template/local/main.tf
---

terraform {
  required_providers {
    kind = {
      source  = "registry.terraform.io/tehcyx/kind"
      version = "0.4.0"
    }
    docker = {
      source  = "kreuzwerker/docker"
      version = "2.16.0"
    }
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = ">= 1.7.0"
    }
  }
}

provider "kind" {

}

provider "docker" {

}

provider "kubernetes" {
  host                   = kind_cluster.default.endpoint
  cluster_ca_certificate = kind_cluster.default.cluster_ca_certificate
  client_key             = kind_cluster.default.client_key
  client_certificate     = kind_cluster.default.client_certificate
}

provider "kubectl" {
  load_config_file       = false
  host                   = kind_cluster.default.endpoint
  cluster_ca_certificate = kind_cluster.default.cluster_ca_certificate
  client_key             = kind_cluster.default.client_key
  client_certificate     = kind_cluster.default.client_certificate
}

resource "kind_cluster" "default" {
  name           = "test-cluster"
  wait_for_ready = true

  kind_config {
    kind        = "Cluster"
    api_version = "kind.x-k8s.io/v1alpha4"

    node {
      role  = "general"
      image = "kindest/node:v1.29.2"
    }
  }
}

resource "kubernetes_namespace" "metallb" {
  metadata {
    name = "metallb-system"
  }
}

data "kubectl_path_documents" "metallb" {
  pattern = "${path.module}/metallb.yaml"
}

resource "kubectl_manifest" "metallb" {
  for_each   = toset(data.kubectl_path_documents.metallb.documents)
  yaml_body  = each.value
  wait       = true
  depends_on = [kubernetes_namespace.metallb]
}

resource "kubectl_manifest" "load-balancer" {
  yaml_body = yamlencode({
    apiVersion = "v1"
    kind       = "ConfigMap"
    metadata = {
      namespace = kubernetes_namespace.metallb.metadata.0.name
      name      = "config"
    }
    data = {
      config = yamlencode({
        address-pools = [{
          name     = "default"
          protocol = "layer2"
          addresses = [
            "${local.metallb_ip_min}-${local.metallb_ip_max}"
          ]
        }]
      })
    }
  })

  depends_on = [kubectl_manifest.metallb]
}

data "docker_network" "kind" {
  name = "kind"

  depends_on = [kind_cluster.default]
}

locals {
  metallb_ip_min = cidrhost([
    for network in data.docker_network.kind.ipam_config : network if network.gateway != ""
  ][0].subnet, 356)

  metallb_ip_max = cidrhost([
    for network in data.docker_network.kind.ipam_config : network if network.gateway != ""
  ][0].subnet, 406)
}



---
File: nebari/src/_nebari/stages/infrastructure/template/local/metallb.yaml
---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: metallb
  name: controller
  namespace: metallb-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: metallb
  name: speaker
  namespace: metallb-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: metallb
  name: metallb-system:controller
rules:
- apiGroups:
  - ''
  resources:
  - services
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ''
  resources:
  - services/status
  verbs:
  - update
- apiGroups:
  - ''
  resources:
  - events
  verbs:
  - create
  - patch
- apiGroups:
  - policy
  resourceNames:
  - controller
  resources:
  - podsecuritypolicies
  verbs:
  - use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: metallb
  name: metallb-system:speaker
rules:
- apiGroups:
  - ''
  resources:
  - services
  - endpoints
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups: ["discovery.k8s.io"]
  resources:
  - endpointslices
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ''
  resources:
  - events
  verbs:
  - create
  - patch
- apiGroups:
  - policy
  resourceNames:
  - speaker
  resources:
  - podsecuritypolicies
  verbs:
  - use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  labels:
    app: metallb
  name: config-watcher
  namespace: metallb-system
rules:
- apiGroups:
  - ''
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  labels:
    app: metallb
  name: pod-lister
  namespace: metallb-system
rules:
- apiGroups:
  - ''
  resources:
  - pods
  verbs:
  - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  labels:
    app: metallb
  name: controller
  namespace: metallb-system
rules:
- apiGroups:
  - ''
  resources:
  - secrets
  verbs:
  - create
- apiGroups:
  - ''
  resources:
  - secrets
  resourceNames:
  - memberlist
  verbs:
  - list
- apiGroups:
  - apps
  resources:
  - deployments
  resourceNames:
  - controller
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app: metallb
  name: metallb-system:controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: metallb-system:controller
subjects:
- kind: ServiceAccount
  name: controller
  namespace: metallb-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app: metallb
  name: metallb-system:speaker
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: metallb-system:speaker
subjects:
- kind: ServiceAccount
  name: speaker
  namespace: metallb-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    app: metallb
  name: config-watcher
  namespace: metallb-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: config-watcher
subjects:
- kind: ServiceAccount
  name: controller
- kind: ServiceAccount
  name: speaker
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    app: metallb
  name: pod-lister
  namespace: metallb-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: pod-lister
subjects:
- kind: ServiceAccount
  name: speaker
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    app: metallb
  name: controller
  namespace: metallb-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: controller
subjects:
- kind: ServiceAccount
  name: controller
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: metallb
    component: speaker
  name: speaker
  namespace: metallb-system
spec:
  selector:
    matchLabels:
      app: metallb
      component: speaker
  template:
    metadata:
      annotations:
        prometheus.io/port: '7472'
        prometheus.io/scrape: 'true'
      labels:
        app: metallb
        component: speaker
    spec:
      containers:
      - args:
        - --port=7472
        - --config=config
        - --log-level=info
        env:
        - name: METALLB_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: METALLB_HOST
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: METALLB_ML_BIND_ADDR
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        # needed when another software is also using memberlist / port 7946
        # when changing this default you also need to update the container ports definition
        # and the PodSecurityPolicy hostPorts definition
        #- name: METALLB_ML_BIND_PORT
        #  value: "7946"
        - name: METALLB_ML_LABELS
          value: "app=metallb,component=speaker"
        - name: METALLB_ML_SECRET_KEY
          valueFrom:
            secretKeyRef:
              name: memberlist
              key: secretkey
        image: quay.io/metallb/speaker:v0.12.1
        name: speaker
        ports:
        - containerPort: 7472
          name: monitoring
        - containerPort: 7946
          name: memberlist-tcp
        - containerPort: 7946
          name: memberlist-udp
          protocol: UDP
        livenessProbe:
          httpGet:
            path: /metrics
            port: monitoring
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /metrics
            port: monitoring
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_RAW
            drop:
            - ALL
          readOnlyRootFilesystem: true
      hostNetwork: true
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: speaker
      terminationGracePeriodSeconds: 2
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: metallb
    component: controller
  name: controller
  namespace: metallb-system
spec:
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      app: metallb
      component: controller
  template:
    metadata:
      annotations:
        prometheus.io/port: '7472'
        prometheus.io/scrape: 'true'
      labels:
        app: metallb
        component: controller
    spec:
      containers:
      - args:
        - --port=7472
        - --config=config
        - --log-level=info
        env:
        - name: METALLB_ML_SECRET_NAME
          value: memberlist
        - name: METALLB_DEPLOYMENT
          value: controller
        image: quay.io/metallb/controller:v0.12.1
        name: controller
        ports:
        - containerPort: 7472
          name: monitoring
        livenessProbe:
          httpGet:
            path: /metrics
            port: monitoring
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /metrics
            port: monitoring
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - all
          readOnlyRootFilesystem: true
      nodeSelector:
        kubernetes.io/os: linux
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        fsGroup: 65534
      serviceAccountName: controller
      terminationGracePeriodSeconds: 0



---
File: nebari/src/_nebari/stages/infrastructure/template/local/outputs.tf
---

output "kubernetes_credentials" {
  description = "Parameters needed to connect to kubernetes cluster locally"
  sensitive   = true
  value = {
    config_path            = var.kubeconfig_filename
    host                   = kind_cluster.default.endpoint
    cluster_ca_certificate = kind_cluster.default.cluster_ca_certificate
    client_key             = kind_cluster.default.client_key
    client_certificate     = kind_cluster.default.client_certificate
  }
}

resource "local_file" "default" {
  content  = kind_cluster.default.kubeconfig
  filename = var.kubeconfig_filename
}

output "kubeconfig_filename" {
  description = "filename for nebari kubeconfig"
  value       = var.kubeconfig_filename
}



---
File: nebari/src/_nebari/stages/infrastructure/template/local/variables.tf
---

variable "kubeconfig_filename" {
  description = "Kubernetes kubeconfig written to filesystem"
  type        = string
}

variable "kube_context" {
  description = "Optional kubernetes context to use to connect to kubernetes cluster"
  type        = string
}



---
File: nebari/src/_nebari/stages/infrastructure/__init__.py
---

import contextlib
import enum
import inspect
import os
import pathlib
import re
import sys
import tempfile
import warnings
from typing import Annotated, Any, Dict, List, Literal, Optional, Tuple, Type, Union

from pydantic import ConfigDict, Field, field_validator, model_validator

from _nebari import constants
from _nebari.provider import opentofu
from _nebari.provider.cloud import amazon_web_services, azure_cloud, google_cloud
from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.kubernetes_services import SharedFsEnum
from _nebari.stages.tf_objects import NebariTerraformState
from _nebari.utils import (
    AZURE_NODE_RESOURCE_GROUP_SUFFIX,
    construct_azure_resource_group_name,
    modified_environ,
)
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


def get_kubeconfig_filename():
    return str(pathlib.Path(tempfile.gettempdir()) / "NEBARI_KUBECONFIG")


class LocalInputVars(schema.Base):
    kubeconfig_filename: str = get_kubeconfig_filename()
    kube_context: Optional[str] = None


class ExistingInputVars(schema.Base):
    kube_context: str


class GCPNodeGroupInputVars(schema.Base):
    name: str
    instance_type: str
    min_size: int
    max_size: int
    labels: Dict[str, str]
    preemptible: bool
    guest_accelerators: List["GCPGuestAccelerator"]


class GCPPrivateClusterConfig(schema.Base):
    enable_private_nodes: bool
    enable_private_endpoint: bool
    master_ipv4_cidr_block: str


@schema.yaml_object(schema.yaml)
class GCPNodeGroupImageTypeEnum(str, enum.Enum):
    UBUNTU_CONTAINERD = "UBUNTU_CONTAINERD"
    COS_CONTAINERD = "COS_CONTAINERD"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class GCPInputVars(schema.Base):
    name: str
    environment: str
    region: str
    project_id: str
    availability_zones: List[str]
    node_groups: List[GCPNodeGroupInputVars]
    kubeconfig_filename: str = get_kubeconfig_filename()
    tags: List[str]
    kubernetes_version: str
    release_channel: str
    networking_mode: str
    network: str
    subnetwork: Optional[str] = None
    ip_allocation_policy: Optional[Dict[str, str]] = None
    master_authorized_networks_config: Optional[Dict[str, str]] = None
    private_cluster_config: Optional[GCPPrivateClusterConfig] = None
    node_group_image_type: GCPNodeGroupImageTypeEnum = None


class AzureNodeGroupInputVars(schema.Base):
    instance: str
    min_nodes: int
    max_nodes: int


class AzureInputVars(schema.Base):
    name: str
    environment: str
    region: str
    authorized_ip_ranges: List[str] = ["0.0.0.0/0"]
    kubeconfig_filename: str = get_kubeconfig_filename()
    kubernetes_version: str
    node_groups: Dict[str, AzureNodeGroupInputVars]
    resource_group_name: str
    node_resource_group_name: str
    vnet_subnet_id: Optional[str] = None
    private_cluster_enabled: bool
    tags: Dict[str, str] = {}
    max_pods: Optional[int] = None
    network_profile: Optional[Dict[str, str]] = None
    azure_policy_enabled: Optional[bool] = None
    workload_identity_enabled: bool = False


class AWSAmiTypes(str, enum.Enum):
    AL2_x86_64 = "AL2_x86_64"
    AL2_x86_64_GPU = "AL2_x86_64_GPU"
    CUSTOM = "CUSTOM"


class AWSNodeLaunchTemplate(schema.Base):
    pre_bootstrap_command: Optional[str] = None
    ami_id: Optional[str] = None


class AWSNodeGroupInputVars(schema.Base):
    name: str
    instance_type: str
    gpu: bool = False
    min_size: int
    desired_size: int
    max_size: int
    single_subnet: bool
    permissions_boundary: Optional[str] = None
    ami_type: Optional[AWSAmiTypes] = None
    launch_template: Optional[AWSNodeLaunchTemplate] = None


def construct_aws_ami_type(
    gpu_enabled: bool, launch_template: AWSNodeLaunchTemplate
) -> str:
    """
    This function selects the Amazon Machine Image (AMI) type for AWS nodes by evaluating
    the provided parameters. The selection logic prioritizes the launch template over the
    GPU flag.

    Returns the AMI type (str) determined by the following rules:
        - Returns "CUSTOM" if a `launch_template` is provided and it includes a valid `ami_id`.
        - Returns "AL2_x86_64_GPU" if `gpu_enabled` is True and no valid
          `launch_template` is provided (None).
        - Returns "AL2_x86_64" as the default AMI type if `gpu_enabled` is False and no
          valid `launch_template` is provided (None).
    """

    if launch_template and getattr(launch_template, "ami_id", None):
        return "CUSTOM"

    if gpu_enabled:
        return "AL2_x86_64_GPU"

    return "AL2_x86_64"


class AWSInputVars(schema.Base):
    name: str
    environment: str
    existing_security_group_id: Optional[str] = None
    existing_subnet_ids: Optional[List[str]] = None
    region: str
    kubernetes_version: str
    eks_endpoint_access: Optional[
        Literal["private", "public", "public_and_private"]
    ] = "public"
    eks_kms_arn: Optional[str] = None
    node_groups: List[AWSNodeGroupInputVars]
    availability_zones: List[str]
    vpc_cidr_block: str
    permissions_boundary: Optional[str] = None
    kubeconfig_filename: str = get_kubeconfig_filename()
    tags: Dict[str, str] = {}
    efs_enabled: bool


def _calculate_asg_node_group_map(config: schema.Main):
    if config.provider == schema.ProviderEnum.aws:
        return amazon_web_services.aws_get_asg_node_group_mapping(
            config.project_name,
            config.namespace,
            config.amazon_web_services.region,
        )
    else:
        return {}


def _calculate_node_groups(config: schema.Main):
    if config.provider == schema.ProviderEnum.aws:
        return {
            group: {"key": "eks.amazonaws.com/nodegroup", "value": group}
            for group in ["general", "user", "worker"]
        }
    elif config.provider == schema.ProviderEnum.gcp:
        return {
            group: {"key": "cloud.google.com/gke-nodepool", "value": group}
            for group in ["general", "user", "worker"]
        }
    elif config.provider == schema.ProviderEnum.azure:
        return {
            group: {"key": "azure-node-pool", "value": group}
            for group in ["general", "user", "worker"]
        }
    elif config.provider == schema.ProviderEnum.existing:
        return config.existing.model_dump()["node_selectors"]
    else:
        return config.local.model_dump()["node_selectors"]


def node_groups_to_dict(node_groups):
    return {ng_name: ng.model_dump() for ng_name, ng in node_groups.items()}


@contextlib.contextmanager
def kubernetes_provider_context(kubernetes_credentials: Dict[str, str]):
    credential_mapping = {
        "config_path": "KUBE_CONFIG_PATH",
        "config_context": "KUBE_CTX",
        "username": "KUBE_USER",
        "password": "KUBE_PASSWORD",
        "client_certificate": "KUBE_CLIENT_CERT_DATA",
        "client_key": "KUBE_CLIENT_KEY_DATA",
        "cluster_ca_certificate": "KUBE_CLUSTER_CA_CERT_DATA",
        "host": "KUBE_HOST",
        "token": "KUBE_TOKEN",
    }

    credentials = {
        credential_mapping[k]: v
        for k, v in kubernetes_credentials.items()
        if v is not None
    }
    with modified_environ(**credentials):
        yield


class KeyValueDict(schema.Base):
    key: str
    value: str


class GCPIPAllocationPolicy(schema.Base):
    cluster_secondary_range_name: str
    services_secondary_range_name: str
    cluster_ipv4_cidr_block: str
    services_ipv4_cidr_block: str


class GCPCIDRBlock(schema.Base):
    cidr_block: str
    display_name: str


class GCPMasterAuthorizedNetworksConfig(schema.Base):
    cidr_blocks: List[GCPCIDRBlock]


class GCPGuestAccelerator(schema.Base):
    """
    See general information regarding GPU support at:
    # TODO: replace with nebari.dev new URL
    https://docs.nebari.dev/en/stable/source/admin_guide/gpu.html?#add-gpu-node-group
    """

    name: str
    count: Annotated[int, Field(ge=1)] = 1


class GCPNodeGroup(schema.Base):
    instance: str
    min_nodes: Annotated[int, Field(ge=0)] = 0
    max_nodes: Annotated[int, Field(ge=1)] = 1
    preemptible: bool = False
    labels: Dict[str, str] = {}
    guest_accelerators: List[GCPGuestAccelerator] = []


DEFAULT_GCP_NODE_GROUPS = {
    "general": GCPNodeGroup(instance="e2-standard-8", min_nodes=1, max_nodes=1),
    "user": GCPNodeGroup(instance="e2-standard-4", min_nodes=0, max_nodes=5),
    "worker": GCPNodeGroup(instance="e2-standard-4", min_nodes=0, max_nodes=5),
}


class GoogleCloudPlatformProvider(schema.Base):
    # If you pass a major and minor version without a patch version
    # yaml will pass it as a float, so we need to coerce it to a string
    model_config = ConfigDict(coerce_numbers_to_str=True)
    region: str
    project: str
    kubernetes_version: str
    availability_zones: Optional[List[str]] = []
    release_channel: str = constants.DEFAULT_GKE_RELEASE_CHANNEL
    node_groups: Dict[str, GCPNodeGroup] = DEFAULT_GCP_NODE_GROUPS
    tags: Optional[List[str]] = []
    networking_mode: str = "ROUTE"
    network: str = "default"
    subnetwork: Optional[Union[str, None]] = None
    ip_allocation_policy: Optional[Union[GCPIPAllocationPolicy, None]] = None
    master_authorized_networks_config: Optional[Union[GCPCIDRBlock, None]] = None
    private_cluster_config: Optional[Union[GCPPrivateClusterConfig, None]] = None

    @field_validator("kubernetes_version", mode="before")
    @classmethod
    def transform_version_to_str(cls, value) -> str:
        """Transforms the version to a string if it is not already."""
        return str(value)

    @model_validator(mode="before")
    @classmethod
    def _check_input(cls, data: Any) -> Any:
        available_regions = google_cloud.regions()
        if data["region"] not in available_regions:
            raise ValueError(
                f"Google Cloud region={data['region']} is not one of {available_regions}"
            )

        available_kubernetes_versions = google_cloud.kubernetes_versions(data["region"])
        if not any(
            v.startswith(str(data["kubernetes_version"]))
            for v in available_kubernetes_versions
        ):
            raise ValueError(
                f"\nInvalid `kubernetes-version` provided: {data['kubernetes_version']}.\nPlease select from one of the following supported Kubernetes versions: {available_kubernetes_versions} or omit flag to use latest Kubernetes version available."
            )

        # check if instances are valid
        available_instances = google_cloud.instances(data["region"])
        if "node_groups" in data:
            for _, node_group in data["node_groups"].items():
                instance = (
                    node_group["instance"]
                    if hasattr(node_group, "__getitem__")
                    else node_group.instance
                )
                if instance not in available_instances:
                    raise ValueError(
                        f"Google Cloud Platform instance {instance} not one of available instance types={available_instances}"
                    )

        return data


class AzureNodeGroup(schema.Base):
    instance: str
    min_nodes: int
    max_nodes: int


DEFAULT_AZURE_NODE_GROUPS = {
    "general": AzureNodeGroup(instance="Standard_D8_v3", min_nodes=1, max_nodes=1),
    "user": AzureNodeGroup(instance="Standard_D4_v3", min_nodes=0, max_nodes=5),
    "worker": AzureNodeGroup(instance="Standard_D4_v3", min_nodes=0, max_nodes=5),
}


class AzureProvider(schema.Base):
    region: str
    kubernetes_version: Optional[str] = None
    storage_account_postfix: str
    authorized_ip_ranges: Optional[List[str]] = ["0.0.0.0/0"]
    resource_group_name: Optional[str] = None
    node_groups: Dict[str, AzureNodeGroup] = DEFAULT_AZURE_NODE_GROUPS
    storage_account_postfix: str
    vnet_subnet_id: Optional[str] = None
    private_cluster_enabled: bool = False
    resource_group_name: Optional[str] = None
    tags: Optional[Dict[str, str]] = {}
    network_profile: Optional[Dict[str, str]] = None
    max_pods: Optional[int] = None
    workload_identity_enabled: bool = False
    azure_policy_enabled: Optional[bool] = None

    @model_validator(mode="before")
    @classmethod
    def _check_credentials(cls, data: Any) -> Any:
        azure_cloud.check_credentials()
        return data

    @field_validator("kubernetes_version")
    @classmethod
    def _validate_kubernetes_version(cls, value: Optional[str]) -> str:
        available_kubernetes_versions = azure_cloud.kubernetes_versions()
        if value is None:
            value = available_kubernetes_versions[-1]
        elif value not in available_kubernetes_versions:
            raise ValueError(
                f"\nInvalid `kubernetes-version` provided: {value}.\nPlease select from one of the following supported Kubernetes versions: {available_kubernetes_versions} or omit flag to use latest Kubernetes version available."
            )
        return value

    @field_validator("resource_group_name")
    @classmethod
    def _validate_resource_group_name(cls, value):
        if value is None:
            return value
        length = len(value) + len(AZURE_NODE_RESOURCE_GROUP_SUFFIX)
        if length < 1 or length > 90:
            raise ValueError(
                f"Azure Resource Group name must be between 1 and 90 characters long, when combined with the suffix `{AZURE_NODE_RESOURCE_GROUP_SUFFIX}`."
            )
        if not re.match(r"^[\w\-\.\(\)]+$", value):
            raise ValueError(
                "Azure Resource Group name can only contain alphanumerics, underscores, parentheses, hyphens, and periods."
            )
        if value[-1] == ".":
            raise ValueError("Azure Resource Group name can't end with a period.")

        return value

    @field_validator("tags")
    @classmethod
    def _validate_tags(cls, value: Optional[Dict[str, str]]) -> Dict[str, str]:
        return value if value is None else azure_cloud.validate_tags(value)


class AWSNodeGroup(schema.Base):
    instance: str
    min_nodes: int = 0
    max_nodes: int
    gpu: bool = False
    single_subnet: bool = False
    permissions_boundary: Optional[str] = None
    # Disabled as part of 2024.11.1 until #2832 is resolved
    # launch_template: Optional[AWSNodeLaunchTemplate] = None

    @model_validator(mode="before")
    def check_launch_template(cls, values):
        if "launch_template" in values:
            raise ValueError(
                "The 'launch_template' field is currently unavailable and has been removed from the configuration schema.\nPlease omit this field until it is reintroduced in a future update.",
            )
        return values


DEFAULT_AWS_NODE_GROUPS = {
    "general": AWSNodeGroup(instance="m5.2xlarge", min_nodes=1, max_nodes=1),
    "user": AWSNodeGroup(
        instance="m5.xlarge", min_nodes=0, max_nodes=5, single_subnet=False
    ),
    "worker": AWSNodeGroup(
        instance="m5.xlarge", min_nodes=0, max_nodes=5, single_subnet=False
    ),
}


class AmazonWebServicesProvider(schema.Base):
    region: str
    kubernetes_version: str
    availability_zones: Optional[List[str]]
    node_groups: Dict[str, AWSNodeGroup] = DEFAULT_AWS_NODE_GROUPS
    eks_endpoint_access: Optional[
        Literal["private", "public", "public_and_private"]
    ] = "public"
    eks_kms_arn: Optional[str] = None
    existing_subnet_ids: Optional[List[str]] = None
    existing_security_group_id: Optional[str] = None
    vpc_cidr_block: str = "10.10.0.0/16"
    permissions_boundary: Optional[str] = None
    tags: Optional[Dict[str, str]] = {}

    @model_validator(mode="before")
    @classmethod
    def _check_input(cls, data: Any) -> Any:
        amazon_web_services.check_credentials()

        # check if region is valid
        available_regions = amazon_web_services.regions(data["region"])
        if data["region"] not in available_regions:
            raise ValueError(
                f"Amazon Web Services region={data['region']} is not one of {available_regions}"
            )

        # check if kubernetes version is valid
        available_kubernetes_versions = amazon_web_services.kubernetes_versions(
            data["region"]
        )
        if len(available_kubernetes_versions) == 0:
            raise ValueError("Request to AWS for available Kubernetes versions failed.")
        if data["kubernetes_version"] is None:
            data["kubernetes_version"] = available_kubernetes_versions[-1]
        elif data["kubernetes_version"] not in available_kubernetes_versions:
            raise ValueError(
                f"\nInvalid `kubernetes-version` provided: {data['kubernetes_version']}.\nPlease select from one of the following supported Kubernetes versions: {available_kubernetes_versions} or omit flag to use latest Kubernetes version available."
            )

        # check if availability zones are valid
        available_zones = amazon_web_services.zones(data["region"])
        if "availability_zones" not in data:
            data["availability_zones"] = list(sorted(available_zones))[:2]
        else:
            for zone in data["availability_zones"]:
                if zone not in available_zones:
                    raise ValueError(
                        f"Amazon Web Services availability zone={zone} is not one of {available_zones}"
                    )

        # check if instances are valid
        available_instances = amazon_web_services.instances(data["region"])
        if "node_groups" in data:
            for _, node_group in data["node_groups"].items():
                instance = (
                    node_group["instance"]
                    if hasattr(node_group, "__getitem__")
                    else node_group.instance
                )
                if instance not in available_instances:
                    raise ValueError(
                        f"Amazon Web Services instance {node_group.instance} not one of available instance types={available_instances}"
                    )

        # check if kms key is valid
        available_kms_keys = amazon_web_services.kms_key_arns(data["region"])
        if "eks_kms_arn" in data and data["eks_kms_arn"] is not None:
            key_id = [
                id for id in available_kms_keys.keys() if id in data["eks_kms_arn"]
            ]
            # Raise error if key_id is not found in available_kms_keys
            if (
                len(key_id) != 1
                or available_kms_keys[key_id[0]].Arn != data["eks_kms_arn"]
            ):
                raise ValueError(
                    f"Amazon Web Services KMS Key with ARN {data['eks_kms_arn']} not one of available/enabled keys={[v.Arn for v in available_kms_keys.values() if v.KeyManager=='CUSTOMER' and v.KeySpec=='SYMMETRIC_DEFAULT']}"
                )
            key_id = key_id[0]
            # Raise error if key is not a customer managed key
            if available_kms_keys[key_id].KeyManager != "CUSTOMER":
                raise ValueError(
                    f"Amazon Web Services KMS Key with ID {key_id} is not a customer managed key"
                )
            # Symmetric KMS keys with Encrypt and decrypt key-usage have the SYMMETRIC_DEFAULT key-spec
            # EKS cluster encryption requires a Symmetric key that is set to encrypt and decrypt data
            if available_kms_keys[key_id].KeySpec != "SYMMETRIC_DEFAULT":
                if available_kms_keys[key_id].KeyUsage == "GENERATE_VERIFY_MAC":
                    raise ValueError(
                        f"Amazon Web Services KMS Key with ID {key_id} does not have KeyUsage set to 'Encrypt and decrypt' data"
                    )
                elif available_kms_keys[key_id].KeyUsage != "ENCRYPT_DECRYPT":
                    raise ValueError(
                        f"Amazon Web Services KMS Key with ID {key_id} is not of type Symmetric, and KeyUsage not set to 'Encrypt and decrypt' data"
                    )
                else:
                    raise ValueError(
                        f"Amazon Web Services KMS Key with ID {key_id} is not of type Symmetric"
                    )

        return data


class LocalProvider(schema.Base):
    kube_context: Optional[str] = None
    node_selectors: Dict[str, KeyValueDict] = {
        "general": KeyValueDict(key="kubernetes.io/os", value="linux"),
        "user": KeyValueDict(key="kubernetes.io/os", value="linux"),
        "worker": KeyValueDict(key="kubernetes.io/os", value="linux"),
    }


class ExistingProvider(schema.Base):
    kube_context: Optional[str] = None
    node_selectors: Dict[str, KeyValueDict] = {
        "general": KeyValueDict(key="kubernetes.io/os", value="linux"),
        "user": KeyValueDict(key="kubernetes.io/os", value="linux"),
        "worker": KeyValueDict(key="kubernetes.io/os", value="linux"),
    }


provider_enum_model_map = {
    schema.ProviderEnum.local: LocalProvider,
    schema.ProviderEnum.existing: ExistingProvider,
    schema.ProviderEnum.gcp: GoogleCloudPlatformProvider,
    schema.ProviderEnum.aws: AmazonWebServicesProvider,
    schema.ProviderEnum.azure: AzureProvider,
}

provider_enum_name_map: Dict[schema.ProviderEnum, str] = {
    schema.ProviderEnum.local: "local",
    schema.ProviderEnum.existing: "existing",
    schema.ProviderEnum.gcp: "google_cloud_platform",
    schema.ProviderEnum.aws: "amazon_web_services",
    schema.ProviderEnum.azure: "azure",
}

provider_name_abbreviation_map: Dict[str, str] = {
    value: key.value for key, value in provider_enum_name_map.items()
}

provider_enum_default_node_groups_map: Dict[schema.ProviderEnum, Any] = {
    schema.ProviderEnum.gcp: node_groups_to_dict(DEFAULT_GCP_NODE_GROUPS),
    schema.ProviderEnum.aws: node_groups_to_dict(DEFAULT_AWS_NODE_GROUPS),
    schema.ProviderEnum.azure: node_groups_to_dict(DEFAULT_AZURE_NODE_GROUPS),
}


class InputSchema(schema.Base):
    local: Optional[LocalProvider] = None
    existing: Optional[ExistingProvider] = None
    google_cloud_platform: Optional[GoogleCloudPlatformProvider] = None
    amazon_web_services: Optional[AmazonWebServicesProvider] = None
    azure: Optional[AzureProvider] = None

    @model_validator(mode="before")
    @classmethod
    def check_provider(cls, data: Any) -> Any:
        if "provider" in data:
            provider: str = data["provider"]
            if hasattr(schema.ProviderEnum, provider):
                # TODO: all cloud providers has required fields, but local and existing don't.
                #  And there is no way to initialize a model without user input here.
                #  We preserve the original behavior here, but we should find a better way to do this.
                if provider in ["local", "existing"] and provider not in data:
                    data[provider] = provider_enum_model_map[provider]()
            else:
                # if the provider field is invalid, it won't be set when this validator is called
                # so we need to check for it explicitly here, and set mode to "before"
                # TODO: this is a workaround, check if there is a better way to do this in Pydantic v2
                raise ValueError(
                    f"'{provider}' is not a valid enumeration member; permitted: local, existing, aws, gcp, azure"
                )
            set_providers = {
                provider
                for provider in provider_name_abbreviation_map.keys()
                if provider in data and data[provider]
            }
            expected_provider_config = provider_enum_name_map[provider]
            extra_provider_config = set_providers - {expected_provider_config}
            if extra_provider_config:
                warnings.warn(
                    f"Provider is set to {getattr(provider, 'value', provider)},  but configuration defined for other providers: {extra_provider_config}"
                )

        else:
            set_providers = [
                provider
                for provider in provider_name_abbreviation_map.keys()
                if provider in data
            ]
            num_providers = len(set_providers)
            if num_providers > 1:
                raise ValueError(f"Multiple providers set: {set_providers}")
            elif num_providers == 1:
                data["provider"] = provider_name_abbreviation_map[set_providers[0]]
            elif num_providers == 0:
                data["provider"] = schema.ProviderEnum.local.value

        return data


class NodeSelectorKeyValue(schema.Base):
    key: str
    value: str


class KubernetesCredentials(schema.Base):
    host: str
    cluster_ca_certificate: str
    token: Optional[str] = None
    username: Optional[str] = None
    password: Optional[str] = None
    client_certificate: Optional[str] = None
    client_key: Optional[str] = None
    config_path: Optional[str] = None
    config_context: Optional[str] = None


class OutputSchema(schema.Base):
    node_selectors: Dict[str, NodeSelectorKeyValue]
    kubernetes_credentials: KubernetesCredentials
    kubeconfig_filename: str
    nfs_endpoint: Optional[str] = None


class KubernetesInfrastructureStage(NebariTerraformStage):
    """Generalized method to provision infrastructure.

    After successful deployment the following properties are set on
    `stage_outputs[directory]`.
      - `kubernetes_credentials` which are sufficient credentials to
        connect with the kubernetes provider
      - `kubeconfig_filename` which is a path to a kubeconfig that can
        be used to connect to a kubernetes cluster
      - at least one node running such that resources in the
        node_group.general can be scheduled

    At a high level this stage is expected to provision a kubernetes
    cluster on a given provider.
    """

    name = "02-infrastructure"
    priority = 20

    input_schema = InputSchema
    output_schema = OutputSchema

    @property
    def template_directory(self):
        return (
            pathlib.Path(inspect.getfile(self.__class__)).parent
            / "template"
            / self.config.provider.value
        )

    @property
    def stage_prefix(self):
        return pathlib.Path("stages") / self.name / self.config.provider.value

    def state_imports(self) -> List[Tuple[str, str]]:
        if self.config.provider == schema.ProviderEnum.azure:
            if self.config.azure.resource_group_name is None:
                return []

            subscription_id = os.environ["ARM_SUBSCRIPTION_ID"]
            resource_group_name = construct_azure_resource_group_name(
                project_name=self.config.project_name,
                namespace=self.config.namespace,
                base_resource_group_name=self.config.azure.resource_group_name,
            )
            resource_url = (
                f"/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}"
            )
            return [
                (
                    "azurerm_resource_group.resource_group",
                    resource_url,
                )
            ]

    def tf_objects(self) -> List[Dict]:
        if self.config.provider == schema.ProviderEnum.gcp:
            return [
                opentofu.Provider(
                    "google",
                    project=self.config.google_cloud_platform.project,
                    region=self.config.google_cloud_platform.region,
                ),
                NebariTerraformState(self.name, self.config),
            ]
        elif self.config.provider == schema.ProviderEnum.azure:
            return [
                NebariTerraformState(self.name, self.config),
            ]
        elif self.config.provider == schema.ProviderEnum.aws:
            return [
                opentofu.Provider("aws", region=self.config.amazon_web_services.region),
                NebariTerraformState(self.name, self.config),
            ]
        else:
            return []

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        if self.config.provider == schema.ProviderEnum.local:
            return LocalInputVars(
                kube_context=self.config.local.kube_context
            ).model_dump()
        elif self.config.provider == schema.ProviderEnum.existing:
            return ExistingInputVars(
                kube_context=self.config.existing.kube_context
            ).model_dump()
        elif self.config.provider == schema.ProviderEnum.gcp:
            return GCPInputVars(
                name=self.config.escaped_project_name,
                environment=self.config.namespace,
                region=self.config.google_cloud_platform.region,
                project_id=self.config.google_cloud_platform.project,
                availability_zones=self.config.google_cloud_platform.availability_zones,
                node_groups=[
                    GCPNodeGroupInputVars(
                        name=name,
                        labels=node_group.labels,
                        instance_type=node_group.instance,
                        min_size=node_group.min_nodes,
                        max_size=node_group.max_nodes,
                        preemptible=node_group.preemptible,
                        guest_accelerators=node_group.guest_accelerators,
                    )
                    for name, node_group in self.config.google_cloud_platform.node_groups.items()
                ],
                tags=self.config.google_cloud_platform.tags,
                kubernetes_version=self.config.google_cloud_platform.kubernetes_version,
                release_channel=self.config.google_cloud_platform.release_channel,
                networking_mode=self.config.google_cloud_platform.networking_mode,
                network=self.config.google_cloud_platform.network,
                subnetwork=self.config.google_cloud_platform.subnetwork,
                ip_allocation_policy=self.config.google_cloud_platform.ip_allocation_policy,
                master_authorized_networks_config=self.config.google_cloud_platform.master_authorized_networks_config,
                private_cluster_config=self.config.google_cloud_platform.private_cluster_config,
                node_group_image_type=(
                    GCPNodeGroupImageTypeEnum.UBUNTU_CONTAINERD
                    if self.config.storage.type == SharedFsEnum.cephfs
                    else GCPNodeGroupImageTypeEnum.COS_CONTAINERD
                ),
            ).model_dump()
        elif self.config.provider == schema.ProviderEnum.azure:
            return AzureInputVars(
                name=self.config.escaped_project_name,
                environment=self.config.namespace,
                region=self.config.azure.region,
                kubernetes_version=self.config.azure.kubernetes_version,
                authorized_ip_ranges=self.config.azure.authorized_ip_ranges,
                node_groups={
                    name: AzureNodeGroupInputVars(
                        instance=node_group.instance,
                        min_nodes=node_group.min_nodes,
                        max_nodes=node_group.max_nodes,
                    )
                    for name, node_group in self.config.azure.node_groups.items()
                },
                resource_group_name=construct_azure_resource_group_name(
                    project_name=self.config.project_name,
                    namespace=self.config.namespace,
                    base_resource_group_name=self.config.azure.resource_group_name,
                ),
                node_resource_group_name=construct_azure_resource_group_name(
                    project_name=self.config.project_name,
                    namespace=self.config.namespace,
                    base_resource_group_name=self.config.azure.resource_group_name,
                    suffix=AZURE_NODE_RESOURCE_GROUP_SUFFIX,
                ),
                vnet_subnet_id=self.config.azure.vnet_subnet_id,
                private_cluster_enabled=self.config.azure.private_cluster_enabled,
                tags=self.config.azure.tags,
                network_profile=self.config.azure.network_profile,
                max_pods=self.config.azure.max_pods,
                workload_identity_enabled=self.config.azure.workload_identity_enabled,
                azure_policy_enabled=self.config.azure.azure_policy_enabled,
            ).model_dump()
        elif self.config.provider == schema.ProviderEnum.aws:
            return AWSInputVars(
                name=self.config.escaped_project_name,
                environment=self.config.namespace,
                eks_endpoint_access=self.config.amazon_web_services.eks_endpoint_access,
                eks_kms_arn=self.config.amazon_web_services.eks_kms_arn,
                existing_subnet_ids=self.config.amazon_web_services.existing_subnet_ids,
                existing_security_group_id=self.config.amazon_web_services.existing_security_group_id,
                region=self.config.amazon_web_services.region,
                kubernetes_version=self.config.amazon_web_services.kubernetes_version,
                node_groups=[
                    AWSNodeGroupInputVars(
                        name=name,
                        instance_type=node_group.instance,
                        gpu=node_group.gpu,
                        min_size=node_group.min_nodes,
                        desired_size=node_group.min_nodes,
                        max_size=node_group.max_nodes,
                        single_subnet=node_group.single_subnet,
                        permissions_boundary=node_group.permissions_boundary,
                        launch_template=None,
                        ami_type=construct_aws_ami_type(
                            gpu_enabled=node_group.gpu,
                            launch_template=None,
                        ),
                    )
                    for name, node_group in self.config.amazon_web_services.node_groups.items()
                ],
                availability_zones=self.config.amazon_web_services.availability_zones,
                vpc_cidr_block=self.config.amazon_web_services.vpc_cidr_block,
                permissions_boundary=self.config.amazon_web_services.permissions_boundary,
                tags=self.config.amazon_web_services.tags,
                efs_enabled=self.config.storage.type == SharedFsEnum.efs,
            ).model_dump()
        else:
            raise ValueError(f"Unknown provider: {self.config.provider}")

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        from kubernetes import client, config
        from kubernetes.client.rest import ApiException

        config.load_kube_config(
            config_file=stage_outputs["stages/02-infrastructure"][
                "kubeconfig_filename"
            ]["value"]
        )

        try:
            api_instance = client.CoreV1Api()
            result = api_instance.list_namespace()
        except ApiException:
            print(
                f"ERROR: After stage={self.name} unable to connect to kubernetes cluster"
            )
            sys.exit(1)

        if len(result.items) < 1:
            print(
                f"ERROR: After stage={self.name} no nodes provisioned within kubernetes cluster"
            )
            sys.exit(1)

        print(f"After stage={self.name} kubernetes cluster successfully provisioned")

    def set_outputs(
        self, stage_outputs: Dict[str, Dict[str, Any]], outputs: Dict[str, Any]
    ):
        outputs["node_selectors"] = _calculate_node_groups(self.config)
        super().set_outputs(stage_outputs, outputs)

    @contextlib.contextmanager
    def post_deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        asg_node_group_map = _calculate_asg_node_group_map(self.config)
        if asg_node_group_map:
            amazon_web_services.set_asg_tags(
                asg_node_group_map, self.config.amazon_web_services.region
            )

    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        with super().deploy(stage_outputs, disable_prompt):
            with kubernetes_provider_context(
                stage_outputs["stages/" + self.name]["kubernetes_credentials"]["value"]
            ):
                yield

    @contextlib.contextmanager
    def destroy(
        self, stage_outputs: Dict[str, Dict[str, Any]], status: Dict[str, bool]
    ):
        with super().destroy(stage_outputs, status):
            with kubernetes_provider_context(
                stage_outputs["stages/" + self.name]["kubernetes_credentials"]["value"]
            ):
                yield


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KubernetesInfrastructureStage]



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/modules/kubernetes/ingress/main.tf
---

locals {
  default_cert = [
    "--entrypoints.websecure.http.tls.certResolver=default",
    "--entrypoints.minio.http.tls.certResolver=default",
  ]
  certificate-settings = {
    lets-encrypt = [
      "--entrypoints.websecure.http.tls.certResolver=letsencrypt",
      "--entrypoints.minio.http.tls.certResolver=letsencrypt",
      "--certificatesresolvers.letsencrypt.acme.tlschallenge",
      "--certificatesresolvers.letsencrypt.acme.email=${var.acme-email}",
      "--certificatesresolvers.letsencrypt.acme.storage=/mnt/acme-certificates/acme.json",
      "--certificatesresolvers.letsencrypt.acme.caserver=${var.acme-server}",
    ]
    self-signed = local.default_cert
    existing    = local.default_cert
    disabled    = []
  }
  add-certificate = local.certificate-settings[var.certificate-service]
}


resource "kubernetes_service_account" "main" {
  metadata {
    name      = "${var.name}-traefik-ingress"
    namespace = var.namespace
  }
}

resource "kubernetes_persistent_volume_claim" "traefik_certs_pvc" {
  metadata {
    name      = "traefik-ingress-certs"
    namespace = var.namespace
  }
  spec {
    access_modes = ["ReadWriteOnce"]
    resources {
      requests = {
        storage = "5Gi"
      }
    }
  }
  wait_until_bound = false
}


resource "kubernetes_cluster_role" "main" {
  metadata {
    name = "${var.name}-traefik-ingress"
  }

  rule {
    api_groups = [""]
    resources  = ["services", "endpoints", "secrets"]
    verbs      = ["get", "list", "watch"]
  }

  rule {
    api_groups = ["extensions", "networking.k8s.io"]
    resources  = ["ingresses", "ingressclasses"]
    verbs      = ["get", "list", "watch"]
  }

  rule {
    api_groups = ["extensions"]
    resources  = ["ingresses/status"]
    verbs      = ["update"]
  }

  rule {
    api_groups = ["traefik.containo.us"]
    resources  = ["ingressroutes", "ingressroutetcps", "ingressrouteudps", "middlewares", "middlewaretcps", "tlsoptions", "tlsstores", "traefikservices", "serverstransports"]
    verbs      = ["get", "list", "watch"]
  }
}


resource "kubernetes_cluster_role_binding" "main" {
  metadata {
    name = "${var.name}-traefik-ingress"
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = kubernetes_cluster_role.main.metadata.0.name
  }
  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account.main.metadata.0.name
    namespace = var.namespace
  }
}


resource "kubernetes_service" "main" {
  wait_for_load_balancer = true

  metadata {
    name        = "${var.name}-traefik-ingress"
    namespace   = var.namespace
    annotations = var.load-balancer-annotations
  }

  spec {
    selector = {
      "app.kubernetes.io/component" = "traefik-ingress"
    }

    port {
      name        = "http"
      protocol    = "TCP"
      port        = 80
      target_port = 80
    }

    port {
      name        = "https"
      protocol    = "TCP"
      port        = 443
      target_port = 443
    }

    port {
      name        = "ssh"
      protocol    = "TCP"
      port        = 8022
      target_port = 8022
    }

    port {
      name        = "sftp"
      protocol    = "TCP"
      port        = 8023
      target_port = 8023
    }

    port {
      name        = "minio"
      protocol    = "TCP"
      port        = 9080
      target_port = 9080
    }

    port {
      name        = "tcp"
      protocol    = "TCP"
      port        = 8786
      target_port = 8786
    }

    type             = "LoadBalancer"
    load_balancer_ip = var.load-balancer-ip
  }
}

resource "kubernetes_service" "traefik_internal" {
  wait_for_load_balancer = true

  metadata {
    name      = "${var.name}-traefik-internal"
    namespace = var.namespace
    annotations = {
      "prometheus.io/scrape" = "true"
      "prometheus.io/path"   = "/metrics"
      "prometheus.io/port"   = 9000
    }
    labels = {
      "app.kubernetes.io/component" = "traefik-internal-service"
      "app.kubernetes.io/part-of"   = "traefik-ingress"
    }
  }

  spec {
    selector = {
      "app.kubernetes.io/component" = "traefik-ingress"
    }

    port {
      name        = "http"
      protocol    = "TCP"
      port        = 9000
      target_port = 9000
    }

    type = "ClusterIP"
  }
}

resource "kubernetes_deployment" "main" {
  metadata {
    name      = "${var.name}-traefik-ingress"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        "app.kubernetes.io/component" = "traefik-ingress"
      }
    }

    template {
      metadata {
        labels = {
          "app.kubernetes.io/component" = "traefik-ingress"
        }
      }

      spec {
        service_account_name             = kubernetes_service_account.main.metadata.0.name
        termination_grace_period_seconds = 60

        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.node-group.key
                  operator = "In"
                  values   = [var.node-group.value]
                }
              }
            }
          }
        }

        container {
          image = "${var.traefik-image.image}:${var.traefik-image.tag}"
          name  = var.name

          volume_mount {
            mount_path = "/mnt/acme-certificates"
            name       = "acme-certificates"
          }
          security_context {
            capabilities {
              drop = ["ALL"]
              add  = ["NET_BIND_SERVICE"]
            }
          }

          args = concat([
            # Do not send usage stats
            "--global.checknewversion=false",
            "--global.sendanonymoususage=false",
            # allow access to the dashboard directly through the port
            # TODO: eventually needs to be tied into traefik middle
            # security possibly using jupyterhub auth this is not a
            # security risk at the moment since this port is not
            # externally accessible
            "--api.insecure=true",
            "--api.dashboard=true",
            "--ping=true",
            # Start the Traefik Kubernetes Ingress Controller
            "--providers.kubernetesingress=true",
            "--providers.kubernetesingress.namespaces=${var.namespace}",
            "--providers.kubernetesingress.ingressclass=traefik",
            # Start the Traefik Kubernetes CRD Controller Provider
            "--providers.kubernetescrd",
            "--providers.kubernetescrd.namespaces=${var.namespace}",
            "--providers.kubernetescrd.throttleduration=2s",
            "--providers.kubernetescrd.allowcrossnamespace=false",
            # Define two entrypoint ports, and setup a redirect from HTTP to HTTPS.
            "--entryPoints.web.address=:80",
            "--entryPoints.websecure.address=:443",
            "--entrypoints.ssh.address=:8022",
            "--entrypoints.sftp.address=:8023",
            "--entryPoints.tcp.address=:8786",
            "--entryPoints.traefik.address=:9000",
            # Define the entrypoint port for Minio
            "--entryPoints.minio.address=:9080",
            # Redirect http -> https
            "--entrypoints.web.http.redirections.entryPoint.to=websecure",
            "--entrypoints.web.http.redirections.entryPoint.scheme=https",
            # Enable Prometheus Monitoring of Traefik
            "--metrics.prometheus=true",
            # Enable debug logging. Useful to work out why something might not be
            # working. Fetch logs of the pod.
            "--log.level=${var.loglevel}",
            ],
            local.add-certificate,
            var.additional-arguments,
          )

          port {
            name           = "http"
            container_port = 80
          }

          port {
            name           = "https"
            container_port = 443
          }

          port {
            name           = "ssh"
            container_port = 8022
          }

          port {
            name           = "sftp"
            container_port = 8023
          }

          port {
            name           = "tcp"
            container_port = 8786
          }

          port {
            name           = "traefik"
            container_port = 9000
          }

          port {
            name           = "minio"
            container_port = 9080
          }

          liveness_probe {
            http_get {
              path = "/ping"
              port = "traefik"
            }

            initial_delay_seconds = 10
            timeout_seconds       = 2
            period_seconds        = 10
            failure_threshold     = 3
            success_threshold     = 1
          }

          readiness_probe {
            http_get {
              path = "/ping"
              port = "traefik"
            }

            initial_delay_seconds = 10
            timeout_seconds       = 2
            period_seconds        = 10
            failure_threshold     = 1
            success_threshold     = 1
          }
        }
        volume {
          name = "acme-certificates"
          persistent_volume_claim {
            claim_name = kubernetes_persistent_volume_claim.traefik_certs_pvc.metadata.0.name
          }
        }
      }
    }
  }
}


resource "kubernetes_manifest" "tlsstore_default" {
  count = var.certificate-secret-name != null ? 1 : 0
  manifest = {
    "apiVersion" = "traefik.containo.us/v1alpha1"
    "kind"       = "TLSStore"
    "metadata" = {
      "name"      = "default"
      "namespace" = var.namespace
    }
    "spec" = {
      "defaultCertificate" = {
        "secretName" = var.certificate-secret-name
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/modules/kubernetes/ingress/outputs.tf
---

locals {
  ingress = kubernetes_service.main.status.0.load_balancer.0.ingress
}

output "endpoint" {
  description = "traefik load balancer endpoint"
  //  handles the case when ingress is empty list
  value = length(local.ingress) == 0 ? null : local.ingress.0
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/modules/kubernetes/ingress/variables.tf
---

variable "name" {
  description = "name prefix to assign to traefik"
  type        = string
  default     = "nebari"
}

variable "namespace" {
  description = "namespace to deploy traefik"
  type        = string
}

variable "node-group" {
  description = "Node group to associate ingress deployment"
  type = object({
    key   = string
    value = string
  })

}

variable "traefik-image" {
  description = "traefik image to use"
  type = object({
    image = string
    tag   = string
  })
}

variable "loglevel" {
  description = "traefik log level"
  default     = "WARN"
}

variable "acme-email" {
  description = "ACME server email"
  default     = "costrouchov@quansight.com"
}

variable "acme-server" {
  description = "ACME server"
  # for testing use the letencrypt staging server
  #  - staging:    https://acme-staging-v02.api.letsencrypt.org/directory
  #  - production: https://acme-v02.api.letsencrypt.org/directory
  default = "https://acme-staging-v02.api.letsencrypt.org/directory"
}

variable "certificate-secret-name" {
  description = "Kubernetes secret used for certificate"
  type        = string
  default     = null
}

variable "load-balancer-ip" {
  description = "IP Address of the load balancer"
  type        = string
  default     = null
}

variable "load-balancer-annotations" {
  description = "Annotations for the load balancer"
  type        = map(string)
  default     = null
}

variable "certificate-service" {
  description = "The certificate service to use"
  type        = string
  default     = "self-signed"
}

variable "additional-arguments" {
  description = "Additional command line arguments to supply to traefik ingress"
  type        = list(string)
  default     = []
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/locals.tf
---

locals {
  additional_tags = {
    Project     = var.name
    Owner       = "terraform"
    Environment = var.environment
  }

  cluster_name = "${var.name}-${var.environment}"
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/main.tf
---

module "kubernetes-ingress" {
  source = "./modules/kubernetes/ingress"

  namespace = var.environment

  node-group = var.node_groups.general

  traefik-image = var.traefik-image

  certificate-service       = var.certificate-service
  acme-email                = var.acme-email
  acme-server               = var.acme-server
  certificate-secret-name   = var.certificate-secret-name
  load-balancer-annotations = var.load-balancer-annotations
  load-balancer-ip          = var.load-balancer-ip
  additional-arguments      = var.additional-arguments
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/outputs.tf
---

output "load_balancer_address" {
  description = "traefik load balancer address"
  value       = module.kubernetes-ingress.endpoint
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/variables.tf
---

variable "name" {
  description = "Prefix name to assign to ingress kubernetes resources"
  type        = string
}

variable "environment" {
  description = "Kubernetes namespace to deploy ingress resources"
  type        = string
}

variable "node_groups" {
  description = "Node group selectors for kubernetes resources"
  type = map(object({
    key   = string
    value = string
  }))
}

variable "traefik-image" {
  description = "traefik image to use"
  type = object({
    image = string
    tag   = string
  })
}

variable "acme-email" {
  description = "ACME server email"
  default     = "nebari@example.com"
}

variable "acme-server" {
  description = "ACME server"
  # for testing use the letencrypt staging server
  #  - staging:    https://acme-staging-v02.api.letsencrypt.org/directory
  #  - production: https://acme-v02.api.letsencrypt.org/directory
  default = "https://acme-staging-v02.api.letsencrypt.org/directory"
}

variable "certificate-secret-name" {
  description = "Kubernetes secret used for certificate"
  default     = ""
}


variable "load-balancer-ip" {
  description = "IP Address of the load balancer"
  type        = string
  default     = null
}


variable "load-balancer-annotations" {
  description = "Annotations for the load balancer"
  type        = map(string)
  default     = null
}


variable "certificate-service" {
  description = "The certificate service to use"
  type        = string
  default     = "self-signed"
}


variable "additional-arguments" {
  description = "Additional command line arguments to supply to traefik ingress"
  type        = list(string)
  default     = []
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/template/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_ingress/__init__.py
---

import enum
import logging
import socket
import sys
import time
from typing import Any, Dict, List, Optional, Type

from _nebari import constants
from _nebari.provider.dns.cloudflare import update_record
from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.tf_objects import (
    NebariHelmProvider,
    NebariKubernetesProvider,
    NebariTerraformState,
)
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl

logger = logging.getLogger(__name__)

# check and retry settings
NUM_ATTEMPTS = 10
TIMEOUT = 10  # seconds


def provision_ingress_dns(
    stage_outputs: Dict[str, Dict[str, Any]],
    config: schema.Main,
    dns_provider: str,
    dns_auto_provision: bool,
    disable_prompt: bool = True,
):
    directory = "stages/04-kubernetes-ingress"

    ip_or_name = stage_outputs[directory]["load_balancer_address"]["value"]
    ip_or_hostname = ip_or_name["hostname"] or ip_or_name["ip"]

    if dns_auto_provision and dns_provider == "cloudflare":
        record_name, zone_name = (
            config.domain.split(".")[:-2],
            config.domain.split(".")[-2:],
        )
        record_name = ".".join(record_name)
        zone_name = ".".join(zone_name)
        if config.provider in {
            schema.ProviderEnum.gcp,
            schema.ProviderEnum.azure,
        }:
            update_record(zone_name, record_name, "A", ip_or_hostname)

        elif config.provider == schema.ProviderEnum.aws:
            update_record(zone_name, record_name, "CNAME", ip_or_hostname)
        else:
            logger.info(
                f"Couldn't update the DNS record for cloud provider: {config.provider}"
            )
    elif not disable_prompt:
        input(
            f"Take IP Address {ip_or_hostname} and update DNS to point to "
            f'"{config.domain}" [Press Enter when Complete]'
        )


def check_ingress_dns(stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool):
    directory = "stages/04-kubernetes-ingress"

    ip_or_name = stage_outputs[directory]["load_balancer_address"]["value"]
    ip = socket.gethostbyname(ip_or_name["hostname"] or ip_or_name["ip"])
    domain_name = stage_outputs[directory]["domain"]

    def _attempt_dns_lookup(
        domain_name, ip, num_attempts=NUM_ATTEMPTS, timeout=TIMEOUT
    ):
        for i in range(num_attempts):
            try:
                _, _, resolved_ips = socket.gethostbyname_ex(domain_name)
                if ip in resolved_ips:
                    print(
                        f"DNS configured domain={domain_name} matches ingress ips={ip}"
                    )
                    return True
                else:
                    print(
                        f"Attempt {i+1} polling DNS domain={domain_name} does not match ip={ip} instead got {resolved_ips}"
                    )
            except socket.gaierror:
                print(
                    f"Attempt {i+1} polling DNS domain={domain_name} record does not exist"
                )
            time.sleep(timeout)
        return False

    attempt = 0
    while not _attempt_dns_lookup(domain_name, ip):
        if disable_prompt:
            sleeptime = 60 * (2**attempt)
            print(f"Will attempt to poll DNS again in {sleeptime} seconds...")
            time.sleep(sleeptime)
        else:
            input(
                f"After attempting to poll the DNS, the record for domain={domain_name} appears not to exist, "
                f"has recently been updated, or has yet to fully propagate. This non-deterministic behavior is likely due to "
                f"DNS caching and will likely resolve itself in a few minutes.\n\n\tTo poll the DNS again [Press Enter].\n\n"
                f"...otherwise kill the process and run the deployment again later..."
            )

        attempt += 1
        if attempt == 5:
            print(
                f"ERROR: After stage directory={directory} DNS domain={domain_name} does not point to ip={ip}"
            )
            sys.exit(1)


@schema.yaml_object(schema.yaml)
class CertificateEnum(str, enum.Enum):
    letsencrypt = "lets-encrypt"
    selfsigned = "self-signed"
    existing = "existing"
    disabled = "disabled"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class Certificate(schema.Base):
    type: CertificateEnum = CertificateEnum.selfsigned
    # existing
    secret_name: Optional[str] = None
    # lets-encrypt
    acme_email: Optional[str] = None
    acme_server: str = "https://acme-v02.api.letsencrypt.org/directory"


class DnsProvider(schema.Base):
    provider: Optional[str] = None
    auto_provision: Optional[bool] = False


class Ingress(schema.Base):
    terraform_overrides: Dict = {}


class InputSchema(schema.Base):
    domain: Optional[str] = None
    certificate: Certificate = Certificate()
    ingress: Ingress = Ingress()
    dns: DnsProvider = DnsProvider()


class IngressEndpoint(schema.Base):
    ip: str
    hostname: str


class OutputSchema(schema.Base):
    load_balancer_address: List[IngressEndpoint]
    domain: str


class KubernetesIngressStage(NebariTerraformStage):
    name = "04-kubernetes-ingress"
    priority = 40

    input_schema = InputSchema
    output_schema = OutputSchema

    def tf_objects(self) -> List[Dict]:
        return [
            NebariTerraformState(self.name, self.config),
            NebariKubernetesProvider(self.config),
            NebariHelmProvider(self.config),
        ]

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        cert_type = self.config.certificate.type
        cert_details = {"certificate-service": cert_type}
        if cert_type == "lets-encrypt":
            cert_details["acme-email"] = self.config.certificate.acme_email
            cert_details["acme-server"] = self.config.certificate.acme_server
        elif cert_type == "existing":
            cert_details["certificate-secret-name"] = (
                self.config.certificate.secret_name
            )

        return {
            **{
                "traefik-image": {
                    "image": "traefik",
                    "tag": constants.DEFAULT_TRAEFIK_IMAGE_TAG,
                },
                "name": self.config.project_name,
                "environment": self.config.namespace,
                "node_groups": stage_outputs["stages/02-infrastructure"][
                    "node_selectors"
                ],
                **self.config.ingress.terraform_overrides,
            },
            **cert_details,
        }

    def set_outputs(
        self, stage_outputs: Dict[str, Dict[str, Any]], outputs: Dict[str, Any]
    ):
        ip_or_name = outputs["load_balancer_address"]["value"]
        host = ip_or_name["hostname"] or ip_or_name["ip"]
        host = host.strip("\n")

        if self.config.domain is None:
            outputs["domain"] = host
        else:
            outputs["domain"] = self.config.domain

        super().set_outputs(stage_outputs, outputs)

    def post_deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        if self.config.dns and self.config.dns.provider:
            provision_ingress_dns(
                stage_outputs,
                self.config,
                dns_provider=self.config.dns.provider,
                dns_auto_provision=self.config.dns.auto_provision,
                disable_prompt=disable_prompt,
            )

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        def _attempt_tcp_connect(
            host, port, num_attempts=NUM_ATTEMPTS, timeout=TIMEOUT
        ):
            for i in range(num_attempts):
                s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
                try:
                    # normalize hostname to ip address
                    ip = socket.gethostbyname(host)
                    s.settimeout(5)
                    result = s.connect_ex((ip, port))
                    if result == 0:
                        print(
                            f"Attempt {i+1} succeeded to connect to tcp://{ip}:{port}"
                        )
                        return True
                    print(f"Attempt {i+1} failed to connect to tcp tcp://{ip}:{port}")
                except socket.gaierror:
                    print(f"Attempt {i+1} failed to get IP for {host}...")
                finally:
                    s.close()

                time.sleep(timeout)

            return False

        tcp_ports = {
            80,  # http
            443,  # https
            8022,  # jupyterhub-ssh ssh
            8023,  # jupyterhub-ssh sftp
            9080,  # minio
            8786,  # dask-scheduler
        }
        ip_or_name = stage_outputs["stages/" + self.name]["load_balancer_address"][
            "value"
        ]
        host = ip_or_name["hostname"] or ip_or_name["ip"]
        host = host.strip("\n")

        for port in tcp_ports:
            if not _attempt_tcp_connect(host, port):
                print(
                    f"ERROR: After stage={self.name} unable to connect to ingress host={host} port={port}"
                )
                sys.exit(1)

        print(
            f"After stage={self.name} kubernetes ingress available on tcp ports={tcp_ports}"
        )

        check_ingress_dns(stage_outputs, disable_prompt=disable_prompt)


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KubernetesIngressStage]



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/cluster-autoscaler/main.tf
---

resource "helm_release" "autoscaler" {
  name      = "cluster-autoscaler"
  namespace = var.namespace

  repository = "https://kubernetes.github.io/autoscaler"
  chart      = "cluster-autoscaler"
  version    = "9.19.0"

  values = concat([
    jsonencode({
      rbac = {
        create = true
      }

      cloudProvider = "aws"
      awsRegion     = var.aws_region

      autoDiscovery = {
        clusterName = var.cluster-name
        enabled     = true
      }

      affinity = {
        nodeAffinity = {
          requiredDuringSchedulingIgnoredDuringExecution = {
            nodeSelectorTerms = [
              {
                matchExpressions = [
                  {
                    key      = "eks.amazonaws.com/nodegroup"
                    operator = "In"
                    values   = ["general"]
                  }
                ]
              }
            ]
          }
        }
      }
    })
  ], var.overrides)
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/cluster-autoscaler/variables.tf
---

variable "namespace" {
  description = "Namespace for helm chart resource"
  type        = string
}

variable "cluster-name" {
  description = "Cluster name for kubernetes cluster"
  type        = string
}

variable "aws_region" {
  description = "AWS Region that cluster autoscaler is running"
  type        = string
}

variable "overrides" {
  description = "Helm overrides to apply"
  type        = list(string)
  default     = []
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/extcr/main.tf
---

resource "kubernetes_secret" "customer_extcr_key" {
  metadata {
    name      = "customer-extcr-key"
    namespace = var.namespace
  }

  data = {
    "access-key-id"     = var.access_key_id
    "secret-access-key" = var.secret_access_key
    "extcr-account"     = var.extcr_account
    "extcr-region"      = var.extcr_region
  }
}

resource "kubernetes_manifest" "role_extcr_cred_updater" {
  manifest = {
    "apiVersion" = "rbac.authorization.k8s.io/v1"
    "kind"       = "Role"
    "metadata" = {
      "name"      = "extcr-cred-updater"
      "namespace" = var.namespace
    }
    "rules" = [
      {
        "apiGroups" = [
          "",
        ]
        "resources" = [
          "secrets",
        ]
        "verbs" = [
          "get",
          "create",
          "delete",
        ]
      },
      {
        "apiGroups" = [
          "",
        ]
        "resources" = [
          "serviceaccounts",
        ]
        "verbs" = [
          "get",
          "patch",
        ]
      },
    ]
  }
}

resource "kubernetes_manifest" "serviceaccount_extcr_cred_updater" {
  manifest = {
    "apiVersion" = "v1"
    "kind"       = "ServiceAccount"
    "metadata" = {
      "name"      = "extcr-cred-updater"
      "namespace" = var.namespace
    }
  }
}

resource "kubernetes_manifest" "rolebinding_extcr_cred_updater" {
  manifest = {
    "apiVersion" = "rbac.authorization.k8s.io/v1"
    "kind"       = "RoleBinding"
    "metadata" = {
      "name"      = "extcr-cred-updater"
      "namespace" = var.namespace
    }
    "roleRef" = {
      "apiGroup" = "rbac.authorization.k8s.io"
      "kind"     = "Role"
      "name"     = "extcr-cred-updater"
    }
    "subjects" = [
      {
        "kind" = "ServiceAccount"
        "name" = "extcr-cred-updater"
      },
    ]
  }
}

resource "kubernetes_manifest" "job_extcr_cred_updater" {
  manifest = {
    "apiVersion" = "batch/v1"
    "kind"       = "Job"
    "metadata" = {
      "name"      = "extcr-cred-updater"
      "namespace" = var.namespace
    }
    "spec" = {
      "backoffLimit" = 4
      "template" = {
        "spec" = {
          "containers" = [
            {
              "command" = [
                "/bin/sh",
                "-c",
                <<-EOT
                DOCKER_REGISTRY_SERVER=https://$${AWS_ACCOUNT}.dkr.ecr.$${AWS_REGION}.amazonaws.com
                DOCKER_USER=AWS
                DOCKER_PASSWORD=`aws ecr get-login --region $${AWS_REGION} --registry-ids $${AWS_ACCOUNT} | cut -d' ' -f6`
                kubectl delete secret extcrcreds || true
                kubectl create secret docker-registry extcrcreds \
                --docker-server=$DOCKER_REGISTRY_SERVER \
                --docker-username=$DOCKER_USER \
                --docker-password=$DOCKER_PASSWORD \
                --docker-email=no@email.local
                kubectl patch serviceaccount default -p '{"imagePullSecrets":[{"name":"extcrcreds"}]}'

                EOT
                ,
              ]
              "env" = [
                {
                  "name" = "AWS_ACCESS_KEY_ID"
                  "valueFrom" = {
                    "secretKeyRef" = {
                      "key"  = "access-key-id"
                      "name" = "customer-extcr-key"
                    }
                  }
                },
                {
                  "name" = "AWS_SECRET_ACCESS_KEY"
                  "valueFrom" = {
                    "secretKeyRef" = {
                      "key"  = "secret-access-key"
                      "name" = "customer-extcr-key"
                    }
                  }
                },
                {
                  "name" = "AWS_ACCOUNT"
                  "valueFrom" = {
                    "secretKeyRef" = {
                      "key"  = "extcr-account"
                      "name" = "customer-extcr-key"
                    }
                  }
                },
                {
                  "name" = "AWS_REGION"
                  "valueFrom" = {
                    "secretKeyRef" = {
                      "key"  = "extcr-region"
                      "name" = "customer-extcr-key"
                    }
                  }
                },
              ]
              "image" = "xynova/aws-kubectl"
              "name"  = "kubectl"
            },
          ]
          "restartPolicy"                 = "Never"
          "serviceAccountName"            = "extcr-cred-updater"
          "terminationGracePeriodSeconds" = 0
        }
      }
    }
  }
}

resource "kubernetes_manifest" "cronjob_extcr_cred_updater" {
  manifest = {
    "apiVersion" = "batch/v1"
    "kind"       = "CronJob"
    "metadata" = {
      "name"      = "extcr-cred-updater"
      "namespace" = var.namespace
    }
    "spec" = {
      "failedJobsHistoryLimit" = 1
      "jobTemplate" = {
        "spec" = {
          "backoffLimit" = 4
          "template" = {
            "spec" = {
              "containers" = [
                {
                  "command" = [
                    "/bin/sh",
                    "-c",
                    <<-EOT
                    DOCKER_REGISTRY_SERVER=https://$${AWS_ACCOUNT}.dkr.ecr.$${AWS_REGION}.amazonaws.com
                    DOCKER_USER=AWS
                    DOCKER_PASSWORD=`aws ecr get-login --region $${AWS_REGION} --registry-ids $${AWS_ACCOUNT} | cut -d' ' -f6`
                    kubectl delete secret extcrcreds || true
                    kubectl create secret docker-registry extcrcreds \
                    --docker-server=$DOCKER_REGISTRY_SERVER \
                    --docker-username=$DOCKER_USER \
                    --docker-password=$DOCKER_PASSWORD \
                    --docker-email=no@email.local
                    kubectl patch serviceaccount default -p '{"imagePullSecrets":[{"name":"extcrcreds"}]}'
                    EOT
                    ,
                  ]
                  "env" = [
                    {
                      "name" = "AWS_ACCESS_KEY_ID"
                      "valueFrom" = {
                        "secretKeyRef" = {
                          "key"  = "access-key-id"
                          "name" = "customer-extcr-key"
                        }
                      }
                    },
                    {
                      "name" = "AWS_SECRET_ACCESS_KEY"
                      "valueFrom" = {
                        "secretKeyRef" = {
                          "key"  = "secret-access-key"
                          "name" = "customer-extcr-key"
                        }
                      }
                    },
                    {
                      "name" = "AWS_ACCOUNT"
                      "valueFrom" = {
                        "secretKeyRef" = {
                          "key"  = "extcr-account"
                          "name" = "customer-extcr-key"
                        }
                      }
                    },
                    {
                      "name" = "AWS_REGION"
                      "valueFrom" = {
                        "secretKeyRef" = {
                          "key"  = "extcr-region"
                          "name" = "customer-extcr-key"
                        }
                      }
                    },
                  ]
                  "image" = "xynova/aws-kubectl"
                  "name"  = "kubectl"
                },
              ]
              "restartPolicy"                 = "Never"
              "serviceAccountName"            = "extcr-cred-updater"
              "terminationGracePeriodSeconds" = 0
            }
          }
        }
      }
      "schedule"                   = "* */8 * * *"
      "successfulJobsHistoryLimit" = 1
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/extcr/variables.tf
---

variable "namespace" {
  description = "namespace to deploy extcr"
  type        = string
}

variable "access_key_id" {
  description = "Customer's access key id for external container reg"
  type        = string
}

variable "secret_access_key" {
  description = "Customer's secret access key for external container reg"
  type        = string
}

variable "extcr_account" {
  description = "AWS Account of the external container reg"
  type        = string
}

variable "extcr_region" {
  description = "AWS Region of the external container reg"
  type        = string
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/initialization/main.tf
---

resource "kubernetes_namespace" "main" {
  metadata {
    labels = merge({}, var.labels)

    name = var.namespace
  }
}


resource "kubernetes_secret" "main" {
  count = length(var.secrets)

  metadata {
    name      = var.secrets[count.index].name
    namespace = var.namespace
    labels    = merge({}, var.labels)
  }

  data = var.secrets[count.index].data

  type = "Opaque"
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/initialization/variables.tf
---

variable "namespace" {
  description = "Namespace for all resources deployed"
  type        = string
}

variable "labels" {
  description = "Additional labs to apply for all resources deployed"
  type        = map(string)
  default     = {}
}

variable "secrets" {
  description = "map of with map of key value secrets to store in kubernetes secrets"
  type = list(object({
    name = string
    data = map(string)
  }))
  default = []
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/nvidia-installer/aws-nvidia-installer.tf
---

resource "kubernetes_daemonset" "aws_nvidia_installer" {
  count = var.gpu_enabled && (var.cloud_provider == "aws") ? 1 : 0
  metadata {
    name      = "nvidia-device-plugin-daemonset-1.12"
    namespace = "kube-system"
  }

  spec {
    selector {
      match_labels = {
        name = "nvidia-device-plugin-ds"
      }
    }

    template {
      metadata {
        labels = {
          name = "nvidia-device-plugin-ds"
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = "eks.amazonaws.com/nodegroup"
                  operator = "In"
                  values   = var.gpu_node_group_names
                }
              }
            }
          }
        }

        volume {
          name = "device-plugin"

          host_path {
            path = "/var/lib/kubelet/device-plugins"
          }
        }

        container {
          name  = "nvidia-device-plugin-ctr"
          image = "nvidia/k8s-device-plugin:1.11"

          volume_mount {
            name       = "device-plugin"
            mount_path = "/var/lib/kubelet/device-plugins"
          }

          security_context {
            capabilities {
              drop = ["ALL"]
            }
          }
        }

        toleration {
          key      = "CriticalAddonsOnly"
          operator = "Exists"
        }

        toleration {
          key      = "nvidia.com/gpu"
          operator = "Exists"
          effect   = "NoSchedule"
        }
      }
    }

    strategy {
      type = "RollingUpdate"
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/nvidia-installer/gcp-nvidia-installer.tf
---

# source https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#installing_drivers
resource "kubernetes_daemonset" "gcp_nvidia_installer" {
  count = var.gpu_enabled && (var.cloud_provider == "gcp") ? 1 : 0

  metadata {
    name      = "nvidia-driver-installer"
    namespace = "kube-system"
    labels = {
      "k8s-app" = "nvidia-driver-installer"
    }
  }

  spec {
    selector {
      match_labels = {
        "k8s-app" = "nvidia-driver-installer"
      }
    }

    template {
      metadata {
        labels = {
          name      = "nvidia-driver-installer"
          "k8s-app" = "nvidia-driver-installer"
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = "cloud.google.com/gke-accelerator"
                  operator = "Exists"
                }
              }
            }
          }
        }
        toleration {
          operator = "Exists"
        }
        host_network = true
        host_pid     = true
        volume {
          name = "dev"
          host_path {
            path = "/dev"
          }
        }
        volume {
          name = "vulkan-icd-mount"
          host_path {
            path = "/home/kubernetes/bin/nvidia/vulkan/icd.d"
          }
        }
        volume {
          name = "nvidia-install-dir-host"
          host_path {
            path = "/home/kubernetes/bin/nvidia"
          }
        }
        volume {
          name = "root-mount"
          host_path {
            path = "/"
          }
        }
        volume {
          name = "cos-tools"
          host_path {
            path = "/var/lib/cos-tools"
          }
        }
        init_container {
          image   = "cos-nvidia-installer:fixed"
          name    = "nvidia-driver-installer"
          command = ["/cos-gpu-installer", "install", "--version=latest"]
          resources {
            requests = {
              cpu = 0.15
            }
          }
          security_context {
            privileged = true
          }
          env {
            name  = "NVIDIA_INSTALL_DIR_HOST"
            value = "/home/kubernetes/bin/nvidia"
          }
          env {
            name  = "NVIDIA_INSTALL_DIR_CONTAINER"
            value = "/usr/local/nvidia"
          }
          env {
            name  = "VULKAN_ICD_DIR_HOST"
            value = "/home/kubernetes/bin/nvidia/vulkan/icd.d"
          }
          env {
            name  = "VULKAN_ICD_DIR_CONTAINER"
            value = "/etc/vulkan/icd.d"
          }
          env {
            name  = "ROOT_MOUNT_DIR"
            value = "/root"
          }
          env {
            name  = "COS_TOOLS_DIR_HOST"
            value = "/var/lib/cos-tools"
          }
          env {
            name  = "COS_TOOLS_DIR_CONTAINER"
            value = "/build/cos-tools"
          }
          volume_mount {
            name       = "nvidia-install-dir-host"
            mount_path = "/usr/local/nvidia"
          }
          volume_mount {
            name       = "vulkan-icd-mount"
            mount_path = "/etc/vulkan/icd.d"
          }
          volume_mount {
            name       = "dev"
            mount_path = "/dev"
          }
          volume_mount {
            name       = "root-mount"
            mount_path = "/root"
          }
          volume_mount {
            name       = "cos-tools"
            mount_path = "/build/cos-tools"
          }
        }
        container {
          image = "gcr.io/google-containers/pause:2.0"
          name  = "pause"
        }
      }
    }

    strategy {
      type = "RollingUpdate"
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/nvidia-installer/variables.tf
---

variable "gpu_node_group_names" {
  description = "Names of node groups with GPU"
  default     = []
}

variable "gpu_enabled" {
  description = "Enable GPU support"
  default     = false
}

variable "cloud_provider" {
  description = "Name of cloud_provider"
  type        = string
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/modules/traefik_crds/main.tf
---

resource "kubernetes_manifest" "ingress_route" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "ingressroutes.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "IngressRoute"
        plural   = "ingressroutes"
        singular = "ingressroute"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    routes = {
                      type = "array"
                      items = {
                        type     = "object"
                        required = ["match", "kind"]
                        properties = {
                          match = {
                            type = "string"
                          }
                          kind = {
                            type = "string"
                            enum = ["Rule"]
                          }
                          priority = {
                            type = "integer"
                          }
                          services = {
                            type = "array"
                            items = {
                              type     = "object"
                              required = ["name", "port"]
                              properties = {
                                name = {
                                  type = "string"
                                }
                                kind = {
                                  type = "string"
                                  enum = ["Service", "TraefikService"]
                                }
                                namespace = {
                                  type = "string"
                                }
                                sticky = {
                                  type = "object"
                                  properties = {
                                    cookie = {
                                      type = "object"
                                      properties = {
                                        name = {
                                          type = "string"
                                        }
                                        secure = {
                                          type = "boolean"
                                        }
                                        httpOnly = {
                                          type = "boolean"
                                        }
                                        sameSite = {
                                          type = "string"
                                          enum = ["None", "Lax", "Strict"]
                                        }
                                      }
                                    }
                                  }
                                }
                                port = {
                                  x-kubernetes-int-or-string = true
                                  pattern                    = "^[1-9]\\d*$"
                                }
                                scheme = {
                                  type = "string"
                                  enum = ["http", "https", "h2c"]
                                }
                                strategy = {
                                  type = "string"
                                  enum = ["RoundRobin"]
                                }
                                passHostHeader = {
                                  type = "boolean"
                                }
                                responseForwarding = {
                                  type = "object"
                                  properties = {
                                    flushInterval = {
                                      type = "string"
                                    }
                                  }
                                }
                                weight = {
                                  type = "integer"
                                }
                              }
                            }
                          }
                          middlewares = {
                            type = "array"
                            items = {
                              type     = "object"
                              required = ["name", "namespace"]
                              properties = {
                                name = {
                                  type = "string"
                                }
                                namespace = {
                                  type = "string"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                    entryPoints = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                    tls = {
                      type = "object"
                      properties = {
                        secretName = {
                          type = "string"
                        }
                        options = {
                          type     = "object"
                          required = ["name", "namespace"]
                          properties = {
                            name = {
                              type = "string"
                            }
                            namespace = {
                              type = "string"
                            }
                          }
                        }
                        store = {
                          type     = "object"
                          required = ["name", "namespace"]
                          properties = {
                            name = {
                              type = "string"
                            }
                            namespace = {
                              type = "string"
                            }
                          }
                        }
                        certResolver = {
                          type = "string"
                        }
                        domains = {
                          type = "array"
                          items = {
                            type = "object"
                            properties = {
                              main = {
                                type = "string"
                              }
                              sans = {
                                type = "array"
                                items = {
                                  type = "string"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "ingress_route_tcp" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "ingressroutetcps.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "IngressRouteTCP"
        plural   = "ingressroutetcps"
        singular = "ingressroutetcp"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    routes = {
                      type = "array"
                      items = {
                        type = "object"
                        properties = {
                          match = {
                            type = "string"
                          }
                          services = {
                            type = "array"
                            items = {
                              type     = "object"
                              required = ["name", "port"]
                              properties = {
                                name = {
                                  type = "string"
                                }
                                namespace = {
                                  type = "string"
                                }
                                port = {
                                  x-kubernetes-int-or-string = true
                                  pattern                    = "^[1-9]\\d*$"
                                }
                                weight = {
                                  type = "integer"
                                }
                                terminationDelay = {
                                  type = "integer"
                                }
                                proxyProtocol = {
                                  type     = "object"
                                  required = ["version"]
                                  properties = {
                                    version = {
                                      type    = "integer"
                                      minimum = 1
                                      maximum = 2
                                    }
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                    entryPoints = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                    tls = {
                      type = "object"
                      properties = {
                        secretName = {
                          type = "string"
                        }
                        passthrough = {
                          type = "boolean"
                        }
                        options = {
                          type     = "object"
                          required = ["name", "namespace"]
                          properties = {
                            name = {
                              type = "string"
                            }
                            namespace = {
                              type = "string"
                            }
                          }
                        }
                        store = {
                          type     = "object"
                          required = ["name", "namespace"]
                          properties = {
                            name = {
                              type = "string"
                            }
                            namespace = {
                              type = "string"
                            }
                          }
                        }
                        certResolver = {
                          type = "string"
                        }
                        domains = {
                          type = "array"
                          items = {
                            type     = "object"
                            required = ["main"]
                            properties = {
                              main = {
                                type = "string"
                              }
                              sans = {
                                type = "array"
                                items = {
                                  type = "string"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "ingress_route_udp" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "ingressrouteudps.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "IngressRouteUDP"
        plural   = "ingressrouteudps"
        singular = "ingressrouteudp"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    routes = {
                      type = "array"
                      items = {
                        type = "object"
                        properties = {
                          services = {
                            type = "array"
                            items = {
                              type     = "object"
                              required = ["name"]
                              properties = {
                                name = {
                                  type = "string"
                                }
                                namespace = {
                                  type = "string"
                                }
                                port = {
                                  x-kubernetes-int-or-string = true
                                  pattern                    = "^[1-9]\\d*$"
                                }
                                weight = {
                                  type = "integer"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                    entryPoints = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "middleware" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "middlewares.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "Middleware"
        plural   = "middlewares"
        singular = "middleware"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    addPrefix = {
                      type = "object"
                      properties = {
                        prefix = {
                          type = "string"
                        }
                      }
                    }
                    stripPrefix = {
                      type = "object"
                      properties = {
                        prefixes = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        forceSlash = {
                          type = "boolean"
                        }
                      }
                    }
                    stripPrefixRegex = {
                      type = "object"
                      properties = {
                        regex = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                      }
                    }
                    replacePath = {
                      type = "object"
                      properties = {
                        path = {
                          type = "string"
                        }
                      }
                    }
                    replacePathRegex = {
                      type = "object"
                      properties = {
                        regex = {
                          type = "string"
                        }
                        replacement = {
                          type = "string"
                        }
                      }
                    }
                    chain = {
                      type = "object"
                      properties = {
                        middlewares = {
                          type = "array"
                          items = {
                            type     = "object"
                            required = ["name", "namespace"]
                            properties = {
                              name = {
                                type = "string"
                              }
                              namespace = {
                                type = "string"
                              }
                            }
                          }
                        }
                      }
                    }
                    ipWhiteList = {
                      type = "object"
                      properties = {
                        sourceRange = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        ipStrategy = {
                          type = "object"
                          properties = {
                            depth = {
                              type = "integer"
                            }
                            excludedIPs = {
                              type = "array"
                              items = {
                                type = "string"
                              }
                            }
                          }
                        }
                      }
                    }
                    headers = {
                      type = "object"
                      properties = {
                        customRequestHeaders = {
                          additionalProperties = {
                            type = "string"
                          }
                          type = "object"
                        }
                        customResponseHeaders = {
                          additionalProperties = {
                            type = "string"
                          }
                          type = "object"
                        }
                        accessControlAllowCredentials = {
                          type = "boolean"
                        }
                        accessControlAllowHeaders = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        accessControlAllowMethods = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        accessControlAllowOrigin = {
                          type = "string"
                        }
                        accessControlAllowOriginList = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        accessControlExposeHeaders = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        accessControlMaxAge = {
                          type = "integer"
                        }
                        addVaryHeader = {
                          type = "boolean"
                        }
                        allowedHosts = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        hostsProxyHeaders = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        sslRedirect = {
                          type = "boolean"
                        }
                        sslTemporaryRedirect = {
                          type = "boolean"
                        }
                        sslHost = {
                          type = "string"
                        }
                        sslProxyHeaders = {
                          additionalProperties = {
                            type = "string"
                          }
                          type = "object"
                        }
                        sslForceHost = {
                          type = "boolean"
                        }
                        stsSeconds = {
                          type = "integer"
                        }
                        stsIncludeSubdomains = {
                          type = "boolean"
                        }
                        stsPreload = {
                          type = "boolean"
                        }
                        forceSTSheader = {
                          type = "boolean"
                        }
                        frameDeny = {
                          type = "boolean"
                        }
                        customFrameOptionsValue = {
                          type = "string"
                        }
                        contentTypeNosniff = {
                          type = "boolean"
                        }
                        browserXssFilter = {
                          type = "boolean"
                        }
                        customBrowserXSSValue = {
                          type = "string"
                        }
                        contentSecurityPolicy = {
                          type = "string"
                        }
                        publicKey = {
                          type = "string"
                        }
                        referrerPolicy = {
                          type = "string"
                        }
                        featurePolicy = {
                          type = "string"
                        }
                        isDevelopment = {
                          type = "boolean"
                        }
                      }
                    }
                    errors = {
                      type = "object"
                      properties = {
                        status = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        service = {
                          type = "object"
                          properties = {
                            sticky = {
                              type = "object"
                              properties = {
                                cookie = {
                                  type = "object"
                                  properties = {
                                    name = {
                                      type = "string"
                                    }
                                    secure = {
                                      type = "boolean"
                                    }
                                    httpOnly = {
                                      type = "boolean"
                                    }
                                  }
                                }
                              }
                            }
                            namespace = {
                              type = "string"
                            }
                            kind = {
                              type = "string"
                            }
                            name = {
                              type = "string"
                            }
                            weight = {
                              type = "integer"
                            }
                            responseForwarding = {
                              type = "object"
                              properties = {
                                flushInterval = {
                                  type = "string"
                                }
                              }
                            }
                            passHostHeader = {
                              type = "boolean"
                            }
                            healthCheck = {
                              type = "object"
                              properties = {
                                path = {
                                  type = "string"
                                }
                                host = {
                                  type = "string"
                                }
                                scheme = {
                                  type = "string"
                                }
                                intervalSeconds = {
                                  type = "integer"
                                }
                                timeoutSeconds = {
                                  type = "integer"
                                }
                                headers = {
                                  type = "object"
                                }
                              }
                            }
                            strategy = {
                              type = "string"
                            }
                            scheme = {
                              type = "string"
                            }
                            port = {
                              type = "integer"
                            }
                          }
                        }
                        query = {
                          type = "string"
                        }
                      }
                    }
                    rateLimit = {
                      type = "object"
                      properties = {
                        average = {
                          type = "integer"
                        }
                        burst = {
                          type = "integer"
                        }
                        sourceCriterion = {
                          type = "object"
                          properties = {
                            ipStrategy = {
                              type = "object"
                              properties = {
                                depth = {
                                  type = "integer"
                                }
                                excludedIPs = {
                                  type = "array"
                                  items = {
                                    type = "string"
                                  }
                                }
                              }
                            }
                            requestHeaderName = {
                              type = "string"
                            }
                            requestHost = {
                              type = "boolean"
                            }
                          }
                        }
                      }
                    }
                    redirectRegex = {
                      type = "object"
                      properties = {
                        regex = {
                          type = "string"
                        }
                        replacement = {
                          type = "string"
                        }
                        permanent = {
                          type = "boolean"
                        }
                      }
                    }
                    redirectScheme = {
                      type = "object"
                      properties = {
                        scheme = {
                          type = "string"
                        }
                        port = {
                          type = "string"
                        }
                        permanent = {
                          type = "boolean"
                        }
                      }
                    }
                    basicAuth = {
                      type = "object"
                      properties = {
                        secret = {
                          type = "string"
                        }
                        realm = {
                          type = "string"
                        }
                        removeHeader = {
                          type = "boolean"
                        }
                        headerField = {
                          type = "string"
                        }
                      }
                    }
                    digestAuth = {
                      type = "object"
                      properties = {
                        secret = {
                          type = "string"
                        }
                        removeHeader = {
                          type = "boolean"
                        }
                        realm = {
                          type = "string"
                        }
                        headerField = {
                          type = "string"
                        }
                      }
                    }
                    forwardAuth = {
                      type = "object"
                      properties = {
                        address = {
                          type = "string"
                        }
                        trustForwardHeader = {
                          type = "boolean"
                        }
                        authResponseHeaders = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        tls = {
                          type = "object"
                          properties = {
                            caSecret = {
                              type = "string"
                            }
                            caOptional = {
                              type = "boolean"
                            }
                            certSecret = {
                              type = "string"
                            }
                            insecureSkipVerify = {
                              type = "boolean"
                            }
                          }
                        }
                      }
                    }
                    inFlightReq = {
                      type = "object"
                      properties = {
                        amount = {
                          type = "integer"
                        }
                        sourceCriterion = {
                          type = "object"
                          properties = {
                            ipStrategy = {
                              type = "object"
                              properties = {
                                depth = {
                                  type = "integer"
                                }
                                excludedIPs = {
                                  type = "array"
                                  items = {
                                    type = "string"
                                  }
                                }
                              }
                            }
                            requestHeaderName = {
                              type = "string"
                            }
                            requestHost = {
                              type = "boolean"
                            }
                          }
                        }
                      }
                    }
                    buffering = {
                      type = "object"
                      properties = {
                        maxRequestBodyBytes = {
                          type = "integer"
                        }
                        memRequestBodyBytes = {
                          type = "integer"
                        }
                        maxResponseBodyBytes = {
                          type = "integer"
                        }
                        memResponseBodyBytes = {
                          type = "integer"
                        }
                        retryExpression = {
                          type = "string"
                        }
                      }
                    }
                    circuitBreaker = {
                      type = "object"
                      properties = {
                        expression = {
                          type = "string"
                        }
                      }
                    }
                    compress = {
                      type = "object"
                      properties = {
                        excludedContentTypes = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                      }
                    }
                    passTLSClientCert = {
                      type = "object"
                      properties = {
                        pem = {
                          type = "boolean"
                        }
                        info = {
                          type = "object"
                          properties = {
                            notAfter = {
                              type = "boolean"
                            }
                            notBefore = {
                              type = "boolean"
                            }
                            sans = {
                              type = "boolean"
                            }
                            subject = {
                              type = "object"
                              properties = {
                                country = {
                                  type = "boolean"
                                }
                                province = {
                                  type = "boolean"
                                }
                                locality = {
                                  type = "boolean"
                                }
                                organization = {
                                  type = "boolean"
                                }
                                commonName = {
                                  type = "boolean"
                                }
                                serialNumber = {
                                  type = "boolean"
                                }
                                domainComponent = {
                                  type = "boolean"
                                }
                              }
                            }
                            issuer = {
                              type = "object"
                              properties = {
                                country = {
                                  type = "boolean"
                                }
                                province = {
                                  type = "boolean"
                                }
                                locality = {
                                  type = "boolean"
                                }
                                organization = {
                                  type = "boolean"
                                }
                                commonName = {
                                  type = "boolean"
                                }
                                serialNumber = {
                                  type = "boolean"
                                }
                                domainComponent = {
                                  type = "boolean"
                                }
                              }
                            }
                            serialNumber = {
                              type = "boolean"
                            }
                          }
                        }
                      }
                    }
                    retry = {
                      type = "object"
                      properties = {
                        attempts = {
                          type = "integer"
                        }
                      }
                    }
                    contentType = {
                      type = "object"
                      properties = {
                        autoDetect = {
                          type = "boolean"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}

resource "kubernetes_manifest" "middlewaretcp" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "middlewaretcps.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "MiddlewareTCP"
        listKind = "MiddlewareTCPList"
        plural   = "middlewaretcps"
        singular = "middlewaretcp"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    inFlightConn = {
                      type = "object"
                      properties = {
                        amount = {
                          type = "integer"
                        }
                      }
                    }
                    ipWhiteList = {
                      type = "object"
                      properties = {
                        sourceRange = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "serverstransports" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "serverstransports.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "ServersTransport"
        plural   = "serverstransports"
        singular = "serverstransports"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    serverName = {
                      type = "string"
                    }
                    insecureSkipVerify = {
                      type = "boolean"
                    }
                    rootCAsSecrets = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                    certificatesSecrets = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                    maxIdleConnsPerHost = {
                      type = "integer"
                    }
                    forwardingTimeouts = {
                      type = "object"
                      properties = {
                        dialTimeout = {
                          x-kubernetes-int-or-string = true
                          pattern                    = "^[1-9](\\d+)?(ns|us|µs|μs|ms|s|m|h)?$"
                        }
                        responseHeaderTimeout = {
                          x-kubernetes-int-or-string = true
                          pattern                    = "^[1-9](\\d+)?(ns|us|µs|μs|ms|s|m|h)?$"
                        }
                        idleConnTimeout = {
                          x-kubernetes-int-or-string = true
                          pattern                    = "^[1-9](\\d+)?(ns|us|µs|μs|ms|s|m|h)?$"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "tls_option" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "tlsoptions.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "TLSOption"
        plural   = "tlsoptions"
        singular = "tlsoption"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    minVersion = {
                      type = "string"
                    }
                    maxVersion = {
                      type = "string"
                    }
                    cipherSuites = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                    curvePreferences = {
                      type = "array"
                      items = {
                        type = "string"
                      }
                    }
                    clientAuth = {
                      type = "object"
                      properties = {
                        clientAuthType = {
                          type = "string"
                          enum = ["NoClientCert", "RequestClientCert", "VerifyClientCertIfGiven", "RequireAndVerifyClientCert"]
                        }
                        secretNames = {
                          type = "array"
                          items = {
                            type = "string"
                          }
                        }
                        sniStrict = {
                          type = "boolean"
                        }
                        preferServerCipherSuites = {
                          type = "boolean"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "tls_stores" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "tlsstores.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "TLSStore"
        plural   = "tlsstores"
        singular = "tlsstore"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    defaultCertificate = {
                      type = "object"
                      properties = {
                        secretName = {
                          type = "string"
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "traefik_service" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "traefikservices.traefik.containo.us"
    }
    spec = {
      group = "traefik.containo.us"
      names = {
        kind     = "TraefikService"
        plural   = "traefikservices"
        singular = "traefikservice"
      }
      scope = "Namespaced"
      versions = [
        {
          name    = "v1alpha1"
          served  = true
          storage = true
          schema = {
            openAPIV3Schema = {
              type = "object"
              properties = {
                spec = {
                  type = "object"
                  properties = {
                    weighted = {
                      type = "object"
                      properties = {
                        services = {
                          type = "array"
                          items = {
                            type = "object"
                            properties = {
                              sticky = {
                                type = "object"
                                properties = {
                                  cookie = {
                                    type = "object"
                                    properties = {
                                      name = {
                                        type = "string"
                                      }
                                      secure = {
                                        type = "boolean"
                                      }
                                      httpOnly = {
                                        type = "boolean"
                                      }
                                    }
                                  }
                                }
                              }
                              namespace = {
                                type = "string"
                              }
                              kind = {
                                type = "string"
                              }
                              name = {
                                type = "string"
                              }
                              weight = {
                                type = "integer"
                              }
                              responseForwarding = {
                                type = "object"
                                properties = {
                                  flushInterval = {
                                    type = "string"
                                  }
                                }
                              }
                              passHostHeader = {
                                type = "boolean"
                              }
                              healthCheck = {
                                type = "object"
                                properties = {
                                  path = {
                                    type = "string"
                                  }
                                  host = {
                                    type = "string"
                                  }
                                  scheme = {
                                    type = "string"
                                  }
                                  intervalSeconds = {
                                    type = "integer"
                                  }
                                  timeoutSeconds = {
                                    type = "integer"
                                  }
                                  headers = {
                                    type = "object"
                                  }
                                }
                              }
                              strategy = {
                                type = "string"
                              }
                              scheme = {
                                type = "string"
                              }
                              port = {
                                type = "integer"
                              }
                            }
                          }
                        }
                        sticky = {
                          type = "object"
                          properties = {
                            cookie = {
                              type = "object"
                              properties = {
                                name = {
                                  type = "string"
                                }
                                secure = {
                                  type = "boolean"
                                }
                                httpOnly = {
                                  type = "boolean"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                    mirroring = {
                      type = "object"
                      properties = {
                        weight = {
                          type = "integer"
                        }
                        responseForwarding = {
                          type = "object"
                          properties = {
                            flushInterval = {
                              type = "string"
                            }
                          }
                        }
                        passHostHeader = {
                          type = "boolean"
                        }
                        healthCheck = {
                          type = "object"
                          properties = {
                            path = {
                              type = "string"
                            }
                            host = {
                              type = "string"
                            }
                            scheme = {
                              type = "string"
                            }
                            intervalSeconds = {
                              type = "integer"
                            }
                            timeoutSeconds = {
                              type = "integer"
                            }
                            headers = {
                              type = "object"
                            }
                          }
                        }
                        strategy = {
                          type = "string"
                        }
                        scheme = {
                          type = "string"
                        }
                        port = {
                          type = "integer"
                        }
                        sticky = {
                          type = "object"
                          properties = {
                            cookie = {
                              type = "object"
                              properties = {
                                name = {
                                  type = "string"
                                }
                                secure = {
                                  type = "boolean"
                                }
                                httpOnly = {
                                  type = "boolean"
                                }
                              }
                            }
                          }
                        }
                        namespace = {
                          type = "string"
                        }
                        kind = {
                          type = "string"
                        }
                        name = {
                          type = "string"
                        }
                        mirrors = {
                          type = "array"
                          items = {
                            type = "object"
                            properties = {
                              name = {
                                type = "string"
                              }
                              kind = {
                                type = "string"
                              }
                              namespace = {
                                type = "string"
                              }
                              sticky = {
                                type = "object"
                                properties = {
                                  cookie = {
                                    type = "object"
                                    properties = {
                                      name = {
                                        type = "string"
                                      }
                                      secure = {
                                        type = "boolean"
                                      }
                                      httpOnly = {
                                        type = "boolean"
                                      }
                                    }
                                  }
                                }
                              }
                              port = {
                                type = "integer"
                              }
                              scheme = {
                                type = "string"
                              }
                              strategy = {
                                type = "string"
                              }
                              healthCheck = {
                                type = "object"
                                properties = {
                                  path = {
                                    type = "string"
                                  }
                                  host = {
                                    type = "string"
                                  }
                                  scheme = {
                                    type = "string"
                                  }
                                  intervalSeconds = {
                                    type = "integer"
                                  }
                                  timeoutSeconds = {
                                    type = "integer"
                                  }
                                  headers = {
                                    type = "object"
                                  }
                                }
                              }
                              passHostHeader = {
                                type = "boolean"
                              }
                              responseForwarding = {
                                type = "object"
                                properties = {
                                  flushInterval = {
                                    type = "string"
                                  }
                                  weight = {
                                    type = "integer"
                                  }
                                  percent = {
                                    type = "integer"
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/external-container-registry.tf
---

module "external-container-reg" {
  count = var.external_container_reg.enabled ? 1 : 0

  source = "./modules/extcr"

  namespace         = var.environment
  access_key_id     = var.external_container_reg.access_key_id
  secret_access_key = var.external_container_reg.secret_access_key
  extcr_account     = var.external_container_reg.extcr_account
  extcr_region      = var.external_container_reg.extcr_region

  depends_on = [module.kubernetes-initialization]
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/locals.tf
---

locals {
  additional_tags = {
    Project     = var.name
    Owner       = "terraform"
    Environment = var.environment
  }

  cluster_name = "${var.name}-${var.environment}"
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/main.tf
---

module "kubernetes-initialization" {
  source = "./modules/initialization"

  namespace = var.environment
  secrets   = []
}

module "kubernetes-autoscaling" {
  count = var.cloud_provider == "aws" ? 1 : 0

  source = "./modules/cluster-autoscaler"

  namespace = var.environment

  aws_region   = var.aws_region
  cluster-name = local.cluster_name
}

module "traefik-crds" {
  source = "./modules/traefik_crds"
}

module "nvidia-driver-installer" {
  count = var.gpu_enabled ? 1 : 0

  source = "./modules/nvidia-installer"

  cloud_provider       = var.cloud_provider
  gpu_enabled          = var.gpu_enabled
  gpu_node_group_names = var.gpu_node_group_names
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/variables.tf
---

variable "name" {
  description = "Prefix name to assign to nebari resources"
  type        = string
}

variable "environment" {
  description = "Namespace to create Kubernetes resources"
  type        = string
}

variable "cloud_provider" {
  description = "Cloud provider being used in deployment"
  type        = string
}

variable "aws_region" {
  description = "AWS region is cloud provider is AWS"
  type        = string
}

variable "external_container_reg" {
  description = "External container registry"
}

variable "gpu_enabled" {
  description = "Enable GPU support"
  type        = bool
}

variable "gpu_node_group_names" {
  description = "Names of node groups with GPU"
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/template/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_initialize/__init__.py
---

import sys
from typing import Any, Dict, List, Optional, Type

from pydantic import model_validator

from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.tf_objects import (
    NebariHelmProvider,
    NebariKubernetesProvider,
    NebariTerraformState,
)
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


class ExtContainerReg(schema.Base):
    enabled: bool = False
    access_key_id: Optional[str] = None
    secret_access_key: Optional[str] = None
    extcr_account: Optional[str] = None
    extcr_region: Optional[str] = None

    @model_validator(mode="after")
    def enabled_must_have_fields(self):
        if self.enabled:
            for fldname in (
                "access_key_id",
                "secret_access_key",
                "extcr_account",
                "extcr_region",
            ):
                value = getattr(self, fldname)
                if value is None or value.strip() == "":
                    raise ValueError(
                        f"external_container_reg must contain a non-blank {fldname} when enabled is true"
                    )
        return self


class InputVars(schema.Base):
    name: str
    environment: str
    cloud_provider: str
    aws_region: Optional[str] = None
    external_container_reg: Optional[ExtContainerReg] = None
    gpu_enabled: bool = False
    gpu_node_group_names: List[str] = []


class InputSchema(schema.Base):
    external_container_reg: ExtContainerReg = ExtContainerReg()


class OutputSchema(schema.Base):
    pass


class KubernetesInitializeStage(NebariTerraformStage):
    name = "03-kubernetes-initialize"
    priority = 30

    input_schema = InputSchema
    output_schema = OutputSchema

    def tf_objects(self) -> List[Dict]:
        return [
            NebariTerraformState(self.name, self.config),
            NebariKubernetesProvider(self.config),
            NebariHelmProvider(self.config),
        ]

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        input_vars = InputVars(
            name=self.config.project_name,
            environment=self.config.namespace,
            cloud_provider=self.config.provider.value,
            external_container_reg=self.config.external_container_reg.model_dump(),
        )

        if self.config.provider == schema.ProviderEnum.gcp:
            input_vars.gpu_enabled = any(
                node_group.guest_accelerators
                for node_group in self.config.google_cloud_platform.node_groups.values()
            )

        elif self.config.provider == schema.ProviderEnum.aws:
            input_vars.gpu_enabled = any(
                node_group.gpu
                for node_group in self.config.amazon_web_services.node_groups.values()
            )
            input_vars.gpu_node_group_names = [
                group for group in self.config.amazon_web_services.node_groups.keys()
            ]
            input_vars.aws_region = self.config.amazon_web_services.region

        return input_vars.model_dump()

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        from kubernetes import client, config
        from kubernetes.client.rest import ApiException

        config.load_kube_config(
            config_file=stage_outputs["stages/02-infrastructure"][
                "kubeconfig_filename"
            ]["value"]
        )

        try:
            api_instance = client.CoreV1Api()
            result = api_instance.list_namespace()
        except ApiException:
            print(
                f"ERROR: After stage={self.name} unable to connect to kubernetes cluster"
            )
            sys.exit(1)

        namespaces = {_.metadata.name for _ in result.items}
        if self.config.namespace not in namespaces:
            print(
                f"ERROR: After stage={self.name} namespace={self.config.namespace} not provisioned within kubernetes cluster"
            )
            sys.exit(1)

        print(f"After stage={self.name} kubernetes initialized successfully")


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KubernetesInitializeStage]



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/modules/kubernetes/keycloak-helm/main.tf
---

resource "helm_release" "keycloak" {
  name      = "keycloak"
  namespace = var.namespace

  repository = "https://codecentric.github.io/helm-charts"
  chart      = "keycloak"
  version    = "15.0.2"

  values = concat([
    # https://github.com/codecentric/helm-charts/blob/keycloak-15.0.2/charts/keycloak/values.yaml
    file("${path.module}/values.yaml"),
    jsonencode({
      nodeSelector = {
        "${var.node_group.key}" = var.node_group.value
      }
      postgresql = {
        primary = {
          nodeSelector = {
            "${var.node_group.key}" = var.node_group.value
          }
        }
      }
    })
  ], var.overrides)

  set_sensitive {
    name  = "nebari_bot_password"
    value = var.nebari-bot-password
  }

  set {
    name  = "initial_root_password"
    value = var.initial_root_password
  }
}


resource "kubernetes_manifest" "keycloak-http" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "keycloak-http"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`/auth`) "
          services = [
            {
              name = "keycloak-headless"
              # Really not sure why 8080 works here
              port      = 80
              namespace = var.namespace
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/modules/kubernetes/keycloak-helm/outputs.tf
---

output "credentials" {
  description = "keycloak admin credentials"
  sensitive   = true
  value = {
    url       = "https://${var.external-url}"
    client_id = "admin-cli"
    realm     = "master"
    username  = "nebari-bot"
    password  = var.nebari-bot-password
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/modules/kubernetes/keycloak-helm/values.yaml
---

# https://github.com/codecentric/helm-charts/blob/keycloak-15.0.2/charts/keycloak/values.yaml
ingress:
  # Helm chart (14.0 anyway) will only define Ingress records, not IngressRoute as required by Traefik, so
  # we will need to define our own IngressRoute elsewhere.
  enabled: false

image:
  repository: quay.io/keycloak/keycloak

imagePullSecrets:
  - name: "extcrcreds"

extraEnv: |
  - name: PROXY_ADDRESS_FORWARDING
    value: "true"

startupScripts:
  keycloak.cli: |
    {{- .Files.Get "scripts/keycloak.cli" | nindent 2 }}

  nebariadminuser.sh: |
    /opt/jboss/keycloak/bin/add-user-keycloak.sh -r master -u root -p "{{ .Values.initial_root_password }}"
    /opt/jboss/keycloak/bin/add-user-keycloak.sh -r master -u nebari-bot -p "{{ .Values.nebari_bot_password }}"

extraInitContainers: |
  - command:
    - sh
    - -c
    - |
      if [ ! -f /data/keycloak-metrics-spi-2.5.3.jar ]; then
        wget https://github.com/aerogear/keycloak-metrics-spi/releases/download/2.5.3/keycloak-metrics-spi-2.5.3.jar -P /data/ &&
        export SHA256SUM=9b3f52f842a66dadf5ff3cc3a729b8e49042d32f84510a5d73d41a2e39f29a96 &&
        if ! (echo "$SHA256SUM  /data/keycloak-metrics-spi-2.5.3.jar" | sha256sum -c)
          then
            echo "Error: Checksum not verified" && exit 1
          else
            chown 1000:1000 /data/keycloak-metrics-spi-2.5.3.jar &&
            chmod 777 /data/keycloak-metrics-spi-2.5.3.jar
        fi
      else
        echo "File /data/keycloak-metrics-spi-2.5.3.jar already exists. Skipping download."
      fi
    image: busybox:1.36
    name: initialize-spi-metrics-jar
    securityContext:
      runAsUser: 0
    volumeMounts:
      - name: metrics-plugin
        mountPath: /data

extraVolumeMounts: |
  - name: metrics-plugin
    mountPath: /opt/jboss/keycloak/providers/

extraVolumes: |
  - name: metrics-plugin
    emptyDir: {}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/modules/kubernetes/keycloak-helm/variables.tf
---

variable "namespace" {
  description = "Namespace for Keycloak deployment"
  type        = string
}

variable "external-url" {
  description = "External public url that cluster is accessible"
  type        = string
}

variable "overrides" {
  description = "Keycloak helm chart list of overrides"
  type        = list(string)
  default     = []
}

variable "nebari-bot-password" {
  description = "nebari-bot password for keycloak"
  type        = string
}

variable "initial_root_password" {
  description = "initial root password for keycloak"
  type        = string
}

variable "node_group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/main.tf
---

resource "random_password" "keycloak-nebari-bot-password" {
  length  = 32
  special = false
}

module "kubernetes-keycloak-helm" {
  source = "./modules/kubernetes/keycloak-helm"

  namespace = var.environment

  external-url = var.endpoint

  nebari-bot-password = random_password.keycloak-nebari-bot-password.result

  initial_root_password = var.initial_root_password

  overrides = var.overrides

  node_group = var.node_group
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/outputs.tf
---

output "keycloak_credentials" {
  description = "keycloak admin credentials"
  sensitive   = true
  value       = module.kubernetes-keycloak-helm.credentials
}

# At this point this might be redundant, see `nebari-bot-password` in ./modules/kubernetes/keycloak-helm/variables.tf
output "keycloak_nebari_bot_password" {
  description = "keycloak nebari-bot credentials"
  sensitive   = true
  value       = random_password.keycloak-nebari-bot-password.result
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/variables.tf
---

variable "name" {
  description = "Prefix name to assign to keycloak kubernetes resources"
  type        = string
}

variable "environment" {
  description = "Kubernetes namespace to deploy keycloak"
  type        = string
}

variable "endpoint" {
  description = "nebari cluster endpoint"
  type        = string
}

variable "initial_root_password" {
  description = "Keycloak root user password"
  type        = string
}

variable "overrides" {
  # https://github.com/codecentric/helm-charts/blob/master/charts/keycloak/values.yaml
  description = "Keycloak helm chart overrides"
  type        = list(string)
}

variable "node_group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/template/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak/__init__.py
---

import contextlib
import enum
import json
import os
import secrets
import string
import sys
import time
from typing import Any, Dict, List, Optional, Type, Union

from pydantic import Field, ValidationInfo, field_validator

from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.tf_objects import (
    NebariHelmProvider,
    NebariKubernetesProvider,
    NebariTerraformState,
)
from _nebari.utils import modified_environ
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl

NUM_ATTEMPTS = 10
TIMEOUT = 10


class InputVars(schema.Base):
    name: str
    environment: str
    endpoint: str
    initial_root_password: str
    overrides: List[str]
    node_group: Dict[str, str]


@contextlib.contextmanager
def keycloak_provider_context(keycloak_credentials: Dict[str, str]):
    credential_mapping = {
        "client_id": "KEYCLOAK_CLIENT_ID",
        "url": "KEYCLOAK_URL",
        "username": "KEYCLOAK_USER",
        "password": "KEYCLOAK_PASSWORD",
        "realm": "KEYCLOAK_REALM",
    }

    credentials = {credential_mapping[k]: v for k, v in keycloak_credentials.items()}
    with modified_environ(**credentials):
        yield


@schema.yaml_object(schema.yaml)
class AuthenticationEnum(str, enum.Enum):
    password = "password"
    github = "GitHub"
    auth0 = "Auth0"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class GitHubConfig(schema.Base):
    client_id: str = Field(
        default_factory=lambda: os.environ.get("GITHUB_CLIENT_ID"),
        validate_default=True,
    )
    client_secret: str = Field(
        default_factory=lambda: os.environ.get("GITHUB_CLIENT_SECRET"),
        validate_default=True,
    )

    @field_validator("client_id", "client_secret", mode="before")
    @classmethod
    def validate_credentials(cls, value: Optional[str], info: ValidationInfo) -> str:
        variable_mapping = {
            "client_id": "GITHUB_CLIENT_ID",
            "client_secret": "GITHUB_CLIENT_SECRET",
        }
        if value is None:
            raise ValueError(
                f"Missing the following required environment variable: {variable_mapping[info.field_name]}"
            )
        return value


class Auth0Config(schema.Base):
    client_id: str = Field(
        default_factory=lambda: os.environ.get("AUTH0_CLIENT_ID"),
        validate_default=True,
    )
    client_secret: str = Field(
        default_factory=lambda: os.environ.get("AUTH0_CLIENT_SECRET"),
        validate_default=True,
    )
    auth0_subdomain: str = Field(
        default_factory=lambda: os.environ.get("AUTH0_DOMAIN"),
        validate_default=True,
    )

    @field_validator("client_id", "client_secret", "auth0_subdomain", mode="before")
    @classmethod
    def validate_credentials(cls, value: Optional[str], info: ValidationInfo) -> str:
        variable_mapping = {
            "client_id": "AUTH0_CLIENT_ID",
            "client_secret": "AUTH0_CLIENT_SECRET",
            "auth0_subdomain": "AUTH0_DOMAIN",
        }
        if value is None:
            raise ValueError(
                f"Missing the following required environment variable: {variable_mapping[info.field_name]} "
            )
        return value


class BaseAuthentication(schema.Base):
    type: AuthenticationEnum


class PasswordAuthentication(BaseAuthentication):
    type: AuthenticationEnum = AuthenticationEnum.password


class Auth0Authentication(BaseAuthentication):
    type: AuthenticationEnum = AuthenticationEnum.auth0
    config: Auth0Config = Field(default_factory=lambda: Auth0Config())


class GitHubAuthentication(BaseAuthentication):
    type: AuthenticationEnum = AuthenticationEnum.github
    config: GitHubConfig = Field(default_factory=lambda: GitHubConfig())


Authentication = Union[
    PasswordAuthentication, Auth0Authentication, GitHubAuthentication
]


def random_secure_string(
    length: int = 16, chars: str = string.ascii_lowercase + string.digits
):
    return "".join(secrets.choice(chars) for i in range(length))


class Keycloak(schema.Base):
    initial_root_password: str = Field(default_factory=random_secure_string)
    overrides: Dict = {}
    realm_display_name: str = "Nebari"


auth_enum_to_model = {
    AuthenticationEnum.password: PasswordAuthentication,
    AuthenticationEnum.auth0: Auth0Authentication,
    AuthenticationEnum.github: GitHubAuthentication,
}

auth_enum_to_config = {
    AuthenticationEnum.auth0: Auth0Config,
    AuthenticationEnum.github: GitHubConfig,
}


class Security(schema.Base):
    authentication: Authentication = PasswordAuthentication()
    shared_users_group: bool = True
    keycloak: Keycloak = Keycloak()

    @field_validator("authentication", mode="before")
    @classmethod
    def validate_authentication(cls, value: Optional[Dict]) -> Authentication:
        if value is None:
            return PasswordAuthentication()
        if "type" not in value:
            raise ValueError(
                "Authentication type must be specified if authentication is set"
            )
        auth_type = value["type"] if hasattr(value, "__getitem__") else value.type
        if auth_type in auth_enum_to_model:
            if auth_type == AuthenticationEnum.password:
                return auth_enum_to_model[auth_type]()
            else:
                if "config" in value:
                    config_dict = (
                        value["config"]
                        if hasattr(value, "__getitem__")
                        else value.config
                    )
                    config = auth_enum_to_config[auth_type](**config_dict)
                else:
                    config = auth_enum_to_config[auth_type]()
                return auth_enum_to_model[auth_type](config=config)
        else:
            raise ValueError(f"Unsupported authentication type {auth_type}")


class InputSchema(schema.Base):
    security: Security = Security()


class KeycloakCredentials(schema.Base):
    url: str
    client_id: str
    realm: str
    username: str
    password: str


class OutputSchema(schema.Base):
    keycloak_credentials: KeycloakCredentials
    keycloak_nebari_bot_password: str


class KubernetesKeycloakStage(NebariTerraformStage):
    name = "05-kubernetes-keycloak"
    priority = 50

    input_schema = InputSchema
    output_schema = OutputSchema

    def tf_objects(self) -> List[Dict]:
        return [
            NebariTerraformState(self.name, self.config),
            NebariKubernetesProvider(self.config),
            NebariHelmProvider(self.config),
        ]

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        return InputVars(
            name=self.config.project_name,
            environment=self.config.namespace,
            endpoint=stage_outputs["stages/04-kubernetes-ingress"]["domain"],
            initial_root_password=self.config.security.keycloak.initial_root_password,
            overrides=[json.dumps(self.config.security.keycloak.overrides)],
            node_group=stage_outputs["stages/02-infrastructure"]["node_selectors"][
                "general"
            ],
        ).model_dump()

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_check: bool = False
    ):
        from keycloak import KeycloakAdmin
        from keycloak.exceptions import KeycloakError

        keycloak_url = f"{stage_outputs['stages/' + self.name]['keycloak_credentials']['value']['url']}/auth/"

        def _attempt_keycloak_connection(
            keycloak_url,
            username,
            password,
            realm_name,
            client_id,
            verify=False,
            num_attempts=NUM_ATTEMPTS,
            timeout=TIMEOUT,
        ):
            for i in range(num_attempts):
                try:
                    KeycloakAdmin(
                        keycloak_url,
                        username=username,
                        password=password,
                        realm_name=realm_name,
                        client_id=client_id,
                        verify=verify,
                    )
                    print(
                        f"Attempt {i+1} succeeded connecting to keycloak master realm"
                    )
                    return True
                except KeycloakError:
                    print(f"Attempt {i+1} failed connecting to keycloak master realm")
                time.sleep(timeout)
            return False

        if not _attempt_keycloak_connection(
            keycloak_url,
            stage_outputs["stages/" + self.name]["keycloak_credentials"]["value"][
                "username"
            ],
            stage_outputs["stages/" + self.name]["keycloak_credentials"]["value"][
                "password"
            ],
            stage_outputs["stages/" + self.name]["keycloak_credentials"]["value"][
                "realm"
            ],
            stage_outputs["stages/" + self.name]["keycloak_credentials"]["value"][
                "client_id"
            ],
            verify=False,
        ):
            print(
                f"ERROR: unable to connect to keycloak master realm at url={keycloak_url} with root credentials"
            )
            sys.exit(1)

        print("Keycloak service successfully started")

    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        with super().deploy(stage_outputs, disable_prompt):
            with keycloak_provider_context(
                stage_outputs["stages/" + self.name]["keycloak_credentials"]["value"]
            ):
                yield

    @contextlib.contextmanager
    def destroy(
        self, stage_outputs: Dict[str, Dict[str, Any]], status: Dict[str, bool]
    ):
        with super().destroy(stage_outputs, status):
            with keycloak_provider_context(
                stage_outputs["stages/" + self.name]["keycloak_credentials"]["value"]
            ):
                yield


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KubernetesKeycloakStage]



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/main.tf
---

resource "keycloak_realm" "main" {

  realm        = var.realm
  display_name = var.realm_display_name

  direct_grant_flow    = "direct grant"
  enabled              = true
  browser_flow         = "browser"
  revoke_refresh_token = false
  user_managed_access  = false
  ssl_required         = "external"
  registration_flow    = "registration"

  refresh_token_max_reuse    = 0
  reset_credentials_flow     = "reset credentials"
  client_authentication_flow = "clients"
  docker_authentication_flow = "docker auth"

  offline_session_max_lifespan_enabled = false

  web_authn_policy {
  }

  web_authn_passwordless_policy {
  }

  lifecycle {
    ignore_changes = [
      # We want user to have control over attributes we are not managing
      # If attribute is added above remove it from this list
      # https://registry.terraform.io/providers/mrparkers/keycloak/latest/docs/resources/realm
      attributes,
      registration_allowed,
      registration_email_as_username,
      edit_username_allowed,
      reset_password_allowed,
      remember_me,
      verify_email,
      login_with_email_allowed,
      login_theme,
      account_theme,
      admin_theme,
      email_theme,
      sso_session_idle_timeout,
      sso_session_max_lifespan,
      sso_session_idle_timeout_remember_me,
      sso_session_max_lifespan_remember_me,
      offline_session_idle_timeout,
      offline_session_max_lifespan,
      access_token_lifespan,
      access_token_lifespan_for_implicit_flow,
      access_code_lifespan,
      access_code_lifespan_login,
      access_code_lifespan_user_action,
      action_token_generated_by_user_lifespan,
      action_token_generated_by_admin_lifespan,
      oauth2_device_code_lifespan,
      oauth2_device_polling_interval,
      smtp_server,
      internationalization,
      security_defenses,
      password_policy,
      otp_policy,
      default_default_client_scopes,
      default_optional_client_scopes,
    ]
  }

}

resource "keycloak_group" "groups" {
  for_each   = var.keycloak_groups
  realm_id   = keycloak_realm.main.id
  name       = each.key
  attributes = {}

  lifecycle {
    ignore_changes = [
      attributes,
    ]
  }
}

resource "keycloak_default_groups" "default" {
  realm_id = keycloak_realm.main.id
  group_ids = [
    for g in var.default_groups :
    keycloak_group.groups[g].id
  ]
}

data "keycloak_realm" "master" {
  realm = "master"
}

resource "random_password" "keycloak-view-only-user-password" {
  length  = 32
  special = false
}

resource "keycloak_user" "read-only-user" {
  realm_id = data.keycloak_realm.master.id
  username = "read-only-user"
  initial_password {
    value     = random_password.keycloak-view-only-user-password.result
    temporary = false
  }
}

resource "keycloak_user_roles" "user_roles" {
  realm_id = data.keycloak_realm.master.id
  user_id  = keycloak_user.read-only-user.id

  role_ids = [
    data.keycloak_role.view-users.id,
  ]
  exhaustive = true
}

# needed for keycloak monitoring to function
resource "keycloak_realm_events" "realm_events" {
  realm_id = keycloak_realm.main.id

  events_enabled = true

  admin_events_enabled         = true
  admin_events_details_enabled = true

  # When omitted or left empty, keycloak will enable all event types
  enabled_event_types = []

  events_listeners = [
    "jboss-logging", "metrics-listener",
  ]
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/outputs.tf
---

output "realm_id" {
  description = "Realm id used for nebari resources"
  value       = keycloak_realm.main.id
}

output "keycloak-read-only-user-credentials" {
  description = "Credentials for user that can read users/groups, but not modify them"
  sensitive   = true
  value = {
    username  = keycloak_user.read-only-user.username
    password  = random_password.keycloak-view-only-user-password.result
    client_id = "admin-cli"
    realm     = data.keycloak_realm.master.realm
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/permissions.tf
---

data "keycloak_openid_client" "realm_management" {
  realm_id  = keycloak_realm.main.id
  client_id = "realm-management"
}

data "keycloak_role" "manage-users" {
  realm_id  = keycloak_realm.main.id
  client_id = data.keycloak_openid_client.realm_management.id
  name      = "manage-users"
}

data "keycloak_openid_client" "nebari-realm" {
  depends_on = [
    keycloak_realm.main,
  ]
  realm_id  = data.keycloak_realm.master.id
  client_id = "${var.realm}-realm"
}

data "keycloak_role" "view-users" {
  realm_id  = data.keycloak_realm.master.id
  client_id = data.keycloak_openid_client.nebari-realm.id
  name      = "view-users"
}


data "keycloak_role" "query-users" {
  realm_id  = keycloak_realm.main.id
  client_id = data.keycloak_openid_client.realm_management.id
  name      = "query-users"
}

data "keycloak_role" "query-groups" {
  realm_id  = keycloak_realm.main.id
  client_id = data.keycloak_openid_client.realm_management.id
  name      = "query-groups"
}

data "keycloak_role" "realm-admin" {
  realm_id  = keycloak_realm.main.id
  client_id = data.keycloak_openid_client.realm_management.id
  name      = "realm-admin"
}

resource "keycloak_group_roles" "admin_roles" {
  realm_id = keycloak_realm.main.id
  group_id = keycloak_group.groups["admin"].id
  role_ids = [
    data.keycloak_role.query-users.id,
    data.keycloak_role.query-groups.id,
    data.keycloak_role.manage-users.id
  ]

  exhaustive = false
}

resource "keycloak_group_roles" "superadmin_roles" {
  realm_id = keycloak_realm.main.id
  group_id = keycloak_group.groups["superadmin"].id
  role_ids = [data.keycloak_role.realm-admin.id]

  exhaustive = false
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/providers.tf
---

provider "keycloak" {
  tls_insecure_skip_verify = true
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/social_auth.tf
---

resource "keycloak_authentication_flow" "flow" {
  realm_id    = keycloak_realm.main.id
  alias       = "detect-existing"
  provider_id = "basic-flow"
  description = ""
}

resource "keycloak_authentication_execution" "idp-detect-existing-broker-user" {
  realm_id          = keycloak_realm.main.id
  parent_flow_alias = keycloak_authentication_flow.flow.alias
  authenticator     = "idp-detect-existing-broker-user"
  requirement       = "REQUIRED"
}

resource "keycloak_authentication_execution" "idp-auto-link" {
  realm_id          = keycloak_realm.main.id
  parent_flow_alias = keycloak_authentication_flow.flow.alias
  authenticator     = "idp-auto-link"
  requirement       = "REQUIRED"

  # This is the only way to encourage Keycloak Provider to set the
  # auth execution priority order:
  # https://github.com/mrparkers/terraform-provider-keycloak/pull/138
  depends_on = [
    keycloak_authentication_execution.idp-detect-existing-broker-user
  ]
}


resource "keycloak_oidc_identity_provider" "github_identity_provider" {
  count = var.authentication.type == "GitHub" ? 1 : 0

  realm             = keycloak_realm.main.id
  alias             = "github"
  provider_id       = "github"
  authorization_url = "https://github.com/login/oauth/authorize"
  client_id         = var.authentication.config.client_id
  client_secret     = var.authentication.config.client_secret
  token_url         = "https://github.com/login/oauth/access_token"
  default_scopes    = "user:email"
  store_token       = false
  sync_mode         = "IMPORT"
  trust_email       = true

  first_broker_login_flow_alias = keycloak_authentication_flow.flow.alias

  extra_config = {
    "clientAuthMethod" = "client_secret_post"
  }
}

resource "keycloak_oidc_identity_provider" "auth0_identity_provider" {
  count = var.authentication.type == "Auth0" ? 1 : 0

  realm             = keycloak_realm.main.id
  alias             = "auth0"
  provider_id       = "oidc"
  authorization_url = "https://${var.authentication.config.auth0_subdomain}.auth0.com/authorize"
  client_id         = var.authentication.config.client_id
  client_secret     = var.authentication.config.client_secret
  token_url         = "https://${var.authentication.config.auth0_subdomain}.auth0.com/oauth/token"
  user_info_url     = "https://${var.authentication.config.auth0_subdomain}.auth0.com/userinfo"
  default_scopes    = "openid email profile"
  store_token       = false
  sync_mode         = "IMPORT"
  trust_email       = true

  first_broker_login_flow_alias = keycloak_authentication_flow.flow.alias

  extra_config = {
    "clientAuthMethod" = "client_secret_post"
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/variables.tf
---

variable "realm" {
  description = "Keycloak realm to use for nebari"
  type        = string
}

variable "realm_display_name" {
  description = "Keycloak realm display name for nebari"
  type        = string
}

variable "keycloak_groups" {
  description = "Permission groups in keycloak used for granting access to services"
  type        = set(string)
  default     = []
}

variable "authentication" {
  description = "Authentication configuration for keycloak"
  type        = any
}

variable "default_groups" {
  description = "Set of groups that should exist by default"
  type        = set(string)
  default     = []
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/template/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
    keycloak = {
      source  = "mrparkers/keycloak"
      version = "3.7.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_keycloak_configuration/__init__.py
---

import sys
import time
from typing import Any, Dict, List, Type

from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.kubernetes_keycloak import Authentication
from _nebari.stages.tf_objects import NebariTerraformState
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl

NUM_ATTEMPTS = 10
TIMEOUT = 10


class InputVars(schema.Base):
    realm: str = "nebari"
    realm_display_name: str
    authentication: Authentication
    keycloak_groups: List[str] = ["superadmin", "admin", "developer", "analyst"]
    default_groups: List[str] = ["analyst"]


class KubernetesKeycloakConfigurationStage(NebariTerraformStage):
    name = "06-kubernetes-keycloak-configuration"
    priority = 60

    def tf_objects(self) -> List[Dict]:
        return [
            NebariTerraformState(self.name, self.config),
        ]

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        input_vars = InputVars(
            realm_display_name=self.config.security.keycloak.realm_display_name,
            authentication=self.config.security.authentication,
        )

        users_group = ["users"] if self.config.security.shared_users_group else []

        input_vars.keycloak_groups += users_group
        input_vars.default_groups += users_group

        return input_vars.model_dump()

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        directory = "stages/05-kubernetes-keycloak"

        from keycloak import KeycloakAdmin
        from keycloak.exceptions import KeycloakError

        keycloak_url = (
            f"{stage_outputs[directory]['keycloak_credentials']['value']['url']}/auth/"
        )

        def _attempt_keycloak_connection(
            keycloak_url,
            username,
            password,
            realm_name,
            client_id,
            nebari_realm,
            verify=False,
            num_attempts=NUM_ATTEMPTS,
            timeout=TIMEOUT,
        ):
            for i in range(num_attempts):
                try:
                    realm_admin = KeycloakAdmin(
                        keycloak_url,
                        username=username,
                        password=password,
                        realm_name=realm_name,
                        client_id=client_id,
                        verify=verify,
                    )
                    existing_realms = {_["id"] for _ in realm_admin.get_realms()}
                    if nebari_realm in existing_realms:
                        print(
                            f"Attempt {i+1} succeeded connecting to keycloak and nebari realm={nebari_realm} exists"
                        )
                        return True
                    else:
                        print(
                            f"Attempt {i+1} succeeded connecting to keycloak but nebari realm did not exist"
                        )
                except KeycloakError:
                    print(f"Attempt {i+1} failed connecting to keycloak master realm")
                time.sleep(timeout)
            return False

        if not _attempt_keycloak_connection(
            keycloak_url,
            stage_outputs[directory]["keycloak_credentials"]["value"]["username"],
            stage_outputs[directory]["keycloak_credentials"]["value"]["password"],
            stage_outputs[directory]["keycloak_credentials"]["value"]["realm"],
            stage_outputs[directory]["keycloak_credentials"]["value"]["client_id"],
            nebari_realm=stage_outputs["stages/06-kubernetes-keycloak-configuration"][
                "realm_id"
            ]["value"],
            verify=False,
        ):
            print(
                "ERROR: unable to connect to keycloak master realm and ensure that nebari realm exists"
            )
            sys.exit(1)

        print("Keycloak service successfully started with nebari realm")


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KubernetesKeycloakConfigurationStage]



---
File: nebari/src/_nebari/stages/kubernetes_kuberhealthy/template/values.yaml
---

prometheus:
  enabled: true
  serviceMonitor:
    enabled: true



---
File: nebari/src/_nebari/stages/kubernetes_kuberhealthy/__init__.py
---

import contextlib
from typing import Any, Dict, List, Type

from _nebari.stages.base import NebariKustomizeStage
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


class InputSchema(schema.Base):
    pass


class OutputSchema(schema.Base):
    pass


class KuberHealthyStage(NebariKustomizeStage):
    name = "10-kubernetes-kuberhealthy"
    priority = 100

    input_schema = InputSchema
    output_schema = OutputSchema

    @property
    def kustomize_vars(self):
        return {
            "namespace": self.config.namespace,
            "kuberhealthy_helm_version": self.config.monitoring.healthchecks.kuberhealthy_helm_version,
        }

    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        if self.config.monitoring.healthchecks.enabled:
            with super().deploy(stage_outputs, disable_prompt):
                yield
        else:
            with self.destroy(stage_outputs, {}):
                yield


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KuberHealthyStage]



---
File: nebari/src/_nebari/stages/kubernetes_kuberhealthy_healthchecks/template/base/conda-store-healthcheck.yaml
---

apiVersion: comcast.github.io/v1
kind: KuberhealthyCheck
metadata:
  name: conda-store-http-check
  namespace: dev
spec:
  runInterval: 5m
  timeout: 10m
  podSpec:
    containers:
      - name: https
        image: kuberhealthy/http-check:v1.5.0
        imagePullPolicy: IfNotPresent
        env:
          - name: COUNT #### default: "0"
            value: "5"
          - name: SECONDS #### default: "0"
            value: "1"
          - name: PASSING_PERCENT #### default: "100"
            value: "80"
        resources:
          requests:
            cpu: 15m
            memory: 15Mi
          limits:
            cpu: 25m
    restartPolicy: Always
    terminationGracePeriodSeconds: 5



---
File: nebari/src/_nebari/stages/kubernetes_kuberhealthy_healthchecks/template/base/jupyterhub-healthcheck.yaml
---

apiVersion: comcast.github.io/v1
kind: KuberhealthyCheck
metadata:
  name: jupyterhub-http-check
  namespace: dev
spec:
  runInterval: 5m
  timeout: 10m
  podSpec:
    containers:
      - name: https
        image: kuberhealthy/http-check:v1.5.0
        imagePullPolicy: IfNotPresent
        env:
          - name: COUNT #### default: "0"
            value: "5"
          - name: SECONDS #### default: "0"
            value: "1"
          - name: PASSING_PERCENT #### default: "100"
            value: "80"
        resources:
          requests:
            cpu: 15m
            memory: 15Mi
          limits:
            cpu: 25m
    restartPolicy: Always
    terminationGracePeriodSeconds: 5



---
File: nebari/src/_nebari/stages/kubernetes_kuberhealthy_healthchecks/template/base/keycloak-healthcheck.yaml
---

apiVersion: comcast.github.io/v1
kind: KuberhealthyCheck
metadata:
  name: keycloak-http-check
  namespace: dev
spec:
  runInterval: 5m
  timeout: 10m
  podSpec:
    containers:
      - name: https
        image: kuberhealthy/http-check:v1.5.0
        imagePullPolicy: IfNotPresent
        env:
          - name: COUNT #### default: "0"
            value: "5"
          - name: SECONDS #### default: "0"
            value: "1"
          - name: PASSING_PERCENT #### default: "100"
            value: "80"
        resources:
          requests:
            cpu: 15m
            memory: 15Mi
          limits:
            cpu: 25m
    restartPolicy: Always
    terminationGracePeriodSeconds: 5



---
File: nebari/src/_nebari/stages/kubernetes_kuberhealthy_healthchecks/__init__.py
---

import contextlib
from typing import Any, Dict, List, Type

from _nebari.stages.base import NebariKustomizeStage
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


class InputSchema(schema.Base):
    pass


class OutputSchema(schema.Base):
    pass


class KuberHealthyStage(NebariKustomizeStage):
    name = "11-kubernetes-kuberhealthy-healthchecks"
    priority = 110

    input_schema = InputSchema
    output_schema = OutputSchema

    @property
    def kustomize_vars(self):
        return {
            "namespace": self.config.namespace,
        }

    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        if self.config.monitoring.healthchecks.enabled:
            with super().deploy(stage_outputs, disable_prompt):
                yield
        else:
            with self.destroy(stage_outputs, {}):
                yield


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KuberHealthyStage]



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/cephfs-mount/main.tf
---

resource "kubernetes_persistent_volume_claim" "main" {
  metadata {
    name      = var.ceph-pvc-name
    namespace = var.namespace
  }

  spec {
    access_modes       = ["ReadWriteMany"]
    storage_class_name = "ceph-filesystem-retain" # kubernetes_storage_class.main.metadata.0.name  # Get this from a terraform output
    resources {
      requests = {
        storage = "${var.fs_capacity}Gi"
      }
    }
  }

  # Hack to avoid timeout while CephCluster is being created
  timeouts {
    create = "10m"
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/cephfs-mount/outputs.tf
---

output "persistent_volume_claim" {
  description = "Name of persistent volume claim"
  value = {
    pvc = {
      name = kubernetes_persistent_volume_claim.main.metadata.0.name
      id   = kubernetes_persistent_volume_claim.main.metadata.0.uid
    }
    namespace = var.namespace
    kind      = "persistentvolumeclaim"
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/cephfs-mount/variables.tf
---

variable "name" {
  description = "Prefix name form nfs mount kubernetes resource"
  type        = string
}

variable "namespace" {
  description = "Namespace to deploy nfs storage mount"
  type        = string
}

variable "fs_capacity" {
  description = "Capacity of NFS server mount in Gi"
  type        = number
  default     = 10
}

variable "ceph-pvc-name" {
  description = "Name of the persistent volume claim"
  type        = string
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/forwardauth/main.tf
---

module "forwardauth-openid-client" {
  source = "../services/keycloak-client"

  realm_id     = var.realm_id
  client_id    = "forwardauth"
  external-url = var.external-url
  callback-url-paths = [
    "https://${var.external-url}${var.callback-url-path}"
  ]
}


resource "kubernetes_service" "forwardauth-service" {
  metadata {
    name      = "forwardauth-service"
    namespace = var.namespace
  }
  spec {
    selector = {
      app = kubernetes_deployment.forwardauth-deployment.spec.0.template.0.metadata[0].labels.app
    }
    port {
      port        = 4181
      target_port = 4181
    }

    type = "ClusterIP"
  }
}

resource "random_password" "forwardauth_cookie_secret" {
  length  = 32
  special = false
}

resource "kubernetes_deployment" "forwardauth-deployment" {
  metadata {
    name      = "forwardauth-deployment"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        app = "forwardauth-pod"
      }
    }

    template {
      metadata {
        labels = {
          app = "forwardauth-pod"
        }
      }

      spec {
        node_selector = {
          "${var.node-group.key}" = var.node-group.value
        }
        dynamic "volume" {
          for_each = var.cert_secret_name == null ? [] : [1]
          content {
            name = "cert-volume"
            secret {
              secret_name = var.cert_secret_name
              items {
                key  = "tls.crt"
                path = "tls.crt"
              }
            }
          }
        }
        container {
          # image = "thomseddon/traefik-forward-auth:2.2.0"
          # Use PR #159 https://github.com/thomseddon/traefik-forward-auth/pull/159
          image = "maxisme/traefik-forward-auth:sha-a98e568"
          name  = "forwardauth-container"

          env {
            name  = "USER_ID_PATH"
            value = "preferred_username"
          }

          env {
            name  = "PROVIDERS_GENERIC_OAUTH_AUTH_URL"
            value = module.forwardauth-openid-client.config.authentication_url
          }

          env {
            name  = "PROVIDERS_GENERIC_OAUTH_TOKEN_URL"
            value = module.forwardauth-openid-client.config.token_url
          }

          env {
            name  = "PROVIDERS_GENERIC_OAUTH_USER_URL"
            value = module.forwardauth-openid-client.config.userinfo_url
          }

          env {
            name  = "PROVIDERS_GENERIC_OAUTH_CLIENT_ID"
            value = module.forwardauth-openid-client.config.client_id
          }

          env {
            name  = "PROVIDERS_GENERIC_OAUTH_CLIENT_SECRET"
            value = module.forwardauth-openid-client.config.client_secret
          }

          env {
            name  = "SECRET"
            value = random_password.forwardauth_cookie_secret.result
          }

          env {
            name  = "DEFAULT_PROVIDER"
            value = "generic-oauth"
          }

          env {
            name  = "URL_PATH"
            value = var.callback-url-path
          }

          env {
            name  = "LOG_LEVEL"
            value = "trace"
          }
          env {
            name  = "AUTH_HOST"
            value = var.external-url
          }

          env {
            name  = "COOKIE_DOMAIN"
            value = var.external-url
          }

          dynamic "env" {
            for_each = var.cert_secret_name == null ? [] : [1]
            content {
              name  = "SSL_CERT_FILE"
              value = "/config/tls.crt"
            }
          }

          port {
            container_port = 4181
          }

          dynamic "volume_mount" {
            for_each = var.cert_secret_name == null ? [] : [1]
            content {
              name       = "cert-volume"
              mount_path = "/config"
              read_only  = true
            }
          }
        }

      }
    }
  }
}

resource "kubernetes_manifest" "forwardauth-middleware" {
  # This version of the middleware is primarily for the forwardauth service
  # itself, so the callback _oauth url can be centalised (not just under for example /someservice/_oauth).
  # This middleware is in the root namespace, someservice may have its own.
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = var.forwardauth_middleware_name
      namespace = var.namespace
    }
    spec = {
      forwardAuth = {
        address = "http://${kubernetes_service.forwardauth-service.metadata.0.name}:4181"
        authResponseHeaders = [
          "X-Forwarded-User"
        ]
      }
    }
  }
}

resource "kubernetes_manifest" "forwardauth-ingressroute" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "forwardauth"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`${var.callback-url-path}`)"

          middlewares = [
            {
              name      = kubernetes_manifest.forwardauth-middleware.manifest.metadata.name
              namespace = var.namespace
            }
          ]

          services = [
            {
              name = kubernetes_service.forwardauth-service.metadata.0.name
              port = 4181
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/forwardauth/outputs.tf
---

output "forward-auth-middleware" {
  description = "middleware name for use with forward auth"
  value = {
    name = kubernetes_manifest.forwardauth-middleware.manifest.metadata.name
  }
}

output "forward-auth-service" {
  description = "middleware name for use with forward auth"
  value = {
    name = kubernetes_service.forwardauth-service.metadata.0.name
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/forwardauth/variables.tf
---

variable "namespace" {
  description = "Namespace to deploy forwardauth"
  type        = string
}

variable "external-url" {
  description = "External domain where nebari is accessible."
  type        = string
}

variable "realm_id" {
  description = "Keycloak realm for forwardauth"
  type        = string
}

variable "callback-url-path" {
  description = "Callback url for forewardauth"
  type        = string
  default     = "/_oauth"
}

variable "node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}

variable "forwardauth_middleware_name" {
  description = "Name of the traefik forward auth middleware"
  type        = string
}

variable "cert_secret_name" {
  description = "Name of the secret containing the certificate"
  type        = string
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/nfs-mount/main.tf
---

resource "kubernetes_storage_class" "main" {
  metadata {
    name = "${var.name}-${var.namespace}-share"
  }
  storage_provisioner = "kubernetes.io/fake-nfs"
}


resource "kubernetes_persistent_volume" "main" {
  metadata {
    name = "${var.name}-${var.namespace}-share"
  }
  spec {
    capacity = {
      storage = "${var.nfs_capacity}Gi"
    }
    storage_class_name = kubernetes_storage_class.main.metadata.0.name
    access_modes       = ["ReadWriteMany"]
    persistent_volume_source {
      nfs {
        path   = "/"
        server = var.nfs_endpoint
      }
    }
  }
}


resource "kubernetes_persistent_volume_claim" "main" {
  metadata {
    name      = var.nfs-pvc-name
    namespace = var.namespace
  }

  spec {
    access_modes       = ["ReadWriteMany"]
    storage_class_name = kubernetes_storage_class.main.metadata.0.name
    resources {
      requests = {
        storage = "${var.nfs_capacity}Gi"
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/nfs-mount/outputs.tf
---

output "persistent_volume_claim" {
  description = "Name of persistent volume claim"
  value = {
    pvc = {
      name = kubernetes_persistent_volume_claim.main.metadata.0.name
      id   = kubernetes_persistent_volume_claim.main.metadata.0.uid
    }
    namespace = var.namespace
    kind      = "persistentvolumeclaim"
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/nfs-mount/variables.tf
---

variable "name" {
  description = "Prefix name form nfs mount kubernetes resource"
  type        = string
}

variable "namespace" {
  description = "Namespace to deploy nfs storage mount"
  type        = string
}

variable "nfs_capacity" {
  description = "Capacity of NFS server mount in Gi"
  type        = number
  default     = 10
}

variable "nfs_endpoint" {
  description = "Endpoint of nfs server"
  type        = string
}

variable "nfs-pvc-name" {
  description = "Name of the persistent volume claim"
  type        = string
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/nfs-server/main.tf
---

resource "kubernetes_persistent_volume_claim" "main" {
  metadata {
    name      = "${var.name}-nfs-storage"
    namespace = var.namespace
  }

  spec {
    access_modes = ["ReadWriteOnce"]
    resources {
      requests = {
        storage = "${var.nfs_capacity}Gi"
      }
    }
  }
}


resource "kubernetes_service" "main" {
  metadata {
    name      = "${var.name}-nfs"
    namespace = var.namespace
  }

  spec {
    selector = {
      role = "${var.name}-nfs"
    }

    port {
      name = "nfs"
      port = 2049
    }

    port {
      name = "mountd"
      port = 20048
    }

    port {
      name = "rpcbind"
      port = 111
    }
  }
}


resource "kubernetes_deployment" "main" {
  metadata {
    name      = "${var.name}-nfs"
    namespace = var.namespace
    labels = {
      role = "${var.name}-nfs"
    }
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        role = "${var.name}-nfs"
      }
    }

    template {
      metadata {
        labels = {
          role = "${var.name}-nfs"
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.node-group.key
                  operator = "In"
                  values = [
                    var.node-group.value
                  ]
                }
              }
            }
          }
        }

        container {
          name  = "nfs-server"
          image = "gcr.io/google_containers/volume-nfs:0.8"

          port {
            name           = "nfs"
            container_port = 2049
          }

          port {
            name           = "mountd"
            container_port = 20048
          }

          port {
            name           = "rpcbind"
            container_port = 111
          }

          security_context {
            privileged = true
          }

          volume_mount {
            mount_path = "/exports"
            name       = "nfs-export-fast"
          }
        }

        volume {
          name = "nfs-export-fast"
          persistent_volume_claim {
            claim_name = "${var.name}-nfs-storage"
          }
        }
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/nfs-server/output.tf
---

output "endpoint" {
  description = "Endpoint dns name of nfs server"
  value       = "${var.name}-nfs.${var.namespace}.svc.cluster.local"
}

output "endpoint_ip" {
  description = "IP Address of nfs server"
  value       = kubernetes_service.main.spec.0.cluster_ip
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/nfs-server/variables.tf
---

variable "name" {
  description = "Prefix name form nfs server kubernetes resource"
  type        = string
}

variable "namespace" {
  description = "Namespace to deploy nfs server"
  type        = string
}

variable "nfs_capacity" {
  description = "Capacity of NFS server deployment in Gi"
  type        = number
  default     = 10
}

variable "node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/argo-workflows/main.tf
---

locals {
  name                  = "argo-workflows"
  argo-workflows-prefix = "argo"
  # roles
  admin     = "argo-admin"
  developer = "argo-developer"
  viewer    = "argo-viewer"
}

resource "helm_release" "argo-workflows" {
  name       = local.name
  namespace  = var.namespace
  repository = "https://argoproj.github.io/argo-helm"
  chart      = "argo-workflows"
  version    = "0.22.9"

  values = concat([
    file("${path.module}/values.yaml"),

    jsonencode({
      singleNamespace = true # Restrict Argo to operate only in a single namespace (the namespace of the Helm release)

      controller = {
        metricsConfig = {
          enabled = true # enable prometheus
        }
        workflowNamespaces = [
          "${var.namespace}"
        ]
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }
      }

      server = {
        # `sso` for OIDC/OAuth
        extraArgs = ["--auth-mode=sso", "--auth-mode=client", "--insecure-skip-verify"]
        # to enable TLS, `secure = true`
        secure   = false
        baseHref = "/${local.argo-workflows-prefix}/"

        sso = {
          insecureSkipVerify = true
          issuer             = "https://${var.external-url}/auth/realms/${var.realm_id}"
          clientId = {
            name = "argo-server-sso"
            key  = "argo-oidc-client-id"
          }
          clientSecret = {
            name = "argo-server-sso"
            key  = "argo-oidc-client-secret"
          }
          # The OIDC redirect URL. Should be in the form <argo-root-url>/oauth2/callback.
          redirectUrl = "https://${var.external-url}/${local.argo-workflows-prefix}/oauth2/callback"
          rbac = {
            # https://argoproj.github.io/argo-workflows/argo-server-sso/#sso-rbac
            enabled         = true
            secretWhitelist = []
          }
          customGroupClaimName = "roles"
          scopes               = ["roles"]
        }
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }
      }

    })
  ], var.overrides)
}

resource "kubernetes_secret" "argo-oidc-secret" {
  metadata {
    name      = "argo-server-sso"
    namespace = var.namespace
  }
  data = {
    "argo-oidc-client-id"     = module.argo-workflow-openid-client.config.client_id
    "argo-oidc-client-secret" = module.argo-workflow-openid-client.config.client_secret
  }
}

module "argo-workflow-openid-client" {
  source = "../keycloak-client"

  realm_id     = var.realm_id
  client_id    = "argo-server-sso"
  external-url = var.external-url
  role_mapping = {
    "admin"     = ["${local.admin}"]
    "developer" = ["${local.developer}"]
    "analyst"   = ["${local.viewer}"]
  }

  callback-url-paths = [
    "https://${var.external-url}/${local.argo-workflows-prefix}/oauth2/callback"
  ]
}

resource "kubernetes_manifest" "argo-workflows-middleware-stripprefix" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebari-argo-workflows-stripprefix"
      namespace = var.namespace
    }
    spec = {
      stripPrefix = {
        prefixes = [
          "/${local.argo-workflows-prefix}/"
        ]
        forceSlash = false
      }
    }
  }
}

resource "kubernetes_manifest" "argo-workflows-ingress-route" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "argo-workflows"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && Path(`/${local.argo-workflows-prefix}/validate`)"
          middlewares = concat(
            [{
              name      = kubernetes_manifest.argo-workflows-middleware-stripprefix.manifest.metadata.name
              namespace = var.namespace
            }]
          )
          services = [
            {
              name      = "wf-admission-controller"
              port      = 8080
              namespace = var.namespace
            }
          ]
        },
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && Path(`/${local.argo-workflows-prefix}/mutate`)"
          middlewares = concat(
            [{
              name      = kubernetes_manifest.argo-workflows-middleware-stripprefix.manifest.metadata.name
              namespace = var.namespace
            }]
          )
          services = [
            {
              name      = "wf-admission-controller"
              port      = 8080
              namespace = var.namespace
            }
          ]
        },
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`/${local.argo-workflows-prefix}`)"

          middlewares = concat(
            [{
              name      = kubernetes_manifest.argo-workflows-middleware-stripprefix.manifest.metadata.name
              namespace = var.namespace
            }]
          )

          services = [
            {
              name      = "${local.name}-server"
              port      = 2746
              namespace = var.namespace
            }
          ]
        },
      ]
    }
  }
}

resource "kubernetes_service_account_v1" "argo-admin-sa" {
  metadata {
    name      = local.admin
    namespace = var.namespace
    annotations = {
      "workflows.argoproj.io/rbac-rule" : "'${local.admin}' in groups"
      "workflows.argoproj.io/rbac-rule-precedence" : "11"
    }
  }
}

resource "kubernetes_secret_v1" "argo-admin-sa-token" {
  metadata {
    name      = "${local.admin}.service-account-token"
    namespace = var.namespace
    annotations = {
      "kubernetes.io/service-account.name" = kubernetes_service_account_v1.argo-admin-sa.metadata[0].name

    }
  }
  type = "kubernetes.io/service-account-token"
}

resource "kubernetes_cluster_role_binding" "argo-admin-rb" {
  metadata {
    name = local.admin
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = "argo-workflows-admin" # role deployed as part of helm chart
  }
  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account_v1.argo-admin-sa.metadata.0.name
    namespace = var.namespace
  }
}

resource "kubernetes_service_account_v1" "argo-developer-sa" {
  metadata {
    name      = local.developer
    namespace = var.namespace
    annotations = {
      "workflows.argoproj.io/rbac-rule" : "'${local.developer}' in groups"
      "workflows.argoproj.io/rbac-rule-precedence" : "10"
    }
  }
}

resource "kubernetes_secret_v1" "argo_dev_sa_token" {
  metadata {
    name      = "${local.developer}.service-account-token"
    namespace = var.namespace
    annotations = {
      "kubernetes.io/service-account.name" = kubernetes_service_account_v1.argo-developer-sa.metadata[0].name
    }
  }
  type = "kubernetes.io/service-account-token"
}

resource "kubernetes_cluster_role_binding" "argo-developer-rb" {
  metadata {
    name = local.developer
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = "argo-workflows-edit" # role deployed as part of helm chart
  }
  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account_v1.argo-developer-sa.metadata.0.name
    namespace = var.namespace
  }
}


resource "kubernetes_service_account_v1" "argo-view-sa" {
  metadata {
    name      = "argo-viewer"
    namespace = var.namespace
    annotations = {
      "workflows.argoproj.io/rbac-rule" : "'${local.viewer}' in groups"
      "workflows.argoproj.io/rbac-rule-precedence" : "9"
    }
  }
}

resource "kubernetes_secret_v1" "argo-viewer-sa-token" {
  metadata {
    name      = "argo-viewer.service-account-token"
    namespace = var.namespace
    annotations = {
      "kubernetes.io/service-account.name" = kubernetes_service_account_v1.argo-view-sa.metadata[0].name
    }
  }
  type = "kubernetes.io/service-account-token"
}

resource "kubernetes_cluster_role_binding" "argo-view-rb" {
  metadata {
    name = "argo-view"
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = "argo-workflows-view" # role deployed as part of helm chart
  }
  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account_v1.argo-view-sa.metadata.0.name
    namespace = var.namespace
  }
}

# Workflow Admission Controller
resource "kubernetes_role" "pod_viewer" {

  metadata {
    name      = "nebari-pod-viewer"
    namespace = var.namespace
  }

  rule {
    api_groups = [""]
    resources  = ["pods"]
    verbs      = ["get", "list"]
  }
}

resource "kubernetes_role" "workflow_viewer" {

  metadata {
    name      = "nebari-workflow-viewer"
    namespace = var.namespace
  }

  rule {
    api_groups = ["argoproj.io"]
    resources  = ["workflows", "cronworkflows"]
    verbs      = ["get", "list"]
  }
}

resource "kubernetes_service_account" "wf-admission-controller" {
  metadata {
    name      = "wf-admission-controller-sa"
    namespace = var.namespace
  }
}

resource "kubernetes_role_binding" "wf-admission-controller-pod-viewer" {
  metadata {
    name      = "wf-admission-controller-pod-viewer"
    namespace = var.namespace
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "Role"
    name      = kubernetes_role.pod_viewer.metadata.0.name
  }

  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account.wf-admission-controller.metadata.0.name
    namespace = var.namespace
  }
}

resource "kubernetes_role_binding" "wf-admission-controller-wf-viewer" {
  metadata {
    name      = "wf-admission-controller-wf-viewer"
    namespace = var.namespace
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "Role"
    name      = kubernetes_role.workflow_viewer.metadata.0.name
  }

  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account.wf-admission-controller.metadata.0.name
    namespace = var.namespace
  }
}


resource "kubernetes_secret" "keycloak-read-only-user-credentials" {
  metadata {
    name      = "keycloak-read-only-user-credentials"
    namespace = var.namespace
  }

  data = {
    username  = var.keycloak-read-only-user-credentials["username"]
    password  = var.keycloak-read-only-user-credentials["password"]
    client_id = var.keycloak-read-only-user-credentials["client_id"]
    realm     = var.keycloak-read-only-user-credentials["realm"]
  }

  type = "Opaque"
}


resource "kubernetes_manifest" "mutatingwebhookconfiguration_admission_controller" {
  count = var.nebari-workflow-controller ? 1 : 0

  manifest = {
    "apiVersion" = "admissionregistration.k8s.io/v1"
    "kind"       = "MutatingWebhookConfiguration"
    "metadata" = {
      "name" = "wf-admission-controller"
    }
    "webhooks" = [
      {
        "admissionReviewVersions" = [
          "v1",
          "v1beta1",
        ]

        "clientConfig" = {
          "url" = "https://${var.external-url}/${local.argo-workflows-prefix}/mutate"
        }

        "name" = "wf-mutating-admission-controller.${var.namespace}.svc"
        "rules" = [
          {
            "apiGroups" = [
              "argoproj.io",
            ]
            "apiVersions" = [
              "v1alpha1",
            ]
            "operations" = [
              "CREATE",
            ]
            "resources" = [
              "workflows",
              "cronworkflows",
            ]
          },
        ]
        "sideEffects" = "None"
      },
    ]
  }
}

resource "kubernetes_manifest" "validatingwebhookconfiguration_admission_controller" {
  count = var.nebari-workflow-controller ? 1 : 0
  manifest = {
    "apiVersion" = "admissionregistration.k8s.io/v1"
    "kind"       = "ValidatingWebhookConfiguration"
    "metadata" = {
      "name" = "wf-admission-controller"
    }
    "webhooks" = [
      {
        "admissionReviewVersions" = [
          "v1",
          "v1beta1",
        ]
        "clientConfig" = {
          "url" = "https://${var.external-url}/${local.argo-workflows-prefix}/validate"
        }
        "name" = "wf-validating-admission-controller.${var.namespace}.svc"
        "rules" = [
          {
            "apiGroups" = [
              "argoproj.io",
            ]
            "apiVersions" = [
              "v1alpha1",
            ]
            "operations" = [
              "CREATE",
            ]
            "resources" = [
              "workflows",
            ]
          },
        ]
        "sideEffects" = "None"
      },
    ]
  }
}

resource "kubernetes_manifest" "deployment_admission_controller" {
  count = var.nebari-workflow-controller ? 1 : 0
  manifest = {
    "apiVersion" = "apps/v1"
    "kind"       = "Deployment"
    "metadata" = {
      "name"      = "nebari-workflow-controller"
      "namespace" = var.namespace
    }
    "spec" = {
      "replicas" = 1
      "selector" = {
        "matchLabels" = {
          "app" = "nebari-workflow-controller"
        }
      }
      "template" = {
        "metadata" = {
          "labels" = {
            "app" = "nebari-workflow-controller"
          }
        }
        "spec" = {
          serviceAccountName           = kubernetes_service_account.wf-admission-controller.metadata.0.name
          automountServiceAccountToken = true
          "containers" = [
            {
              command = ["bash", "-c"]
              args    = ["python -m nebari_workflow_controller"]

              "env" = [
                {
                  "name" = "KEYCLOAK_USERNAME"
                  "valueFrom" = {
                    "secretKeyRef" = {
                      "key"  = "username"
                      "name" = "keycloak-read-only-user-credentials"
                    }
                  }
                },
                {
                  "name" = "KEYCLOAK_PASSWORD"
                  "valueFrom" = {
                    "secretKeyRef" = {
                      "key"  = "password"
                      "name" = "keycloak-read-only-user-credentials"
                    }
                  }
                },
                {
                  "name"  = "KEYCLOAK_URL"
                  "value" = "https://${var.external-url}/auth/"
                },
                {
                  "name"  = "NAMESPACE"
                  "value" = var.namespace
                },
              ]
              "volumeMounts" = [
                {
                  "mountPath" = "/etc/argo"
                  "name"      = "valid-argo-roles"
                  "readOnly"  = true
                },
              ]
              "image" = "quay.io/nebari/nebari-workflow-controller:${var.workflow-controller-image-tag}"
              "name"  = "admission-controller"
            },
          ]
          "volumes" = [
            {
              "name" = "valid-argo-roles"
              "configMap" = {
                "name" = "valid-argo-roles"
              }
            },
          ]
          affinity = {
            nodeAffinity = {
              requiredDuringSchedulingIgnoredDuringExecution = {
                nodeSelectorTerms = [
                  {
                    matchExpressions = [
                      {
                        key      = var.node-group.key
                        operator = "In"
                        values   = [var.node-group.value]
                      }
                    ]
                  }
                ]
              }
            }
          }
        }
      }
    }
  }
}

resource "kubernetes_manifest" "service_admission_controller" {
  manifest = {
    "apiVersion" = "v1"
    "kind"       = "Service"
    "metadata" = {
      "name"      = "wf-admission-controller"
      "namespace" = var.namespace
    }
    "spec" = {
      "ports" = [
        {
          "name"       = "wf-admission-controller"
          "port"       = 8080
          "targetPort" = 8080
        },
      ]
      "selector" = {
        "app" = "nebari-workflow-controller"
      }
    }
  }
}

resource "kubernetes_config_map" "valid-argo-roles" {
  metadata {
    name      = "valid-argo-roles"
    namespace = var.namespace
  }

  data = {
    "valid-argo-roles" = jsonencode([local.admin, local.developer])
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/argo-workflows/values.yaml
---

# https://github.com/argoproj/argo-helm/blob/argo-workflows-0.22.9/charts/argo-workflows/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/argo-workflows/variables.tf
---

variable "namespace" {
  description = "deploy argo server on this namespace"
  type        = string
  default     = "dev"
}

variable "argo-workflows-namespace" {
  description = "deploy argo workflows on this namespace"
  type        = string
  default     = "dev"
}

variable "node-group" {
  description = "Node key value pair for bound resources"
  type = object({
    key   = string
    value = string
  })
}

variable "external-url" {
  description = "External url where jupyterhub cluster is accessible"
  type        = string
}


variable "overrides" {
  description = "Argo Workflows helm chart overrides"
  type        = list(string)
  default     = []
}

variable "realm_id" {
  description = "Keycloak realm to use for deploying openid client"
  type        = string
}

variable "keycloak-read-only-user-credentials" {
  sensitive   = true
  description = "Keycloak password for nebari-bot"
  type        = map(string)
  default     = {}
}

variable "workflow-controller-image-tag" {
  description = "Image tag for nebari-workflow-controller"
  type        = string
}

variable "nebari-workflow-controller" {
  description = "Nebari Workflow Controller enabled"
  type        = bool
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/argo-workflows/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/config/conda_store_config.py
---

import dataclasses
import json
import logging
import re
import tempfile
import typing
import urllib
import urllib.parse
import urllib.request
from pathlib import Path

import requests
from conda_store_server import api
from conda_store_server._internal.server.dependencies import get_conda_store
from conda_store_server.server import schema as auth_schema
from conda_store_server.server.auth import GenericOAuthAuthentication
from conda_store_server.storage import S3Storage


def conda_store_config(path="/var/lib/conda-store/config.json"):
    with open(path) as f:
        return json.load(f)


config = conda_store_config()

# ==================================
#      conda-store settings
# ==================================
c.CondaStore.storage_class = S3Storage
c.CondaStore.store_directory = "/home/conda/"
c.CondaStore.database_url = f"postgresql+psycopg2://{config['postgres-username']}:{config['postgres-password']}@{config['postgres-service']}/conda-store"
c.CondaStore.redis_url = (
    f"redis://:{config['redis-password']}@{config['redis-service']}:6379/0"
)
c.CondaStore.default_uid = 1000
c.CondaStore.default_gid = 100
c.CondaStore.default_permissions = "555"
c.CondaStore.conda_included_packages = ["ipykernel"]

c.S3Storage.internal_endpoint = f"{config['minio-service']}:9000"
c.S3Storage.internal_secure = False
c.S3Storage.external_endpoint = f"{config['external-url']}:9080"
c.S3Storage.external_secure = True
c.S3Storage.access_key = config["minio-username"]
c.S3Storage.secret_key = config["minio-password"]
c.S3Storage.region = "us-east-1"  # minio region default
c.S3Storage.bucket_name = "conda-store"

c.CondaStore.default_namespace = "global"
c.CondaStore.filesystem_namespace = config["default-namespace"]
c.CondaStore.conda_allowed_channels = []  # allow all channels
c.CondaStore.conda_indexed_channels = [
    "main",
    "conda-forge",
    "https://repo.anaconda.com/pkgs/main",
]
c.RBACAuthorizationBackend.role_mappings_version = 2

# ==================================
#        server settings
# ==================================
c.CondaStoreServer.log_level = logging.INFO
c.CondaStoreServer.log_format = (
    "%(asctime)s %(levelname)9s %(name)s:%(lineno)4s: %(message)s"
)
c.CondaStoreServer.enable_ui = True
c.CondaStoreServer.enable_api = True
c.CondaStoreServer.enable_registry = True
c.CondaStoreServer.enable_metrics = True
c.CondaStoreServer.address = "0.0.0.0"
c.CondaStoreServer.port = 5000
c.CondaStoreServer.behind_proxy = True
# This MUST start with `/`
c.CondaStoreServer.url_prefix = "/conda-store"

# ==================================
#         auth settings
# ==================================

c.GenericOAuthAuthentication.access_token_url = config["openid-config"]["token_url"]
c.GenericOAuthAuthentication.authorize_url = config["openid-config"][
    "authentication_url"
]
c.GenericOAuthAuthentication.user_data_url = config["openid-config"]["userinfo_url"]
c.GenericOAuthAuthentication.oauth_callback_url = (
    f"https://{config['external-url']}/conda-store/oauth_callback"
)
c.GenericOAuthAuthentication.client_id = config["openid-config"]["client_id"]
c.GenericOAuthAuthentication.client_secret = config["openid-config"]["client_secret"]
c.GenericOAuthAuthentication.access_scope = "profile"
c.GenericOAuthAuthentication.user_data_key = "preferred_username"
c.GenericOAuthAuthentication.tls_verify = False

CONDA_STORE_ROLE_PERMISSIONS_ORDER = ["viewer", "developer", "admin"]


@dataclasses.dataclass
class CondaStoreNamespaceRole:
    namespace: str
    role: str


@dataclasses.dataclass
class KeyCloakCondaStoreRoleScopes:
    scopes: str
    log: logging.Logger

    def _validate_role(self, role):
        valid = role in CONDA_STORE_ROLE_PERMISSIONS_ORDER
        self.log.info(f"role: {role} is {'valid' if valid else 'invalid'}")
        return valid

    def parse_role_and_namespace(
        self, text
    ) -> typing.Optional[CondaStoreNamespaceRole]:
        # The regex pattern
        pattern = r"^(\w+)!namespace=([^!]+)$"

        # Perform the regex search
        match = re.search(pattern, text)

        # Extract the permission and namespace if there is a match
        if match and self._validate_role(match.group(1)):
            return CondaStoreNamespaceRole(
                namespace=match.group(2), role=match.group(1)
            )
        else:
            return None

    def parse_scope(self) -> typing.List[CondaStoreNamespaceRole]:
        """Parsed scopes from keycloak role's attribute and returns a list of role/namespace
        if scopes' syntax is valid otherwise return []

        Example:
            Given scopes as "viewer!namespace=scipy,admin!namespace=pycon", the function will
            return [{"role": "viewer", "namespace": "scipy"}, {"role": "admin", "namespace": "pycon"}]
        """
        if not self.scopes:
            self.log.info(f"No scope found: {self.scopes}, skipping role")
            return []
        scope_list = self.scopes.split(",")
        parsed_scopes = []
        self.log.info(f"Scopes to parse: {scope_list}")
        for scope_text in scope_list:
            parsed_scope = self.parse_role_and_namespace(scope_text)
            parsed_scopes.append(parsed_scope)
            if not parsed_scope:
                self.log.info(f"Unable to parse: {scope_text}, skipping keycloak role")
                return []
        return parsed_scopes


class KeyCloakAuthentication(GenericOAuthAuthentication):
    conda_store_api_url = f"https://{config['external-url']}/conda-store/api/v1"
    access_token_url = config["token_url_internal"]
    realm_api_url = config["realm_api_url_internal"]
    service_account_token = config["service-tokens-mapping"][
        "conda-store-service-account"
    ]

    def _get_conda_store_client_id(self, token: str) -> str:
        # Get the clients list to find the "id" of "conda-store" client.
        self.log.info("Getting conda store client id")
        clients_data = self._fetch_api(endpoint="clients/", token=token)
        conda_store_clients = [
            client for client in clients_data if client["clientId"] == "conda_store"
        ]
        self.log.info(f"conda store clients: {conda_store_clients}")
        assert len(conda_store_clients) == 1
        conda_store_client_id = conda_store_clients[0]["id"]
        return conda_store_client_id

    async def _delete_conda_store_roles(self, request, namespace: str, username: str):
        self.log.info(
            f"Delete all conda-store roles on namespace: {namespace} for user: {username}"
        )
        conda_store = await get_conda_store(request)
        with conda_store.session_factory() as db:
            api.delete_namespace_role(db, namespace, other=username)
            db.commit()

    async def _create_conda_store_role(
        self, request, namespace_role: CondaStoreNamespaceRole, username: str
    ):
        self.log.info(
            f"Creating conda-store roles on namespace: {namespace_role.namespace} for user: {username}"
        )
        conda_store = await get_conda_store(request)
        with conda_store.session_factory() as db:
            api.create_namespace_role(
                db, namespace_role.namespace, username, namespace_role.role
            )
            db.commit()

    def _get_keycloak_token(self) -> str:
        body = urllib.parse.urlencode(
            {
                "client_id": self.client_id,
                "client_secret": self.client_secret,
                "grant_type": "client_credentials",
            }
        )
        self.log.info(f"Getting token from access token url: {self.access_token_url}")
        req = urllib.request.Request(self.access_token_url, data=body.encode())
        response = urllib.request.urlopen(req)
        data = json.loads(response.read())
        return data["access_token"]  # type: ignore[no-any-return]

    def _fetch_api(self, endpoint: str, token: str):
        request_url = f"{self.realm_api_url}/{endpoint}"
        req = urllib.request.Request(
            request_url,
            method="GET",
            headers={"Authorization": f"Bearer {token}"},
        )
        self.log.info(f"Making request to: {request_url}")
        with urllib.request.urlopen(req) as response:
            data = json.loads(response.read())
        return data

    async def _remove_current_bindings(self, request, username):
        """Remove current roles for the user to make sure only the roles defined in
        keycloak are applied:
        - to avoid inconsistency in user roles
        - single source of truth
        - roles that are added in keycloak and then later removed from keycloak are actually removed from conda-store.
        """
        entity_bindings = self._get_current_entity_bindings(username)
        self.log.info("Remove current role bindings for the user")
        for entity, role in entity_bindings.items():
            if entity not in {"default/*", "filesystem/*"}:
                namespace = entity.split("/")[0]
                self.log.info(
                    f"Removing current role {role} on namespace {namespace} "
                    f"for user {username}"
                )
                await self._delete_conda_store_roles(request, namespace, username)

    async def _apply_roles_from_keycloak(self, request, user_data):
        token = self._get_keycloak_token()
        conda_store_client_id = self._get_conda_store_client_id(token)
        conda_store_client_roles = self._get_conda_store_client_roles_for_user(
            user_data["sub"], conda_store_client_id, token
        )
        await self._remove_current_bindings(request, user_data["preferred_username"])
        await self._apply_conda_store_roles_from_keycloak(
            request, conda_store_client_roles, user_data["preferred_username"]
        )

    def _filter_duplicate_namespace_roles_with_max_permissions(
        self, namespace_roles: typing.List[CondaStoreNamespaceRole]
    ):
        """Filter duplicate roles in keycloak such that to apply only the one with the highest
        permissions.

        Example:
            role 1: namespace: foo, role: viewer
            role 2: namespace: foo, role: admin
        We need to apply only the role 2 as that one has higher permissions.
        """
        self.log.info("Filtering duplicate roles for same namespace")
        namespace_role_mapping: typing.Dict[str:CondaStoreNamespaceRole] = {}
        for namespace_role in namespace_roles:
            namespace = namespace_role.namespace
            new_role = namespace_role.role

            existing_role: CondaStoreNamespaceRole = namespace_role_mapping.get(
                namespace
            )
            if not existing_role:
                # Add if not already added
                namespace_role_mapping[namespace] = namespace_role
            else:
                # Only add if the permissions of this role is higher than existing
                new_role_priority = CONDA_STORE_ROLE_PERMISSIONS_ORDER.index(new_role)
                existing_role_priority = CONDA_STORE_ROLE_PERMISSIONS_ORDER.index(
                    existing_role.role
                )
                if new_role_priority > existing_role_priority:
                    namespace_role_mapping[namespace] = new_role
        return list(namespace_role_mapping.values())

    def _get_permissions_from_keycloak_role(
        self, keycloak_role
    ) -> typing.List[CondaStoreNamespaceRole]:
        self.log.info(f"Getting permissions from keycloak role: {keycloak_role}")
        role_attributes = keycloak_role["attributes"]
        # scopes returns a list with a value say ["viewer!namespace=pycon,developer!namespace=scipy"]
        scopes = role_attributes.get("scopes", [""])[0]
        k_cstore_scopes = KeyCloakCondaStoreRoleScopes(scopes=scopes, log=self.log)
        return k_cstore_scopes.parse_scope()

    async def _apply_conda_store_roles_from_keycloak(
        self, request, conda_store_client_roles, username
    ):
        self.log.info(
            f"Apply conda store roles from keycloak roles: {conda_store_client_roles}, user: {username}"
        )
        role_permissions: typing.List[CondaStoreNamespaceRole] = []
        for conda_store_client_role in conda_store_client_roles:
            role_permissions += self._get_permissions_from_keycloak_role(
                conda_store_client_role
            )

        self.log.info("Filtering duplicate namespace role for max permissions")
        filtered_namespace_role: typing.List[CondaStoreNamespaceRole] = (
            self._filter_duplicate_namespace_roles_with_max_permissions(
                role_permissions
            )
        )
        self.log.info(f"Final role permissions to apply: {filtered_namespace_role}")
        for namespace_role in filtered_namespace_role:
            if namespace_role.namespace.lower() == username.lower():
                self.log.info("Role for given user's namespace, skipping")
                continue
            try:
                await self._delete_conda_store_roles(
                    request, namespace_role.namespace, username
                )
                await self._create_conda_store_role(request, namespace_role, username)
            except ValueError as e:
                self.log.error(
                    f"Failed to add permissions for namespace: {namespace_role.namespace} to user: {username}"
                )
                self.log.exception(e)

    def _get_keycloak_conda_store_roles_with_attributes(
        self, roles: dict, client_id: str, token: str
    ):
        """This fetches all roles by id to fetch their attributes."""
        roles_rich = []
        for role in roles:
            # If this takes too much time, which isn't the case right now, we can
            # also do multi-threaded requests
            role_rich = self._fetch_api(
                endpoint=f"roles-by-id/{role['id']}?client={client_id}", token=token
            )
            roles_rich.append(role_rich)
        return roles_rich

    def _get_conda_store_client_roles_for_user(
        self, user_id, conda_store_client_id, token
    ):
        """Get roles for the client named 'conda-store' for the given user_id."""
        self.log.info(
            f"Get conda store client roles for user: {user_id}, conda_store_client_id: {conda_store_client_id}"
        )
        user_roles = self._fetch_api(
            endpoint=f"users/{user_id}/role-mappings/clients/{conda_store_client_id}/composite",
            token=token,
        )
        client_roles_rich = self._get_keycloak_conda_store_roles_with_attributes(
            user_roles, client_id=conda_store_client_id, token=token
        )
        self.log.info(f"conda store client roles: {client_roles_rich}")
        return client_roles_rich

    def _get_current_entity_bindings(self, username):
        entity = auth_schema.AuthenticationToken(
            primary_namespace=username, role_bindings={}
        )
        self.log.info(f"entity: {entity}")
        entity_bindings = self.authorization.get_entity_bindings(entity)
        self.log.info(f"current entity_bindings: {entity_bindings}")
        return entity_bindings

    async def authenticate(self, request):
        oauth_access_token = self._get_oauth_token(request)
        if oauth_access_token is None:
            return None  # authentication failed

        response = requests.get(
            self.user_data_url,
            headers={"Authorization": f"Bearer {oauth_access_token}"},
            verify=self.tls_verify,
        )
        response.raise_for_status()
        user_data = response.json()
        username = user_data["preferred_username"]

        try:
            await self._apply_roles_from_keycloak(request, user_data=user_data)
        except Exception as e:
            self.log.error("Adding roles from keycloak failed")
            self.log.exception(e)

        # superadmin gets access to everything
        if "conda_store_superadmin" in user_data.get("roles", []):
            return auth_schema.AuthenticationToken(
                primary_namespace=username,
                role_bindings={"*/*": {"admin"}},
            )

        role_mappings = {
            "conda_store_admin": "admin",
            "conda_store_developer": "developer",
            "conda_store_viewer": "viewer",
        }
        roles = {
            role_mappings[role]
            for role in user_data.get("roles", [])
            if role in role_mappings
        }
        default_namespace = config["default-namespace"]
        self.log.info(f"default_namespace: {default_namespace}")
        namespaces = {username, "global", default_namespace}
        self.log.info(f"namespaces: {namespaces}")
        role_bindings = {
            f"{username}/*": {"admin"},
            f"{default_namespace}/*": {"viewer"},
            "global/*": roles,
        }

        for group in user_data.get("groups", []):
            # Use only the base name of Keycloak groups
            group_name = Path(group).name
            namespaces.add(group_name)
            role_bindings[f"{group_name}/*"] = roles

        conda_store = await get_conda_store(request)
        with conda_store.session_factory() as db:
            for namespace in namespaces:
                _namespace = api.get_namespace(db, name=namespace)
                if _namespace is None:
                    api.ensure_namespace(db, name=namespace)

        return auth_schema.AuthenticationToken(
            primary_namespace=username,
            role_bindings=role_bindings,
        )


c.CondaStoreServer.authentication_class = KeyCloakAuthentication
c.AuthenticationBackend.predefined_tokens = {
    service_token: service_permissions
    for service_token, service_permissions in config["service-tokens"].items()
}

# ==================================
#         worker settings
# ==================================
c.CondaStoreWorker.log_level = logging.INFO
c.CondaStoreWorker.watch_paths = ["/opt/environments"]
c.CondaStoreWorker.concurrency = 4

# Template used to form the directory for symlinking conda environment builds.
c.CondaStore.environment_directory = "/home/conda/{namespace}/envs/{namespace}-{name}"

# extra-settings to apply simply as `c.Class.key = value`
conda_store_settings = config["extra-settings"]
for classname, attributes in conda_store_settings.items():
    for attribute, value in attributes.items():
        setattr(getattr(c, classname), attribute, value)

# run arbitrary python code
# compiling makes debugging easier: https://stackoverflow.com/a/437857
extra_config_filename = Path(tempfile.gettempdir()) / "extra-config.py"
extra_config = config.get("extra-config", "")
with open(extra_config_filename, "w") as f:
    f.write(extra_config)
exec(compile(source=extra_config, filename=extra_config_filename, mode="exec"))



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/output.tf
---

output "endpoint" {
  description = "Endpoint dns name of conda-store nfs server"
  value       = "${kubernetes_service.nfs.metadata.0.name}.${var.namespace}.svc.cluster.local"
}

output "endpoint_ip" {
  description = "IP Address of conda-store nfs server"
  value       = kubernetes_service.nfs.spec.0.cluster_ip
}

output "service_name" {
  description = "Kubernetes service name for accessing conda-store server"
  value       = "${kubernetes_service.server.metadata.0.name}:${kubernetes_service.server.spec.0.port.0.port}"
}

output "service-tokens" {
  description = "Service tokens for conda-store"
  value       = { for k, _ in var.services : k => base64encode(random_password.conda_store_service_token[k].result) }
}

output "pvc" {
  description = "Shared PVC name for conda-store"
  value       = local.shared-pvc
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/server.tf
---

resource "random_password" "conda_store_service_token" {
  for_each = var.services

  length  = 32
  special = false
}

resource "kubernetes_secret" "conda-store-secret" {
  metadata {
    name      = "conda-store-secret"
    namespace = var.namespace
  }

  data = {
    "config.json" = jsonencode({
      external-url           = var.external-url
      minio-username         = module.minio.root_username
      minio-password         = module.minio.root_password
      minio-service          = module.minio.service
      redis-password         = module.redis.root_password
      redis-service          = module.redis.service
      postgres-username      = module.postgresql.root_username
      postgres-password      = module.postgresql.root_password
      postgres-service       = module.postgresql.service
      openid-config          = module.conda-store-openid-client.config
      extra-settings         = var.extra-settings
      extra-config           = var.extra-config
      default-namespace      = var.default-namespace-name
      token_url_internal     = "http://keycloak-http.${var.namespace}.svc/auth/realms/${var.realm_id}/protocol/openid-connect/token"
      realm_api_url_internal = "http://keycloak-http.${var.namespace}.svc/auth/admin/realms/${var.realm_id}"
      service-tokens = {
        for service, value in var.services : base64encode(random_password.conda_store_service_token[service].result) => value
      }
      # So that the mapping can be used in conda-store config itself
      service-tokens-mapping = {
        for service, _ in var.services : service => base64encode(random_password.conda_store_service_token[service].result)
      }
      extra-settings = var.extra-settings
      extra-config   = var.extra-config
    })
  }
}


resource "kubernetes_config_map" "conda-store-config" {
  metadata {
    name      = "conda-store-config"
    namespace = var.namespace
  }

  data = {
    "conda_store_config.py" = file("${path.module}/config/conda_store_config.py")
  }
}


module "conda-store-openid-client" {
  source = "../keycloak-client"

  realm_id     = var.realm_id
  client_id    = "conda_store"
  external-url = var.external-url
  role_mapping = {
    "superadmin" = ["conda_store_superadmin"]
    "admin"      = ["conda_store_admin"]
    "developer"  = ["conda_store_developer"]
    "analyst"    = ["conda_store_developer"]
  }
  callback-url-paths = [
    "https://${var.external-url}/conda-store/oauth_callback"
  ]
  service-accounts-enabled = true
  service-account-roles = [
    "view-realm", "view-users", "view-clients"
  ]
}


resource "kubernetes_service" "server" {
  metadata {
    name      = "${var.name}-conda-store-server"
    namespace = var.namespace
    labels = {
      app       = "conda-store"
      component = "conda-store-server"
    }
  }

  spec {
    selector = {
      role = "${var.name}-conda-store-server"
    }

    port {
      name = "conda-store-server"
      port = 5000
    }
  }
}


resource "kubernetes_deployment" "server" {
  metadata {
    name      = "${var.name}-conda-store-server"
    namespace = var.namespace
    labels = {
      role = "${var.name}-conda-store-server"
    }
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        role = "${var.name}-conda-store-server"
      }
    }

    template {
      metadata {
        labels = {
          role = "${var.name}-conda-store-server"
        }

        annotations = {
          # This lets us autorestart when the config changes!
          "checksum/config-map" = sha256(jsonencode(kubernetes_config_map.conda-store-config.data))
          "checksum/secret"     = sha256(jsonencode(kubernetes_secret.conda-store-secret.data))
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.node-group.key
                  operator = "In"
                  values = [
                    var.node-group.value
                  ]
                }
              }
            }
          }
        }

        container {
          name  = "conda-store-server"
          image = "${var.conda-store-image}:${var.conda-store-image-tag}"

          args = [
            "conda-store-server",
            "--config",
            "/etc/conda-store/conda_store_config.py"
          ]

          volume_mount {
            name       = "config"
            mount_path = "/etc/conda-store"
          }

          volume_mount {
            name       = "secret"
            mount_path = "/var/lib/conda-store/"
          }

          volume_mount {
            name       = "home-volume"
            mount_path = "/home/conda"
          }
        }

        volume {
          name = "config"
          config_map {
            name = kubernetes_config_map.conda-store-config.metadata.0.name
          }
        }

        volume {
          name = "secret"
          secret {
            secret_name = kubernetes_secret.conda-store-secret.metadata.0.name
          }
        }

        volume {
          name = "home-volume"
          empty_dir {
            size_limit = "1Mi"
          }
        }
      }
    }
  }
}


resource "kubernetes_manifest" "jupyterhub" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "conda-store-server"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`/conda-store`)"
          services = [
            {
              name = kubernetes_service.server.metadata.0.name
              port = 5000
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/shared-pvc.tf
---

module "conda-store-nfs-mount" {
  count  = var.conda-store-fs == "nfs" ? 1 : 0
  source = "../../../../modules/kubernetes/nfs-mount"

  name         = "conda-store"
  namespace    = var.namespace
  nfs_capacity = var.nfs_capacity
  nfs_endpoint = kubernetes_service.nfs.spec.0.cluster_ip
  nfs-pvc-name = local.conda-store-pvc-name

  depends_on = [
    kubernetes_deployment.worker,
  ]
}


locals {
  conda-store-pvc-name     = "conda-store-${var.namespace}-share"
  new-pvc-name             = "nebari-conda-store-storage"
  create-pvc               = var.conda-store-fs == "nfs"
  enable-nfs-server-worker = var.conda-store-fs == "nfs"
  pvc-name                 = var.conda-store-fs == "nfs" ? local.new-pvc-name : local.conda-store-pvc-name
  shared-pvc               = var.conda-store-fs == "nfs" ? module.conda-store-nfs-mount[0].persistent_volume_claim.pvc : module.conda-store-cephfs-mount[0].persistent_volume_claim.pvc
}



module "conda-store-cephfs-mount" {
  count  = var.conda-store-fs == "cephfs" ? 1 : 0
  source = "../../../../modules/kubernetes/cephfs-mount"

  name          = "conda-store"
  namespace     = var.namespace
  fs_capacity   = var.nfs_capacity # conda-store-filesystem-storage
  ceph-pvc-name = local.conda-store-pvc-name
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/storage.tf
---

module "minio" {
  source = "../minio"

  name         = "nebari-conda-store"
  namespace    = var.namespace
  external-url = var.external-url

  node-group = var.node-group

  storage = var.minio_capacity

  buckets = [
    "conda-store"
  ]
}


module "postgresql" {
  source = "../postgresql"

  name      = "nebari-conda-store"
  namespace = var.namespace

  node-group = var.node-group

  database = "conda-store"
}


module "redis" {
  source = "../redis"

  name      = "nebari-conda-store"
  namespace = var.namespace

  node-group = var.node-group
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/variables.tf
---

variable "name" {
  description = "Prefix name form conda-store server kubernetes resource"
  type        = string
}

variable "namespace" {
  description = "Namespace to deploy conda-store server"
  type        = string
}

variable "nfs_capacity" {
  description = "Capacity of conda-store filesystem"
  type        = string
  default     = "10Gi"
}

variable "minio_capacity" {
  description = "Capacity of conda-store object storage"
  type        = string
  default     = "10Gi"
}

variable "environments" {
  description = "conda environments for conda-store to build"
  type        = map(any)
  default     = {}
}

variable "node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}

variable "conda-store-image" {
  description = "Conda-Store image"
  type        = string
  default     = "quansight/conda-store-server"
}

variable "conda-store-image-tag" {
  description = "Version of conda-store to use"
  type        = string
}

variable "external-url" {
  description = "External url that jupyterhub cluster is accessible"
  type        = string
}

variable "realm_id" {
  description = "Keycloak realm to use for deploying openid client"
  type        = string
}

variable "extra-settings" {
  description = "Additional traitlets settings to apply before extra-config traitlets code is run"
  type        = map(any)
  default     = {}
}

variable "extra-config" {
  description = "Additional traitlets configuration code to be ran"
  type        = string
  default     = ""
}

variable "default-namespace-name" {
  description = "Name of the default conda-store namespace"
  type        = string
}

variable "services" {
  description = "Map of services tokens and scopes for conda-store"
  type        = map(any)
}

variable "conda-store-fs" {
  type        = string
  description = "Use NFS or Ceph"

  validation {
    condition     = contains(["cephfs", "nfs"], var.conda-store-fs)
    error_message = "Allowed values for input_parameter are \"cephfs\", or \"nfs\"."
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/conda-store/worker.tf
---

resource "kubernetes_service" "nfs" {
  metadata {
    name      = "${var.name}-conda-store-nfs"
    namespace = var.namespace
  }

  spec {
    selector = {
      role = "${var.name}-conda-store-worker"
    }

    port {
      name = "nfs"
      port = 2049
    }

    port {
      name = "mountd"
      port = 20048
    }

    port {
      name = "rpcbind"
      port = 111
    }
  }
}


resource "kubernetes_persistent_volume_claim" "main" {
  count = local.create-pvc ? 1 : 0

  metadata {
    name      = "${var.name}-conda-store-storage"
    namespace = var.namespace
  }

  spec {
    access_modes = ["ReadWriteOnce"]
    resources {
      requests = {
        storage = "${var.nfs_capacity}Gi"
      }
    }
  }
}


resource "kubernetes_config_map" "conda-store-environments" {
  metadata {
    name      = "conda-environments"
    namespace = var.namespace
  }

  data = var.environments
}


resource "kubernetes_deployment" "worker" {
  metadata {
    name      = "${var.name}-conda-store-worker"
    namespace = var.namespace
    labels = {
      role = "${var.name}-conda-store-worker"
    }
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        role = "${var.name}-conda-store-worker"
      }
    }

    template {
      metadata {
        labels = {
          role = "${var.name}-conda-store-worker"
        }

        annotations = {
          # This lets us autorestart when the config changes!
          "checksum/config-map"         = sha256(jsonencode(kubernetes_config_map.conda-store-config.data))
          "checksum/secret"             = sha256(jsonencode(kubernetes_secret.conda-store-secret.data))
          "checksum/conda-environments" = sha256(jsonencode(kubernetes_config_map.conda-store-environments.data))
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.node-group.key
                  operator = "In"
                  values = [
                    var.node-group.value
                  ]
                }
              }
            }
          }
        }

        container {
          name  = "conda-store-worker"
          image = "${var.conda-store-image}:${var.conda-store-image-tag}"

          args = [
            "conda-store-worker",
            "--config",
            "/etc/conda-store/conda_store_config.py"
          ]

          volume_mount {
            name       = "config"
            mount_path = "/etc/conda-store"
          }

          volume_mount {
            name       = "environments"
            mount_path = "/opt/environments"
          }

          volume_mount {
            name       = "storage"
            mount_path = "/home/conda"
          }

          volume_mount {
            name       = "secret"
            mount_path = "/var/lib/conda-store/"
          }
        }

        dynamic "container" {
          for_each = local.enable-nfs-server-worker ? [1] : []
          content {
            name  = "nfs-server"
            image = "gcr.io/google_containers/volume-nfs:0.8"

            port {
              name           = "nfs"
              container_port = 2049
            }

            port {
              name           = "mountd"
              container_port = 20048
            }

            port {
              name           = "rpcbind"
              container_port = 111
            }

            security_context {
              privileged = true
            }

            volume_mount {
              mount_path = "/exports"
              name       = "storage"
            }
          }
        }

        volume {
          name = "config"
          config_map {
            name = kubernetes_config_map.conda-store-config.metadata.0.name
          }
        }

        volume {
          name = "secret"
          secret {
            secret_name = kubernetes_secret.conda-store-secret.metadata.0.name
          }
        }

        volume {
          name = "environments"
          config_map {
            name = kubernetes_config_map.conda-store-environments.metadata.0.name
          }
        }

        volume {
          name = "storage"
          persistent_volume_claim {
            # on AWS the pvc gets stuck in a provisioning state if we
            # directly reference the pvc may no longer be issue in
            # future
            # claim_name = kubernetes_persistent_volume_claim.main.metadata.0.name
            claim_name = local.pvc-name
          }
        }
        security_context {
          run_as_group = 0
          run_as_user  = 0
        }
      }
    }
  }
  depends_on = [
    module.conda-store-cephfs-mount
  ]

  lifecycle {
    replace_triggered_by = [
      null_resource.pvc
    ]
  }
}

resource "null_resource" "pvc" {
  triggers = {
    pvc = var.conda-store-fs
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/files/controller_config.py
---

import json


def dask_gateway_config(path="/var/lib/dask-gateway/config.json"):
    with open(path) as f:
        return json.load(f)


config = dask_gateway_config()

c.KubeController.address = ":8000"
c.KubeController.api_url = f'http://{config["gateway_service_name"]}.{config["gateway_service_namespace"]}:8000/api'
c.KubeController.gateway_instance = config["gateway_service_name"]
c.KubeController.proxy_prefix = config["gateway"]["prefix"]
c.KubeController.proxy_web_middlewares = [
    {
        "name": config["gateway_cluster_middleware_name"],
        "namespace": config["gateway_cluster_middleware_namespace"],
    }
]
c.KubeController.log_level = config["controller"]["loglevel"]
c.KubeController.completed_cluster_max_age = config["controller"][
    "completedClusterMaxAge"
]
c.KubeController.completed_cluster_cleanup_period = config["controller"][
    "completedClusterCleanupPeriod"
]
c.KubeController.backoff_base_delay = config["controller"]["backoffBaseDelay"]
c.KubeController.backoff_max_delay = config["controller"]["backoffMaxDelay"]
c.KubeController.k8s_api_rate_limit = config["controller"]["k8sApiRateLimit"]
c.KubeController.k8s_api_rate_limit_burst = config["controller"]["k8sApiRateLimitBurst"]

c.KubeController.proxy_web_entrypoint = "websecure"
c.KubeController.proxy_tcp_entrypoint = "tcp"



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/files/gateway_config.py
---

import functools
import json
from pathlib import Path

import urllib3
from aiohttp import web
from dask_gateway_server.auth import JupyterHubAuthenticator
from dask_gateway_server.options import Mapping, Options, Select


def dask_gateway_config(path="/var/lib/dask-gateway/config.json"):
    with open(path) as f:
        return json.load(f)


config = dask_gateway_config()


c.DaskGateway.log_level = config["gateway"]["loglevel"]

# Configure addresses
c.DaskGateway.address = ":8000"
c.KubeBackend.api_url = f'http://{config["gateway_service_name"]}.{config["gateway_service_namespace"]}:8000/api'

c.DaskGateway.backend_class = "dask_gateway_server.backends.kubernetes.KubeBackend"
c.KubeBackend.gateway_instance = config["gateway_service_name"]

# ========= Dask Cluster Default Configuration =========
c.KubeClusterConfig.image = (
    f"{config['cluster-image']['name']}:{config['cluster-image']['tag']}"
)
c.KubeClusterConfig.image_pull_policy = config["cluster"]["image_pull_policy"]
c.KubeClusterConfig.environment = config["cluster"]["environment"]
c.KubeClusterConfig.idle_timeout = config["cluster"]["idle_timeout"]

c.KubeClusterConfig.scheduler_cores = config["cluster"]["scheduler_cores"]
c.KubeClusterConfig.scheduler_cores_limit = config["cluster"]["scheduler_cores_limit"]
c.KubeClusterConfig.scheduler_memory = config["cluster"]["scheduler_memory"]
c.KubeClusterConfig.scheduler_memory_limit = config["cluster"]["scheduler_memory_limit"]
c.KubeClusterConfig.scheduler_extra_container_config = config["cluster"][
    "scheduler_extra_container_config"
]
c.KubeClusterConfig.scheduler_extra_pod_config = config["cluster"][
    "scheduler_extra_pod_config"
]

c.KubeClusterConfig.worker_cores = config["cluster"]["worker_cores"]
c.KubeClusterConfig.worker_cores_limit = config["cluster"]["worker_cores_limit"]
c.KubeClusterConfig.worker_memory = config["cluster"]["worker_memory"]
c.KubeClusterConfig.worker_memory_limit = config["cluster"]["worker_memory_limit"]
c.KubeClusterConfig.worker_threads = config["cluster"].get(
    "worker_threads", config["cluster"]["worker_cores"]
)
c.KubeClusterConfig.worker_extra_container_config = config["cluster"][
    "worker_extra_container_config"
]
c.KubeClusterConfig.worker_extra_pod_config = config["cluster"][
    "worker_extra_pod_config"
]


# ============ Authentication =================
class NebariAuthentication(JupyterHubAuthenticator):
    async def authenticate(self, request):
        user = await super().authenticate(request)
        url = f"{self.jupyterhub_api_url}/users/{user.name}"
        kwargs = {
            "headers": {"Authorization": "token %s" % self.jupyterhub_api_token},
            "ssl": self.ssl_context,
        }
        resp = await self.session.get(url, **kwargs)
        data = (await resp.json())["auth_state"]["oauth_user"]

        if (
            "dask_gateway_developer" not in data["roles"]
            and "dask_gateway_admin" not in data["roles"]
        ):
            raise web.HTTPInternalServerError(
                reason="Permission failure user does not have required dask_gateway roles"
            )

        user.admin = "dask_gateway_admin" in data["roles"]
        user.groups = [Path(group).name for group in data["groups"]]
        return user


c.DaskGateway.authenticator_class = NebariAuthentication
c.JupyterHubAuthenticator.jupyterhub_api_url = config["jupyterhub_api_url"]
c.JupyterHubAuthenticator.jupyterhub_api_token = config["jupyterhub_api_token"]


# ==================== Profiles =======================
def list_dask_environments():
    necessary_dask_packages = {"dask", "distributed", "dask-gateway"}
    token = config["conda-store-api-token"]
    conda_store_service_name, conda_store_service_port = config[
        "conda-store-service-name"
    ].split(":")
    conda_store_endpoint = f"{conda_store_service_name}.{config['conda-store-namespace']}.svc:{conda_store_service_port}"
    environment_endpoint = "/conda-store/api/v1/environment/"
    query_params = f"?packages={'&packages='.join(necessary_dask_packages)}"

    url = "http://" + conda_store_endpoint + environment_endpoint + query_params

    http = urllib3.PoolManager()
    response = http.request("GET", url, headers={"Authorization": f"Bearer {token}"})

    # parse response
    j = json.loads(response.data.decode("UTF-8"))
    return [
        (conda_env["namespace"]["name"], conda_env["name"])
        for conda_env in j.get("data", [])
    ]


def base_node_group(options):
    key = config["worker-node-group"]["key"]
    if config.get("provider", "") == "aws":
        key = "dedicated"
    default_node_group = {key: config["worker-node-group"]["value"]}

    # check `worker_extra_pod_config` first
    worker_node_group = (
        config["profiles"][options.profile]
        .get("worker_extra_pod_config", {})
        .get("nodeSelector")
    )
    worker_node_group = (
        default_node_group if worker_node_group is None else worker_node_group
    )

    # check `scheduler_extra_pod_config` first
    scheduler_node_group = (
        config["profiles"][options.profile]
        .get("scheduler_extra_pod_config", {})
        .get("nodeSelector")
    )
    scheduler_node_group = (
        default_node_group if scheduler_node_group is None else scheduler_node_group
    )

    return {
        "scheduler_extra_pod_config": {"nodeSelector": scheduler_node_group},
        "worker_extra_pod_config": {"nodeSelector": worker_node_group},
    }


def base_conda_store_mounts(namespace, name):
    conda_store_pvc_name = config["conda-store-pvc"]
    conda_store_mount = Path(config["conda-store-mount"])

    return {
        "scheduler_extra_pod_config": {
            "volumes": [
                {
                    "name": "conda-store",
                    "persistentVolumeClaim": {
                        "claimName": conda_store_pvc_name,
                    },
                }
            ]
        },
        "scheduler_extra_container_config": {
            "volumeMounts": [
                {
                    "mountPath": str(conda_store_mount / namespace),
                    "name": "conda-store",
                    "subPath": namespace,
                }
            ]
        },
        "worker_extra_pod_config": {
            "volumes": [
                {
                    "name": "conda-store",
                    "persistentVolumeClaim": {
                        "claimName": conda_store_pvc_name,
                    },
                }
            ]
        },
        "worker_extra_container_config": {
            "volumeMounts": [
                {
                    "mountPath": str(conda_store_mount / namespace),
                    "name": "conda-store",
                    "subPath": namespace,
                }
            ]
        },
        "worker_cmd": "/opt/conda-run-worker",
        "scheduler_cmd": "/opt/conda-run-scheduler",
        "environment": {
            "CONDA_ENVIRONMENT": str(conda_store_mount / namespace / "envs" / name),
            "BOKEH_RESOURCES": "cdn",
        },
    }


def base_username_mount(username, uid=1000, gid=100):
    return {
        "scheduler_extra_pod_config": {"volumes": [{"name": "home", "emptyDir": {}}]},
        "scheduler_extra_container_config": {
            "securityContext": {"runAsUser": uid, "runAsGroup": gid, "fsGroup": gid},
            "workingDir": f"/home/{username}",
            "volumeMounts": [
                {
                    "mountPath": f"/home/{username}",
                    "name": "home",
                }
            ],
        },
        "worker_extra_pod_config": {"volumes": [{"name": "home", "emptyDir": {}}]},
        "worker_extra_container_config": {
            "securityContext": {"runAsUser": uid, "runAsGroup": gid, "fsGroup": gid},
            "workingDir": f"/home/{username}",
            "volumeMounts": [
                {
                    "mountPath": f"/home/{username}",
                    "name": "home",
                }
            ],
        },
        "environment": {
            "HOME": f"/home/{username}",
        },
    }


def worker_profile(options, user):
    namespace, name = options.conda_environment.split("/")
    return functools.reduce(
        deep_merge,
        [
            base_node_group(options),
            base_conda_store_mounts(namespace, name),
            base_username_mount(user.name),
            config["profiles"][options.profile],
            {"environment": {**options.environment_vars}},
        ],
        {},
    )


def user_options(user):
    default_namespace = config["default-conda-store-namespace"]
    allowed_namespaces = set(
        [default_namespace, "global", user.name] + list(user.groups)
    )
    conda_environments = []
    for namespace, name in list_dask_environments():
        if namespace not in allowed_namespaces:
            continue
        conda_environments.append(f"{namespace}/{namespace}-{name}")

    args = []
    if conda_environments:
        args += [
            Select(
                "conda_environment",
                conda_environments,
                default=conda_environments[0],
                label="Environment",
            )
        ]
    if config["profiles"]:
        args += [
            Select(
                "profile",
                list(config["profiles"].keys()),
                default=list(config["profiles"].keys())[0],
                label="Cluster Profile",
            )
        ]

    args += [
        Mapping("environment_vars", {}, label="Environment Variables"),
    ]

    return Options(
        *args,
        handler=worker_profile,
    )


c.Backend.cluster_options = user_options


# ============== utils ============
def deep_merge(d1, d2):
    """Deep merge two dictionaries.
    >>> value_1 = {
    'a': [1, 2],
    'b': {'c': 1, 'z': [5, 6]},
    'e': {'f': {'g': {}}},
    'm': 1,
    }.

    >>> value_2 = {
        'a': [3, 4],
        'b': {'d': 2, 'z': [7]},
        'e': {'f': {'h': 1}},
        'm': [1],
    }

    >>> print(deep_merge(value_1, value_2))
    {'m': 1, 'e': {'f': {'g': {}, 'h': 1}}, 'b': {'d': 2, 'c': 1, 'z': [5, 6, 7]}, 'a': [1, 2, 3,  4]}
    """
    if isinstance(d1, dict) and isinstance(d2, dict):
        d3 = {}
        for key in d1.keys() | d2.keys():
            if key in d1 and key in d2:
                d3[key] = deep_merge(d1[key], d2[key])
            elif key in d1:
                d3[key] = d1[key]
            elif key in d2:
                d3[key] = d2[key]
        return d3
    elif isinstance(d1, list) and isinstance(d2, list):
        return [*d1, *d2]
    else:  # if they don't match use left one
        return d1



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/controller.tf
---

resource "kubernetes_config_map" "controller" {
  metadata {
    name      = "${var.name}-daskgateway-controller"
    namespace = var.namespace
  }

  data = {
    "dask_gateway_config.py" = file("${path.module}/files/controller_config.py")
  }
}

resource "kubernetes_service_account" "controller" {
  metadata {
    name      = "${var.name}-daskgateway-controller"
    namespace = var.namespace
  }
}


resource "kubernetes_cluster_role" "controller" {
  metadata {
    name = "${var.name}-daskgateway-controller"
  }

  rule {
    api_groups = ["gateway.dask.org"]
    resources  = ["daskclusters", "daskclusters/status"]
    verbs      = ["*"]
  }

  rule {
    api_groups = ["traefik.containo.us"]
    resources  = ["ingressroutes", "ingressroutetcps"]
    verbs      = ["get", "create", "delete"]
  }

  rule {
    api_groups = [""]
    resources  = ["pods"]
    verbs      = ["get", "list", "watch", "create", "delete"]
  }

  rule {
    api_groups = [""]
    resources  = ["endpoints"]
    verbs      = ["get", "list", "watch"]
  }

  rule {
    api_groups = [""]
    resources  = ["secrets", "services"]
    verbs      = ["create", "delete"]
  }
}


resource "kubernetes_cluster_role_binding" "controller" {
  metadata {
    name = "${var.name}-daskgateway-controller"
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = kubernetes_deployment.controller.metadata.0.name
  }
  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_deployment.controller.metadata.0.name
    namespace = var.namespace
  }
}


resource "kubernetes_deployment" "controller" {
  metadata {
    name      = "${var.name}-daskgateway-controller"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    strategy {
      type = "Recreate"
    }

    selector {
      match_labels = {
        "app.kubernetes.io/component" = "dask-gateway-controller"
      }
    }

    template {
      metadata {
        labels = {
          "app.kubernetes.io/component" = "dask-gateway-controller"
        }

        annotations = {
          # This lets us autorestart when the secret changes!
          "checksum/config-map" = sha256(jsonencode(kubernetes_config_map.controller.data))
          "checksum/secret"     = sha256(jsonencode(kubernetes_secret.gateway.data))
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.general-node-group.key
                  operator = "In"
                  values   = [var.general-node-group.value]
                }
              }
            }
          }
        }

        service_account_name            = kubernetes_service_account.controller.metadata.0.name
        automount_service_account_token = true

        volume {
          name = "configmap"
          config_map {
            name = kubernetes_config_map.controller.metadata.0.name
          }
        }

        volume {
          name = "secret"
          secret {
            secret_name = kubernetes_secret.gateway.metadata.0.name
          }
        }

        container {
          image = "${var.controller-image.name}:${var.controller-image.tag}"
          name  = "${var.name}-daskgateway-controller"

          command = [
            "dask-gateway-server",
            "kube-controller",
            "--config",
            "/etc/dask-gateway/dask_gateway_config.py"
          ]

          volume_mount {
            name       = "configmap"
            mount_path = "/etc/dask-gateway/"
          }

          volume_mount {
            name       = "secret"
            mount_path = "/var/lib/dask-gateway/"
          }

          port {
            name           = "api"
            container_port = 8000
          }
        }
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/crds.tf
---

resource "kubernetes_manifest" "main" {
  manifest = {
    apiVersion = "apiextensions.k8s.io/v1"
    kind       = "CustomResourceDefinition"
    metadata = {
      name = "daskclusters.gateway.dask.org"
    }
    spec = {
      group = "gateway.dask.org"
      names = {
        kind     = "DaskCluster"
        listKind = "DaskClusterList"
        plural   = "daskclusters"
        singular = "daskcluster"
      }
      scope = "Namespaced"
      versions = [{
        name    = "v1alpha1"
        served  = true
        storage = true
        subresources = {
          status = {}
        }

        # NOTE: While we define a schema, it is a dummy schema that doesn't
        #       validate anything. We just have it to comply with the schema of
        #       a CustomResourceDefinition that requires it.
        #
        #       A decision has been made to not implement an actual schema at
        #       this point in time due to the additional maintenance work it
        #       would require.
        #
        #       Reference: https://github.com/dask/dask-gateway/issues/434
        #
        schema = {
          openAPIV3Schema = {
            type = "object"
            # FIXME: Make this an actual schema instead of this dummy schema that
            #        is a workaround to meet the requirement of having a schema.
            x-kubernetes-preserve-unknown-fields = true
          }
        }
      }]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/gateway.tf
---

resource "kubernetes_secret" "gateway" {
  metadata {
    name      = "${var.name}-daskgateway-gateway"
    namespace = var.namespace
  }

  data = {
    "config.json" = jsonencode({
      jupyterhub_api_token                 = var.jupyterhub_api_token
      jupyterhub_api_url                   = var.jupyterhub_api_url
      gateway_service_name                 = kubernetes_service.gateway.metadata.0.name
      gateway_service_namespace            = kubernetes_service.gateway.metadata.0.namespace
      gateway_cluster_middleware_name      = kubernetes_manifest.chain-middleware.manifest.metadata.name
      gateway_cluster_middleware_namespace = kubernetes_manifest.chain-middleware.manifest.metadata.namespace
      gateway                              = var.gateway
      controller                           = var.controller
      cluster                              = var.cluster
      cluster-image                        = var.cluster-image
      profiles                             = var.profiles
      default-conda-store-namespace        = var.default-conda-store-namespace
      conda-store-pvc                      = var.conda-store-pvc.name
      conda-store-mount                    = var.conda-store-mount
      worker-node-group                    = var.worker-node-group
      conda-store-api-token                = var.conda-store-api-token
      conda-store-service-name             = var.conda-store-service-name
      conda-store-namespace                = var.namespace
      provider                             = var.cloud-provider
    })
  }
}


resource "kubernetes_config_map" "gateway" {
  metadata {
    name      = "${var.name}-daskgateway-gateway"
    namespace = var.namespace
  }

  data = {
    "dask_gateway_config.py" = file("${path.module}/files/gateway_config.py")
  }
}


resource "kubernetes_service_account" "gateway" {
  metadata {
    name      = "${var.name}-daskgateway-gateway"
    namespace = var.namespace
  }
}


resource "kubernetes_cluster_role" "gateway" {
  metadata {
    name = "${var.name}-daskgateway-gateway"
  }

  rule {
    api_groups = [""]
    resources  = ["secrets"]
    verbs      = ["get"]
  }

  rule {
    api_groups = ["gateway.dask.org"]
    resources  = ["daskclusters"]
    verbs      = ["*"]
  }
}


resource "kubernetes_cluster_role_binding" "gateway" {
  metadata {
    name = "${var.name}-daskgateway-gateway"
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = kubernetes_cluster_role.gateway.metadata.0.name
  }
  subject {
    kind      = "ServiceAccount"
    name      = kubernetes_service_account.gateway.metadata.0.name
    namespace = var.namespace
  }
}


resource "kubernetes_service" "gateway" {
  metadata {
    name      = "${var.name}-dask-gateway-gateway-api"
    namespace = var.namespace
  }

  spec {
    selector = {
      "app.kubernetes.io/component" = "dask-gateway-gateway"
    }

    port {
      name        = "api"
      protocol    = "TCP"
      port        = 8000
      target_port = 8000
    }

    type = "ClusterIP"
  }
}


resource "kubernetes_deployment" "gateway" {
  metadata {
    name      = "${var.name}-daskgateway-gateway"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        "app.kubernetes.io/component" = "dask-gateway-gateway"
      }
    }

    template {
      metadata {
        labels = {
          "app.kubernetes.io/component" = "dask-gateway-gateway"
        }

        annotations = {
          # This lets us autorestart when the secret changes!
          "checksum/config-map" = sha256(jsonencode(kubernetes_config_map.gateway.data))
          "checksum/secret"     = sha256(jsonencode(kubernetes_secret.gateway.data))
        }
      }

      spec {
        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.general-node-group.key
                  operator = "In"
                  values   = [var.general-node-group.value]
                }
              }
            }
          }
        }

        volume {
          name = "configmap"
          config_map {
            name = kubernetes_config_map.gateway.metadata.0.name
          }
        }

        volume {
          name = "secret"
          secret {
            secret_name = kubernetes_secret.gateway.metadata.0.name
          }
        }

        volume {
          name = "conda-store"
          persistent_volume_claim {
            claim_name = var.conda-store-pvc.name
          }
        }

        service_account_name            = kubernetes_service_account.gateway.metadata.0.name
        automount_service_account_token = true

        container {
          image = "${var.gateway-image.name}:${var.gateway-image.tag}"
          name  = var.name

          command = [
            "dask-gateway-server",
            "--config",
            "/etc/dask-gateway/dask_gateway_config.py"
          ]

          volume_mount {
            name       = "configmap"
            mount_path = "/etc/dask-gateway/"
          }

          volume_mount {
            name       = "secret"
            mount_path = "/var/lib/dask-gateway/"
          }

          volume_mount {
            name       = "conda-store"
            mount_path = var.conda-store-mount
          }

          port {
            name           = "api"
            container_port = 8000
          }

          resources {
            limits = {
              cpu    = "0.5"
              memory = "512Mi"
            }
            requests = {
              cpu    = "250m"
              memory = "50Mi"
            }
          }

          liveness_probe {
            http_get {
              path = "/api/health"
              port = "api"
            }

            initial_delay_seconds = 5
            timeout_seconds       = 2
            period_seconds        = 10
            failure_threshold     = 6
          }

          readiness_probe {
            http_get {
              path = "/api/health"
              port = "api"
            }

            initial_delay_seconds = 5
            timeout_seconds       = 2
            period_seconds        = 10
            failure_threshold     = 3
          }
        }
      }
    }
  }

  lifecycle {
    replace_triggered_by = [null_resource.conda-store-pvc]
  }
}

resource "null_resource" "conda-store-pvc" {
  triggers = {
    pvc = var.conda-store-pvc.id
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/main.tf
---

resource "kubernetes_config_map" "dask-etc" {
  metadata {
    name      = "dask-etc"
    namespace = var.namespace
  }

  data = {
    "gateway.yaml" = jsonencode({
      gateway = {
        address        = "http://${kubernetes_service.gateway.metadata.0.name}.${kubernetes_service.gateway.metadata.0.namespace}:8000"
        public_address = "https://${var.external-url}/gateway"
        proxy_address  = "tcp://${var.external-url}:8786"

        auth = {
          type = "jupyterhub"
        }
      }
    })
    "dashboard.yaml" = jsonencode({})
  }
}

resource "kubernetes_manifest" "dask-gateway" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "dask-gateway"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`/gateway/`)"

          middlewares = [
            {
              name      = "nebari-dask-gateway-gateway-api"
              namespace = var.namespace
            }
          ]

          services = [
            {
              name = kubernetes_service.gateway.metadata.0.name
              port = 8000
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/middleware.tf
---

resource "kubernetes_manifest" "gateway-middleware" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebari-dask-gateway-gateway-api"
      namespace = var.namespace
    }
    spec = {
      stripPrefixRegex = {
        regex = [
          "/gateway"
        ]
      }
    }
  }
}

# Create one chain middleware for the IngressRoutes that will be
# dynamically created by Dask Gateway The chain combines
# traefik-forward-auth and stripprefix middleware defined below.

resource "kubernetes_manifest" "chain-middleware" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebari-dask-gateway-chain" # Updated name to -chain from -cluster to avoid upgrade confusion
      namespace = var.namespace
    }
    spec = {
      chain = {
        middlewares = [
          {
            name      = var.forwardauth_middleware_name
            namespace = var.namespace
          },
          {
            name      = kubernetes_manifest.cluster-middleware-stripprefix.manifest.metadata.name
            namespace = var.namespace
          }
        ]
      }
    }
  }
}

resource "kubernetes_manifest" "cluster-middleware-stripprefix" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebari-dask-gateway-cluster-stripprefix"
      namespace = var.namespace
    }
    spec = {
      stripPrefixRegex = {
        regex = [
          "/gateway/clusters/[a-zA-Z0-9.-]+"
        ]
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/outputs.tf
---

output "config" {
  description = "dask gateway /etc/dask/dask-gateway.yaml configuration"
  value = {
    gateway = {
      address        = "http://${kubernetes_service.gateway.metadata.0.name}.${kubernetes_service.gateway.metadata.0.namespace}:8000"
      public_address = "https://${var.external-url}/gateway"
      proxy_address  = "tcp://${var.external-url}:8786"

      auth = {
        type = "jupyterhub"
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/variables.tf
---

variable "name" {
  description = "name prefix to assign to dask-gateway"
  type        = string
  default     = "nebari"
}

variable "namespace" {
  description = "namespace to deploy dask-gateway"
  type        = string
}

variable "jupyterhub_api_token" {
  description = "jupyterhub api token for dask-gateway"
  type        = string
}

variable "jupyterhub_api_url" {
  description = "jupyterhub api url for dask-gateway"
  type        = string
}

variable "external-url" {
  description = "External public url that dask-gateway cluster is accessible"
  type        = string
}

variable "gateway-image" {
  description = "dask gateway image to use for gateway"
  type = object({
    name = string
    tag  = string
  })
  default = {
    name = "ghcr.io/dask/dask-gateway-server"
    tag  = "2022.4.0"
  }
}

variable "controller-image" {
  description = "dask gateway image to use for controller"
  type = object({
    name = string
    tag  = string
  })
  default = {
    name = "ghcr.io/dask/dask-gateway-server"
    tag  = "2022.4.0"
  }
}

variable "cluster-image" {
  description = "default dask gateway image to use for cluster"
  type = object({
    name = string
    tag  = string
  })
  default = {
    name = "ghcr.io/dask/dask-gateway"
    tag  = "2022.4.0"
  }
}

variable "general-node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}

variable "worker-node-group" {
  description = "Node group key value pair for bound worker resources"
  type = object({
    key   = string
    value = string
  })
}

variable "dask-etc-configmap-name" {
  description = "Name for dask-etc configuration resource"
  type        = string
}

variable "gateway" {
  description = "gateway configuration"
  type = object({
    loglevel = string
    # Path prefix to serve dask-gateway api requests under This prefix
    # will be added to all routes the gateway manages in the traefik
    # proxy.
    prefix = string
  })
  default = {
    loglevel = "INFO"
    prefix   = "/gateway"
  }
}

variable "controller" {
  description = "controller configuration"
  type = object({
    loglevel = string
    # Max time (in seconds) to keep around records of completed clusters.
    # Default is 24 hours.
    completedClusterMaxAge = number
    # Time (in seconds) between cleanup tasks removing records of completed
    # clusters. Default is 5 minutes.
    completedClusterCleanupPeriod = number
    # Base delay (in seconds) for backoff when retrying after failures.
    backoffBaseDelay = number
    # Max delay (in seconds) for backoff when retrying after failures.
    backoffMaxDelay = number
    # Limit on the average number of k8s api calls per second.
    k8sApiRateLimit = number
    # Limit on the maximum number of k8s api calls per second.
    k8sApiRateLimitBurst = number
  })
  default = {
    loglevel                      = "INFO"
    completedClusterMaxAge        = 86400
    completedClusterCleanupPeriod = 600
    backoffBaseDelay              = 0.1
    backoffMaxDelay               = 300
    k8sApiRateLimit               = 50
    k8sApiRateLimitBurst          = 100
  }
}

variable "cluster" {
  description = "dask gateway cluster defaults"
  type = object({
    # scheduler configuration
    scheduler_cores                  = number
    scheduler_cores_limit            = number
    scheduler_memory                 = string
    scheduler_memory_limit           = string
    scheduler_extra_container_config = any
    scheduler_extra_pod_config       = any
    # worker configuration
    worker_cores                  = number
    worker_cores_limit            = number
    worker_memory                 = string
    worker_memory_limit           = string
    worker_extra_container_config = any
    worker_extra_pod_config       = any
    # additional fields
    idle_timeout      = number
    image_pull_policy = string
    environment       = map(string)
  })
  default = {
    # scheduler configuration
    scheduler_cores                  = 1
    scheduler_cores_limit            = 1
    scheduler_memory                 = "2 G"
    scheduler_memory_limit           = "2 G"
    scheduler_extra_container_config = {}
    scheduler_extra_pod_config       = {}
    # worker configuration
    worker_cores                  = 1
    worker_cores_limit            = 1
    worker_memory                 = "2 G"
    worker_memory_limit           = "2 G"
    worker_extra_container_config = {}
    worker_extra_pod_config       = {}
    # additional fields
    idle_timeout      = 1800 # 30 minutes
    image_pull_policy = "IfNotPresent"
    environment       = {}
  }
}

variable "profiles" {
  description = "Dask Gateway Profiles"
  default     = []
}

variable "conda-store-pvc" {
  description = "Name for persistent volume claim to use for conda-store directory"
  type = object({
    name = string
    id   = string
  })
}

variable "conda-store-mount" {
  description = "Mount directory for conda-store environments"
  type        = string
}

variable "default-conda-store-namespace" {
  description = "Default conda-store namespace"
  type        = string
}

variable "conda-store-api-token" {
  description = "Service token for conda-store api"
  type        = string
}

variable "conda-store-service-name" {
  description = "internal service-name:port where conda-store can be reached"
  type        = string
}

variable "cloud-provider" {
  description = "Name of the cloud provider to deploy to."
  type        = string
}

variable "forwardauth_middleware_name" {
  type = string
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/ipython/ipython_config.py
---

import os

# Disable history manager, we don't really use it and by default it
# puts an sqlite file on NFS, which is not something we want to do
c.Historymanager.enabled = False


# Change default umask for all subprocesses of the notebook server if
# set in the environment
if "NB_UMASK" in os.environ:
    os.umask(int(os.environ["NB_UMASK"], 8))



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyter/jupyter_jupyterlab_pioneer_config.py.tpl
---

import logging
import json


default_log_format = "%(asctime)s %(levelname)9s %(lineno)4s %(module)s: %(message)s"
log_format = "${log_format}"

logging.basicConfig(
    level=logging.INFO,
    format=log_format if log_format else default_log_format
)

logger = logging.getLogger(__name__)

CUSTOM_EXPORTER_NAME = "MyCustomExporter"


def my_custom_exporter(args):
    """Custom exporter to log JupyterLab events to command line."""
    logger.info(json.dumps(args.get("data")))
    return {
        "exporter": CUSTOM_EXPORTER_NAME,
        "message": ""
    }


c.JupyterLabPioneerApp.exporters = [
    {
        # sends telemetry data to the browser console
        "type": "console_exporter",
    },
    {
        # sends telemetry data (json) to the python console jupyter is running on
        "type": "custom_exporter",
        "args": {
            "id": CUSTOM_EXPORTER_NAME
            # add additional args for your exporter function here
        },
    }
]

c.JupyterLabPioneerApp.custom_exporter = {
    CUSTOM_EXPORTER_NAME: my_custom_exporter,
}

c.JupyterLabPioneerApp.activeEvents = [
    {"name": "ActiveCellChangeEvent", "logWholeNotebook": False},
    {"name": "CellAddEvent", "logWholeNotebook": False},
    {"name": "CellEditEvent", "logWholeNotebook": False},
    {"name": "CellExecuteEvent", "logWholeNotebook": False},
    {"name": "CellRemoveEvent", "logWholeNotebook": False},
    {"name": "ClipboardCopyEvent", "logWholeNotebook": False},
    {"name": "ClipboardCutEvent", "logWholeNotebook": False},
    {"name": "ClipboardPasteEvent", "logWholeNotebook": False},
    {"name": "NotebookHiddenEvent", "logWholeNotebook": False},
    {"name": "NotebookOpenEvent", "logWholeNotebook": False},
    {"name": "NotebookSaveEvent", "logWholeNotebook": False},
    {"name": "NotebookScrollEvent", "logWholeNotebook": False},
    {"name": "NotebookVisibleEvent", "logWholeNotebook": False},
]



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyter/jupyter_server_config.py.tpl
---

# To help jupyterhub-idle-culler cull user servers, we configure the kernel manager to cull
# idle kernels that would otherwise make the user servers report themselves as active which
# is part of what jupyterhub-idle-culler considers.

# Extra config available at:
# https://zero-to-jupyterhub.readthedocs.io/en/1.x/jupyterhub/customizing/user-management.html#culling-user-pods

# Refuse to serve content from handlers missing authentication guards, unless
# the handler is explicitly allow-listed with `@allow_unauthenticated`; this
# prevents accidental exposure of information by extensions installed in the
# single-user server when their handlers are missing authentication guards.
c.ServerApp.allow_unauthenticated_access = False

# Enable Show Hidden Files menu option in View menu
c.ContentsManager.allow_hidden = True
c.FileContentsManager.allow_hidden = True

# Set the preferred path for the frontend to start in
c.FileContentsManager.preferred_dir = "${jupyterlab_preferred_dir}"

# Timeout (in seconds) in which a terminal has been inactive and ready to
# be culled.
c.TerminalManager.cull_inactive_timeout = ${terminal_cull_inactive_timeout} * 60

# The interval (in seconds) on which to check for terminals exceeding the
# inactive timeout value.
c.TerminalManager.cull_interval = ${terminal_cull_interval} * 60

# cull_idle_timeout: timeout (in seconds) after which an idle kernel is
# considered ready to be culled
c.MappingKernelManager.cull_idle_timeout = ${kernel_cull_idle_timeout} * 60

# cull_interval: the interval (in seconds) on which to check for idle
# kernels exceeding the cull timeout value
c.MappingKernelManager.cull_interval = ${kernel_cull_interval} * 60

# cull_connected: whether to consider culling kernels which have one
# or more connections
c.MappingKernelManager.cull_connected = ${kernel_cull_connected}

# cull_busy: whether to consider culling kernels which are currently
# busy running some code
c.MappingKernelManager.cull_busy = ${kernel_cull_busy}

# Shut down the server after N seconds with no kernels or terminals
# running and no activity.
c.NotebookApp.shutdown_no_activity_timeout = ${server_shutdown_no_activity_timeout} * 60

###############################################################################
# JupyterHub idle culler total timeout corresponds (approximately) to:
# max(cull_idle_timeout, cull_inactive_timeout) + shutdown_no_activity_timeout

from argo_jupyter_scheduler.executor import ArgoExecutor
from argo_jupyter_scheduler.scheduler import ArgoScheduler

c.Scheduler.execution_manager_class=ArgoExecutor
c.SchedulerApp.scheduler_class=ArgoScheduler
c.SchedulerApp.scheduler_class.use_conda_store_env=True



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterhub/01-theme.py
---

from nebari_jupyterhub_theme import theme_extra_handlers, theme_template_paths

c.JupyterHub.extra_handlers.extend(theme_extra_handlers)
c.JupyterHub.template_paths.extend(theme_template_paths)

import z2jh  # noqa: E402

jupyterhub_theme = z2jh.get_config("custom.theme")

c.JupyterHub.template_vars = {
    **jupyterhub_theme,
}

if z2jh.get_config("custom.jhub-apps-enabled"):
    from jhub_apps import themes

    c.JupyterHub.template_vars = {**themes.DEFAULT_THEME, **jupyterhub_theme}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterhub/02-spawner.py
---

import inspect
import json

import kubernetes.client.models
from tornado import gen

kubernetes.client.models.V1EndpointPort = (
    kubernetes.client.models.CoreV1EndpointPort
)  # noqa: E402

from kubespawner import KubeSpawner  # noqa: E402

# conda-store default page size
DEFAULT_PAGE_SIZE_LIMIT = 100


@gen.coroutine
def get_username_hook(spawner):
    auth_state = yield spawner.user.get_auth_state()
    username = auth_state["oauth_user"]["preferred_username"]

    spawner.environment.update(
        {
            "PREFERRED_USERNAME": username,
        }
    )


def get_total_records(url: str, token: str) -> int:
    import urllib3

    http = urllib3.PoolManager()
    response = http.request("GET", url, headers={"Authorization": f"Bearer {token}"})
    decoded_response = json.loads(response.data.decode("UTF-8"))
    return decoded_response.get("count", 0)


def generate_paged_urls(base_url: str, total_records: int, page_size: int) -> list[str]:
    import math

    urls = []
    # pages starts at 1
    for page in range(1, math.ceil(total_records / page_size) + 1):
        urls.append(f"{base_url}?size={page_size}&page={page}")

    return urls


# TODO: this should get unit tests. Currently, since this is not a python module,
# adding tests in a traditional sense is not possible. See https://github.com/soapy1/nebari/tree/try-unit-test-spawner
# for a demo on one approach to adding test.
def get_conda_store_environments(user_info: dict):
    import os

    import urllib3

    # Check for the environment variable `CONDA_STORE_API_PAGE_SIZE_LIMIT`. Fall
    # back to using the default page size limit if not set.
    page_size = os.environ.get(
        "CONDA_STORE_API_PAGE_SIZE_LIMIT", DEFAULT_PAGE_SIZE_LIMIT
    )

    external_url = z2jh.get_config("custom.conda-store-service-name")
    token = z2jh.get_config("custom.conda-store-jhub-apps-token")
    endpoint = "conda-store/api/v1/environment"

    base_url = f"http://{external_url}/{endpoint}/"
    http = urllib3.PoolManager()

    # get total number of records from the endpoint
    total_records = get_total_records(base_url, token)

    # will contain all the environment info returned from the api
    env_data = []

    # generate a list of urls to hit to build the response
    urls = generate_paged_urls(base_url, total_records, page_size)

    # get content from urls
    for url in urls:
        response = http.request(
            "GET", url, headers={"Authorization": f"Bearer {token}"}
        )
        decoded_response = json.loads(response.data.decode("UTF-8"))
        env_data += decoded_response.get("data", [])

    # Filter and return conda environments for the user
    return [f"{env['namespace']['name']}-{env['name']}" for env in env_data]


c.Spawner.pre_spawn_hook = get_username_hook

c.JupyterHub.allow_named_servers = False
c.JupyterHub.spawner_class = KubeSpawner

if z2jh.get_config("custom.jhub-apps-enabled"):
    from jhub_apps import theme_template_paths
    from jhub_apps.configuration import install_jhub_apps

    domain = z2jh.get_config("custom.external-url")
    hub_url = f"https://{domain}"
    c.JupyterHub.bind_url = hub_url
    c.JupyterHub.default_url = "/hub/home"
    c.Spawner.debug = True

    c.JAppsConfig.conda_envs = get_conda_store_environments
    c.JAppsConfig.jupyterhub_config_path = (
        "/usr/local/etc/jupyterhub/jupyterhub_config.py"
    )
    c.JAppsConfig.hub_host = "hub"
    c.JAppsConfig.service_workers = 4

    jhub_apps_overrides = json.loads(z2jh.get_config("custom.jhub-apps-overrides"))
    for config_key, config_value in jhub_apps_overrides.items():
        setattr(c.JAppsConfig, config_key, config_value)

    def service_for_jhub_apps(name, url):
        return {
            "name": name,
            "display": True,
            "info": {
                "name": name,
                "url": url,
                "external": True,
            },
        }

    c.JupyterHub.services.extend(
        [
            service_for_jhub_apps(name="Argo", url="/argo"),
            service_for_jhub_apps(name="Users", url="/auth/admin/nebari/console/"),
            service_for_jhub_apps(name="Environments", url="/conda-store"),
            service_for_jhub_apps(name="Monitoring", url="/monitoring"),
        ]
    )

    c.JupyterHub.template_paths = theme_template_paths

    kwargs = {}
    jhub_apps_signature = inspect.signature(install_jhub_apps)
    if "oauth_no_confirm" in jhub_apps_signature.parameters:
        kwargs["oauth_no_confirm"] = True

    c = install_jhub_apps(c, spawner_to_subclass=KubeSpawner, **kwargs)



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterhub/03-profiles.py
---

import ast
import copy
import functools
import json
from pathlib import Path

import z2jh
from tornado import gen


def base_profile_home_mounts(username):
    """Configure the home directory mount for user.

    Ensure that user directory exists and user has permissions to
    read/write/execute.

    """
    home_pvc_name = z2jh.get_config("custom.home-pvc")
    skel_mount = z2jh.get_config("custom.skel-mount")
    pvc_home_mount_path = "home/{username}"
    pod_home_mount_path = "/home/{username}"

    extra_pod_config = {
        "volumes": [
            {
                "name": "home",
                "persistentVolumeClaim": {
                    "claimName": home_pvc_name,
                },
            },
            {
                "name": "skel",
                "configMap": {
                    "name": skel_mount["name"],
                },
            },
        ]
    }

    extra_container_config = {
        "volumeMounts": [
            {
                "mountPath": pod_home_mount_path.format(username=username),
                "name": "home",
                "subPath": pvc_home_mount_path.format(username=username),
            },
        ]
    }

    MKDIR_OWN_DIRECTORY = (
        "mkdir -p /mnt/{path} && chmod 777 /mnt/{path} && "
        # Copy skel files/folders not starting with '..' to user home directory.
        # Filtering out ..* removes some unneeded folders (k8s configmap mount implementation details).
        "find /etc/skel/. -maxdepth 1 -not -name '.' -not -name '..*' -exec "
        "cp -rL {escaped_brackets} /mnt/{path} \;"
    )
    command = MKDIR_OWN_DIRECTORY.format(
        # have to escape the brackets since this string will be formatted later by KubeSpawner
        escaped_brackets="{{}}",
        path=pvc_home_mount_path.format(username=username),
    )
    init_containers = [
        {
            "name": "initialize-home-mount",
            "image": "busybox:1.31",
            "command": ["sh", "-c", command],
            "securityContext": {"runAsUser": 0},
            "volumeMounts": [
                {
                    "mountPath": f"/mnt/{pvc_home_mount_path.format(username=username)}",
                    "name": "home",
                    "subPath": pvc_home_mount_path.format(username=username),
                },
                {"mountPath": "/etc/skel", "name": "skel"},
            ],
        }
    ]
    return {
        "extra_pod_config": extra_pod_config,
        "extra_container_config": extra_container_config,
        "init_containers": init_containers,
    }


def base_profile_shared_mounts(groups_to_volume_mount):
    """Configure the group directory mounts for user.

    Ensure that {shared}/{group} directory exists based on the scope availability
    and if user has permissions to read/write/execute. Kubernetes does not allow the
    same pvc to be a volume thus we must check that the home and share
    pvc are not the same for some operation.

    """
    home_pvc_name = z2jh.get_config("custom.home-pvc")
    shared_pvc_name = z2jh.get_config("custom.shared-pvc")

    pvc_shared_mount_path = "shared/{group}"
    pod_shared_mount_path = "/shared/{group}"

    extra_pod_config = {"volumes": []}
    if home_pvc_name != shared_pvc_name:
        extra_pod_config["volumes"].append(
            {"name": "shared", "persistentVolumeClaim": {"claimName": shared_pvc_name}}
        )

    extra_container_config = {"volumeMounts": []}

    MKDIR_OWN_DIRECTORY = "mkdir -p /mnt/{path} && chmod 777 /mnt/{path}"
    command = " && ".join(
        [
            MKDIR_OWN_DIRECTORY.format(path=pvc_shared_mount_path.format(group=group))
            for group in groups_to_volume_mount
        ]
    )

    init_containers = [
        {
            "name": "initialize-shared-mounts",
            "image": "busybox:1.31",
            "command": ["sh", "-c", command],
            "securityContext": {"runAsUser": 0},
            "volumeMounts": [],
        }
    ]

    for group in groups_to_volume_mount:
        extra_container_config["volumeMounts"].append(
            {
                "mountPath": pod_shared_mount_path.format(group=group),
                "name": "shared" if home_pvc_name != shared_pvc_name else "home",
                "subPath": pvc_shared_mount_path.format(group=group),
            }
        )
        init_containers[0]["volumeMounts"].append(
            {
                "mountPath": f"/mnt/{pvc_shared_mount_path.format(group=group)}",
                "name": "shared" if home_pvc_name != shared_pvc_name else "home",
                "subPath": pvc_shared_mount_path.format(group=group),
            }
        )

    return {
        "extra_pod_config": extra_pod_config,
        "extra_container_config": extra_container_config,
        "init_containers": init_containers,
    }


def profile_conda_store_mounts(username, groups):
    """Configure the conda_store environment directories mounts for
    user.

    Ensure that {shared}/{group} directory exists and user has
    permissions to read/write/execute.

    """
    conda_store_pvc_name = z2jh.get_config("custom.conda-store-pvc")
    conda_store_mount = Path(z2jh.get_config("custom.conda-store-mount"))
    default_namespace = z2jh.get_config("custom.default-conda-store-namespace")

    extra_pod_config = {
        "volumes": [
            {
                "name": "conda-store",
                "persistentVolumeClaim": {
                    "claimName": conda_store_pvc_name,
                },
            }
        ]
    }

    conda_store_namespaces = [username, default_namespace, "global"] + groups
    extra_container_config = {
        "volumeMounts": [
            {
                "mountPath": str(conda_store_mount / namespace),
                "name": "conda-store",
                "subPath": namespace,
            }
            for namespace in conda_store_namespaces
        ]
    }

    MKDIR_OWN_DIRECTORY = "mkdir -p /mnt/{path} && chmod 755 /mnt/{path}"
    command = " && ".join(
        [
            MKDIR_OWN_DIRECTORY.format(path=namespace)
            for namespace in conda_store_namespaces
        ]
    )
    init_containers = [
        {
            "name": "initialize-conda-store-mounts",
            "image": "busybox:1.31",
            "command": ["sh", "-c", command],
            "securityContext": {"runAsUser": 0},
            "volumeMounts": [
                {
                    "mountPath": f"/mnt/{namespace}",
                    "name": "conda-store",
                    "subPath": namespace,
                }
                for namespace in conda_store_namespaces
            ],
        }
    ]
    return {
        "extra_pod_config": extra_pod_config,
        "extra_container_config": extra_container_config,
        "init_containers": init_containers,
    }


def base_profile_extra_mounts():
    extra_mounts = z2jh.get_config("custom.extra-mounts")

    extra_pod_config = {
        "volumes": [
            (
                {
                    "name": volume["name"],
                    "persistentVolumeClaim": {"claimName": volume["name"]},
                }
                if volume["kind"] == "persistentvolumeclaim"
                else {"name": volume["name"], "configMap": {"name": volume["name"]}}
            )
            for mount_path, volume in extra_mounts.items()
        ]
    }

    extra_container_config = {
        "volumeMounts": [
            {
                "name": volume["name"],
                "mountPath": mount_path,
            }
            for mount_path, volume in extra_mounts.items()
        ]
    }
    return {
        "extra_pod_config": extra_pod_config,
        "extra_container_config": extra_container_config,
    }


def configure_user_provisioned_repositories(username):
    # Define paths and configurations
    pvc_home_mount_path = f"home/{username}"

    git_repos_provision_pvc = z2jh.get_config("custom.initial-repositories")
    git_clone_update_config = {
        "name": "git-clone-update",
        "configMap": {"name": "git-clone-update", "defaultMode": 511},
    }

    # Convert the string configuration to a list of dictionaries
    def string_to_objects(input_string):
        try:
            result = ast.literal_eval(input_string)
            if isinstance(result, list) and all(
                isinstance(item, dict) for item in result
            ):
                return result
            else:
                raise ValueError(
                    "Input string does not contain a list of dictionaries."
                )
        except (ValueError, SyntaxError):
            # Return an error message if the input string is not a list of dictionaries
            raise ValueError(f"Invalid input string format: {input_string}")

    git_repos_provision_pvc = string_to_objects(git_repos_provision_pvc)

    if not git_repos_provision_pvc:
        return {}

    # Define the extra pod configuration for the volumes
    extra_pod_config = {
        "volumes": [{"name": "git-clone-update", **git_clone_update_config}]
    }

    extras_git_clone_cp_path = f"/mnt/{pvc_home_mount_path}/.git-clone-update.sh"

    BASH_EXECUTION = "./.git-clone-update.sh"

    for local_repo_pair in git_repos_provision_pvc:
        for path, remote_url in local_repo_pair.items():
            BASH_EXECUTION += f" '{path} {remote_url}'"

    EXEC_OWNERSHIP_CHANGE = " && ".join(
        [
            f"cp /mnt/extras/git-clone-update.sh {extras_git_clone_cp_path}",
            f"chmod 777 {extras_git_clone_cp_path}",
            f"chown -R 1000:100 {extras_git_clone_cp_path}",
            f"cd /mnt/{pvc_home_mount_path}",
            BASH_EXECUTION,
            f"rm -f {extras_git_clone_cp_path}",
        ]
    )

    # Define init containers configuration
    init_containers = [
        {
            "name": "pre-populate-git-repos",
            "image": "bitnami/git",
            "command": ["sh", "-c", EXEC_OWNERSHIP_CHANGE],
            "securityContext": {"runAsUser": 0},
            "volumeMounts": [
                {
                    "mountPath": f"/mnt/{pvc_home_mount_path}",
                    "name": "home",
                    "subPath": pvc_home_mount_path,
                },
                {"mountPath": "/mnt/extras", "name": "git-clone-update"},
            ],
        }
    ]

    return {
        "extra_pod_config": extra_pod_config,
        "init_containers": init_containers,
    }


def configure_user(username, groups, uid=1000, gid=100):
    environment = {
        # nss_wrapper
        # https://cwrap.org/nss_wrapper.html
        "LD_PRELOAD": "libnss_wrapper.so",
        "NSS_WRAPPER_PASSWD": "/tmp/passwd",
        "NSS_WRAPPER_GROUP": "/tmp/group",
        # default files created will have 775 permissions
        "NB_UMASK": "0002",
        # set default shell to bash
        "SHELL": "/bin/bash",
        # set home directory to username
        "HOME": f"/home/{username}",
        # Disable global usage of pip
        "PIP_REQUIRE_VIRTUALENV": "true",
    }

    etc_passwd, etc_group = generate_nss_files(
        users=[{"username": username, "uid": uid, "gid": gid}],
        groups=[{"groupname": "users", "gid": gid}],
    )

    jupyter_config = json.dumps(
        {
            # nb_conda_kernels configuration
            # https://github.com/Anaconda-Platform/nb_conda_kernels
            "CondaKernelSpecManager": {"name_format": "{environment}"}
        }
    )

    # condarc to add all the namespaces user has access to
    default_namespace = z2jh.get_config("custom.default-conda-store-namespace")
    condarc = json.dumps(
        {
            "envs_dirs": [
                f"/home/conda/{_}/envs"
                for _ in [
                    username,
                    default_namespace,
                    "global",
                ]
                + groups
            ]
        }
    )

    command = " && ".join(
        [
            # nss_wrapper
            # https://cwrap.org/nss_wrapper.html
            f"echo '{etc_passwd}' > /tmp/passwd",
            f"echo '{etc_group}' > /tmp/group",
            # mount the shared directories for user only if there are
            # shared folders (groups) that the user is a member of
            # else ensure that the `shared` folder symlink does not exist
            (
                f"ln -sfn /shared /home/{username}/shared"
                if groups
                else f"rm -f /home/{username}/shared"
            ),
            # conda-store environment configuration
            f"printf '{condarc}' > /home/{username}/.condarc",
            # jupyter configuration
            f"mkdir -p /home/{username}/.jupyter && printf '{jupyter_config}' > /home/{username}/.jupyter/jupyter_config.json",
        ]
    )
    lifecycle_hooks = {"postStart": {"exec": {"command": ["/bin/sh", "-c", command]}}}

    extra_container_config = {
        "workingDir": f"/home/{username}",
    }

    return {
        "environment": environment,
        "lifecycle_hooks": lifecycle_hooks,
        "uid": uid,
        "gid": gid,
        "fs_gid": gid,
        "notebook_dir": f"/home/{username}",
        "extra_container_config": extra_container_config,
    }


def profile_argo_token(groups):
    # TODO: create a more robust check user's Argo-Workflow role

    if not z2jh.get_config("custom.argo-workflows-enabled"):
        return {}

    domain = z2jh.get_config("custom.external-url")
    namespace = z2jh.get_config("custom.namespace")

    ADMIN = "admin"
    DEVELOPER = "developer"
    ANALYST = "analyst"

    base = "argo-"
    argo_sa = None

    if ANALYST in groups:
        argo_sa = base + "viewer"
    if DEVELOPER in groups:
        argo_sa = base + "developer"
    if ADMIN in groups:
        argo_sa = base + "admin"
    if not argo_sa:
        return {}

    return {
        "ARGO_BASE_HREF": "/argo",
        "ARGO_SERVER": f"{domain}:443",
        "ARGO_NAMESPACE": namespace,
        "ARGO_TOKEN": "Bearer $(HERA_TOKEN)",
        "ARGO_HTTP1": "true",  # Maybe due to traefik config, but `argo list` returns 404 without this set.  Try removing after upgrading argo past v3.4.4.
        # Hera token is needed for versions of hera released before https://github.com/argoproj-labs/hera/pull/1053 is merged
        "HERA_TOKEN": {
            "valueFrom": {
                "secretKeyRef": {
                    "name": f"{argo_sa}.service-account-token",
                    "key": "token",
                }
            }
        },
    }


def profile_conda_store_viewer_token():
    return {
        "CONDA_STORE_TOKEN": {
            "valueFrom": {
                "secretKeyRef": {
                    "name": "argo-workflows-conda-store-token",
                    "key": "conda-store-api-token",
                }
            }
        },
        "CONDA_STORE_SERVICE": {
            "valueFrom": {
                "secretKeyRef": {
                    "name": "argo-workflows-conda-store-token",
                    "key": "conda-store-service-name",
                }
            }
        },
        "CONDA_STORE_SERVICE_NAMESPACE": {
            "valueFrom": {
                "secretKeyRef": {
                    "name": "argo-workflows-conda-store-token",
                    "key": "conda-store-service-namespace",
                }
            }
        },
    }


def render_profile(
    profile, username, groups, keycloak_profilenames, groups_to_volume_mount
):
    """Render each profile for user.

    If profile is not available for given username, groups returns
    None. Otherwise profile is transformed into kubespawner profile.

    {
        display_name: "<heading for profile>",
        slug: "<longer description of profile>"
        default: "<only one profile can be default>",
        kubespawner_override: {
            # https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html
            ...
        }
    }
    """
    access = profile.get("access", "all")

    if access == "yaml":
        # check that username or groups in allowed groups for profile
        # profile.groups and profile.users can be None or empty lists, or may not be members of profile at all
        user_not_in_users = username not in set(profile.get("users", []) or [])
        user_not_in_groups = (
            set(groups) & set(profile.get("groups", []) or [])
        ) == set()
        if user_not_in_users and user_not_in_groups:
            return None
    elif access == "keycloak":
        # Keycloak mapper should provide the 'jupyterlab_profiles' attribute from groups/user
        if profile.get("display_name", None) not in keycloak_profilenames:
            return None

    profile = copy.copy(profile)
    profile_kubespawner_override = profile.get("kubespawner_override")
    profile["kubespawner_override"] = functools.reduce(
        deep_merge,
        [
            base_profile_home_mounts(username),
            base_profile_shared_mounts(groups_to_volume_mount),
            profile_conda_store_mounts(username, groups),
            base_profile_extra_mounts(),
            configure_user(username, groups),
            configure_user_provisioned_repositories(username),
            profile_kubespawner_override,
        ],
        {},
    )

    # We need to merge any env vars from the spawner with any overrides from the profile
    # This is mainly to ensure JUPYTERHUB_ANYONE/GROUP is passed through from the spawner
    # to control dashboard access.
    envvars_fixed = {**(profile["kubespawner_override"].get("environment", {}))}

    def preserve_envvars(spawner):
        # This adds in JUPYTERHUB_ANYONE/GROUP rather than overwrite all env vars,
        # if set in the spawner for a dashboard to control access.
        return {
            **envvars_fixed,
            **spawner.environment,
            **profile_argo_token(groups),
            **profile_conda_store_viewer_token(),
        }

    profile["kubespawner_override"]["environment"] = preserve_envvars

    return profile


@gen.coroutine
def render_profiles(spawner):
    # jupyterhub does not yet manage groups but it will soon
    # so for now we rely on auth_state from the keycloak
    # userinfo request to have the groups in the key
    # "auth_state.oauth_user.groups"
    auth_state = yield spawner.user.get_auth_state()

    username = auth_state["oauth_user"]["preferred_username"]

    # only return the lowest level group name
    # e.g. /projects/myproj -> myproj
    # and /developers -> developers
    groups = [Path(group).name for group in auth_state["oauth_user"]["groups"]]
    groups_with_permission_to_mount = [
        Path(group).name
        for group in auth_state.get("groups_with_permission_to_mount", [])
    ]

    keycloak_profilenames = auth_state["oauth_user"].get("jupyterlab_profiles", [])

    # fetch available profiles and render additional attributes
    profile_list = z2jh.get_config("custom.profiles")
    return list(
        filter(
            None,
            [
                render_profile(
                    p,
                    username,
                    groups,
                    keycloak_profilenames,
                    groups_with_permission_to_mount,
                )
                for p in profile_list
            ],
        )
    )


c.KubeSpawner.args = ["--debug"]
c.KubeSpawner.environment = {
    **c.KubeSpawner.environment,
    "JUPYTERHUB_SINGLEUSER_APP": "jupyter_server.serverapp.ServerApp",
}
c.KubeSpawner.profile_list = render_profiles


# Utils
def deep_merge(d1, d2):
    """Deep merge two dictionaries.
    >>> value_1 = {
    'a': [1, 2],
    'b': {'c': 1, 'z': [5, 6]},
    'e': {'f': {'g': {}}},
    'm': 1,
    }.

    >>> value_2 = {
        'a': [3, 4],
        'b': {'d': 2, 'z': [7]},
        'e': {'f': {'h': 1}},
        'm': [1],
    }

    >>> print(deep_merge(value_1, value_2))
    {'m': 1, 'e': {'f': {'g': {}, 'h': 1}}, 'b': {'d': 2, 'c': 1, 'z': [5, 6, 7]}, 'a': [1, 2, 3,  4]}
    """
    if isinstance(d1, dict) and isinstance(d2, dict):
        d3 = {}
        for key in d1.keys() | d2.keys():
            if key in d1 and key in d2:
                d3[key] = deep_merge(d1[key], d2[key])
            elif key in d1:
                d3[key] = d1[key]
            elif key in d2:
                d3[key] = d2[key]
        return d3
    elif isinstance(d1, list) and isinstance(d2, list):
        return [*d1, *d2]
    else:  # if they don't match use left one
        return d1


def generate_nss_files(users, groups):
    etc_passwd = []
    passwd_format = "{username}:x:{uid}:{gid}:{username}:/home/{username}:/bin/bash"
    for user in users:
        etc_passwd.append(passwd_format.format(**user))

    etc_group = []
    group_format = "{groupname}:x:{gid}:"
    for group in groups:
        etc_group.append(group_format.format(**group))

    return "\n".join(etc_passwd), "\n".join(etc_group)



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterhub/04-auth.py
---

import asyncio
import json
import os
import time
import urllib
from functools import reduce

from jupyterhub import scopes
from jupyterhub.traitlets import Callable
from oauthenticator.generic import GenericOAuthenticator
from traitlets import Bool, Unicode, Union


class KeyCloakOAuthenticator(GenericOAuthenticator):
    """
    Since `oauthenticator` 16.3 `GenericOAuthenticator` supports group management.
    This subclass adds role management on top of it, building on the new `manage_roles`
    feature added in JupyterHub 5.0 (https://github.com/jupyterhub/jupyterhub/pull/4748).
    """

    claim_roles_key = Union(
        [Unicode(os.environ.get("OAUTH2_ROLES_KEY", "groups")), Callable()],
        config=True,
        help="""As `claim_groups_key` but for roles.""",
    )

    realm_api_url = Unicode(
        config=True, help="""The keycloak REST API URL for the realm."""
    )

    reset_managed_roles_on_startup = Bool(True)

    async def update_auth_model(self, auth_model):
        """Updates and returns the auth_model dict.
        This function is called every time a user authenticates with JupyterHub, as in
        every time a user login to Nebari.

        It will fetch the roles and their corresponding scopes from keycloak
        and return updated auth model which will updates roles/scopes for the
        user. When a user's roles/scopes are updated, they take in-affect only
        after they log in to Nebari.
        """
        start = time.time()
        self.log.info("Updating user auth model")
        auth_model = await super().update_auth_model(auth_model)
        user_id = auth_model["auth_state"]["oauth_user"]["sub"]
        token = await self._get_token()

        jupyterhub_client_id = await self._get_jupyterhub_client_id(token=token)
        user_info = auth_model["auth_state"][self.user_auth_state_key]
        user_roles_from_claims = self._get_user_roles(user_info=user_info)
        keycloak_api_call_start = time.time()
        user_roles = await self._get_client_roles_for_user(
            user_id=user_id, client_id=jupyterhub_client_id, token=token
        )
        user_roles_rich = await self._get_roles_with_attributes(
            roles=user_roles, client_id=jupyterhub_client_id, token=token
        )

        # Include which groups have permission to mount shared directories (user by
        # profiles.py)
        auth_model["auth_state"]["groups_with_permission_to_mount"] = (
            await self.get_client_groups_with_mount_permissions(
                user_groups=auth_model["auth_state"]["oauth_user"]["groups"],
                user_roles=user_roles_rich,
                client_id=jupyterhub_client_id,
                token=token,
            )
        )

        keycloak_api_call_time_taken = time.time() - keycloak_api_call_start
        user_roles_rich_names = {role["name"] for role in user_roles_rich}

        user_roles_non_jhub_client = [
            {"name": role}
            for role in user_roles_from_claims
            if role in (user_roles_from_claims - user_roles_rich_names)
        ]

        auth_model["roles"] = [
            {
                "name": role["name"],
                "description": role.get("description"),
                "scopes": self._get_scope_from_role(role),
            }
            for role in [*user_roles_rich, *user_roles_non_jhub_client]
        ]

        # note: because the roles check is comprehensive, we need to re-add the admin and user roles
        if auth_model["admin"]:
            auth_model["roles"].append({"name": "admin"})

        if await self.check_allowed(auth_model["name"], auth_model):
            auth_model["roles"].append({"name": "user"})

        execution_time = time.time() - start

        self.log.info(
            f"Auth model update complete, time taken: {execution_time}s "
            f"time taken for keycloak api call: {keycloak_api_call_time_taken}s "
            f"delta between full execution and keycloak call: {execution_time - keycloak_api_call_time_taken}s"
        )
        return auth_model

    async def _get_jupyterhub_client_roles(self, jupyterhub_client_id, token):
        """Get roles for the client named 'jupyterhub'."""
        # Includes roles like "jupyterhub_admin", "jupyterhub_developer", "dask_gateway_developer"

        client_roles = await self._fetch_api(
            endpoint=f"clients/{jupyterhub_client_id}/roles", token=token
        )
        client_roles_rich = await self._get_roles_with_attributes(
            client_roles, client_id=jupyterhub_client_id, token=token
        )
        return client_roles_rich

    async def _get_jupyterhub_client_id(self, token):
        # Get the clients list to find the "id" of "jupyterhub" client.
        clients_data = await self._fetch_api(endpoint="clients/", token=token)
        jupyterhub_clients = [
            client for client in clients_data if client["clientId"] == "jupyterhub"
        ]
        assert len(jupyterhub_clients) == 1
        jupyterhub_client_id = jupyterhub_clients[0]["id"]
        return jupyterhub_client_id

    async def load_managed_roles(self):
        self.log.info("Loading managed roles")
        if not self.manage_roles:
            raise ValueError(
                "Managed roles can only be loaded when `manage_roles` is True"
            )
        token = await self._get_token()
        jupyterhub_client_id = await self._get_jupyterhub_client_id(token=token)
        client_roles_rich = await self._get_jupyterhub_client_roles(
            jupyterhub_client_id=jupyterhub_client_id, token=token
        )

        # Includes roles like "default-roles-nebari", "offline_access", "uma_authorization"
        realm_roles = await self._fetch_api(endpoint="roles", token=token)
        roles = {
            role["name"]: {
                "name": role["name"],
                "description": role["description"],
                "scopes": self._get_scope_from_role(role),
            }
            for role in [*realm_roles, *client_roles_rich]
        }

        # we could use either `name` (e.g. "developer") or `path` ("/developer");
        # since the default claim key returns `path`, it seems preferable.
        for realm_role in realm_roles:
            role_name = realm_role["name"]
            role = roles[role_name]
            # fetch role assignments to groups
            role.update(
                await self._get_users_and_groups_for_role(
                    role_name,
                    token=token,
                )
            )

        for client_role in client_roles_rich:
            role_name = client_role["name"]
            role = roles[role_name]
            # fetch role assignments to groups
            role.update(
                await self._get_users_and_groups_for_role(
                    role_name,
                    token=token,
                    client_id=jupyterhub_client_id,
                )
            )

        return list(roles.values())

    async def get_client_groups_with_mount_permissions(
        self, user_groups, user_roles, client_id, token
    ):
        """
        Asynchronously retrieves the list of client groups with mount permissions
        that the user belongs to.
        """

        roles_with_permission = []
        groups_with_permission_to_mount = set()

        # Filter roles with the shared-directory component and scope
        for role in user_roles:
            attributes = role.get("attributes", {})

            role_component = attributes.get("component", [None])[0]
            role_scopes = attributes.get("scopes", [None])[0]

            if (
                role_component == "shared-directory"
                and role_scopes == "write:shared-mount"
            ):
                role_name = role.get("name")
                roles_with_permission.append(role_name)

        # Fetch groups for all relevant roles concurrently
        group_fetch_tasks = [
            self._fetch_api(
                endpoint=f"clients/{client_id}/roles/{role_name}/groups",
                token=token,
            )
            for role_name in roles_with_permission
        ]

        all_role_groups = await asyncio.gather(*group_fetch_tasks)

        # Collect group names with permissions
        for role_groups in all_role_groups:
            groups_with_permission_to_mount |= set(
                [group["path"] for group in role_groups]
            )

        return list(groups_with_permission_to_mount & set(user_groups))

    async def _get_users_and_groups_for_role(
        self, role_name, token, client_id=None, group_name_key="path"
    ):
        """
        Asynchronously fetches and maps groups and users to a specified role.

        Returns:
            dict: A dictionary with groups (path or name) and users mapped to the role.
        {
            "groups": ["/group1", "/group2"],
            "users": ["user1", "user2"],
        },
        """
        # Prepare endpoints
        group_endpoint = f"roles/{role_name}/groups"
        user_endpoint = f"roles/{role_name}/users"

        if client_id:
            group_endpoint = f"clients/{client_id}/roles/{role_name}/groups"
            user_endpoint = f"clients/{client_id}/roles/{role_name}/users"

        # fetch role assignments to groups (Fetch data concurrently)
        groups, users = await asyncio.gather(
            *[
                self._fetch_api(endpoint=group_endpoint, token=token),
                self._fetch_api(endpoint=user_endpoint, token=token),
            ]
        )

        # Process results
        return {
            "groups": [group[group_name_key] for group in groups],
            "users": [user["username"] for user in users],
        }

    def _get_scope_from_role(self, role):
        """Return scopes from role if the component is jupyterhub"""
        role_scopes = role.get("attributes", {}).get("scopes", [])
        component = role.get("attributes", {}).get("component", [])
        # Attributes are returned as a single-element array, unless `##` delimiter is used in Keycloak
        # See this: https://stackoverflow.com/questions/68954733/keycloak-client-role-attribute-array
        if component == ["jupyterhub"] and role_scopes:
            return self.validate_scopes(role_scopes[0].split(","))
        else:
            return []

    def validate_scopes(self, role_scopes):
        """Validate role scopes to sanity check user provided scopes from keycloak"""
        self.log.info(f"Validating role scopes: {role_scopes}")
        try:
            # This is not a public function, but there isn't any alternative
            # method to verify scopes, and we do need to do this sanity check
            # as a invalid scopes could cause hub pod to fail
            scopes._check_scopes_exist(role_scopes)
            return role_scopes
        except scopes.ScopeNotFound as e:
            self.log.error(f"Invalid scopes, skipping: {role_scopes} ({e})")
        return []

    async def _get_roles_with_attributes(self, roles: dict, client_id: str, token: str):
        """This fetches all roles by id to fetch their attributes."""
        roles_rich = []
        for role in roles:
            # If this takes too much time, which isn't the case right now, we can
            # also do multithreaded requests
            role_rich = await self._fetch_api(
                endpoint=f"roles-by-id/{role['id']}?client={client_id}", token=token
            )
            roles_rich.append(role_rich)
        return roles_rich

    async def _get_client_roles_for_user(self, user_id, client_id, token):
        user_roles = await self._fetch_api(
            endpoint=f"users/{user_id}/role-mappings/clients/{client_id}/composite",
            token=token,
        )
        return user_roles

    def _get_user_roles(self, user_info):
        if callable(self.claim_roles_key):
            return set(self.claim_roles_key(user_info))
        try:
            return set(reduce(dict.get, self.claim_roles_key.split("."), user_info))
        except TypeError:
            self.log.error(
                f"The claim_roles_key {self.claim_roles_key} does not exist in the user token"
            )
            return set()

    async def _get_token(self) -> str:
        http = self.http_client

        body = urllib.parse.urlencode(
            {
                "client_id": self.client_id,
                "client_secret": self.client_secret,
                "grant_type": "client_credentials",
            }
        )
        response = await http.fetch(
            self.token_url,
            method="POST",
            body=body,
        )
        data = json.loads(response.body)
        return data["access_token"]  # type: ignore[no-any-return]

    async def _fetch_api(self, endpoint: str, token: str):
        response = await self.http_client.fetch(
            f"{self.realm_api_url}/{endpoint}",
            method="GET",
            headers={"Authorization": f"Bearer {token}"},
        )
        return json.loads(response.body)


c.JupyterHub.authenticator_class = KeyCloakOAuthenticator



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterlab/overrides.json
---

{
    "dask-labextension:plugin": {
        "browserDashboardCheck": true
    },
    "jupyterlab-conda-store:plugin": {
        "apiUrl": "/conda-store/",
        "authMethod": "cookie",
        "loginUrl": "/conda-store/login?next=",
        "authToken": "",
        "addMainMenuItem": false
    },
    "@jupyterlab/apputils-extension:notification": {
        "checkForUpdates": false,
        "fetchNews": "false"
    },
    "@jupyterlab/mainmenu-extension:plugin": {
        "menus": [
            {
                "id": "jp-mainmenu-file",
                "items": [
                    {
                        "command": "help:open",
                        "rank": 0,
                        "args": {
                            "url": "/hub/home",
                            "text": "Home",
                            "newBrowserTab": true
                        }
                    },
                    {
                        "type": "submenu",
                        "submenu": {
                            "id": "jp-mainmenu-file-new"
                        },
                        "rank": 0.5
                    },
                    {
                        "command": "hub:control-panel",
                        "disabled": true
                    },
                    {
                        "command": "hub:logout",
                        "disabled": true
                    }
                ]
            },
            {
                "id": "jp-mainmenu-services",
                "disabled": false,
                "label": "Services",
                "rank": 1000,
                "items": [
                    {
                        "command": "nebari:run-first-enabled",
                        "args": {
                            "commands": [
                                {
                                  "id": "condastore:open",
                                  "label": "Environment Management"
                                },
                                {
                                  "id": "help:open",
                                  "args": {
                                    "url": "/conda-store",
                                    "text": "Environment Management",
                                    "newBrowserTab": true
                                  }
                                }
                            ]
                        },
                        "rank": 1
                    },
                    {
                        "command": "help:open",
                        "rank": 2,
                        "args": {
                            "url": "/auth/admin/nebari/console",
                            "text": "User Management",
                            "newBrowserTab": true
                        }
                    },
                    {
                        "command": "help:open",
                        "rank": 3,
                        "args": {
                            "url": "/monitoring",
                            "text": "Monitoring",
                            "newBrowserTab": true
                        }
                    },
                    {
                        "command": "help:open",
                        "rank": 4,
                        "args": {
                            "url": "/argo",
                            "text": "Argo Workflows",
                            "newBrowserTab": true
                        }
                    },
                    {
                        "command": "nebari:open-proxy",
                        "rank": 5,
                        "args": {
                            "name": "vscode"
                        }
                    }
                ]
            },
            {
                "id": "jp-mainmenu-help",
                "rank": 1001,
                "items": [
                    {
                        "command": "help:open",
                        "rank": 1001,
                        "args": {
                            "url": "https://www.nebari.dev/docs/welcome/",
                            "text": "Nebari documentation",
                            "newBrowserTab": true
                        }
                    }
                ]
            }
        ]
    }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/configmaps.tf
---

locals {
  jupyter-notebook-config-py-template = templatefile("${path.module}/files/jupyter/jupyter_server_config.py.tpl", {
    terminal_cull_inactive_timeout      = var.idle-culler-settings.terminal_cull_inactive_timeout
    terminal_cull_interval              = var.idle-culler-settings.terminal_cull_interval
    kernel_cull_idle_timeout            = var.idle-culler-settings.kernel_cull_idle_timeout
    kernel_cull_interval                = var.idle-culler-settings.kernel_cull_interval
    kernel_cull_connected               = var.idle-culler-settings.kernel_cull_connected ? "True" : "False" # for Python compatible boolean values
    kernel_cull_busy                    = var.idle-culler-settings.kernel_cull_busy ? "True" : "False"      # for Python compatible boolean values
    server_shutdown_no_activity_timeout = var.idle-culler-settings.server_shutdown_no_activity_timeout
    jupyterlab_preferred_dir            = var.jupyterlab-preferred-dir != null ? var.jupyterlab-preferred-dir : ""
    }
  )
}

locals {
  jupyterlab-overrides-json-object = merge(
    jsondecode(file("${path.module}/files/jupyterlab/overrides.json")),
    var.jupyterlab-default-settings
  )
}

locals {
  jupyter-pioneer-config-py-template = templatefile("${path.module}/files/jupyter/jupyter_jupyterlab_pioneer_config.py.tpl", {
    log_format = var.jupyterlab-pioneer-log-format != null ? var.jupyterlab-pioneer-log-format : ""
    }
  )
}


resource "local_file" "jupyter_server_config_py" {
  content  = local.jupyter-notebook-config-py-template
  filename = "${path.module}/files/jupyter/jupyter_server_config.py"

  provisioner "local-exec" {
    # check the syntax of the config file without running it
    command = "python -m py_compile ${self.filename}"
  }
}

resource "local_file" "jupyter_jupyterlab_pioneer_config_py" {
  content  = local.jupyter-pioneer-config-py-template
  filename = "${path.module}/files/jupyter/jupyter_jupyterlab_pioneer_config.py"

  provisioner "local-exec" {
    # check the syntax of the config file without running it
    command = "python -m py_compile ${self.filename}"
  }
}

resource "local_sensitive_file" "jupyter_gallery_config_json" {
  content = jsonencode({
    "GalleryManager" = var.jupyterlab-gallery-settings
  })
  filename = "${path.module}/files/jupyter/jupyter_gallery_config.json"
}


resource "local_file" "overrides_json" {
  content  = jsonencode(local.jupyterlab-overrides-json-object)
  filename = "${path.module}/files/jupyterlab/overrides.json"
}

resource "local_file" "page_config_json" {
  content = jsonencode({
    "disabledExtensions" : {
      "jupyterlab-jhub-apps" : !var.jhub-apps-enabled
    },
    # `lockedExtensions` is an empty dict to signify that `jupyterlab-jhub-apps` is not being disabled and locked (but only disabled)
    # which means users are still allowed to disable the jupyterlab-jhub-apps extension (if they have write access to page_config).
    "lockedExtensions" : {}
  })
  filename = "${path.module}/files/jupyterlab/page_config.json"
}

resource "kubernetes_config_map" "etc-ipython" {
  metadata {
    name      = "etc-ipython"
    namespace = var.namespace
  }

  data = {
    for filename in fileset("${path.module}/files/ipython", "*") :
    filename => file("${path.module}/files/ipython/${filename}")
  }
}


locals {
  etc-jupyter-config-data = merge(
    {
      "jupyter_server_config.py"    = local_file.jupyter_server_config_py.content,
      "jupyter_gallery_config.json" = local_sensitive_file.jupyter_gallery_config_json.content,
    },
    var.jupyterlab-pioneer-enabled ? {
      # quotes are must here, as terraform would otherwise think py is a property of
      # a defined resource jupyter_jupyterlab_pioneer_config
      "jupyter_jupyterlab_pioneer_config.py" = local_file.jupyter_jupyterlab_pioneer_config_py.content
    } : {}
  )
}

locals {
  etc-jupyterlab-settings = {
    "overrides.json" = local_file.overrides_json.content
  }
  etc-jupyterlab-page-config = {
    "page_config.json" = local_file.page_config_json.content
  }
}

resource "kubernetes_config_map" "etc-jupyter" {
  depends_on = [
    local_file.jupyter_server_config_py,
    local_file.jupyter_jupyterlab_pioneer_config_py,
    local_sensitive_file.jupyter_gallery_config_json
  ]

  metadata {
    name      = "etc-jupyter"
    namespace = var.namespace
  }

  data = local.etc-jupyter-config-data
}


resource "kubernetes_config_map" "etc-skel" {
  metadata {
    name      = "etc-skel"
    namespace = var.namespace
  }

  data = {
    for filename in fileset("${path.module}/files/skel", "*") :
    filename => file("${path.module}/files/skel/${filename}")
  }
}


resource "kubernetes_config_map" "jupyterlab-settings" {
  depends_on = [
    local_file.overrides_json
  ]

  metadata {
    name      = "jupyterlab-settings"
    namespace = var.namespace
  }

  data = local.etc-jupyterlab-settings
}


resource "kubernetes_config_map" "jupyterlab-page-config" {
  depends_on = [
    local_file.page_config_json
  ]

  metadata {
    name      = "jupyterlab-page-config"
    namespace = var.namespace
  }

  data = local.etc-jupyterlab-page-config
}

resource "kubernetes_config_map" "git_clone_update" {
  metadata {
    name      = "git-clone-update"
    namespace = var.namespace
  }

  data = {
    "git-clone-update.sh" = "${file("${path.module}/files/extras/git_clone_update.sh")}"
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/main.tf
---

resource "random_password" "service_token" {
  for_each = var.services

  length  = 32
  special = false
}

resource "random_password" "proxy_secret_token" {
  length  = 32
  special = false
}

resource "random_password" "jhub_apps_jwt_secret" {
  length  = 32
  special = false
}

locals {
  jhub_apps_secrets_name           = "jhub-apps-secrets"
  jhub_apps_env_var_name           = "JHUB_APP_JWT_SECRET_KEY"
  singleuser_nodeselector_key      = var.cloud-provider == "aws" ? "dedicated" : var.user-node-group.key
  userscheduler_nodeselector_key   = var.cloud-provider == "aws" ? "dedicated" : var.general-node-group.key
  userscheduler_nodeselector_value = var.general-node-group.value
}

resource "kubernetes_secret" "jhub_apps_secrets" {
  metadata {
    name      = local.jhub_apps_secrets_name
    namespace = var.namespace
  }

  data = {
    jwt_secret_key = random_password.jhub_apps_jwt_secret.result
  }

  type = "Opaque"
}

locals {
  jupyterhub_env_vars = [
    {
      name = local.jhub_apps_env_var_name,
      valueFrom : {
        secretKeyRef : {
          name : local.jhub_apps_secrets_name
          key : "jwt_secret_key"
        }
      }
    }
  ]
}


resource "helm_release" "jupyterhub" {
  name      = "jupyterhub-${var.namespace}"
  namespace = var.namespace

  repository = "https://jupyterhub.github.io/helm-chart/"
  chart      = "jupyterhub"
  version    = "4.0.0-0.dev.git.6707.h109668fd"

  values = concat([
    file("${path.module}/values.yaml"),
    jsonencode({
      # custom values can be accessed via z2jh.get_config('custom.<path>')
      custom = {
        namespace                     = var.namespace
        external-url                  = var.external-url
        theme                         = var.theme
        profiles                      = var.profiles
        argo-workflows-enabled        = var.argo-workflows-enabled
        home-pvc                      = var.home-pvc.name
        shared-pvc                    = var.shared-pvc.name
        conda-store-pvc               = var.conda-store-pvc
        conda-store-mount             = var.conda-store-mount
        default-conda-store-namespace = var.default-conda-store-namespace
        conda-store-service-name      = var.conda-store-service-name
        conda-store-jhub-apps-token   = var.conda-store-jhub-apps-token
        jhub-apps-enabled             = var.jhub-apps-enabled
        jhub-apps-overrides           = var.jhub-apps-overrides
        initial-repositories          = var.initial-repositories
        skel-mount = {
          name      = kubernetes_config_map.etc-skel.metadata.0.name
          namespace = kubernetes_config_map.etc-skel.metadata.0.namespace
        }
        extra-mounts = merge(
          var.extra-mounts,
          {
            "/etc/ipython" = {
              name      = kubernetes_config_map.etc-ipython.metadata.0.name
              namespace = kubernetes_config_map.etc-ipython.metadata.0.namespace
              kind      = "configmap"
            }

            "/etc/jupyter" = {
              name      = kubernetes_config_map.etc-jupyter.metadata.0.name
              namespace = kubernetes_config_map.etc-jupyter.metadata.0.namespace
              kind      = "configmap"
            }

            "/opt/conda/envs/default/share/jupyter/lab/settings" = {
              name      = kubernetes_config_map.jupyterlab-settings.metadata.0.name
              namespace = kubernetes_config_map.jupyterlab-settings.metadata.0.namespace
              kind      = "configmap"
            }

            "/etc/jupyter/labconfig" = {
              name      = kubernetes_config_map.jupyterlab-page-config.metadata.0.name
              namespace = kubernetes_config_map.jupyterlab-page-config.metadata.0.namespace
              kind      = "configmap"
            }
          }
        )
        environments = var.conda-store-environments
      }

      hub = {
        image = var.jupyterhub-image
        nodeSelector = {
          "${var.general-node-group.key}" = var.general-node-group.value
        }

        extraVolumes = [{
          name = "conda-store-shared"
          persistentVolumeClaim = {
            claimName = var.conda-store-pvc
          }
        }]

        extraVolumeMounts = [{
          mountPath = var.conda-store-mount
          name      = "conda-store-shared"
        }]

        extraConfig = {
          "01-theme.py"    = file("${path.module}/files/jupyterhub/01-theme.py")
          "02-spawner.py"  = file("${path.module}/files/jupyterhub/02-spawner.py")
          "03-profiles.py" = file("${path.module}/files/jupyterhub/03-profiles.py")
          "04-auth.py"     = file("${path.module}/files/jupyterhub/04-auth.py")
        }

        services = {
          for service in var.services : service => {
            name      = service
            admin     = true
            api_token = random_password.service_token[service].result
          }
        }

        # for simple key value configuration with jupyterhub traitlets
        # this hub.config property should be used
        config = {
          Authenticator = {
            enable_auth_state = true
          }
          KeyCloakOAuthenticator = {
            client_id            = module.jupyterhub-openid-client.config.client_id
            client_secret        = module.jupyterhub-openid-client.config.client_secret
            oauth_callback_url   = "https://${var.external-url}/hub/oauth_callback"
            authorize_url        = module.jupyterhub-openid-client.config.authentication_url
            token_url            = module.jupyterhub-openid-client.config.token_url
            userdata_url         = module.jupyterhub-openid-client.config.userinfo_url
            realm_api_url        = module.jupyterhub-openid-client.config.realm_api_url
            login_service        = "Keycloak"
            username_claim       = "preferred_username"
            claim_groups_key     = "groups"
            claim_roles_key      = "roles"
            allowed_groups       = ["/analyst", "/developer", "/admin", "jupyterhub_admin", "jupyterhub_developer"]
            admin_groups         = ["/admin", "jupyterhub_admin"]
            manage_groups        = true
            manage_roles         = true
            refresh_pre_spawn    = true
            validate_server_cert = false

            # deprecated, to be removed (replaced by validate_server_cert)
            tls_verify = false
            # deprecated, to be removed (replaced by username_claim)
            username_key = "preferred_username"
          }
        }
      }

      proxy = {
        chp = {
          nodeSelector = {
            "${var.general-node-group.key}" = var.general-node-group.value
          }
        }
      }

      singleuser = {
        image = var.jupyterlab-image
        nodeSelector = {
          "${local.singleuser_nodeselector_key}" = var.user-node-group.value
        }
      }

      scheduling = {
        userScheduler = {
          nodeSelector = {
            "${local.userscheduler_nodeselector_key}" = local.userscheduler_nodeselector_value
          }
        }
      }
    })],
    var.overrides,
    [jsonencode({
      hub = {
        extraEnv = concat([
          {
            name  = "OAUTH_LOGOUT_REDIRECT_URL",
            value = format("%s?redirect_uri=%s", "https://${var.external-url}/auth/realms/${var.realm_id}/protocol/openid-connect/logout", urlencode(var.jupyterhub-logout-redirect-url))
          },
          ],
          concat(local.jupyterhub_env_vars, jsondecode(var.jupyterhub-hub-extraEnv))
        )
      }
    })]
  )

  set {
    name  = "proxy.secretToken"
    value = random_password.proxy_secret_token.result
  }

  depends_on = [
    var.home-pvc,
    var.shared-pvc,
  ]

  lifecycle {
    replace_triggered_by = [
      null_resource.home-pvc,
    ]
  }

}

resource "null_resource" "home-pvc" {
  triggers = {
    home-pvc = var.home-pvc.id
  }
}

resource "kubernetes_manifest" "jupyterhub" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "jupyterhub"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && (Path(`/`) || PathPrefix(`/hub`) || PathPrefix(`/user`) || PathPrefix(`/services`))"
          services = [
            {
              name = "proxy-public"
              port = 80
            }
          ]
          middlewares = [
            {
              name      = kubernetes_manifest.jupyterhub-proxy-add-slash.manifest.metadata.name
              namespace = var.namespace
            }
          ]
        },
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && (PathPrefix(`/home`) || PathPrefix(`/token`) || PathPrefix(`/admin`))"
          middlewares = [
            {
              name      = kubernetes_manifest.jupyterhub-middleware-addprefix.manifest.metadata.name
              namespace = var.namespace
            }
          ]
          services = [
            {
              name = "proxy-public"
              port = 80
            }
          ]
        }
      ]
    }
  }
}


module "jupyterhub-openid-client" {
  source = "../keycloak-client"

  realm_id     = var.realm_id
  client_id    = "jupyterhub"
  external-url = var.external-url
  role_mapping = {
    "admin"     = ["jupyterhub_admin", "dask_gateway_admin"]
    "developer" = ["jupyterhub_developer", "dask_gateway_developer"]
    "analyst"   = ["jupyterhub_developer"]
  }
  client_roles = [
    {
      "name" : "allow-app-sharing-role",
      "description" : "Allow app sharing for apps created via JupyterHub App Launcher (jhub-apps)",
      "groups" : [],
      "attributes" : {
        # grants permissions to share server
        # grants permissions to read other user's names
        # grants permissions to read other groups' names
        # The later two are required for sharing with a group or user
        "scopes" : "shares,read:users:name,read:groups:name"
        "component" : "jupyterhub"
      }
    },
    {
      "name" : "allow-read-access-to-services-role",
      "description" : "Allow read access to services, such that they are visible on the home page e.g. conda-store",
      # Adding it to analyst group such that it's applied to every user.
      "groups" : ["analyst"],
      "attributes" : {
        # grants permissions to read services
        "scopes" : "read:services",
        "component" : "jupyterhub"
      }
    },
    {
      "name" : "allow-group-directory-creation-role",
      "description" : "Grants a group the ability to manage the creation of its corresponding mounted directory.",
      "groups" : ["admin", "analyst", "developer"],
      "attributes" : {
        # grants permissions to mount group folder to shared dir
        "scopes" : "write:shared-mount",
        "component" : "shared-directory"
      }
    },
  ]
  callback-url-paths = [
    "https://${var.external-url}/hub/oauth_callback",
    var.jupyterhub-logout-redirect-url
  ]
  jupyterlab_profiles_mapper = true
  service-accounts-enabled   = true
  service-account-roles = [
    "view-realm", "view-users", "view-clients"
  ]
}


resource "kubernetes_secret" "argo-workflows-conda-store-token" {
  metadata {
    name      = "argo-workflows-conda-store-token"
    namespace = var.namespace
  }

  data = {
    "conda-store-api-token"         = var.conda-store-argo-workflows-jupyter-scheduler-token
    "conda-store-service-name"      = var.conda-store-service-name
    "conda-store-service-namespace" = var.namespace
  }

  type = "Opaque"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/middleware.tf
---

resource "kubernetes_manifest" "jupyterhub-middleware-addprefix" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebari-jupyterhub-add-prefix"
      namespace = var.namespace
    }
    spec = {
      addPrefix = {
        prefix = "/hub"
      }
    }
  }
}

resource "kubernetes_manifest" "jupyterhub-proxy-add-slash" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebari-jupyterhub-proxy-add-slash"
      namespace = var.namespace
    }
    spec = {
      redirectRegex = {
        regex       = "^https://${var.external-url}/user/([^/]+)/proxy/(\\d+)$"
        replacement = "https://${var.external-url}/user/$${1}/proxy/$${2}/"
        permanent   = true
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/outputs.tf
---

output "internal_jupyterhub_url" {
  description = "internal url for jupyterhub"
  value       = "http://proxy-public.${var.namespace}:80"
}


output "services" {
  description = "Jupyterhub registered services"
  value = {
    for service in var.services : service => {
      name      = service
      api_token = random_password.service_token[service].result
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/values.yaml
---

# https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/1.2.0/jupyterhub/values.yaml
hub:
  db:
    type: sqlite-pvc
    pvc:
      storage: 1Gi
  baseUrl: "/"

  networkPolicy:
    ingress:
      - ports:
          - port: 10202
        from:
          - podSelector:
              matchLabels:
                hub.jupyter.org/network-access-hub: "true"

  service:
    extraPorts:
      - port: 10202
        targetPort: 10202
        name: jhub-apps

proxy:
  secretToken: "<placeholder>"
  service:
    type: ClusterIP
  chp:
    networkPolicy:
      egressAllowRules:
        cloudMetadataServer: false
        dnsPortsPrivateIPs: false
        nonPrivateIPs: false
        privateIPs: false

      egress:
        - ports:
          - port: 53
            protocol: UDP
          - port: 53
            protocol: TCP
          - port: 10202
            protocol: TCP
        - to:
          - ipBlock:
              cidr: 0.0.0.0/0

scheduling:
  userScheduler:
    enabled: true
  podPriority:
    enabled: true
  userPlaceholder:
    enabled: false
    replicas: 1

imagePullSecrets:
  - extcrcreds

singleuser:
  defaultUrl: "/lab"
  startTimeout: 600  # 10 minutes
  profileList: []
  storage:
    type: static
    extraVolumeMounts:
      - mountPath: "/home/shared"
        name: home
        subPath: "home/shared"
  cpu:
    limit: 1
    guarantee: 1
  memory:
    limit: "1G"
    guarantee: "1G"
  networkPolicy:
    enabled: false

# cull relates to the jupyterhub-idle-culler service, responsible for evicting
# inactive singleuser pods.
#
# The configuration below, except for enabled, corresponds to command-line flags
# for jupyterhub-idle-culler as documented here:
# https://github.com/jupyterhub/jupyterhub-idle-culler#as-a-standalone-script
#
cull:
  enabled: true
  users: false
  removeNamedServers: false
  timeout: 1800
  every: 600
  concurrency: 10
  maxAge: 0



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/variables.tf
---

variable "name" {
  description = "name for nebari deployment"
  type        = string
}

variable "namespace" {
  description = "Namespace for jupyterhub deployment"
  type        = string
}

variable "overrides" {
  description = "Jupyterhub helm chart list of overrides"
  type        = list(string)
  default     = []
}

variable "jupyterhub-image" {
  description = "Docker image to use for jupyterhub hub"
  type = object({
    name = string
    tag  = string
  })
}

variable "jupyterlab-image" {
  description = "Docker image to use for jupyterlab users"
  type = object({
    name = string
    tag  = string
  })
}

variable "general-node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}

variable "user-node-group" {
  description = "Node group key value pair for bound user resources"
  type = object({
    key   = string
    value = string
  })
}

variable "home-pvc" {
  description = "Name for persistent volume claim to use for home directory uses /home/{username}"
  type = object({
    name = string
    id   = string
  })
}

variable "shared-pvc" {
  description = "Name for persistent volume claim to use for shared directory uses /share/{group}"
  type = object({
    name = string
    id   = string
  })
}

variable "conda-store-pvc" {
  description = "Name for persistent volume claim to use for conda-store directory"
  type        = string
}

variable "conda-store-mount" {
  description = "Mount directory for conda-store environments"
  type        = string
}

variable "extra-mounts" {
  description = "Name of additional configmaps and pvcs to be mounted within jupyterlab image"
  default     = {}
}

variable "external-url" {
  description = "External url that jupyterhub cluster is accessible"
  type        = string
}

variable "realm_id" {
  description = "Keycloak realm to use for deploying openid client"
  type        = string
}

variable "services" {
  description = "Set of services that use the jupyterhub api"
  type        = set(string)
}

variable "theme" {
  description = "JupyterHub theme"
  type        = map(any)
  default     = {}
}

variable "profiles" {
  description = "JupyterHub profiles"
  default     = []
}

variable "conda-store-service-name" {
  description = "Name of conda-store service"
  type        = string
}

variable "conda-store-jhub-apps-token" {
  description = "Token for conda-store to be used by jhub apps for fetching conda environments dynamically."
  type        = string
}

variable "conda-store-environments" {
  description = "conda environments from conda-store in filesystem namespace"
  type        = any
  default     = {}
}

variable "jhub-apps-enabled" {
  description = "Enable/Disable JupyterHub Apps extension to spin up apps, dashboards, etc"
  type        = bool
}

variable "jhub-apps-overrides" {
  description = "jhub-apps configuration overrides"
  type        = string
}

variable "conda-store-argo-workflows-jupyter-scheduler-token" {
  description = "Token for argo-workflows-jupyter-schedule to use conda-store"
  type        = string
}

variable "jupyterhub-logout-redirect-url" {
  description = "Next redirect destination following a Keycloak logout"
  type        = string
  default     = ""
}

variable "jupyterhub-hub-extraEnv" {
  description = "Extracted overrides to merge with jupyterhub.hub.extraEnv"
  type        = string
  default     = "[]"
}

variable "default-conda-store-namespace" {
  description = "Default conda-store namespace"
  type        = string
}

variable "idle-culler-settings" {
  description = "Idle culler timeout settings (in minutes)"
  type = object({
    kernel_cull_busy                    = bool
    kernel_cull_connected               = bool
    kernel_cull_idle_timeout            = number
    kernel_cull_interval                = number
    server_shutdown_no_activity_timeout = number
    terminal_cull_inactive_timeout      = number
    terminal_cull_interval              = number
  })
}

variable "argo-workflows-enabled" {
  description = "Enable Argo Workflows"
  type        = bool
}

variable "jupyterlab-default-settings" {
  description = "Default settings for JupyterLab to be placed in overrides.json"
  type        = map(any)
}

variable "jupyterlab-gallery-settings" {
  description = "Server-side settings for jupyterlab-gallery extension"
  type = object({
    title                         = optional(string)
    destination                   = optional(string)
    hide_gallery_without_exhibits = optional(bool)
    exhibits = list(object({
      git         = string
      title       = string
      homepage    = optional(string)
      description = optional(string)
      icon        = optional(string)
      account     = optional(string)
      token       = optional(string)
      branch      = optional(string)
      depth       = optional(number)
    }))
  })
}

variable "jupyterlab-pioneer-enabled" {
  description = "Enable JupyterLab Pioneer for telemetry"
  type        = bool
}

variable "jupyterlab-pioneer-log-format" {
  description = "Logging format for JupyterLab Pioneer"
  type        = string
}

variable "jupyterlab-preferred-dir" {
  description = "Directory in which the JupyterLab should open the file browser"
  type        = string
}

variable "cloud-provider" {
  description = "Name of cloud provider."
  type        = string
}

variable "initial-repositories" {
  description = "Map of folder location and git repo url to clone"
  type        = string
  default     = "[]"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub-ssh/main.tf
---

resource "kubernetes_manifest" "jupyterhub-ssh-ingress" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRouteTCP"
    metadata = {
      name      = "jupyterhub-ssh-ingress"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["ssh"]
      routes = [
        {
          match = "HostSNI(`*`)"
          services = [
            {
              name = kubernetes_service.jupyterhub-ssh.metadata.0.name
              port = 8022
            }
          ]
        }
      ]
    }
  }
}


resource "kubernetes_manifest" "jupyterhub-sftp-ingress" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRouteTCP"
    metadata = {
      name      = "jupyterhub-sftp-ingress"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["sftp"]
      routes = [
        {
          match = "HostSNI(`*`)"
          services = [
            {
              name = kubernetes_service.jupyterhub-sftp.metadata.0.name
              port = 8023
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub-ssh/sftp.tf
---

resource "kubernetes_secret" "jupyterhub-sftp" {
  metadata {
    name      = "${var.name}-jupyterhub-sftp"
    namespace = var.namespace
  }

  data = {
    "hostKey" = tls_private_key.main.private_key_pem
    "hubUrl"  = var.jupyterhub_api_url
  }
}


resource "kubernetes_service" "jupyterhub-sftp" {
  metadata {
    name      = "${var.name}-jupyterhub-sftp"
    namespace = var.namespace
  }

  spec {
    selector = {
      "app.kubernetes.io/component" = "jupyterhub-sftp"
    }

    port {
      name        = "sftp"
      protocol    = "TCP"
      port        = 8023
      target_port = "sftp"
    }

    type = "ClusterIP"
  }
}


resource "kubernetes_deployment" "jupyterhub-sftp" {
  metadata {
    name      = "${var.name}-jupyterhub-sftp"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        "app.kubernetes.io/component" = "jupyterhub-sftp"
      }
    }

    template {
      metadata {
        labels = {
          "app.kubernetes.io/component" = "jupyterhub-sftp"
        }
      }

      spec {
        automount_service_account_token = true

        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.node-group.key
                  operator = "In"
                  values   = [var.node-group.value]
                }
              }
            }
          }
        }

        volume {
          name = "home"
          persistent_volume_claim {
            claim_name = var.persistent_volume_claim.name
          }
        }

        volume {
          name = "config"
          config_map {
            name = kubernetes_config_map.jupyterhub-ssh.metadata.0.name
          }
        }

        volume {
          name = "secrets"
          secret {
            secret_name  = kubernetes_secret.jupyterhub-sftp.metadata.0.name
            default_mode = "0600"
          }
        }

        container {
          name              = "jupyterhub-sftp"
          image             = "${var.jupyterhub-sftp-image.name}:${var.jupyterhub-sftp-image.tag}"
          image_pull_policy = "Always"

          security_context {
            privileged = true
          }

          volume_mount {
            name       = "home"
            mount_path = "/mnt/home"
            sub_path   = "home"
          }

          volume_mount {
            name       = "config"
            mount_path = "/etc/jupyterhub-ssh/config"
            read_only  = true
          }

          volume_mount {
            name       = "secrets"
            mount_path = "/etc/jupyterhub-sftp/config"
            read_only  = true
          }

          port {
            name           = "sftp"
            container_port = 22
            protocol       = "TCP"
          }
        }
      }
    }
  }
  lifecycle {
    replace_triggered_by = [
      null_resource.pvc,
    ]
  }
}

# hack to force the deployment to update when the pvc changes
resource "null_resource" "pvc" {
  triggers = {
    pvc = var.persistent_volume_claim.id
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub-ssh/ssh.tf
---

resource "tls_private_key" "main" {
  algorithm = "RSA"
  rsa_bits  = 2048
}


resource "kubernetes_config_map" "jupyterhub-ssh" {
  metadata {
    name      = "${var.name}-jupyterhub-ssh"
    namespace = var.namespace
  }

  data = {
    "values.yaml" = <<-EOT
      hubUrl: ${var.jupyterhub_api_url}
      ssh:
        config:
          JupyterHubSSH:
            debug: true
            host_key_path: /etc/jupyterhub-ssh/secrets/jupyterhub-ssh.host-key
    EOT
  }
}


resource "kubernetes_secret" "jupyterhub-ssh" {
  metadata {
    name      = "${var.name}-jupyterhub-ssh"
    namespace = var.namespace
  }

  data = {
    "jupyterhub-ssh.host-key" = tls_private_key.main.private_key_pem
  }
}


resource "kubernetes_service" "jupyterhub-ssh" {
  metadata {
    name      = "${var.name}-jupyterhub-ssh"
    namespace = var.namespace
  }

  spec {
    selector = {
      "app.kubernetes.io/component" = "jupyterhub-ssh"
    }

    port {
      name        = "ssh"
      protocol    = "TCP"
      port        = 8022
      target_port = "ssh"
    }

    type = "ClusterIP"
  }
}


resource "kubernetes_deployment" "jupyterhub-ssh" {
  metadata {
    name      = "${var.name}-jupyterhub-ssh"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        "app.kubernetes.io/component" = "jupyterhub-ssh"
      }
    }

    template {
      metadata {
        labels = {
          "app.kubernetes.io/component" = "jupyterhub-ssh"
        }
      }

      spec {
        automount_service_account_token = true

        affinity {
          node_affinity {
            required_during_scheduling_ignored_during_execution {
              node_selector_term {
                match_expressions {
                  key      = var.node-group.key
                  operator = "In"
                  values   = [var.node-group.value]
                }
              }
            }
          }
        }

        volume {
          name = "secrets"
          secret {
            secret_name = kubernetes_secret.jupyterhub-ssh.metadata.0.name
          }
        }

        volume {
          name = "config"
          config_map {
            name = kubernetes_config_map.jupyterhub-ssh.metadata.0.name
          }
        }

        container {
          name              = "jupyterhub-ssh"
          image             = "${var.jupyterhub-ssh-image.name}:${var.jupyterhub-ssh-image.tag}"
          image_pull_policy = "Always"

          security_context {
            allow_privilege_escalation = false
            run_as_non_root            = true
            run_as_user                = 1000
          }

          volume_mount {
            name       = "secrets"
            mount_path = "/etc/jupyterhub-ssh/secrets"
            read_only  = true
          }

          volume_mount {
            name       = "config"
            mount_path = "/etc/jupyterhub-ssh/config"
            read_only  = true
          }

          port {
            name           = "ssh"
            container_port = 8022
            protocol       = "TCP"
          }
        }
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub-ssh/variables.tf
---

variable "name" {
  description = "name prefix to assign to jupyterhub-ssh"
  type        = string
  default     = "nebari"
}

variable "namespace" {
  description = "namespace to deploy jupyterhub-ssh"
  type        = string
}

variable "node-group" {
  description = "Node group to associate jupyterhub-ssh deployment"
  type = object({
    key   = string
    value = string
  })
}

variable "jupyterhub_api_url" {
  description = "jupyterhub api url for jupyterhub-ssh"
  type        = string
}

variable "jupyterhub-ssh-image" {
  description = "image to use for jupyterhub-ssh"
  type = object({
    name = string
    tag  = string
  })
  default = {
    name = "quay.io/jupyterhub-ssh/ssh"
    tag  = "0.0.1-0.dev.git.149.he5107a4"
  }
}

variable "jupyterhub-sftp-image" {
  description = "image to use for jupyterhub-sftp"
  type = object({
    name = string
    tag  = string
  })
  default = {
    name = "quay.io/jupyterhub-ssh/sftp"
    tag  = "0.0.1-0.dev.git.142.h402a3d6"
  }
}

variable "persistent_volume_claim" {
  description = "name of persistent volume claim to mount"
  type = object({
    name = string
    id   = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/keycloak-client/main.tf
---

resource "random_password" "client_secret" {
  length  = 32
  special = false
}


resource "keycloak_openid_client" "main" {
  realm_id      = var.realm_id
  client_id     = var.client_id
  client_secret = random_password.client_secret.result

  name    = "grafana"
  enabled = true

  access_type           = "CONFIDENTIAL"
  standard_flow_enabled = true

  valid_redirect_uris      = var.callback-url-paths
  service_accounts_enabled = var.service-accounts-enabled
}


resource "keycloak_openid_user_client_role_protocol_mapper" "main" {
  realm_id   = var.realm_id
  client_id  = keycloak_openid_client.main.id
  name       = "user-client-role-mapper"
  claim_name = "roles"

  claim_value_type    = "String"
  multivalued         = true
  add_to_id_token     = true
  add_to_access_token = true
  add_to_userinfo     = true
}


resource "keycloak_openid_group_membership_protocol_mapper" "main" {
  realm_id   = var.realm_id
  client_id  = keycloak_openid_client.main.id
  name       = "group-membership-mapper"
  claim_name = "groups"

  full_path           = true
  add_to_id_token     = true
  add_to_access_token = true
  add_to_userinfo     = true
}

resource "keycloak_openid_user_attribute_protocol_mapper" "jupyterlab_profiles" {
  count = var.jupyterlab_profiles_mapper ? 1 : 0

  realm_id   = var.realm_id
  client_id  = keycloak_openid_client.main.id
  name       = "jupyterlab_profiles_mapper"
  claim_name = "jupyterlab_profiles"

  add_to_id_token     = true
  add_to_access_token = true
  add_to_userinfo     = true

  user_attribute       = "jupyterlab_profiles"
  multivalued          = true
  aggregate_attributes = true
}

data "keycloak_realm" "master" {
  realm = "nebari"
}

data "keycloak_openid_client" "realm_management" {
  realm_id  = var.realm_id
  client_id = "realm-management"
}

data "keycloak_role" "main-service" {
  for_each = toset(var.service-account-roles)

  realm_id  = data.keycloak_realm.master.id
  client_id = data.keycloak_openid_client.realm_management.id
  name      = each.key
}

resource "keycloak_openid_client_service_account_role" "main" {
  for_each = toset(var.service-account-roles)

  realm_id                = var.realm_id
  service_account_user_id = keycloak_openid_client.main.service_account_user_id
  client_id               = data.keycloak_openid_client.realm_management.id
  role                    = data.keycloak_role.main-service[each.key].name
}


resource "keycloak_role" "main" {
  for_each = toset(flatten(values(var.role_mapping)))

  realm_id    = var.realm_id
  client_id   = keycloak_openid_client.main.id
  name        = each.key
  description = each.key
}

data "keycloak_group" "main" {
  for_each = var.role_mapping

  realm_id = var.realm_id
  name     = each.key
}


resource "keycloak_group_roles" "group_roles" {
  for_each = var.role_mapping

  realm_id = var.realm_id
  group_id = data.keycloak_group.main[each.key].id
  role_ids = [for role in each.value : keycloak_role.main[role].id]

  exhaustive = false
}

resource "keycloak_role" "default_client_roles" {
  for_each    = { for role in var.client_roles : role.name => role }
  realm_id    = var.realm_id
  client_id   = keycloak_openid_client.main.id
  name        = each.value.name
  description = each.value.description
  attributes  = each.value.attributes
}

locals {
  group_role_mapping = flatten([
    for role_object in var.client_roles : [
      for group_name in role_object.groups : {
        group : group_name
        role_name : role_object.name
      }
    ]
  ])

  client_roles_groups = toset([
    for index, value in local.group_role_mapping : value.group
  ])
}

data "keycloak_group" "client_role_groups" {
  for_each = local.client_roles_groups
  realm_id = var.realm_id
  name     = each.value
}

resource "keycloak_group_roles" "assign_roles" {
  for_each   = { for idx, value in local.group_role_mapping : idx => value }
  realm_id   = var.realm_id
  group_id   = data.keycloak_group.client_role_groups[each.value.group].id
  role_ids   = [keycloak_role.default_client_roles[each.value.role_name].id]
  exhaustive = false
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/keycloak-client/outputs.tf
---

output "config" {
  description = "configuration credentials for connecting to openid client"
  value = {
    client_id               = keycloak_openid_client.main.client_id
    client_secret           = keycloak_openid_client.main.client_secret
    service_account_user_id = keycloak_openid_client.main.service_account_user_id

    authentication_url = "https://${var.external-url}/auth/realms/${var.realm_id}/protocol/openid-connect/auth"
    token_url          = "https://${var.external-url}/auth/realms/${var.realm_id}/protocol/openid-connect/token"
    userinfo_url       = "https://${var.external-url}/auth/realms/${var.realm_id}/protocol/openid-connect/userinfo"
    realm_api_url      = "https://${var.external-url}/auth/admin/realms/${var.realm_id}"
    callback_urls      = var.callback-url-paths
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/keycloak-client/variables.tf
---

variable "realm_id" {
  description = "Keycloak realm_id"
  type        = string
}


variable "client_id" {
  description = "OpenID Client ID"
  type        = string
}


variable "external-url" {
  description = "External url for keycloak auth endpoint"
  type        = string
}


variable "service-accounts-enabled" {
  description = "Whether the client should have a service account created"
  type        = bool
  default     = false
}

variable "service-account-roles" {
  description = "Roles to be granted to the service account. Requires setting service-accounts-enabled to true."
  type        = list(string)
  default     = []
}


variable "role_mapping" {
  description = "Group to role mapping to establish for client"
  type        = map(list(string))
  default     = {}
}


variable "callback-url-paths" {
  description = "URLs to use for openid callback"
  type        = list(string)
}

variable "jupyterlab_profiles_mapper" {
  description = "Create a mapper for jupyterlab_profiles group/user attributes"
  type        = bool
  default     = false
}

variable "client_roles" {
  description = "Create roles for the client and assign it to groups"
  default     = []
  type = list(object({
    name        = string
    description = string
    groups      = optional(list(string))
    attributes  = map(any)
  }))
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/keycloak-client/versions.tf
---

terraform {
  required_providers {
    keycloak = {
      source  = "mrparkers/keycloak"
      version = "3.7.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/minio/ingress.tf
---

resource "kubernetes_manifest" "minio-api" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "minio-api"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["minio"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`)"
          services = [
            {
              name      = helm_release.minio.name
              port      = 9000
              namespace = var.namespace
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/minio/main.tf
---

resource "random_password" "root_password" {
  length  = 32
  special = false
}


resource "helm_release" "minio" {
  name      = "${var.name}-minio"
  namespace = var.namespace

  repository = "https://raw.githubusercontent.com/bitnami/charts/defb094c658024e4aa8245622dab202874880cbc/bitnami"
  chart      = "minio"
  # last release that was Apache-2.0
  version = "6.7.4"

  set {
    name  = "accessKey.password"
    value = "admin"
  }

  set {
    name  = "secretKey.password"
    value = random_password.root_password.result
  }

  set {
    name  = "defaultBuckets"
    value = join(" ", var.buckets)
  }

  set {
    name  = "persistence.size"
    value = var.storage
  }

  values = concat([
    file("${path.module}/values.yaml"),
    jsonencode({
      nodeSelector = {
        "${var.node-group.key}" = var.node-group.value
      }
    })
  ], var.overrides)
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/minio/outputs.tf
---

output "root_username" {
  description = "Username for root user"
  value       = "admin"
}

output "root_password" {
  description = "Password for root user"
  value       = random_password.root_password.result
}

output "service" {
  description = "Service name"
  value       = helm_release.minio.name
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/minio/values.yaml
---

# https://github.com/bitnami/charts/blob/master/bitnami/minio/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/minio/variables.tf
---

variable "name" {
  description = "Name prefix to deploy conda-store server"
  type        = string
  default     = "nebari"
}


variable "namespace" {
  description = "Namespace to deploy conda-store server"
  type        = string
}


variable "storage" {
  description = "Storage size for minio objects"
  type        = string
  default     = "10Gi"
}

variable "buckets" {
  description = "Default available buckets"
  type        = list(string)
  default     = []
}


variable "overrides" {
  description = "Minio helm chart list of overrides"
  type        = list(string)
  default     = []
}


variable "external-url" {
  description = "External url that jupyterhub cluster is accessible"
  type        = string
}

variable "node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/dashboards/Main/cluster_information.json
---

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "iteration": 1681232789057,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "collapse": false,
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "panels": [],
      "showTitle": true,
      "title": "Cluster Stats",
      "titleSize": "h6",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "description": "Count of running users, grouped by namespace\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 24,
        "x": 0,
        "y": 1
      },
      "hiddenSeries": false,
      "id": 3,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "expr": "# Sum up all running user pods by namespace\nsum(\n  # Grab a list of all running pods.\n  # The group aggregator always returns \"1\" for the number of times each\n  # unique label appears in the time series. This is desirable for this\n  # use case because we're merely identifying running pods by name,\n  # not how many times they might be running.\n  group(\n    kube_pod_status_phase{phase=\"Running\"}\n  ) by (pod)\n  * on (pod) group_right() group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\"}\n  ) by (namespace, pod)\n) by (namespace)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{namespace}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Running Users",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "% of total memory in the cluster currently requested by to non-placeholder pods.\n\nIf autoscaling is efficient, this should be a fairly constant, high number (>70%).\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 11
      },
      "hiddenSeries": false,
      "id": 4,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  # Get individual container memory requests\n  kube_pod_container_resource_requests{resource=\"memory\"}\n  # Add node pool name as label\n  * on(node) group_left(label_cloud_google_com_gke_nodepool)\n  # group aggregator ensures that node names are unique per\n  # pool.\n  group(\n    kube_node_labels\n  ) by (node, label_cloud_google_com_gke_nodepool)\n  # Ignore containers from pods that aren't currently running or scheduled\n  # FIXME: This isn't the best metric here, evaluate what is.\n  and on (pod) kube_pod_status_scheduled{condition='true'}\n  # Ignore user and node placeholder pods\n  and on (pod) kube_pod_labels{label_component!~'user-placeholder|node-placeholder'}\n) by (label_cloud_google_com_gke_nodepool)\n/\nsum(\n  # Total allocatable memory on a node\n  kube_node_status_allocatable{resource=\"memory\"}\n  # Add nodepool name as label\n  * on(node) group_left(label_cloud_google_com_gke_nodepool)\n  # group aggregator ensures that node names are unique per\n  # pool.\n  group(\n    kube_node_labels\n  ) by (node, label_cloud_google_com_gke_nodepool)\n) by (label_cloud_google_com_gke_nodepool)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{label_cloud_google_com_gke_nodepool}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Memory commitment %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "% of total CPU in the cluster currently requested by to non-placeholder pods.\n\nJupyterHub users mostly are capped by memory, so this is not super useful.\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 11
      },
      "hiddenSeries": false,
      "id": 5,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  # Get individual container memory requests\n  kube_pod_container_resource_requests{resource=\"cpu\"}\n  # Add node pool name as label\n  * on(node) group_left(label_cloud_google_com_gke_nodepool)\n  # group aggregator ensures that node names are unique per\n  # pool.\n  group(\n    kube_node_labels\n  ) by (node, label_cloud_google_com_gke_nodepool)\n  # Ignore containers from pods that aren't currently running or scheduled\n  # FIXME: This isn't the best metric here, evaluate what is.\n  and on (pod) kube_pod_status_scheduled{condition='true'}\n  # Ignore user and node placeholder pods\n  and on (pod) kube_pod_labels{label_component!~'user-placeholder|node-placeholder'}\n) by (label_cloud_google_com_gke_nodepool)\n/\nsum(\n  # Total allocatable CPU on a node\n  kube_node_status_allocatable{resource=\"cpu\"}\n  # Add nodepool name as label\n  * on(node) group_left(label_cloud_google_com_gke_nodepool)\n  # group aggregator ensures that node names are unique per\n  # pool.\n  group(\n    kube_node_labels\n  ) by (node, label_cloud_google_com_gke_nodepool)\n) by (label_cloud_google_com_gke_nodepool)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{label_cloud_google_com_gke_nodepool}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "CPU commitment %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 21
      },
      "hiddenSeries": false,
      "id": 6,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "# sum up all nodes by nodepool\nsum(\n  # kube_pod_labels comes from\n  # https://github.com/kubernetes/kube-state-metrics, and there is a particular\n  # label (kubernetes_node) that lists the node on which the kube-state-metrics pod\n  # s running! So that's totally irrelevant to these queries, but when a nodepool\n  # is rotated it caused there to exist two metrics with the same node value (which\n  # we care about) but different kubernetes_node values (because kube-state-metrics\n  # was running in a different node, even though we don't care about that). This\n  # group really just drops all labels except the two we care about to\n  # avoid messing things up.\n  group(\n    kube_node_labels\n  ) by (node, label_cloud_google_com_gke_nodepool)\n) by (label_cloud_google_com_gke_nodepool)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{label_cloud_google_com_gke_nodepool}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Node Count",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "description": "Pods in states other than 'Running'.\n\nIn a functional clusters, pods should not be in non-Running states for long.\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 21
      },
      "hiddenSeries": false,
      "id": 7,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(kube_pod_status_phase{phase!=\"Running\"}) by (phase)",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{phase}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Non Running Pods",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapse": false,
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 31
      },
      "id": 8,
      "panels": [],
      "showTitle": true,
      "title": "Node Stats",
      "titleSize": "h6",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "% of available CPUs currently in use\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 32
      },
      "id": 9,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(rate(node_cpu_seconds_total{mode!=\"idle\"}[5m])) by (node)\n/\nsum(kube_node_status_capacity{resource=\"cpu\"}) by (node)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ node }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "title": "Node CPU Utilization %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ]
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "% of available Memory currently in use\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 32
      },
      "id": 10,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "1 - (\n  sum (\n    # Memory that can be allocated to processes when they need\n    node_memory_MemFree_bytes + # Unused bytes\n    node_memory_Cached_bytes + # Shared memory + temporary disk cache\n    node_memory_Buffers_bytes # Very temporary buffer memory cache for disk i/o\n  ) by (node)\n  /\n  sum(node_memory_MemTotal_bytes) by (node)\n)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{node}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "title": "Node Memory Utilization %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ]
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "% of each node guaranteed to pods on it\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 42
      },
      "id": 11,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  # Get individual container CPU limits\n  kube_pod_container_resource_requests{resource=\"cpu\"}\n  # Ignore containers from pods that aren't currently running or scheduled\n  # FIXME: This isn't the best metric here, evaluate what is.\n  and on (pod) kube_pod_status_scheduled{condition='true'}\n  # Ignore user and node placeholder pods\n  and on (pod) kube_pod_labels{label_component!~'user-placeholder|node-placeholder'}\n) by (node)\n/\nsum(\n  # Get individual container CPU requests\n  kube_node_status_allocatable{resource=\"cpu\"}\n) by (node)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{node}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "title": "Node CPU Commit %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ]
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "% of each node guaranteed to pods on it\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 42
      },
      "id": 12,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  # Get individual container memory limits\n  kube_pod_container_resource_requests{resource=\"memory\"}\n  # Ignore containers from pods that aren't currently running or scheduled\n  # FIXME: This isn't the best metric here, evaluate what is.\n  and on (pod) kube_pod_status_scheduled{condition='true'}\n  # Ignore user and node placeholder pods\n  and on (pod) kube_pod_labels{label_component!~'user-placeholder|node-placeholder'}\n) by (node)\n/\nsum(\n  # Get individual container memory requests\n  kube_node_status_allocatable{resource=\"memory\"}\n) by (node)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{node}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "title": "Node Memory Commit %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ]
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "description": "Number of Out of Memory (OOM) kills in a given node.\n\nWhen users use up more memory than they are allowed, the notebook kernel they\nwere running usually gets killed and restarted. This graph shows the number of times\nthat happens on any given node, and helps validate that a notebook kernel restart was\ninfact caused by an OOM\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 52
      },
      "id": 13,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "# We use [2m] here, as node_exporter usually scrapes things at 1min intervals\n# And oom kills are distinct events, so we want to see 'how many have just happened',\n# rather than average over time.\nincrease(node_vmstat_oom_kill[2m])\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ node }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "title": "Out of Memory kill count",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ]
    }
  ],
  "refresh": "5s",
  "schemaVersion": 34,
  "style": "dark",
  "tags": [
    "jupyterhub",
    "kubernetes"
  ],
  "templating": {
    "list": [
      {
        "current": {
          "selected": false,
          "text": "Prometheus",
          "value": "Prometheus"
        },
        "hide": 1,
        "includeAll": false,
        "multi": false,
        "name": "PROMETHEUS_DS",
        "options": [],
        "query": "prometheus",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "type": "datasource"
      }
    ]
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "browser",
  "title": "Cluster Information",
  "uid": "-whBDuL4k",
  "version": 1,
  "weekStart": ""
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/dashboards/Main/conda_store.json
---

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "description": "",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 6,
      "panels": [],
      "title": "Environments",
      "type": "row"
    },
    {
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "auto",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "noValue": "0",
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 24,
        "x": 0,
        "y": 1
      },
      "id": 18,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_queued",
          "interval": "",
          "legendFormat": "conda_store_build_queued",
          "refId": "Queued"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_building",
          "hide": false,
          "instant": false,
          "interval": "",
          "legendFormat": "conda_store_build_building",
          "refId": "Building"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_completed",
          "hide": false,
          "instant": false,
          "interval": "",
          "legendFormat": "conda_store_build_completed",
          "refId": "Completed"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_failed",
          "hide": false,
          "interval": "",
          "legendFormat": "conda_store_build_failed",
          "refId": "Failed"
        }
      ],
      "title": "Builds",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "noValue": "0",
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 3,
        "w": 3,
        "x": 0,
        "y": 8
      },
      "id": 8,
      "options": {
        "colorMode": "value",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_environments",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Environments",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "noValue": "0",
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 3,
        "w": 3,
        "x": 3,
        "y": 8
      },
      "id": 10,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_queued",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Queued",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "noValue": "0",
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 3,
        "w": 3,
        "x": 6,
        "y": 8
      },
      "id": 12,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_building",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Building",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "noValue": "0",
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 3,
        "w": 3,
        "x": 9,
        "y": 8
      },
      "id": 14,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_completed",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Completed",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "noValue": "0",
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 3,
        "w": 3,
        "x": 12,
        "y": 8
      },
      "id": 16,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_build_failed",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Failed",
      "type": "stat"
    },
    {
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 11
      },
      "id": 4,
      "panels": [],
      "title": "Storage",
      "type": "row"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "continuous-GrYlRd"
          },
          "mappings": [],
          "max": 1,
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 0.8
              }
            ]
          },
          "unit": "percentunit"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 6,
        "w": 8,
        "x": 0,
        "y": 12
      },
      "id": 2,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_disk_usage / conda_store_disk_total",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Disk Usage",
      "type": "gauge"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "decgbytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 6,
        "w": 7,
        "x": 8,
        "y": 12
      },
      "id": 20,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_disk_total / (2.0^30)",
          "instant": false,
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Total Storage",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "decgbytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 6,
        "w": 7,
        "x": 15,
        "y": 12
      },
      "id": 22,
      "options": {
        "colorMode": "value",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "conda_store_disk_usage / (2^30)",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Disk Used",
      "type": "stat"
    }
  ],
  "refresh": "5s",
  "schemaVersion": 34,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": []
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "",
  "title": "Conda-Store",
  "uid": "7lHPaT1nz",
  "version": 1,
  "weekStart": ""
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/dashboards/Main/jupyterhub_dashboard.json
---

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "iteration": 1681169262003,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "collapse": false,
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "panels": [],
      "showTitle": true,
      "title": "Hub usage stats",
      "titleSize": "h6",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 11,
        "w": 24,
        "x": 0,
        "y": 1
      },
      "hiddenSeries": false,
      "id": 3,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  group(\n    kube_pod_status_phase{phase=\"Running\"}\n  ) by (label_component, pod, namespace)\n  * on (namespace, pod) group_right() \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (namespace)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{namespace}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Currently Active Users",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapse": false,
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 12
      },
      "id": 7,
      "panels": [],
      "showTitle": true,
      "title": "User Resource Utilization stats",
      "titleSize": "h6",
      "type": "row"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#b4ff00",
        "colorScale": "sqrt",
        "colorScheme": "interpolateViridis",
        "exponent": 0.5,
        "mode": "spectrum"
      },
      "dataFormat": "timeseries",
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 13
      },
      "heatmap": {},
      "hideZeroBuckets": false,
      "highlightCards": true,
      "id": 8,
      "legend": {
        "show": false
      },
      "reverseYBuckets": false,
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "(\n  time()\n  - (\n    kube_pod_created{job='kube-state-metrics'}\n    * on (namespace, pod) group_left()\n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n  )\n)\n",
          "format": "time_series",
          "interval": "600s",
          "intervalFactor": 1,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "User active age distribution",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "xBucketSize": "600s",
      "yAxis": {
        "format": "s",
        "logBase": 1,
        "min": 0,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#b4ff00",
        "colorScale": "sqrt",
        "colorScheme": "interpolateViridis",
        "exponent": 0.5,
        "mode": "spectrum"
      },
      "dataFormat": "timeseries",
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 13
      },
      "heatmap": {},
      "hideZeroBuckets": false,
      "highlightCards": true,
      "id": 9,
      "legend": {
        "show": false
      },
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(\n  # exclude name=\"\" because the same container can be reported\n  # with both no name and `name=k8s_...`,\n  # in which case sum() by (pod) reports double the actual metric\n  irate(container_cpu_usage_seconds_total{name!=\"\"}[5m])\n  * on (namespace, pod) group_left(container) \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (pod)\n",
          "format": "time_series",
          "interval": "600s",
          "intervalFactor": 1,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "User CPU usage distribution",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "xBucketSize": "600s",
      "yAxis": {
        "format": "percentunit",
        "logBase": 1,
        "min": 0,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#b4ff00",
        "colorScale": "sqrt",
        "colorScheme": "interpolateViridis",
        "exponent": 0.5,
        "mode": "spectrum"
      },
      "dataFormat": "timeseries",
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 23
      },
      "heatmap": {},
      "hideZeroBuckets": false,
      "highlightCards": true,
      "id": 10,
      "legend": {
        "show": false
      },
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(\n  # exclude name=\"\" because the same container can be reported\n  # with both no name and `name=k8s_...`,\n  # in which case sum() by (pod) reports double the actual metric\n  container_memory_working_set_bytes{name!=\"\"}\n  * on (namespace, pod) group_left(container) \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (pod)\n",
          "format": "time_series",
          "interval": "600s",
          "intervalFactor": 1,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "User memory usage distribution",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "xBucketSize": "600s",
      "yAxis": {
        "format": "bytes",
        "logBase": 1,
        "min": 0,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "collapse": false,
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 33
      },
      "id": 11,
      "panels": [],
      "showTitle": true,
      "title": "Hub Diagnostics",
      "titleSize": "h6",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 34
      },
      "hiddenSeries": false,
      "id": 12,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": false,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 2,
      "points": true,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "histogram_quantile(0.99, sum(rate(jupyterhub_server_spawn_duration_seconds_bucket{}[5m])) by (le))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "99th percentile",
          "refId": "A"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "histogram_quantile(0.5, sum(rate(jupyterhub_server_spawn_duration_seconds_bucket{}[5m])) by (le))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "50th percentile",
          "refId": "B"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Server Start Times",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "s",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "Attempts by users to start servers that failed.\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 34
      },
      "hiddenSeries": false,
      "id": 13,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": false,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(increase(jupyterhub_server_spawn_duration_seconds_count{status!=\"success\"}[2m])) by (status)",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{status}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Server Start Failures",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 44
      },
      "hiddenSeries": false,
      "id": 14,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "histogram_quantile(\n  0.99,\n  sum(\n    rate(\n      jupyterhub_request_duration_seconds_bucket{\n        # Ignore SpawnProgressAPIHandler, as it is a EventSource stream\n        # and keeps long lived connections open\n        handler!=\"jupyterhub.apihandlers.users.SpawnProgressAPIHandler\"\n      }[5m]\n    )\n  ) by (le))\n",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "99th percentile",
          "refId": "A"
        },
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "expr": "histogram_quantile(\n  0.50,\n  sum(\n    rate(\n      jupyterhub_request_duration_seconds_bucket{\n        app=\"jupyterhub\",\n        namespace=~\"$hub\",\n        # Ignore SpawnProgressAPIHandler, as it is a EventSource stream\n        # and keeps long lived connections open\n        handler!=\"jupyterhub.apihandlers.users.SpawnProgressAPIHandler\"\n      }[5m]\n    )\n  ) by (le))\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "50th percentile",
          "refId": "B"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Hub response latency",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "s",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 12,
        "x": 12,
        "y": 44
      },
      "hiddenSeries": false,
      "id": 15,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  # exclude name=\"\" because the same container can be reported\n# with both no name and `name=k8s_...`,\n# in which case sum() reports double the actual metric\nirate(container_cpu_usage_seconds_total{name!=\"\"}[5m])\n\n  * on (namespace, pod) group_left(container, label_component) \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component!=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (label_component)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ label_component }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "All JupyterHub Components CPU",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 1,
          "format": "percentunit",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 12,
        "x": 0,
        "y": 54
      },
      "hiddenSeries": false,
      "id": 16,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  # exclude name=\"\" because the same container can be reported\n# with both no name and `name=k8s_...`,\n# in which case sum() reports double the actual metric\ncontainer_memory_working_set_bytes{name!=\"\"}\n\n  * on (namespace, pod) group_left(container, label_component) \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component!=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (label_component)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ label_component }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "All JupyterHub Components Memory (Working Set)",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "bytes",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "description": "% of disk space left in the disk storing the JupyterHub sqlite database. If goes to 0, the hub will fail.\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 56
      },
      "hiddenSeries": false,
      "id": 17,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "# Free bytes available on the hub db PVC\nsum(kubelet_volume_stats_available_bytes{persistentvolumeclaim=\"hub-db-dir\", namespace=~\"$hub\"}) by (namespace) /\n# Total number of bytes available on the hub db PVC\nsum(kubelet_volume_stats_capacity_bytes{persistentvolumeclaim=\"hub-db-dir\", namespace=~\"$hub\"}) by (namespace)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ $hub }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Hub DB Disk Space Availability %",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "percentunit",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "max": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "description": "Pods in a non-running state in the hub's namespace.\n\nPods stuck in non-running states often indicate an error condition\n",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 0,
        "y": 66
      },
      "hiddenSeries": false,
      "id": 18,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n  kube_pod_status_phase{phase!=\"Running\", namespace=~\"$hub\"}\n) by (phase)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{phase}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Non Running Pods",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$PROMETHEUS_DS"
      },
      "decimals": 0,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 12,
        "x": 12,
        "y": 66
      },
      "hiddenSeries": false,
      "id": 19,
      "legend": {
        "alignAsTable": false,
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "sum(\n    # kube_pod_info.node identifies the pod node,\n    # while kube_pod_labels.node is the metrics exporter's node\n    kube_pod_info{node!=\"\"}\n    * on (namespace, pod) group_left() \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (node)\n",
          "format": "time_series",
          "intervalFactor": 2,
          "legendFormat": "{{ node }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Users per node",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        },
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": 0,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapse": false,
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 76
      },
      "id": 21,
      "panels": [],
      "showTitle": true,
      "title": "Anomalous user pods",
      "titleSize": "h6",
      "type": "row"
    },
    {
      "columns": [],
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "description": "User pods that have been running for a long time (>8h).\n\nThis often indicates problems with the idle culler\n",
      "fontSize": "100%",
      "gridPos": {
        "h": 12,
        "w": 12,
        "x": 0,
        "y": 77
      },
      "id": 22,
      "links": [],
      "showHeader": true,
      "sort": {
        "col": 2,
        "desc": true
      },
      "styles": [
        {
          "alias": "Age",
          "align": "auto",
          "pattern": "Value",
          "type": "number",
          "unit": "s"
        },
        {
          "alias": "Time",
          "align": "auto",
          "pattern": "Time",
          "type": "hidden"
        }
      ],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": false,
          "expr": "(\n  time() - (kube_pod_created * on (namespace, pod) group_left\n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace))\n)  > (60 * 60 * 8) # 8 hours is our threshold\n",
          "format": "time_series",
          "instant": true,
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{namespace}}/{{pod}}",
          "refId": "A"
        }
      ],
      "title": "Very old user pods",
      "transform": "timeseries_to_rows",
      "type": "table-old"
    },
    {
      "columns": [],
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "description": "User pods using a lot of CPU\n\nThis could indicate a runaway process consuming resources\nunnecessarily.\n",
      "fontSize": "100%",
      "gridPos": {
        "h": 12,
        "w": 12,
        "x": 12,
        "y": 77
      },
      "id": 23,
      "links": [],
      "showHeader": true,
      "sort": {
        "col": 2,
        "desc": true
      },
      "styles": [
        {
          "alias": "CPU usage",
          "align": "auto",
          "pattern": "Value",
          "type": "number",
          "unit": "percentunit"
        },
        {
          "alias": "Time",
          "align": "auto",
          "pattern": "Time",
          "type": "hidden"
        }
      ],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": false,
          "expr": "max( # Ideally we just want 'current' value, so max will do\n  irate(container_cpu_usage_seconds_total[5m])\n  * on (namespace, pod) group_left() \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (namespace, pod) > 0.5\n",
          "format": "time_series",
          "instant": true,
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{namespace}}/{{pod}}",
          "refId": "A"
        }
      ],
      "title": "User Pods with high CPU usage (>0.5)",
      "transform": "timeseries_to_rows",
      "type": "table-old"
    },
    {
      "columns": [],
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "description": "User pods getting close to their memory limit\n\nOnce they hit their memory limit, user kernels will start dying.\n",
      "fontSize": "100%",
      "gridPos": {
        "h": 12,
        "w": 12,
        "x": 0,
        "y": 89
      },
      "id": 24,
      "links": [],
      "showHeader": true,
      "sort": {
        "col": 2,
        "desc": true
      },
      "styles": [
        {
          "alias": "% of mem limit consumed",
          "align": "auto",
          "pattern": "Value",
          "type": "number",
          "unit": "percentunit"
        },
        {
          "alias": "Time",
          "align": "auto",
          "pattern": "Time",
          "type": "hidden"
        }
      ],
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": false,
          "expr": "max( # Ideally we just want 'current', but max will do. This metric is a gauge, so sum is inappropriate\n  container_memory_working_set_bytes\n  * on (namespace, pod) group_left() \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (namespace, pod)\n/\nsum(\n  kube_pod_container_resource_limits{resource=\"memory\"}\n  * on (namespace, pod) group_left() \n  group(\n    kube_pod_labels{label_app=\"jupyterhub\", label_component=\"singleuser-server\", namespace=~\"$hub\"}\n  ) by (pod, namespace)\n) by (namespace, pod)\n> 0.8\n",
          "format": "time_series",
          "instant": true,
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{namespace}}/{{pod}}",
          "refId": "A"
        }
      ],
      "title": "User pods with high memory usage (>80% of limit)",
      "transform": "timeseries_to_rows",
      "type": "table-old"
    }
  ],
  "refresh": "5s",
  "schemaVersion": 34,
  "style": "dark",
  "tags": [
    "jupyterhub"
  ],
  "templating": {
    "list": [
      {
        "current": {
          "selected": false,
          "text": "Prometheus",
          "value": "Prometheus"
        },
        "hide": 1,
        "includeAll": false,
        "multi": false,
        "name": "PROMETHEUS_DS",
        "options": [],
        "query": "prometheus",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "type": "datasource"
      },
      {
        "current": {
          "selected": true,
          "text": [
            "dev"
          ],
          "value": [
            "dev"
          ]
        },
        "datasource": {
          "uid": "$PROMETHEUS_DS"
        },
        "definition": "label_values({service=\"hub\"},namespace)",
        "hide": 0,
        "includeAll": false,
        "multi": true,
        "name": "hub",
        "options": [],
        "query": {
          "query": "label_values({service=\"hub\"},namespace)",
          "refId": "Prometheus-hub-Variable-Query"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      }
    ]
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "browser",
  "title": "JupyterHub Dashboard",
  "uid": "hub-dashboard",
  "version": 1,
  "weekStart": ""
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/dashboards/Main/keycloak.json
---

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      },
      {
        "datasource": "-- Grafana --",
        "enable": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "iconSize": 0,
        "lineColor": "",
        "name": "Annotations & Alerts",
        "query": "",
        "showLine": false,
        "tagsField": "",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "textField": "",
        "type": "dashboard"
      }
    ]
  },
  "description": "An updated version of dashboard Keycloak metrics exported with Keycloak Metrics SPI\n\nhttps://github.com/aerogear/keycloak-metrics-spi",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "gnetId": 14607,
  "graphTooltip": 1,
  "iteration": 1642196323079,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "CustomPanel": {
        "datasource": "$Datasource",
        "description": "Memory currently being used by Keycloak.",
        "fieldConfig": {
          "defaults": {
            "color": {
              "mode": "thresholds"
            },
            "custom": {},
            "mappings": [],
            "max": 100,
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green"
                },
                {
                  "color": "#EAB839",
                  "value": 80
                },
                {
                  "color": "red",
                  "value": 90
                }
              ]
            },
            "unit": "percent"
          },
          "overrides": []
        },
        "gridPos": {
          "h": 7,
          "w": 6,
          "x": 0,
          "y": 0
        },
        "hideTimeOverride": false,
        "id": 5,
        "links": [],
        "options": {
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "mean"
            ],
            "fields": "",
            "values": false
          },
          "showThresholdLabels": false,
          "showThresholdMarkers": true
        },
        "pluginVersion": "7.2.0",
        "targets": [
          {
            "expr": "sum(jvm_memory_bytes_used{instance=\"$instance\", area=\"heap\"})*100/sum(jvm_memory_bytes_max{instance=\"$instance\", area=\"heap\"})\n",
            "format": "time_series",
            "hide": false,
            "instant": false,
            "intervalFactor": 1,
            "legendFormat": "",
            "refId": "B"
          }
        ],
        "title": "Current Memory HEAP",
        "type": "gauge"
      },
      "datasource": {
        "uid": "$Datasource"
      },
      "editable": false,
      "error": false,
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "max": 100,
          "min": 0,
          "thresholds": {
            "mode": "percentage",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "#EAB839",
                "value": 80
              },
              {
                "color": "red",
                "value": 90
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 6,
        "x": 0,
        "y": 0
      },
      "hideTimeOverride": false,
      "id": 5,
      "isNew": false,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "mean"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "span": 0,
      "targets": [
        {
          "expr": "sum(jvm_memory_bytes_used{pod=\"$instance\", area=\"heap\"})*100/sum(jvm_memory_bytes_max{pod=\"$instance\", area=\"heap\"})",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Current Memory HEAP",
      "type": "gauge"
    },
    {
      "CustomPanel": {
        "datasource": "$Datasource",
        "description": "Memory currently being used by Keycloak.",
        "fieldConfig": {
          "defaults": {
            "color": {
              "mode": "thresholds"
            },
            "custom": {},
            "mappings": [],
            "max": 100,
            "min": 0,
            "thresholds": {
              "mode": "absolute",
              "steps": [
                {
                  "color": "green"
                },
                {
                  "color": "#EAB839",
                  "value": 80
                },
                {
                  "color": "red",
                  "value": 90
                }
              ]
            },
            "unit": "percent"
          },
          "overrides": []
        },
        "gridPos": {
          "h": 7,
          "w": 6,
          "x": 6,
          "y": 0
        },
        "hideTimeOverride": false,
        "id": 23,
        "links": [],
        "options": {
          "orientation": "auto",
          "reduceOptions": {
            "calcs": [
              "mean"
            ],
            "fields": "",
            "values": false
          },
          "showThresholdLabels": false,
          "showThresholdMarkers": true
        },
        "pluginVersion": "7.2.0",
        "targets": [
          {
            "expr": "sum(jvm_memory_bytes_used{instance=\"$instance\", area=\"nonheap\"})*100/sum(jvm_memory_bytes_max{instance=\"$instance\", area=\"nonheap\"})",
            "format": "time_series",
            "hide": false,
            "instant": false,
            "intervalFactor": 1,
            "legendFormat": "",
            "refId": "B"
          }
        ],
        "title": "Current Memory nonHEAP",
        "type": "gauge"
      },
      "datasource": {
        "uid": "$Datasource"
      },
      "editable": false,
      "error": false,
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "max": 100,
          "min": 0,
          "thresholds": {
            "mode": "percentage",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "#EAB839",
                "value": 80
              },
              {
                "color": "red",
                "value": 90
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 6,
        "x": 6,
        "y": 0
      },
      "hideTimeOverride": false,
      "id": 23,
      "isNew": false,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "mean"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "span": 0,
      "targets": [
        {
          "expr": "sum(jvm_memory_bytes_used{pod=\"$instance\", area=\"nonheap\"})*100/sum(jvm_memory_bytes_max{pod=\"$instance\", area=\"nonheap\"})",
          "interval": "",
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Current Memory nonHEAP",
      "type": "gauge"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 20,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "bytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 0
      },
      "hideTimeOverride": false,
      "id": 12,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum(jvm_memory_bytes_max{pod=\"$instance\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Max",
          "refId": "A"
        },
        {
          "expr": "sum(jvm_memory_bytes_committed{pod=\"$instance\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Committed",
          "refId": "C"
        },
        {
          "expr": "sum(jvm_memory_bytes_used{pod=\"$instance\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Used",
          "refId": "B"
        }
      ],
      "title": "Memory Usage",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            }
          },
          "decimals": 0,
          "mappings": [],
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 6,
        "x": 0,
        "y": 7
      },
      "hideTimeOverride": true,
      "id": 16,
      "links": [],
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right",
          "values": [
            "percent"
          ]
        },
        "pieType": "pie",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "sum by (realm)(increase(keycloak_logins[24h]))",
          "interval": "",
          "legendFormat": "{{realm}}",
          "refId": "A"
        }
      ],
      "title": "Logins Per REALM for past 24h",
      "type": "piechart"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            }
          },
          "decimals": 0,
          "mappings": [],
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 6,
        "x": 6,
        "y": 7
      },
      "id": 44,
      "links": [],
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right",
          "values": [
            "percent"
          ]
        },
        "pieType": "pie",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "7.2.0",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "sum by (realm)(increase(keycloak_registrations[24h]))",
          "interval": "",
          "legendFormat": "{{realm}}",
          "refId": "A"
        }
      ],
      "title": "Registrations Per REALM for past 24h",
      "type": "piechart"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            }
          },
          "decimals": 0,
          "mappings": [],
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 6,
        "x": 12,
        "y": 7
      },
      "hideTimeOverride": true,
      "id": 20,
      "links": [],
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right",
          "values": [
            "percent"
          ]
        },
        "pieType": "pie",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "sum by (client_id)(increase(keycloak_logins[24h]))",
          "interval": "",
          "legendFormat": "{{client_id}}",
          "refId": "A"
        }
      ],
      "title": "Logins Per CLIENT for past 24h",
      "type": "piechart"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            }
          },
          "decimals": 0,
          "mappings": [],
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 6,
        "x": 18,
        "y": 7
      },
      "hideTimeOverride": true,
      "id": 17,
      "links": [],
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right",
          "values": [
            "percent"
          ]
        },
        "pieType": "pie",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "targets": [
        {
          "expr": "sum by (client_id)(increase(keycloak_registrations[24h]))",
          "interval": "",
          "legendFormat": "{{client_id}}",
          "refId": "A"
        }
      ],
      "title": "Registrations Per CLIENT for past 24h",
      "type": "piechart"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 6,
        "y": 14
      },
      "id": 46,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (code)(increase(keycloak_response_errors[30m]))",
          "interval": "",
          "legendFormat": "{{code}}",
          "refId": "A"
        }
      ],
      "title": "4xx and 5xx Responses",
      "type": "timeseries"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 22
      },
      "hideTimeOverride": false,
      "id": 1,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "sum by (realm)(increase(keycloak_logins[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{realm}}",
          "refId": "A"
        }
      ],
      "title": "Logins per REALM",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 22
      },
      "hideTimeOverride": false,
      "id": 7,
      "options": {
        "legend": {
          "calcs": [
            "lastNotNull"
          ],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (error) (increase(keycloak_failed_login_attempts{provider=\"keycloak\",realm=\"$realm\"}[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{$realm }} {{error}}",
          "refId": "A"
        },
        {
          "expr": "sum by (realm) (increase(keycloak_failed_login_attempts{provider=\"keycloak\",realm=\"dialog-test\"} [30m]))",
          "interval": "",
          "legendFormat": "{{sum by $realm}}",
          "refId": "B"
        }
      ],
      "title": "Login Errors on realm $realm",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 30
      },
      "hideTimeOverride": false,
      "id": 18,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (client_id)(increase(keycloak_logins{realm=\"$realm\",provider=\"keycloak\"}[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{client_id}}",
          "refId": "A"
        }
      ],
      "title": "Logins per CLIENT on realm $realm",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 30
      },
      "hideTimeOverride": false,
      "id": 21,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (realm) (increase(keycloak_registrations_errors{provider=\"keycloak\",realm=\"$realm\"} [30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "Sum by {{realm}}",
          "refId": "A"
        },
        {
          "expr": "sum by (error) (increase(keycloak_registrations_errors{provider=\"keycloak\",realm=\"$realm\"} [30m]))",
          "interval": "",
          "legendFormat": "{{error}}",
          "refId": "B"
        }
      ],
      "title": "Registration Errors on realm $realm",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 38
      },
      "hideTimeOverride": false,
      "id": 33,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (realm)(increase(keycloak_registrations[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{realm}}",
          "refId": "A"
        }
      ],
      "title": "Registrations per REALM",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 38
      },
      "hideTimeOverride": false,
      "id": 19,
      "options": {
        "legend": {
          "calcs": [
            "lastNotNull"
          ],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (error) (increase(keycloak_failed_login_attempts{provider=\"keycloak\",realm=\"$realm\",client_id=\"$ClientId\"}[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{error}}",
          "refId": "A"
        }
      ],
      "title": "Login Errors for $ClientId",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 46
      },
      "hideTimeOverride": false,
      "id": 22,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (client_id)(increase(keycloak_registrations{realm=\"$realm\",provider=\"keycloak\"}[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{client_id}}",
          "refId": "A"
        },
        {
          "expr": "sum by (realm)(increase(keycloak_registrations{provider=\"keycloak\",realm=\"$realm\"} [30m]))",
          "interval": "",
          "legendFormat": "Sum by {{realm}}",
          "refId": "B"
        }
      ],
      "title": "Registrations per CLIENT on Realm $realm",
      "type": "timeseries"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": false,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "min": 0,
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 46
      },
      "hideTimeOverride": false,
      "id": 34,
      "options": {
        "legend": {
          "calcs": [
            "lastNotNull"
          ],
          "displayMode": "table",
          "placement": "right"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum by (error) (increase(keycloak_registrations_errors{provider=\"keycloak\",realm=\"$realm\",client_id=\"$ClientId\"}[30m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{error}}",
          "refId": "A"
        }
      ],
      "title": "Registration Errors for $ClientId",
      "type": "timeseries"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#73BF69",
        "colorScale": "sqrt",
        "colorScheme": "interpolateGreens",
        "exponent": 0.4,
        "mode": "opacity"
      },
      "dataFormat": "tsbuckets",
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 54
      },
      "heatmap": {},
      "hideTimeOverride": false,
      "hideZeroBuckets": true,
      "highlightCards": true,
      "id": 35,
      "legend": {
        "show": true
      },
      "pluginVersion": "7.2.0",
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(increase(keycloak_request_duration_bucket{method=\"GET\"}[30m])) by (le)",
          "format": "heatmap",
          "interval": "",
          "intervalFactor": 4,
          "legendFormat": "{{ le }}",
          "refId": "A"
        }
      ],
      "title": "Request duration method = \"GET\" Heatmap",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "yAxis": {
        "format": "ms",
        "logBase": 1,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "max": 100,
          "min": 0,
          "thresholds": {
            "mode": "percentage",
            "steps": [
              {
                "color": "red",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              },
              {
                "color": "#EAB839",
                "value": 90
              },
              {
                "color": "green",
                "value": 98
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 54
      },
      "hideTimeOverride": false,
      "id": 39,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "mean"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum(rate(keycloak_request_duration_bucket{method=\"GET\", le=\"100.0\"}[30m])) / sum(rate(keycloak_request_duration_count{method=\"GET\"}[30m])) * 100",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Percentage of requests \"GET\"  method  was served in 100ms or below",
      "type": "gauge"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#73BF69",
        "colorScale": "sqrt",
        "colorScheme": "interpolateGreens",
        "exponent": 0.4,
        "mode": "opacity"
      },
      "dataFormat": "tsbuckets",
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 62
      },
      "heatmap": {},
      "hideTimeOverride": false,
      "hideZeroBuckets": true,
      "highlightCards": true,
      "id": 36,
      "legend": {
        "show": true
      },
      "pluginVersion": "7.2.0",
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(increase(keycloak_request_duration_bucket{method=\"POST\"}[30m])) by (le)",
          "format": "heatmap",
          "interval": "",
          "intervalFactor": 4,
          "legendFormat": "{{ le }}",
          "refId": "A"
        }
      ],
      "title": "Request duration method = \"POST\" Heatmap",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "yAxis": {
        "format": "ms",
        "logBase": 1,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "max": 100,
          "min": 0,
          "thresholds": {
            "mode": "percentage",
            "steps": [
              {
                "color": "red",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              },
              {
                "color": "#EAB839",
                "value": 90
              },
              {
                "color": "green",
                "value": 98
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 62
      },
      "hideTimeOverride": false,
      "id": 40,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "mean"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum(rate(keycloak_request_duration_bucket{method=\"POST\", le=\"100.0\"}[30m])) / sum(rate(keycloak_request_duration_count{method=\"POST\"}[30m])) * 100",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Percentage of requests \"POST\"  method  was served in 100ms or below",
      "type": "gauge"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#73BF69",
        "colorScale": "sqrt",
        "colorScheme": "interpolateGreens",
        "exponent": 0.4,
        "mode": "opacity"
      },
      "dataFormat": "tsbuckets",
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 70
      },
      "heatmap": {},
      "hideTimeOverride": false,
      "hideZeroBuckets": true,
      "highlightCards": true,
      "id": 37,
      "legend": {
        "show": true
      },
      "pluginVersion": "7.2.0",
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(increase(keycloak_request_duration_bucket{method=\"HEAD\"}[30m])) by (le)",
          "format": "heatmap",
          "interval": "",
          "intervalFactor": 4,
          "legendFormat": "{{ le }}",
          "refId": "A"
        }
      ],
      "title": "Request duration method = \"HEAD\" Heatmap",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "yAxis": {
        "format": "ms",
        "logBase": 1,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "max": 100,
          "min": 0,
          "thresholds": {
            "mode": "percentage",
            "steps": [
              {
                "color": "red",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              },
              {
                "color": "#EAB839",
                "value": 90
              },
              {
                "color": "green",
                "value": 98
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 70
      },
      "hideTimeOverride": false,
      "id": 41,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "mean"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum(rate(keycloak_request_duration_bucket{method=\"HEAD\", le=\"100.0\"}[30m])) / sum(rate(keycloak_request_duration_count{method=\"HEAD\"}[30m])) * 100",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Percentage of requests \"HEAD\"  method  was served in 100ms or below",
      "type": "gauge"
    },
    {
      "cards": {},
      "color": {
        "cardColor": "#73BF69",
        "colorScale": "sqrt",
        "colorScheme": "interpolateGreens",
        "exponent": 0.4,
        "mode": "opacity"
      },
      "dataFormat": "tsbuckets",
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 78
      },
      "heatmap": {},
      "hideTimeOverride": false,
      "hideZeroBuckets": true,
      "highlightCards": true,
      "id": 38,
      "legend": {
        "show": true
      },
      "pluginVersion": "7.2.0",
      "reverseYBuckets": false,
      "targets": [
        {
          "expr": "sum(increase(keycloak_request_duration_bucket{method=\"PUT\"}[30m])) by (le)",
          "format": "heatmap",
          "interval": "",
          "intervalFactor": 4,
          "legendFormat": "{{ le }}",
          "refId": "A"
        }
      ],
      "title": "Request duration method = \"PUT\" Heatmap",
      "tooltip": {
        "show": true,
        "showHistogram": false
      },
      "type": "heatmap",
      "xAxis": {
        "show": true
      },
      "yAxis": {
        "format": "ms",
        "logBase": 1,
        "show": true
      },
      "yBucketBound": "auto"
    },
    {
      "datasource": {
        "uid": "$Datasource"
      },
      "description": "",
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "max": 100,
          "min": 0,
          "thresholds": {
            "mode": "percentage",
            "steps": [
              {
                "color": "red",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              },
              {
                "color": "#EAB839",
                "value": 90
              },
              {
                "color": "green",
                "value": 98
              }
            ]
          },
          "unit": "percent"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 78
      },
      "hideTimeOverride": false,
      "id": 42,
      "options": {
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "mean"
          ],
          "fields": "",
          "values": false
        },
        "showThresholdLabels": false,
        "showThresholdMarkers": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "expr": "sum(rate(keycloak_request_duration_bucket{method=\"PUT\", le=\"100.0\"}[30m])) / sum(rate(keycloak_request_duration_count{method=\"PUT\"}[30m])) * 100",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Percentage of requests \"PUT\"  method  was served in 100ms or below",
      "type": "gauge"
    }
  ],
  "refresh": false,
  "schemaVersion": 34,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": [
      {
        "current": {
          "selected": false,
          "text": "Prometheus",
          "value": "Prometheus"
        },
        "hide": 0,
        "includeAll": false,
        "multi": false,
        "name": "Datasource",
        "options": [],
        "query": "prometheus",
        "queryValue": "",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "type": "datasource"
      },
      {
        "allFormat": "",
        "allValue": "",
        "current": {
          "selected": false,
          "text": "None",
          "value": ""
        },
        "datasource": {
          "uid": "$Datasource"
        },
        "definition": "label_values(keycloak_logins,pod)",
        "hide": 0,
        "includeAll": false,
        "label": "Instance",
        "multi": false,
        "multiFormat": "",
        "name": "instance",
        "options": [],
        "query": {
          "query": "label_values(keycloak_logins,pod)",
          "refId": "Prometheus-instance-Variable-Query"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      },
      {
        "allFormat": "",
        "allValue": "",
        "current": {
          "selected": true,
          "text": "master",
          "value": "master"
        },
        "datasource": {
          "uid": "$Datasource"
        },
        "definition": "",
        "hide": 0,
        "includeAll": false,
        "label": "Realm",
        "multi": false,
        "multiFormat": "",
        "name": "realm",
        "options": [],
        "query": {
          "query": "label_values(keycloak_logins{provider=\"keycloak\"},realm)",
          "refId": "Prometheus-realm-Variable-Query"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      },
      {
        "allFormat": "",
        "allValue": "",
        "current": {
          "selected": false,
          "text": "admin-cli",
          "value": "admin-cli"
        },
        "datasource": {
          "uid": "$Datasource"
        },
        "definition": "",
        "hide": 0,
        "includeAll": false,
        "label": "ClientId",
        "multi": false,
        "multiFormat": "",
        "name": "ClientId",
        "options": [],
        "query": {
          "query": "label_values(keycloak_logins{provider=\"keycloak\",realm=\"$realm\"},client_id)",
          "refId": "Prometheus-ClientId-Variable-Query"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      }
    ]
  },
  "time": {
    "from": "now-30d",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "",
  "title": "Keycloak Metrics Dashboard",
  "uid": "keycloak-dashboard",
  "version": 2,
  "weekStart": ""
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/dashboards/Main/traefik.json
---

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "description": "Traefik Metrics Overview",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "gnetId": 13165,
  "graphTooltip": 1,
  "iteration": 1681240333933,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 21,
      "panels": [],
      "title": "General",
      "type": "row"
    },
    {
      "fieldConfig": {
        "defaults": {
          "color": {
            "fixedColor": "rgb(31, 120, 193)",
            "mode": "fixed"
          },
          "mappings": [
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 7,
        "w": 3,
        "x": 0,
        "y": 1
      },
      "id": 13,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "area",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "text": {},
        "textMode": "auto"
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "prometheus"
          },
          "exemplar": true,
          "expr": "count(kube_pod_status_ready{namespace=\"$namespace\",condition=\"true\",pod=~\"nebari-traefik-ingress-.*\", job=\"kube-state-metrics\"})",
          "format": "time_series",
          "hide": false,
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "",
          "refId": "A"
        }
      ],
      "title": "Running instances",
      "type": "stat"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 7,
        "w": 21,
        "x": 3,
        "y": 1
      },
      "hiddenSeries": false,
      "id": 29,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": false,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "histogram_quantile(0.$percentiles, sum(rate(traefik_entrypoint_request_duration_seconds_bucket{code=~\"2..\"}[5m])) by (instance, le))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{ instance }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Per instance latency $percentiles th perc over 5 min",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "s",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 8
      },
      "id": 17,
      "panels": [],
      "title": "Entrypoints",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$Prometheus"
      },
      "fill": 7,
      "fillGradient": 0,
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 9
      },
      "hiddenSeries": false,
      "id": 19,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(traefik_entrypoint_open_connections) by (method)",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{ method }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Open Connections",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$Prometheus"
      },
      "decimals": 2,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 9
      },
      "hiddenSeries": false,
      "id": 22,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "(sum(rate(traefik_entrypoint_request_duration_seconds_bucket{le=\"0.1\",code=\"200\"}[5m])) by (job) + sum(rate(traefik_entrypoint_request_duration_seconds_bucket{le=\"0.3\",code=\"200\"}[5m])) by (job)) / 2 / sum(rate(traefik_entrypoint_request_duration_seconds_count{code=\"200\"}[5m])) by (job)",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Code 200",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Apdex score (over 5 min)",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 2,
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "decimals": 2,
      "description": "",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 24,
        "x": 0,
        "y": 16
      },
      "hiddenSeries": false,
      "id": 3,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": false,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(rate(traefik_entrypoint_requests_total[1m])) by (entrypoint)",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{ entrypoint }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Requests/min per entrypoint",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 2,
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "decimals": 2,
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": "0",
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 26
      },
      "id": 24,
      "panels": [],
      "title": "Services",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$Prometheus"
      },
      "decimals": 0,
      "fill": 7,
      "fillGradient": 0,
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 0,
        "y": 27
      },
      "hiddenSeries": false,
      "id": 25,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(traefik_service_open_connections) by (method)",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "{{ method }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Open Connections",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 0,
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "decimals": 2,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 7,
        "w": 12,
        "x": 12,
        "y": 27
      },
      "hiddenSeries": false,
      "id": 26,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "(sum(rate(traefik_service_request_duration_seconds_bucket{le=\"0.1\",code=\"200\"}[5m])) by (job) + sum(rate(traefik_service_request_duration_seconds_bucket{le=\"0.3\",code=\"200\"}[5m])) by (job)) / 2 / sum(rate(traefik_service_request_duration_seconds_count{code=\"200\"}[5m])) by (job)",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "200",
          "refId": "A"
        },
        {
          "exemplar": true,
          "expr": "traefik_service_request_duration_seconds_bucket{le=\"0.1\",code=\"200\"}",
          "hide": false,
          "interval": "",
          "legendFormat": "",
          "refId": "B"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Apdex score (over 5 min)",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 2,
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "decimals": 2,
      "description": "",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 10,
        "w": 24,
        "x": 0,
        "y": 34
      },
      "hiddenSeries": false,
      "id": 4,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "hideZero": true,
        "max": true,
        "min": false,
        "rightSide": true,
        "show": true,
        "sort": "avg",
        "sortDesc": true,
        "total": false,
        "values": true
      },
      "lines": false,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(rate(traefik_service_requests_total[1m])) by (service)",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{ service }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Requests/min per service",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 2,
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "decimals": 2,
          "format": "short",
          "label": "",
          "logBase": 1,
          "min": "0",
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 44
      },
      "id": 15,
      "panels": [],
      "title": "HTTP Codes stats",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": true,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$Prometheus"
      },
      "decimals": 0,
      "description": "",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 45
      },
      "hiddenSeries": false,
      "id": 5,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": false,
        "hideEmpty": false,
        "hideZero": true,
        "max": false,
        "min": false,
        "rightSide": true,
        "show": true,
        "total": true,
        "values": true
      },
      "lines": false,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": true,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "count(rate(traefik_service_requests_total{code=~\"[2|3|4|5]..\"}[5m])) by (method, code)",
          "format": "time_series",
          "interval": "1",
          "intervalFactor": 2,
          "legendFormat": "{{method}} : {{code}}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Status method/codes over 5min",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "decimals": 2,
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": true
        },
        {
          "decimals": 2,
          "format": "short",
          "logBase": 1,
          "min": "0",
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 54
      },
      "id": 35,
      "panels": [],
      "title": "Pods resources",
      "type": "row"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 0,
        "y": 55
      },
      "hiddenSeries": false,
      "id": 31,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": false,
        "rightSide": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(container_memory_usage_bytes{namespace=\"$namespace\",pod=~\".*traefik.*\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Memory used",
          "refId": "A"
        },
        {
          "exemplar": true,
          "expr": "sum(kube_pod_container_resource_requests_memory_bytes{namespace=\"$namespace\",pod=~\".*traefik.*\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Requested memory",
          "refId": "B"
        },
        {
          "exemplar": true,
          "expr": "sum(kube_pod_container_resource_limits_memory_bytes{namespace=\"$namespace\",pod=~\".*traefik.*\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Limit memory usage",
          "refId": "C"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Traefik memory usage",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "bytes",
          "logBase": 1,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": false
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "uid": "$Prometheus"
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 12,
        "x": 12,
        "y": 55
      },
      "hiddenSeries": false,
      "id": 33,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.3.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "exemplar": true,
          "expr": "sum(rate(container_cpu_usage_seconds_total{namespace=\"$namespace\",pod=~\".*traefik.*\"}[2m]))",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Cpu used",
          "refId": "A"
        },
        {
          "exemplar": true,
          "expr": "sum(kube_pod_container_resource_requests_cpu_cores{namespace=\"$namespace\",pod=\".*traefik.*\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Requested cpu",
          "refId": "B"
        },
        {
          "exemplar": true,
          "expr": "sum(kube_pod_container_resource_limits_cpu_cores{namespace=\"$namespace\",pod=\".*traefik.*\"})",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 1,
          "legendFormat": "Limit cpu usage",
          "refId": "C"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Traefik CPU usage",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "short",
          "logBase": 1,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    }
  ],
  "refresh": "10s",
  "schemaVersion": 34,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": [
      {
        "current": {
          "selected": false,
          "text": "Prometheus",
          "value": "Prometheus"
        },
        "hide": 0,
        "includeAll": false,
        "label": "Datasource",
        "multi": false,
        "name": "Prometheus",
        "options": [],
        "query": "prometheus",
        "queryValue": "",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "type": "datasource"
      },
      {
        "current": {
          "selected": true,
          "text": "dev",
          "value": "dev"
        },
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "definition": "label_values(kube_pod_container_info{pod=~\".*traefik.*\"}, namespace)",
        "hide": 0,
        "includeAll": false,
        "label": "Namespace",
        "multi": false,
        "name": "namespace",
        "options": [],
        "query": {
          "query": "label_values(kube_pod_container_info{pod=~\".*traefik.*\"}, namespace)",
          "refId": "StandardVariableQuery"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 1,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      },
      {
        "current": {
          "selected": true,
          "text": "95",
          "value": "95"
        },
        "hide": 0,
        "includeAll": false,
        "label": "Percentiles",
        "multi": false,
        "name": "percentiles",
        "options": [
          {
            "selected": true,
            "text": "95",
            "value": "95"
          },
          {
            "selected": false,
            "text": "99",
            "value": "99"
          }
        ],
        "query": "95,99",
        "queryValue": "",
        "skipUrlSync": false,
        "type": "custom"
      }
    ]
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "",
  "title": "Traefik",
  "uid": "2p6nlgS7z",
  "version": 1,
  "weekStart": ""
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/dashboards/Main/usage_report.json
---

{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "iteration": 1681169345054,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "bytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 10,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "options": {
        "displayMode": "gradient",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "showUnfilled": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "kube_pod_labels{\n  label_app=\"jupyterhub\",\n  label_component=\"singleuser-server\",\n  namespace=~\"$hub\",\n  job=\"kube-state-metrics\"\n}\n* on (namespace, pod) group_left()\nsum(\n  container_memory_working_set_bytes{\n    namespace=~\"$hub\",\n    container=\"notebook\",\n    name!=\"\",\n  }\n) by (namespace, pod)\n",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{label_hub_jupyter_org_username}} ({{namespace}})",
          "refId": "A"
        }
      ],
      "title": "User pod memory usage",
      "type": "bargauge"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "bytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 10,
        "w": 24,
        "x": 0,
        "y": 10
      },
      "id": 3,
      "options": {
        "displayMode": "gradient",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "showUnfilled": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "sum(\n  kube_pod_labels{\n    namespace=~\"$hub\",\n    label_app_kubernetes_io_component=\"dask-worker\",\n  }\n  * on (namespace, pod) group_left()\n  sum(\n    container_memory_working_set_bytes{\n      namespace=~\"$hub\",\n      container=\"dask-worker\",\n      name!=\"\",\n    }\n  ) by (namespace, pod)\n) by (label_gateway_dask_org_cluster)\n",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{label_hub_jupyter_org_username}}-{{label_gateway_dask_org_cluster}}",
          "refId": "A"
        }
      ],
      "title": "Dask-gateway worker pod memory usage",
      "type": "bargauge"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "$PROMETHEUS_DS"
      },
      "fieldConfig": {
        "defaults": {
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "bytes"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 10,
        "w": 24,
        "x": 0,
        "y": 20
      },
      "id": 4,
      "options": {
        "displayMode": "gradient",
        "orientation": "auto",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "showUnfilled": true
      },
      "pluginVersion": "8.3.3",
      "targets": [
        {
          "datasource": {
            "type": "prometheus",
            "uid": "PBFA97CFB590B2093"
          },
          "exemplar": true,
          "expr": "sum(\n  kube_pod_labels{\n    namespace=~\"$hub\",\n    label_app_kubernetes_io_component=\"dask-scheduler\",\n  }\n  * on (namespace, pod) group_left()\n  sum(\n    container_memory_working_set_bytes{\n      namespace=~\"$hub\",\n      container=\"dask-scheduler\",\n      name!=\"\",\n    }\n  ) by (namespace, pod)\n) by (label_gateway_dask_org_cluster)\n",
          "format": "time_series",
          "interval": "",
          "intervalFactor": 2,
          "legendFormat": "{{label_hub_jupyter_org_username}}-{{label_gateway_dask_org_cluster}}",
          "refId": "A"
        }
      ],
      "title": "Dask-gateway scheduler pod memory usage",
      "type": "bargauge"
    }
  ],
  "refresh": "5s",
  "schemaVersion": 34,
  "style": "dark",
  "tags": [
    "jupyterhub",
    "dask"
  ],
  "templating": {
    "list": [
      {
        "current": {
          "selected": false,
          "text": "Prometheus",
          "value": "Prometheus"
        },
        "hide": 1,
        "includeAll": false,
        "multi": false,
        "name": "PROMETHEUS_DS",
        "options": [],
        "query": "prometheus",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "type": "datasource"
      },
      {
        "current": {
          "selected": false,
          "text": "dev",
          "value": "dev"
        },
        "datasource": {
          "uid": "$PROMETHEUS_DS"
        },
        "definition": "label_values({service=\"hub\"},namespace)",
        "hide": 0,
        "includeAll": false,
        "multi": false,
        "name": "hub",
        "options": [],
        "query": {
          "query": "label_values({service=\"hub\"},namespace)",
          "refId": "Prometheus-hub-Variable-Query"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      }
    ]
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "browser",
  "title": "Usage Report",
  "uid": "usage-report",
  "version": 1,
  "weekStart": ""
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/main.tf
---

resource "random_password" "minio_root_password" {
  length  = 32
  special = false
}

locals {
  minio-url = "http://${var.minio-release-name}:${var.minio-port}"
  node-selector = {
    "${var.node-group.key}" = "${var.node-group.value}"
  }
}

resource "helm_release" "loki-minio" {
  count      = var.minio-enabled ? 1 : 0
  name       = var.minio-release-name
  namespace  = var.namespace
  repository = "https://raw.githubusercontent.com/bitnami/charts/defb094c658024e4aa8245622dab202874880cbc/bitnami"
  chart      = "minio"
  # last release that was Apache-2.0
  version = var.minio-helm-chart-version

  set {
    name  = "accessKey.password"
    value = "admin"
  }

  set {
    name  = "secretKey.password"
    value = random_password.minio_root_password.result
  }

  set {
    name  = "defaultBuckets"
    value = join(" ", var.buckets)
  }

  set {
    name  = "persistence.size"
    value = var.minio-storage
  }

  values = concat([
    file("${path.module}/values_minio.yaml"),
    jsonencode({
      nodeSelector : local.node-selector
    })
  ], var.grafana-loki-minio-overrides)
}


resource "helm_release" "grafana-loki" {
  name       = "nebari-loki"
  namespace  = var.namespace
  repository = "https://grafana.github.io/helm-charts"
  chart      = "loki"
  version    = var.loki-helm-chart-version

  values = concat([
    file("${path.module}/values_loki.yaml"),
    jsonencode({
      loki : {
        storage : {
          s3 : {
            endpoint : local.minio-url,
            accessKeyId : "admin"
            secretAccessKey : random_password.minio_root_password.result,
            s3ForcePathStyle : true
          }
        }
      }
      storageConfig : {
        # We configure MinIO by using the AWS config because MinIO implements the S3 API
        aws : {
          s3 : local.minio-url
          s3ForcePathStyle : true
        }
      }
      write : { nodeSelector : local.node-selector }
      read : { nodeSelector : local.node-selector }
      backend : { nodeSelector : local.node-selector }
      gateway : { nodeSelector : local.node-selector }
    })
  ], var.grafana-loki-overrides)

  depends_on = [helm_release.loki-minio]
}

resource "helm_release" "grafana-promtail" {
  # Promtail ships the contents of logs to Loki instance
  name       = "nebari-promtail"
  namespace  = var.namespace
  repository = "https://grafana.github.io/helm-charts"
  chart      = "promtail"
  version    = var.promtail-helm-chart-version

  values = concat([
    file("${path.module}/values_promtail.yaml"),
    jsonencode({
    })
  ], var.grafana-promtail-overrides)

  depends_on = [helm_release.grafana-loki]
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/values_loki.yaml
---

# https://github.com/grafana/loki/blob/4cae003ecedd474e4c15feab4ea2ef435afff83f/production/helm/loki/values.yaml

loki:
  storage:
    type: s3
  commonConfig:
    replication_factor: 1
  # Not required as it is inside cluster and not exposed to the public network
  auth_enabled: false

  # The Compactor deduplicates index entries and also apply granular retention.
  compactor:
    # is the directory where marked chunks and temporary tables will be saved.
    working_directory: /var/loki/compactor/data/retention
    # minio s3
    shared_store: s3
    # how often compaction will happen
    compaction_interval: 1h
    # should delete old logs after retention delete delay
    # ideally we would want to do storage based retention, but this is not
    # currently implemented in loki, that's why we're doing time based retention.
    retention_enabled: true
    # is the delay after which the Compactor will delete marked chunks.
    retention_delete_delay: 1h
    # specifies the maximum quantity of goroutine workers instantiated to delete chunks.
    retention_delete_worker_count: 150

  limits_config:
    # The minimum retention period is 24h.
    # This is reasonable in most cases, but if people would like to retain logs for longer
    # then they can override this variable from nebari-config.yaml
    retention_period: 60d

  schema_config:
    configs:
      # list of period_configs
      # The date of the first day that index buckets should be created.
      - from: "2024-03-01"
        index:
            period: 24h
            prefix: loki_index_
        object_store: s3
        schema: v11
        store: boltdb-shipper
  storage_config:
    boltdb_shipper:
        # Directory where ingesters would write index files which would then be
        # uploaded by shipper to configured storage
        active_index_directory: /var/loki/compactor/data/index
        # Cache location for restoring index files from storage for queries
        cache_location: /var/loki/compactor/data/boltdb-cache
        # Shared store for keeping index files
        shared_store: s3

# Configuration for the write pod(s)
write:
  # -- Number of replicas for the write
  # Keeping cost of running Nebari in mind
  # We don't need so many replicas, if people need it
  # they can always override from nebari-config.yaml
  replicas: 1

read:
  # -- Number of replicas for the read
  replicas: 1

backend:
  # -- Number of replicas for the backend
  replicas: 1

minio:
  # We are deploying minio from bitnami chart separately
  enabled: false

monitoring:
  selfMonitoring:
    grafanaAgent:
      installOperator: false



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/values_minio.yaml
---

# https://github.com/bitnami/charts/blob/440ec159c26e4ff0748b9e9866b345d98220c40a/bitnami/minio/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/values_promtail.yaml
---

# https://github.com/grafana/helm-charts/blob/3831194ba2abd2a0ca7a14ca00e578f8e9d2abc6/charts/promtail/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/variables.tf
---

variable "namespace" {
  description = "deploy monitoring services on this namespace"
  type        = string
  default     = "dev"
}

variable "loki-helm-chart-version" {
  description = "version to deploy for the loki helm chart"
  type        = string
  default     = "5.43.3"
}

variable "promtail-helm-chart-version" {
  description = "version to deploy for the promtail helm chart"
  type        = string
  default     = "6.15.5"
}

variable "minio-helm-chart-version" {
  description = "version to deploy for the minio helm chart"
  type        = string
  default     = "6.7.4"
}

variable "grafana-loki-overrides" {
  description = "Grafana Loki helm chart overrides"
  type        = list(string)
  default     = []
}

variable "grafana-promtail-overrides" {
  description = "Grafana Promtail helm chart overrides"
  type        = list(string)
  default     = []
}

variable "grafana-loki-minio-overrides" {
  description = "Grafana Loki minio helm chart overrides"
  type        = list(string)
  default     = []
}

variable "minio-release-name" {
  description = "Grafana Loki minio release name"
  type        = string
  default     = "nebari-loki-minio"
}

variable "minio-port" {
  description = "Grafana Loki minio port"
  type        = number
  default     = 9000
}

variable "buckets" {
  description = "Minio buckets"
  type        = list(string)
  default = [
    "chunks",
    "ruler",
    "admin",
    "loki"
  ]
}

variable "minio-storage" {
  description = "Minio storage"
  type        = string
  default     = "50Gi"
}

variable "minio-enabled" {
  description = "Deploy minio along with loki or not"
  type        = bool
  default     = true
}

variable "node-group" {
  description = "Node key value pair for bound resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/main.tf
---

resource "random_password" "grafana_admin_password" {
  length  = 32
  special = false
}

resource "kubernetes_secret" "grafana_oauth_secret" {
  metadata {
    name      = "grafana-oauth-secret"
    namespace = var.namespace
  }

  data = {
    "grafana-oauth-client-id"     = module.grafana-client-id.config.client_id
    "grafana-oauth-client-secret" = module.grafana-client-id.config.client_secret
  }
}

resource "helm_release" "prometheus-grafana" {
  name       = "nebari"
  namespace  = var.namespace
  repository = "https://prometheus-community.github.io/helm-charts"
  chart      = "kube-prometheus-stack"
  version    = "58.4.0"

  values = concat([
    file("${path.module}/values.yaml"),
    # https://github.com/prometheus-community/helm-charts/blob/kube-prometheus-stack-58.4.0/charts/kube-prometheus-stack/values.yaml
    jsonencode({
      alertmanager = {
        alertmanagerSpec = {
          nodeSelector : {
            "${var.node-group.key}" = var.node-group.value
          }
        }
      }

      prometheusOperator = {
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }

        admissionWebhooks = {
          patch = {
            nodeSelector = {
              "${var.node-group.key}" = var.node-group.value
            }
          }
        }
      }

      kube-state-metrics = {
        # kube-state-metrics does not collect pod labels by default.
        # This tells kube-state-metrics to collect app and component labels which are used by the jupyterhub grafana dashboards.
        metricLabelsAllowlist = ["pods=[app,component,hub.jupyter.org/username,app.kubernetes.io/component,gateway.dask.org/cluster]", "nodes=[*]"] # ["pods=[*]"] would collect all pod labels, but is not recommended.
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }
      }

      prometheus = {
        prometheusSpec = {
          nodeSelector = {
            "${var.node-group.key}" = var.node-group.value
          }
          additionalScrapeConfigs = [
            {
              job_name        = "kuberhealthy"
              scrape_interval = "1m"
              honor_labels    = true
              metrics_path    = "/metrics"
              static_configs = [
                {
                  targets = [
                    "kuberhealthy.${var.namespace}.svc.cluster.local"
                  ]
                }
              ]
            },
            {
              job_name     = "Keycloak Target"
              metrics_path = "/auth/realms/master/metrics"
              static_configs = [
                { targets = [
                  "keycloak-http.${var.namespace}.svc",
                  ]
                }
              ]
            },
            {
              job_name     = "Conda Store Target"
              metrics_path = "/conda-store/metrics"
              static_configs = [
                { targets = [
                  "nebari-conda-store-server.${var.namespace}.svc:5000",
                  ]
                }
              ]
            },
            {
              job_name     = "Jupyterhub"
              metrics_path = "/hub/metrics"
              static_configs = [
                { targets = [
                  "hub.${var.namespace}.svc:8081",
                  ]
                }
              ]
              authorization = {
                type        = "Bearer"
                credentials = var.jupyterhub_api_token
              }
            },
            {
              "job_name"     = "Kubernetes Services"
              "honor_labels" = true
              "kubernetes_sd_configs" = [{
                "role" = "endpoints"
              }]

              "relabel_configs" = [
                {
                  "action" = "keep"

                  "regex" = true

                  "source_labels" = ["__meta_kubernetes_service_annotation_prometheus_io_scrape"]
                },
                {
                  "action" = "drop"

                  "regex" = true

                  "source_labels" = ["__meta_kubernetes_service_annotation_prometheus_io_scrape_slow"]
                },
                {
                  "action" = "replace"

                  "regex" = "(https?)"

                  "source_labels" = ["__meta_kubernetes_service_annotation_prometheus_io_scheme"]

                  "target_label" = "__scheme__"
                },
                {
                  "action" = "replace"

                  "regex" = "(.+)"

                  "source_labels" = ["__meta_kubernetes_service_annotation_prometheus_io_path"]

                  "target_label" = "__metrics_path__"
                },
                {
                  "action" = "replace"

                  "regex" = "(.+?)(?::\\d+)?;(\\d+)"

                  "replacement" = "$1:$2"

                  "source_labels" = ["__address__", "__meta_kubernetes_service_annotation_prometheus_io_port"]

                  "target_label" = "__address__"
                },
                {
                  "action" = "labelmap"

                  "regex" = "__meta_kubernetes_service_annotation_prometheus_io_param_(.+)"

                  "replacement" = "__param_$1"
                },
                {
                  "action" = "labelmap"

                  "regex" = "__meta_kubernetes_service_label_(.+)"
                },
                {
                  "action" = "replace"

                  "source_labels" = ["__meta_kubernetes_namespace"]

                  "target_label" = "namespace"
                },
                {
                  "action" = "replace"

                  "source_labels" = ["__meta_kubernetes_service_name"]

                  "target_label" = "service"
                },
                {
                  "action" = "replace"

                  "source_labels" = ["__meta_kubernetes_pod_node_name"]

                  "target_label" = "node"
                }
              ]
            }
          ]
        }
      }

      # https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml
      grafana = {
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }

        additionalDataSources = [
          {
            name = "Loki"
            type = "loki"
            url  = "http://loki-gateway.${var.namespace}"
          }
        ]

        # Avoid using the default password, as that's a security risk
        adminPassword : random_password.grafana_admin_password.result

        sidecar = {
          dashboards = {
            annotations = {
              "dashboard/subdirectory" = "Supplementary"
            }
            provider = {
              foldersFromFilesStructure : true
            }
            # If specified, the sidecar will look for annotation with this name to create folder and put graph here.
            # You can use this parameter together with `provider.foldersFromFilesStructure`to annotate configmaps and create folder structure.
            folderAnnotation : "dashboard/subdirectory"
          }
        }

        envFromSecret = kubernetes_secret.grafana_oauth_secret.metadata[0].name

        "grafana.ini" : {
          server = {
            protocol            = "http"
            domain              = var.external-url
            root_url            = "https://%(domain)s/monitoring"
            serve_from_sub_path = "true"
          }

          auth = {
            oauth_auto_login = "true"
          }

          "auth.generic_oauth" = {
            enabled                  = "true"
            name                     = "Login Keycloak"
            allow_sign_up            = "true"
            client_id                = "$__env{grafana-oauth-client-id}"
            client_secret            = "$__env{grafana-oauth-client-secret}"
            scopes                   = "profile"
            auth_url                 = module.grafana-client-id.config.authentication_url
            token_url                = module.grafana-client-id.config.token_url
            api_url                  = module.grafana-client-id.config.userinfo_url
            tls_skip_verify_insecure = "true"
            login_attribute_path     = "preferred_username"
            role_attribute_path      = "contains(roles[*], 'grafana_admin') && 'Admin' || contains(roles[*], 'grafana_developer') && 'Editor' || contains(roles[*], 'grafana_viewer') || 'Viewer'"
          }
        }
      }
    })
  ], var.overrides)
}


module "grafana-client-id" {
  source = "../keycloak-client"

  realm_id     = var.realm_id
  client_id    = "grafana"
  external-url = var.external-url
  role_mapping = {
    "admin"     = ["grafana_admin"]
    "developer" = ["grafana_developer"]
    "analyst"   = ["grafana_viewer"]
  }
  callback-url-paths = [
    "https://${var.external-url}/monitoring/login/generic_oauth"
  ]
}


resource "kubernetes_config_map" "dashboard" {
  for_each = var.dashboards
  metadata {
    name      = "nebari-grafana-dashboards-${lower(each.value)}"
    namespace = var.namespace
    labels = {
      # grafana_dashboard label needed for grafana to pick it up
      # automatically
      grafana_dashboard = "1"
    }
    annotations = {
      "dashboard/subdirectory" = "${each.value}"
    }
  }

  data = {
    for dashboard_file in fileset("${path.module}/dashboards/${each.value}", "*.json") :
    dashboard_file => file("${path.module}/dashboards/${each.value}/${dashboard_file}")
  }
}


resource "kubernetes_manifest" "grafana-ingress-route" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "grafana-ingress-route"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`/monitoring`)"
          services = [
            {
              name      = "nebari-grafana"
              port      = 80
              namespace = var.namespace
            }
          ]
        }
      ]
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/values.yaml
---

# https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/variables.tf
---

variable "namespace" {
  description = "deploy monitoring services on this namespace"
  type        = string
  default     = "dev"
}


variable "external-url" {
  description = "External url that jupyterhub cluster is accessible"
  type        = string
}


variable "realm_id" {
  description = "Keycloak realm for creating oauth client"
  type        = string
}


variable "dashboards" {
  description = "Enabled grafana dashboards"
  type        = set(string)
  default = [
    "Main",
  ]
}

variable "jupyterhub_api_token" {
  type      = string
  default   = ""
  sensitive = true
}

variable "node-group" {
  description = "Node key value pair for bound resources"
  type = object({
    key   = string
    value = string
  })
}


variable "overrides" {
  description = "Grafana helm chart overrides"
  type        = list(string)
  default     = []
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/postgresql/main.tf
---

resource "random_password" "root_password" {
  length  = 32
  special = false
}


resource "helm_release" "postgresql" {
  name      = "${var.name}-postgresql"
  namespace = var.namespace

  repository = "https://raw.githubusercontent.com/bitnami/charts/defb094c658024e4aa8245622dab202874880cbc/bitnami"
  chart      = "postgresql"
  version    = "10.13.12"

  set {
    name  = "postgresqlUsername"
    value = "postgres"
  }

  set {
    name  = "postgresqlPassword"
    value = random_password.root_password.result
  }

  set {
    name  = "postgresqlDatabase"
    value = var.database
  }

  values = concat([
    file("${path.module}/values.yaml"),
    jsonencode({
      primary = {
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }
      }
    })
  ], var.overrides)
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/postgresql/outputs.tf
---

output "root_username" {
  description = "Username for root user"
  value       = "postgres"
}

output "root_password" {
  description = "Password for root user"
  value       = random_password.root_password.result
}

output "database" {
  description = "Database name"
  value       = var.database
}

output "service" {
  description = "Service name"
  value       = helm_release.postgresql.name
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/postgresql/values.yaml
---

# https://github.com/bitnami/charts/blob/master/bitnami/postgresql/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/postgresql/variables.tf
---

variable "name" {
  description = "Name prefix to deploy conda-store server"
  type        = string
  default     = "nebari"
}


variable "namespace" {
  description = "Namespace to deploy conda-store server"
  type        = string
}


variable "database" {
  description = "Postgres database"
  type        = string
}


variable "overrides" {
  description = "Postgresql helm chart list of overrides"
  type        = list(string)
  default     = []
}

variable "node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/redis/main.tf
---

resource "random_password" "root_password" {
  length  = 32
  special = false
}


resource "helm_release" "redis" {
  name      = "${var.name}-redis"
  namespace = var.namespace

  repository = "https://charts.bitnami.com/bitnami"
  chart      = "redis"
  version    = "17.0.6"

  set {
    name  = "auth.password"
    value = random_password.root_password.result
  }

  values = concat([
    file("${path.module}/values.yaml"),
    jsonencode({
      architecture = "standalone"
      master = {
        nodeSelector = {
          "${var.node-group.key}" = var.node-group.value
        }
      }
    })
  ], var.overrides)
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/redis/outputs.tf
---

output "root_password" {
  description = "Password for redis"
  value       = random_password.root_password.result
}

output "service" {
  description = "Service name"
  value       = "${helm_release.redis.name}-master"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/redis/values.yaml
---

# https://github.com/bitnami/charts/blob/master/bitnami/redis/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/redis/variables.tf
---

variable "name" {
  description = "Name prefix to deploy conda-store server"
  type        = string
  default     = "nebari"
}


variable "namespace" {
  description = "Namespace to deploy conda-store server"
  type        = string
}


variable "overrides" {
  description = "Postgresql helm chart list of overrides"
  type        = list(string)
  default     = []
}


variable "node-group" {
  description = "Node key value pair for bound general resources"
  type = object({
    key   = string
    value = string
  })
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/rook-ceph/cluster-values.yaml.tftpl
---

# https://github.com/rook/rook/blob/v1.14.7/deploy/charts/rook-ceph-cluster/values.yaml
monitoring:
  enabled: false  # TODO: Enable monitoring when nebari-config.yaml has it enabled
toolbox:
  enabled: false # for debugging purposes
cephBlockPools: []
cephObjectStores: []
cephClusterSpec:
  cephConfig:
    global:
      osd_pool_default_size: "1"
      mon_warn_on_pool_no_redundancy: "false"
      bdev_flock_retry: "20"
      bluefs_buffered_io: "false"
      mon_data_avail_warn: "10"
  placement:
    additionalProperties:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
            - matchExpressions:
                - key: ${node_group.key}
                  operator: In
                  values:
                    - ${node_group.value}
  # values from https://raw.githubusercontent.com/rook/rook/release-1.14/deploy/examples/cluster-on-pvc.yaml
  dataDirHostPath: /var/lib/rook
  mon:
    # Set the number of mons to be started. Generally recommended to be 3.
    # For highest availability, an odd number of mons should be specified.
    count: 1
    allowMultiplePerNode: true
    # A volume claim template can be specified in which case new monitors (and
    # monitors created during fail over) will construct a PVC based on the
    # template for the monitor's primary storage. Changes to the template do not
    # affect existing monitors. Log data is stored on the HostPath under
    # dataDirHostPath. If no storage requirement is specified, a default storage
    # size appropriate for monitor data will be used.
    volumeClaimTemplate:
      spec:
        %{ if storageClassName != null }storageClassName: ${storageClassName}%{ endif }
        resources:
          requests:
            storage: 10Gi
  cephVersion:
    image: quay.io/ceph/ceph:v18.2.2
    allowUnsupported: false
  mgr:
    count: 1
    allowMultiplePerNode: true
    modules:
      - name: rook
        enabled: true
  dashboard:
    enabled: true
    ssl: false
  crashCollector:
    disable: true # false
  logCollector:
    enabled: true
    periodicity: daily # one of: hourly, daily, weekly, monthly
    maxLogSize: 500M # SUFFIX may be 'M' or 'G'. Must be at least 1M.
  storage:
    storageClassDeviceSets:
      - name: set1
        # The number of OSDs to create from this device set
        count: 1
        portable: true
        tuneDeviceClass: true
        tuneFastDeviceClass: true
        # whether to encrypt the deviceSet or not
        encrypted: false
        # Since the OSDs could end up on any node, an effort needs to be made to spread the OSDs
        # across nodes as much as possible. Unfortunately the pod anti-affinity breaks down
        # as soon as you have more than one OSD per node. The topology spread constraints will
        # give us an even spread on K8s 1.18 or newer.
        placement:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
                - matchExpressions:
                    - key: ${node_group.key}
                      operator: In
                      values:
                        - ${node_group.value}
        resources:
        volumeClaimTemplates:
          - metadata:
              name: data
              # if you are looking at giving your OSD a different CRUSH device class than the one detected by Ceph
              # annotations:
              #   crushDeviceClass: hybrid
            spec:
              resources:
                requests:
                  storage: ${storage_capacity_Gi}Gi  # TODO: Look into auto resizing these as needed
              # IMPORTANT: Change the storage class depending on your environment
              %{ if storageClassName != null }storageClassName: ${storageClassName}%{ endif }
              volumeMode: Block
              accessModes:
                - ReadWriteOnce
    # when onlyApplyOSDPlacement is false, will merge both placement.All() and storageClassDeviceSets.Placement.
    onlyApplyOSDPlacement: false
  resources:
  priorityClassNames:
    # If there are multiple nodes available in a failure domain (e.g. zones), the
    # mons and osds can be portable and set the system-cluster-critical priority class.
    mon: system-node-critical
    osd: system-node-critical
    mgr: system-cluster-critical
  disruptionManagement:
    managePodBudgets: true
    osdMaintenanceTimeout: 30
    pgHealthCheckTimeout: 0

cephFileSystems:
  - name: ceph-filesystem
    # see https://github.com/rook/rook/blob/master/Documentation/CRDs/Shared-Filesystem/ceph-filesystem-crd.md#filesystem-settings for available configuration
    spec:
      metadataPool:
        replicated:
          size: 1
      dataPools:
        - failureDomain: host
          replicated:
            size: 1
          # Optional and highly recommended, 'data0' by default, see https://github.com/rook/rook/blob/master/Documentation/CRDs/Shared-Filesystem/ceph-filesystem-crd.md#pools
          name: data0
      metadataServer:
        activeCount: 1
        activeStandby: true
        resources:
          limits:
            memory: "4Gi"
          requests:
            cpu: "1000m"
            memory: "4Gi"
        priorityClassName: system-cluster-critical
    storageClass:
      enabled: true
      isDefault: false
      name: ceph-filesystem
      # (Optional) specify a data pool to use, must be the name of one of the data pools above, 'data0' by default
      pool: data0
      reclaimPolicy: Delete
      allowVolumeExpansion: true
      volumeBindingMode: "Immediate"
      annotations: { }
      labels: { }
      mountOptions: []
      # see https://github.com/rook/rook/blob/master/Documentation/Storage-Configuration/Shared-Filesystem-CephFS/filesystem-storage.md#provision-storage for available configuration
      parameters:
        # The secrets contain Ceph admin credentials.
        csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
        csi.storage.k8s.io/provisioner-secret-namespace: "{{ .Release.Namespace }}"
        csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
        csi.storage.k8s.io/controller-expand-secret-namespace: "{{ .Release.Namespace }}"
        csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
        csi.storage.k8s.io/node-stage-secret-namespace: "{{ .Release.Namespace }}"
        # Specify the filesystem type of the volume. If not specified, csi-provisioner
        # will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
        # in hyperconverged settings where the volume is mounted on the same node as the osds.
        csi.storage.k8s.io/fstype: ext4



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/rook-ceph/main.tf
---

resource "helm_release" "rook-ceph-cluster" {
  name          = "rook-ceph-cluster"
  namespace     = var.namespace
  repository    = "https://charts.rook.io/release"
  chart         = "rook-ceph-cluster"
  version       = "v1.14.7"
  wait          = true
  wait_for_jobs = true

  values = concat([
    templatefile("${path.module}/cluster-values.yaml.tftpl",
      {
        "storageClassName"    = var.storage_class_name,
        "node_group"          = var.node_group,
        "storage_capacity_Gi" = var.ceph_storage_capacity,
    }),
    jsonencode({
      operatorNamespace = var.operator_namespace,
    })
  ], var.overrides)
}

locals {
  storage-class           = data.kubernetes_storage_class.rook-ceph-fs-delete-sc
  storage-class-base-name = "ceph-filesystem"
}

data "kubernetes_storage_class" "rook-ceph-fs-delete-sc" {
  metadata {
    name = local.storage-class-base-name # TODO: Make sure we get this right
  }
  depends_on = [helm_release.rook-ceph-cluster]
}

resource "kubernetes_storage_class" "ceph-retain-sc" {
  metadata {
    name = "${local.storage-class-base-name}-retain" # "ceph-filesystem-retain"  # TODO: Make sure we get this right
  }
  storage_provisioner    = local.storage-class.storage_provisioner # "rook-ceph.cephfs.csi.ceph.com"
  reclaim_policy         = "Retain"
  volume_binding_mode    = local.storage-class.volume_binding_mode
  allow_volume_expansion = local.storage-class.allow_volume_expansion
  parameters             = local.storage-class.parameters

  depends_on = [data.kubernetes_storage_class.rook-ceph-fs-delete-sc]
}

# This is necessary on GKE to completely create a ceph cluster
resource "kubernetes_resource_quota" "rook_critical_pods" {
  metadata {
    name      = "rook-critical-pods"
    namespace = var.namespace
    labels = {
      "addonmanager.kubernetes.io/mode" = "Reconcile"
    }
  }

  spec {
    hard = {
      "pods" = "1G"
    }

    scope_selector {
      match_expression {
        operator   = "In"
        scope_name = "PriorityClass"
        values     = ["system-node-critical", "system-cluster-critical"]
      }
    }
  }
  # depends_on = [helm_release.rook-ceph]
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/rook-ceph/operator-values.yaml
---

# https://github.com/rook/rook/blob/v1.14.7/deploy/charts/rook-ceph/values.yaml



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/rook-ceph/variables.tf
---

variable "namespace" {
  description = "deploy rook-ceph operator in this namespace"
  type        = string
}

variable "operator_namespace" {
  description = "namespace where the rook-ceph operator is deployed"
  type        = string
}


variable "overrides" {
  description = "Rook Ceph helm chart overrides"
  type        = list(string)
  default     = []
}

variable "storage_class_name" {
  description = "Name of the storage class to create"
  type        = string
  default     = null
}

variable "node_group" {
  description = "Node key value pair for bound resources"
  type = object({
    key   = string
    value = string
  })
}

variable "ceph_storage_capacity" {
  description = "Ceph storage capacity in Gi"
  type        = number
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/rook-ceph/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/argo-workflows.tf
---

# ======================= VARIABLES ======================
variable "argo-workflows-overrides" {
  description = "Argo Workflows helm chart overrides"
  type        = list(string)
}

variable "nebari-workflow-controller" {
  description = "Nebari Workflow Controller enabled"
  type        = bool
}


variable "keycloak-read-only-user-credentials" {
  description = "Keycloak password for nebari-bot"
  type        = map(string)
}

variable "workflow-controller-image-tag" {
  description = "Image tag for nebari-workflow-controller"
  type        = string
}


# ====================== RESOURCES =======================
module "argo-workflows" {
  count = var.argo-workflows-enabled ? 1 : 0

  source       = "./modules/kubernetes/services/argo-workflows"
  namespace    = var.environment
  external-url = var.endpoint
  realm_id     = var.realm_id

  node-group                          = var.node_groups.general
  overrides                           = var.argo-workflows-overrides
  keycloak-read-only-user-credentials = var.keycloak-read-only-user-credentials
  workflow-controller-image-tag       = var.workflow-controller-image-tag
  nebari-workflow-controller          = var.nebari-workflow-controller
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/conda-store.tf
---

# ======================= VARIABLES ======================
variable "conda-store-environments" {
  description = "Conda-Store managed environments"
}

variable "conda-store-filesystem-storage" {
  description = "Conda-Store storage in GB for filesystem environments that are built"
  type        = string
}

variable "conda-store-object-storage" {
  description = "Conda-Store storage in GB for object storage. Conda-Store uses minio for object storage to be cloud agnostic. If empty default is var.conda-store-filesystem-storage value"
  type        = string
}

variable "conda-store-extra-settings" {
  description = "Conda-Store extra traitlet settings to apply in `c.Class.key = value` form"
  type        = map(any)
}

variable "conda-store-extra-config" {
  description = "Additional traitlets configuration code to be ran"
  type        = string
}

variable "conda-store-image" {
  description = "Conda-Store image"
  type        = string
}

variable "conda-store-image-tag" {
  description = "Version of conda-store to use"
  type        = string
}

variable "conda-store-service-token-scopes" {
  description = "Map of services tokens and scopes for conda-store"
  type        = map(any)
}

# ====================== RESOURCES =======================
module "kubernetes-conda-store-server" {
  source = "./modules/kubernetes/services/conda-store"

  name      = "nebari"
  namespace = var.environment

  external-url = var.endpoint
  realm_id     = var.realm_id

  nfs_capacity           = var.conda-store-filesystem-storage
  minio_capacity         = coalesce(var.conda-store-object-storage, var.conda-store-filesystem-storage)
  node-group             = var.node_groups.general
  conda-store-image      = var.conda-store-image
  conda-store-image-tag  = var.conda-store-image-tag
  default-namespace-name = var.conda-store-default-namespace
  environments = {
    for filename, environment in var.conda-store-environments :
    filename => yamlencode(environment)
  }
  services       = var.conda-store-service-token-scopes
  extra-settings = var.conda-store-extra-settings
  extra-config   = var.conda-store-extra-config
  conda-store-fs = var.shared_fs_type

  depends_on = [
    module.rook-ceph
  ]
}

moved {
  from = module.conda-store-nfs-mount
  to   = module.kubernetes-conda-store-server.module.conda-store-nfs-mount[0]
}


locals {
  conda-store-fs = var.shared_fs_type
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/dask_gateway.tf
---

# ===================== VARIABLES ====================
variable "dask-worker-image" {
  description = "Dask worker image"
  type = object({
    name = string
    tag  = string
  })
}

variable "dask-gateway-profiles" {
  description = "Dask Gateway profiles to expose to user"
}

# =================== RESOURCES =====================
module "dask-gateway" {
  source = "./modules/kubernetes/services/dask-gateway"

  namespace            = var.environment
  jupyterhub_api_token = module.jupyterhub.services.dask-gateway.api_token
  jupyterhub_api_url   = "${module.jupyterhub.internal_jupyterhub_url}/hub/api"

  external-url = var.endpoint

  cluster-image = var.dask-worker-image

  general-node-group = var.node_groups.general
  worker-node-group  = var.node_groups.worker

  # needs to match name in module.jupyterhub.extra-mounts
  dask-etc-configmap-name = "dask-etc"

  # environments
  conda-store-pvc               = module.kubernetes-conda-store-server.pvc
  conda-store-mount             = "/home/conda"
  default-conda-store-namespace = var.conda-store-default-namespace
  conda-store-api-token         = module.kubernetes-conda-store-server.service-tokens.dask-gateway
  conda-store-service-name      = module.kubernetes-conda-store-server.service_name

  # profiles
  profiles = var.dask-gateway-profiles

  cloud-provider = var.cloud-provider

  forwardauth_middleware_name = var.forwardauth_middleware_name

  depends_on = [
    module.kubernetes-nfs-server,
    module.rook-ceph
  ]
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/forward-auth.tf
---

module "forwardauth" {
  source = "./modules/kubernetes/forwardauth"

  namespace    = var.environment
  external-url = var.endpoint
  realm_id     = var.realm_id

  node-group                  = var.node_groups.general
  forwardauth_middleware_name = var.forwardauth_middleware_name
  cert_secret_name            = var.cert_secret_name
}

variable "forwardauth_middleware_name" {
  description = "Name of the traefik forward auth middleware"
  type        = string
}

variable "cert_secret_name" {
  description = "Name of the secret containing the certificate"
  type        = string
}

output "forward-auth-middleware" {
  description = "middleware name for use with forward auth"
  value       = module.forwardauth.forward-auth-middleware
}

output "forward-auth-service" {
  description = "middleware name for use with forward auth"
  value       = module.forwardauth.forward-auth-service
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/jupyterhub_ssh.tf
---

module "kubernetes-jupyterhub-ssh" {
  source = "./modules/kubernetes/services/jupyterhub-ssh"

  namespace          = var.environment
  jupyterhub_api_url = module.jupyterhub.internal_jupyterhub_url

  node-group              = var.node_groups.general
  persistent_volume_claim = local.jupyterhub-pvc

  depends_on = [
    module.kubernetes-nfs-server,
    module.rook-ceph
  ]
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/jupyterhub.tf
---

variable "jupyterhub-theme" {
  description = "JupyterHub theme"
  type        = map(any)
}

variable "jupyterhub-image" {
  description = "Jupyterhub user image"
  type = object({
    name = string
    tag  = string
  })
}

variable "jupyterhub-overrides" {
  description = "Jupyterhub helm chart overrides"
  type        = list(string)
  default     = []
}

variable "jupyterhub-shared-storage" {
  description = "JupyterHub shared storage size [GB]"
  type        = number
}

variable "jupyterhub-shared-endpoint" {
  description = "JupyterHub shared storage nfs endpoint"
  type        = string
}

variable "jupyterlab-image" {
  description = "Jupyterlab user image"
  type = object({
    name = string
    tag  = string
  })
}

variable "jupyterlab-profiles" {
  description = "JupyterHub profiles to expose to user"
}

variable "jupyterlab-preferred-dir" {
  description = "Directory in which the JupyterLab should open the file browser"
  type        = string
}

variable "initial-repositories" {
  description = "Map of folder location and git repo url to clone"
  type        = string
}

variable "jupyterlab-default-settings" {
  description = "Default settings for JupyterLab to be placed in overrides.json"
  type        = map(any)
}

variable "jupyterlab-gallery-settings" {
  description = "Server-side settings for jupyterlab-gallery extension"
  type = object({
    title                         = optional(string)
    destination                   = optional(string)
    hide_gallery_without_exhibits = optional(bool)
    exhibits = list(object({
      git         = string
      title       = string
      homepage    = optional(string)
      description = optional(string)
      icon        = optional(string)
      account     = optional(string)
      token       = optional(string)
      branch      = optional(string)
      depth       = optional(number)
    }))
  })
}

variable "jupyterhub-hub-extraEnv" {
  description = "Extracted overrides to merge with jupyterhub.hub.extraEnv"
  type        = string
  default     = "[]"
}

variable "idle-culler-settings" {
  description = "Idle culler timeout settings (in minutes)"
  type        = any
}

variable "shared_fs_type" {
  type        = string
  description = "Use NFS or Ceph"

  validation {
    condition     = contains(["cephfs", "nfs"], var.shared_fs_type)
    error_message = "Allowed values for input_parameter are \"cephfs\" or \"nfs\"."
  }

}

locals {
  jupyterhub-fs       = var.shared_fs_type
  jupyterhub-pvc-name = "jupyterhub-${var.environment}-share"
  jupyterhub-pvc      = local.jupyterhub-fs == "nfs" ? module.jupyterhub-nfs-mount[0].persistent_volume_claim.pvc : module.jupyterhub-cephfs-mount[0].persistent_volume_claim.pvc
  enable-nfs-server   = var.jupyterhub-shared-endpoint == null && (local.jupyterhub-fs == "nfs" || local.conda-store-fs == "nfs")
}



module "kubernetes-nfs-server" {
  count = local.enable-nfs-server ? 1 : 0

  source = "./modules/kubernetes/nfs-server"

  name         = "nfs-server"
  namespace    = var.environment
  nfs_capacity = var.jupyterhub-shared-storage
  node-group   = var.node_groups.general
}

moved {
  from = module.jupyterhub-nfs-mount
  to   = module.jupyterhub-nfs-mount[0]
}

module "jupyterhub-nfs-mount" {
  count  = local.jupyterhub-fs == "nfs" ? 1 : 0
  source = "./modules/kubernetes/nfs-mount"

  name         = "jupyterhub"
  namespace    = var.environment
  nfs_capacity = var.jupyterhub-shared-storage
  nfs_endpoint = var.jupyterhub-shared-endpoint == null ? module.kubernetes-nfs-server.0.endpoint_ip : var.jupyterhub-shared-endpoint
  nfs-pvc-name = local.jupyterhub-pvc-name

  depends_on = [
    module.kubernetes-nfs-server,
    module.rook-ceph
  ]
}

module "jupyterhub-cephfs-mount" {
  count  = local.jupyterhub-fs == "cephfs" ? 1 : 0
  source = "./modules/kubernetes/cephfs-mount"

  name          = "jupyterhub"
  namespace     = var.environment
  fs_capacity   = var.jupyterhub-shared-storage
  ceph-pvc-name = local.jupyterhub-pvc-name

  depends_on = [
    module.kubernetes-nfs-server,
    module.rook-ceph
  ]
}



module "jupyterhub" {
  source = "./modules/kubernetes/services/jupyterhub"

  name      = var.name
  namespace = var.environment

  cloud-provider = var.cloud-provider

  external-url = var.endpoint
  realm_id     = var.realm_id

  overrides = var.jupyterhub-overrides

  home-pvc = local.jupyterhub-pvc

  shared-pvc = local.jupyterhub-pvc

  conda-store-pvc                                    = module.kubernetes-conda-store-server.pvc.name
  conda-store-mount                                  = "/home/conda"
  conda-store-environments                           = var.conda-store-environments
  default-conda-store-namespace                      = var.conda-store-default-namespace
  argo-workflows-enabled                             = var.argo-workflows-enabled
  conda-store-argo-workflows-jupyter-scheduler-token = module.kubernetes-conda-store-server.service-tokens.argo-workflows-jupyter-scheduler
  conda-store-service-name                           = module.kubernetes-conda-store-server.service_name
  conda-store-jhub-apps-token                        = module.kubernetes-conda-store-server.service-tokens.jhub-apps
  jhub-apps-enabled                                  = var.jhub-apps-enabled
  jhub-apps-overrides                                = var.jhub-apps-overrides

  extra-mounts = {
    "/etc/dask" = {
      name      = "dask-etc"
      namespace = var.environment
      kind      = "configmap"
    },
  }

  services = concat([
    "dask-gateway"
    ],
    (var.monitoring-enabled ? ["monitoring"] : []),
  )

  general-node-group = var.node_groups.general
  user-node-group    = var.node_groups.user

  jupyterhub-image = var.jupyterhub-image
  jupyterlab-image = var.jupyterlab-image

  theme    = var.jupyterhub-theme
  profiles = var.jupyterlab-profiles

  jupyterhub-logout-redirect-url = var.jupyterhub-logout-redirect-url
  jupyterhub-hub-extraEnv        = var.jupyterhub-hub-extraEnv

  idle-culler-settings = var.idle-culler-settings
  initial-repositories = var.initial-repositories

  jupyterlab-default-settings = var.jupyterlab-default-settings

  jupyterlab-gallery-settings = var.jupyterlab-gallery-settings

  jupyterlab-pioneer-enabled    = var.jupyterlab-pioneer-enabled
  jupyterlab-pioneer-log-format = var.jupyterlab-pioneer-log-format

  jupyterlab-preferred-dir = var.jupyterlab-preferred-dir

  depends_on = [
    module.kubernetes-nfs-server,
    module.rook-ceph,
  ]
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/locals.tf
---

locals {
  additional_tags = {
    Project     = var.name
    Owner       = "terraform"
    Environment = var.environment
  }

  cluster_name = "${var.name}-${var.environment}"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/monitoring.tf
---

variable "monitoring-enabled" {
  description = "Prometheus and Grafana monitoring enabled"
  type        = bool
}

module "monitoring" {
  count = var.monitoring-enabled ? 1 : 0

  source               = "./modules/kubernetes/services/monitoring"
  namespace            = var.environment
  external-url         = var.endpoint
  realm_id             = var.realm_id
  jupyterhub_api_token = module.jupyterhub.services.monitoring.api_token

  node-group = var.node_groups.general
}

module "grafana-loki" {
  count                        = var.monitoring-enabled ? 1 : 0
  source                       = "./modules/kubernetes/services/monitoring/loki"
  namespace                    = var.environment
  grafana-loki-overrides       = var.grafana-loki-overrides
  grafana-promtail-overrides   = var.grafana-promtail-overrides
  grafana-loki-minio-overrides = var.grafana-loki-minio-overrides
  node-group                   = var.node_groups.general
  minio-enabled                = var.minio-enabled
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/outputs.tf
---

output "service_urls" {
  description = "service urls for configured services"
  value = {
    argo-workflows = {
      url        = var.argo-workflows-enabled ? "https://${var.endpoint}/argo/" : null
      health_url = var.argo-workflows-enabled ? "https://${var.endpoint}/argo/" : null
    }
    conda_store = {
      url        = "https://${var.endpoint}/conda-store/"
      health_url = "https://${var.endpoint}/conda-store/api/v1/"
    }
    dask_gateway = {
      url        = "https://${var.endpoint}/gateway/"
      health_url = "https://${var.endpoint}/gateway/api/version"
    }
    jupyterhub = {
      url        = "https://${var.endpoint}/"
      health_url = "https://${var.endpoint}/hub/api/"
    }
    keycloak = {
      url        = "https://${var.endpoint}/auth/"
      health_url = "https://${var.endpoint}/auth/realms/master"
    }
    monitoring = {
      url        = var.monitoring-enabled ? "https://${var.endpoint}/monitoring/" : null
      health_url = var.monitoring-enabled ? "https://${var.endpoint}/monitoring/api/health" : null
    }
  }
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/providers.tf
---

provider "keycloak" {
  tls_insecure_skip_verify = true
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/rook-ceph.tf
---

# ======================= VARIABLES ======================
variable "rook_ceph_storage_class_name" {
  description = "Name of the storage class to create"
  type        = string
}

locals {
  enable-ceph-cluster = local.jupyterhub-fs == "cephfs" || local.conda-store-fs == "cephfs"
}
# ====================== RESOURCES =======================
module "rook-ceph" {
  count              = local.enable-ceph-cluster ? 1 : 0
  source             = "./modules/kubernetes/services/rook-ceph"
  namespace          = var.environment
  operator_namespace = var.environment

  storage_class_name    = var.rook_ceph_storage_class_name
  node_group            = var.node_groups.general
  ceph_storage_capacity = var.jupyterhub-shared-storage + var.conda-store-filesystem-storage

  depends_on = [helm_release.rook-ceph]
}

resource "helm_release" "rook-ceph" {
  count = local.enable-ceph-cluster ? 1 : 0

  name       = "rook-ceph"
  namespace  = var.environment
  repository = "https://charts.rook.io/release"
  chart      = "rook-ceph"
  version    = "v1.14.7"

  values = concat([
    file("./modules/kubernetes/services/rook-ceph/operator-values.yaml"),
    jsonencode({
      nodeSelector = {
        "${var.node_groups.general.key}" = var.node_groups.general.value
      },
      monitoring = {
        enabled = false # TODO: Enable monitoring when nebari-config.yaml has it enabled
      },
      csi = {
        enableRbdDriver = false, # necessary to provision block storage, but saves some cpu and memory if not needed
      },
    })
    ],
    # var.overrides  # TODO: Add overrides for Rook-Ceph Operator
  )
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/variables.tf
---

# Variables that are shared between multiple kubernetes services

variable "name" {
  description = "Prefix name to assign to kubernetes resources"
  type        = string
}

variable "environment" {
  description = "Kubernetes namespace to create resources within"
  type        = string
}

variable "endpoint" {
  description = "Endpoint for services"
  type        = string
}

variable "realm_id" {
  description = "Keycloak realm id for creating clients"
  type        = string
}

variable "node_groups" {
  description = "Node group selectors for kubernetes resources"
  type = map(object({
    key   = string
    value = string
  }))
}

variable "jupyterhub-logout-redirect-url" {
  description = "Next redirect destination following a Keycloak logout"
  type        = string
  default     = ""
}

variable "conda-store-default-namespace" {
  description = "Default conda-store namespace name"
  type        = string
}

variable "argo-workflows-enabled" {
  description = "Enable Argo Workflows"
  type        = bool
}

variable "jupyterlab-pioneer-enabled" {
  description = "Enable JupyterLab Pioneer for telemetry"
  type        = bool
}

variable "jupyterlab-pioneer-log-format" {
  description = "Logging format for JupyterLab Pioneer"
  type        = string
}

variable "jhub-apps-enabled" {
  description = "Enable JupyterHub Apps"
  type        = bool
}

variable "jhub-apps-overrides" {
  description = "jhub-apps configuration overrides"
  type        = string
  default     = "{}"
}

variable "cloud-provider" {
  description = "Name of cloud provider."
  type        = string
}

variable "grafana-loki-overrides" {
  description = "Helm chart overrides for loki"
  type        = list(string)
  default     = []
}

variable "grafana-promtail-overrides" {
  description = "Helm chart overrides for promtail"
  type        = list(string)
  default     = []
}

variable "grafana-loki-minio-overrides" {
  description = "Grafana Loki minio helm chart overrides"
  type        = list(string)
  default     = []
}

variable "minio-enabled" {
  description = "Deploy minio along with loki or not"
  type        = bool
  default     = true
}



---
File: nebari/src/_nebari/stages/kubernetes_services/template/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
    keycloak = {
      source  = "mrparkers/keycloak"
      version = "3.7.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/kubernetes_services/__init__.py
---

import enum
import json
import sys
import time
from typing import Any, Dict, List, Optional, Type, Union
from urllib.parse import urlencode

from pydantic import ConfigDict, Field, field_validator, model_validator
from typing_extensions import Self

from _nebari import constants
from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.tf_objects import (
    NebariHelmProvider,
    NebariKubernetesProvider,
    NebariTerraformState,
)
from _nebari.utils import (
    byte_unit_conversion,
    set_docker_image_tag,
    set_nebari_dask_version,
)
from _nebari.version import __version__
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl

# check and retry settings
NUM_ATTEMPTS = 10
TIMEOUT = 10  # seconds


_forwardauth_middleware_name = "traefik-forward-auth"


@schema.yaml_object(schema.yaml)
class AccessEnum(str, enum.Enum):
    all = "all"
    yaml = "yaml"
    keycloak = "keycloak"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


@schema.yaml_object(schema.yaml)
class SharedFsEnum(str, enum.Enum):
    nfs = "nfs"
    cephfs = "cephfs"
    efs = "efs"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class DefaultImages(schema.Base):
    jupyterhub: str = f"quay.io/nebari/nebari-jupyterhub:{set_docker_image_tag()}"
    jupyterlab: str = f"quay.io/nebari/nebari-jupyterlab:{set_docker_image_tag()}"
    dask_worker: str = f"quay.io/nebari/nebari-dask-worker:{set_docker_image_tag()}"


class Storage(schema.Base):
    type: SharedFsEnum = Field(
        default=None,
        json_schema_extra={"immutable": True},
    )
    conda_store: str = "200Gi"
    shared_filesystem: str = "200Gi"


class JupyterHubTheme(schema.Base):
    hub_title: str = "Nebari"
    hub_subtitle: str = "Your open source data science platform"
    welcome: str = (
        """Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the documentation</a>. If you have any questions or feedback, reach the team on <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support forums</a>."""
    )
    logo: str = (
        "https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg"
    )
    favicon: str = (
        "https://raw.githubusercontent.com/nebari-dev/nebari-design/main/symbol/favicon.ico"
    )
    primary_color: str = "#4f4173"
    primary_color_dark: str = "#4f4173"
    secondary_color: str = "#957da6"
    secondary_color_dark: str = "#957da6"
    accent_color: str = "#32C574"
    accent_color_dark: str = "#32C574"
    text_color: str = "#111111"
    h1_color: str = "#652e8e"
    h2_color: str = "#652e8e"
    version: str = f"v{__version__}"
    navbar_color: str = "#1c1d26"
    navbar_text_color: str = "#f1f1f6"
    navbar_hover_color: str = "#db96f3"
    display_version: str = "True"  # limitation of theme everything is a str


class Theme(schema.Base):
    jupyterhub: JupyterHubTheme = JupyterHubTheme()


class KubeSpawner(schema.Base):
    cpu_limit: float
    cpu_guarantee: float
    mem_limit: str
    mem_guarantee: str
    model_config = ConfigDict(extra="allow")


class JupyterLabProfile(schema.Base):
    access: AccessEnum = AccessEnum.all
    display_name: str
    description: str
    default: bool = False
    users: Optional[List[str]] = None
    groups: Optional[List[str]] = None
    kubespawner_override: Optional[KubeSpawner] = None

    @model_validator(mode="after")
    def only_yaml_can_have_groups_and_users(self):
        if self.access != AccessEnum.yaml:
            if self.users is not None or self.groups is not None:
                raise ValueError(
                    "Profile must not contain groups or users fields unless access = yaml"
                )
        return self


class DaskWorkerProfile(schema.Base):
    worker_cores_limit: float
    worker_cores: float
    worker_memory_limit: str
    worker_memory: str
    worker_threads: int = 1
    model_config = ConfigDict(extra="allow")


class Profiles(schema.Base):
    jupyterlab: List[JupyterLabProfile] = [
        JupyterLabProfile(
            display_name="Small Instance",
            description="Stable environment with 2 cpu / 8 GB ram",
            default=True,
            kubespawner_override=KubeSpawner(
                cpu_limit=2,
                cpu_guarantee=1.5,
                mem_limit="8G",
                mem_guarantee="5G",
            ),
        ),
        JupyterLabProfile(
            display_name="Medium Instance",
            description="Stable environment with 4 cpu / 16 GB ram",
            kubespawner_override=KubeSpawner(
                cpu_limit=4,
                cpu_guarantee=3,
                mem_limit="16G",
                mem_guarantee="10G",
            ),
        ),
    ]
    dask_worker: Dict[str, DaskWorkerProfile] = {
        "Small Worker": DaskWorkerProfile(
            worker_cores_limit=2,
            worker_cores=1.5,
            worker_memory_limit="8G",
            worker_memory="5G",
            worker_threads=2,
        ),
        "Medium Worker": DaskWorkerProfile(
            worker_cores_limit=4,
            worker_cores=3,
            worker_memory_limit="16G",
            worker_memory="10G",
            worker_threads=4,
        ),
    }

    @field_validator("jupyterlab")
    @classmethod
    def check_default(cls, value):
        """Check if only one default value is present."""
        default = [attrs["default"] for attrs in value if "default" in attrs]
        if default.count(True) > 1:
            raise TypeError(
                "Multiple default Jupyterlab profiles may cause unexpected problems."
            )
        return value


class CondaEnvironment(schema.Base):
    name: str
    channels: Optional[List[str]] = None
    dependencies: List[Union[str, Dict[str, List[str]]]]


class CondaStore(schema.Base):
    extra_settings: Dict[str, Any] = {}
    extra_config: str = ""
    image: str = "quansight/conda-store-server"
    image_tag: str = constants.DEFAULT_CONDA_STORE_IMAGE_TAG
    default_namespace: str = "nebari-git"
    object_storage: str = "200Gi"


class NebariWorkflowController(schema.Base):
    enabled: bool = True
    image_tag: str = constants.DEFAULT_NEBARI_WORKFLOW_CONTROLLER_IMAGE_TAG


class ArgoWorkflows(schema.Base):
    enabled: bool = True
    overrides: Dict = {}
    nebari_workflow_controller: NebariWorkflowController = NebariWorkflowController()


class JHubApps(schema.Base):
    enabled: bool = False
    overrides: Dict = {}


class MonitoringOverrides(schema.Base):
    loki: Dict = {}
    promtail: Dict = {}
    minio: Dict = {}


class Healthchecks(schema.Base):
    enabled: bool = False
    kuberhealthy_helm_version: str = constants.KUBERHEALTHY_HELM_VERSION


class Monitoring(schema.Base):
    enabled: bool = True
    overrides: MonitoringOverrides = MonitoringOverrides()
    minio_enabled: bool = True
    healthchecks: Healthchecks = Healthchecks()


class JupyterLabPioneer(schema.Base):
    enabled: bool = False
    log_format: Optional[str] = None


class Telemetry(schema.Base):
    jupyterlab_pioneer: JupyterLabPioneer = JupyterLabPioneer()


class JupyterHub(schema.Base):
    overrides: Dict = {}


class IdleCuller(schema.Base):
    terminal_cull_inactive_timeout: int = 15
    terminal_cull_interval: int = 5
    kernel_cull_idle_timeout: int = 15
    kernel_cull_interval: int = 5
    kernel_cull_connected: bool = True
    kernel_cull_busy: bool = False
    server_shutdown_no_activity_timeout: int = 15


class JupyterLabGalleryExhibit(schema.Base):
    git: str
    title: str
    homepage: Optional[str] = None
    description: Optional[str] = None
    icon: Optional[str] = None
    account: Optional[str] = None
    token: Optional[str] = None
    branch: Optional[str] = None
    depth: Optional[int] = None


class JupyterLabGallerySettings(schema.Base):
    title: str = "Examples"
    destination: str = "examples"
    exhibits: List[JupyterLabGalleryExhibit] = []
    hide_gallery_without_exhibits: bool = True


class JupyterLab(schema.Base):
    default_settings: Dict[str, Any] = {}
    gallery_settings: JupyterLabGallerySettings = JupyterLabGallerySettings()
    idle_culler: IdleCuller = IdleCuller()
    initial_repositories: List[Dict[str, str]] = []
    preferred_dir: Optional[str] = None


class RookCeph(schema.Base):
    storage_class_name: None | str = None


class InputSchema(schema.Base):
    default_images: DefaultImages = DefaultImages()
    storage: Storage = Storage()
    theme: Theme = Theme()
    profiles: Profiles = Profiles()
    environments: Dict[str, CondaEnvironment] = {
        "environment-dask.yaml": CondaEnvironment(
            name="dask",
            channels=["conda-forge"],
            dependencies=[
                "python==3.11.6",
                "ipykernel==6.26.0",
                "ipywidgets==8.1.1",
                f"nebari-dask=={set_nebari_dask_version()}",
                "python-graphviz==0.20.1",
                "pyarrow==14.0.1",
                "s3fs==2023.10.0",
                "gcsfs==2023.10.0",
                "numpy=1.26.0",
                "numba=0.58.1",
                "pandas=2.1.3",
                "xarray==2023.10.1",
            ],
        ),
        "environment-dashboard.yaml": CondaEnvironment(
            name="dashboard",
            channels=["conda-forge"],
            dependencies=[
                "python==3.11.6",
                "cufflinks-py==0.17.3",
                "dash==2.14.1",
                "geopandas==0.14.1",
                "geopy==2.4.0",
                "geoviews==1.11.0",
                "gunicorn==21.2.0",
                "holoviews==1.18.1",
                "ipykernel==6.26.0",
                "ipywidgets==8.1.1",
                "jupyter==1.0.0",
                "jupyter_bokeh==3.0.7",
                "matplotlib==3.8.1",
                f"nebari-dask=={set_nebari_dask_version()}",
                "nodejs=20.8.1",
                "numpy==1.26.0",
                "openpyxl==3.1.2",
                "pandas==2.1.3",
                "panel==1.3.1",
                "param==2.0.1",
                "plotly==5.18.0",
                "python-graphviz==0.20.1",
                "rich==13.6.0",
                "streamlit==1.28.1",
                "sympy==1.12",
                "voila==0.5.5",
                "xarray==2023.10.1",
                "pip==23.3.1",
                {
                    "pip": [
                        "streamlit-image-comparison==0.0.4",
                        "noaa-coops==0.1.9",
                        "dash_core_components==2.0.0",
                        "dash_html_components==2.0.0",
                    ],
                },
            ],
        ),
    }
    conda_store: CondaStore = CondaStore()
    argo_workflows: ArgoWorkflows = ArgoWorkflows()
    monitoring: Monitoring = Monitoring()
    telemetry: Telemetry = Telemetry()
    jupyterhub: JupyterHub = JupyterHub()
    jupyterlab: JupyterLab = JupyterLab()
    jhub_apps: JHubApps = JHubApps()
    ceph: RookCeph = RookCeph()

    def _set_storage_type_default_value(self):
        if self.storage.type is None:
            if self.provider == schema.ProviderEnum.aws:
                self.storage.type = SharedFsEnum.efs
            else:
                self.storage.type = SharedFsEnum.nfs

    @model_validator(mode="after")
    def custom_validation(self) -> Self:
        self._set_storage_type_default_value()

        if (
            self.storage.type == SharedFsEnum.cephfs
            and self.provider == schema.ProviderEnum.local
        ):
            raise ValueError(
                f'storage.type: "{self.storage.type.value}" is not supported for provider: "{self.provider.value}"'
            )

        if (
            self.storage.type == SharedFsEnum.efs
            and self.provider != schema.ProviderEnum.aws
        ):
            raise ValueError(
                f'storage.type: "{self.storage.type.value}" is only supported for provider: "{schema.ProviderEnum.aws.value}"'
            )
        return self


class OutputSchema(schema.Base):
    pass


# variables shared by multiple services
class KubernetesServicesInputVars(schema.Base):
    name: str
    environment: str
    endpoint: str
    realm_id: str
    node_groups: Dict[str, Dict[str, str]]
    jupyterhub_logout_redirect_url: str = Field(alias="jupyterhub-logout-redirect-url")
    forwardauth_middleware_name: str = _forwardauth_middleware_name
    cert_secret_name: Optional[str] = None


def _split_docker_image_name(image_name):
    name, tag = image_name.split(":")
    return {"name": name, "tag": tag}


class ImageNameTag(schema.Base):
    name: str
    tag: str


class RookCephInputVars(schema.Base):
    rook_ceph_storage_class_name: None | str = None


class CondaStoreInputVars(schema.Base):
    conda_store_environments: Dict[str, CondaEnvironment] = Field(
        alias="conda-store-environments"
    )
    conda_store_default_namespace: str = Field(alias="conda-store-default-namespace")
    conda_store_filesystem_storage: float = Field(
        alias="conda-store-filesystem-storage"
    )
    conda_store_object_storage: str = Field(alias="conda-store-object-storage")
    conda_store_extra_settings: Dict[str, Any] = Field(
        alias="conda-store-extra-settings"
    )
    conda_store_extra_config: str = Field(alias="conda-store-extra-config")
    conda_store_image: str = Field(alias="conda-store-image")
    conda_store_image_tag: str = Field(alias="conda-store-image-tag")
    conda_store_service_token_scopes: Dict[str, Dict[str, Any]] = Field(
        alias="conda-store-service-token-scopes"
    )

    @field_validator("conda_store_filesystem_storage", mode="before")
    @classmethod
    def handle_units(cls, value: Optional[str]) -> float:
        return byte_unit_conversion(value, "GiB")


class JupyterhubInputVars(schema.Base):
    jupyterhub_theme: Dict[str, Any] = Field(alias="jupyterhub-theme")
    jupyterlab_image: ImageNameTag = Field(alias="jupyterlab-image")
    jupyterlab_default_settings: Dict[str, Any] = Field(
        alias="jupyterlab-default-settings"
    )
    jupyterlab_gallery_settings: JupyterLabGallerySettings = Field(
        alias="jupyterlab-gallery-settings"
    )
    initial_repositories: str = Field(alias="initial-repositories")
    jupyterhub_overrides: List[str] = Field(alias="jupyterhub-overrides")
    jupyterhub_shared_storage: float = Field(alias="jupyterhub-shared-storage")
    jupyterhub_shared_endpoint: Optional[str] = Field(
        alias="jupyterhub-shared-endpoint", default=None
    )
    jupyterhub_profiles: List[JupyterLabProfile] = Field(alias="jupyterlab-profiles")
    jupyterhub_image: ImageNameTag = Field(alias="jupyterhub-image")
    jupyterhub_hub_extraEnv: str = Field(alias="jupyterhub-hub-extraEnv")
    idle_culler_settings: Dict[str, Any] = Field(alias="idle-culler-settings")
    argo_workflows_enabled: bool = Field(alias="argo-workflows-enabled")
    jhub_apps_enabled: bool = Field(alias="jhub-apps-enabled")
    jhub_apps_overrides: str = Field(alias="jhub-apps-overrides")
    cloud_provider: str = Field(alias="cloud-provider")
    jupyterlab_preferred_dir: Optional[str] = Field(alias="jupyterlab-preferred-dir")
    shared_fs_type: SharedFsEnum

    @field_validator("jupyterhub_shared_storage", mode="before")
    @classmethod
    def handle_units(cls, value: Optional[str]) -> float:
        return byte_unit_conversion(value, "GiB")


class DaskGatewayInputVars(schema.Base):
    dask_worker_image: ImageNameTag = Field(alias="dask-worker-image")
    dask_gateway_profiles: Dict[str, Any] = Field(alias="dask-gateway-profiles")
    cloud_provider: str = Field(alias="cloud-provider")
    forwardauth_middleware_name: str = _forwardauth_middleware_name


class MonitoringInputVars(schema.Base):
    monitoring_enabled: bool = Field(alias="monitoring-enabled")
    minio_enabled: bool = Field(alias="minio-enabled")
    grafana_loki_overrides: List[str] = Field(alias="grafana-loki-overrides")
    grafana_promtail_overrides: List[str] = Field(alias="grafana-promtail-overrides")
    grafana_loki_minio_overrides: List[str] = Field(
        alias="grafana-loki-minio-overrides"
    )


class TelemetryInputVars(schema.Base):
    jupyterlab_pioneer_enabled: bool = Field(alias="jupyterlab-pioneer-enabled")
    jupyterlab_pioneer_log_format: Optional[str] = Field(
        alias="jupyterlab-pioneer-log-format"
    )


class ArgoWorkflowsInputVars(schema.Base):
    argo_workflows_enabled: bool = Field(alias="argo-workflows-enabled")
    argo_workflows_overrides: List[str] = Field(alias="argo-workflows-overrides")
    nebari_workflow_controller: bool = Field(alias="nebari-workflow-controller")
    workflow_controller_image_tag: str = Field(alias="workflow-controller-image-tag")
    keycloak_read_only_user_credentials: Dict[str, Any] = Field(
        alias="keycloak-read-only-user-credentials"
    )


class KubernetesServicesStage(NebariTerraformStage):
    name = "07-kubernetes-services"
    priority = 70

    input_schema = InputSchema
    output_schema = OutputSchema

    def tf_objects(self) -> List[Dict]:
        return [
            NebariTerraformState(self.name, self.config),
            NebariKubernetesProvider(self.config),
            NebariHelmProvider(self.config),
        ]

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        domain = stage_outputs["stages/04-kubernetes-ingress"]["domain"]
        final_logout_uri = f"https://{domain}/hub/login"

        realm_id = stage_outputs["stages/06-kubernetes-keycloak-configuration"][
            "realm_id"
        ]["value"]
        cloud_provider = self.config.provider.value
        jupyterhub_shared_endpoint = (
            stage_outputs["stages/02-infrastructure"]
            .get("nfs_endpoint", {})
            .get("value")
        )
        keycloak_read_only_user_credentials = stage_outputs[
            "stages/06-kubernetes-keycloak-configuration"
        ]["keycloak-read-only-user-credentials"]["value"]

        conda_store_token_scopes = {
            "dask-gateway": {
                "primary_namespace": "",
                "role_bindings": {
                    "*/*": ["viewer"],
                },
            },
            "argo-workflows-jupyter-scheduler": {
                "primary_namespace": "",
                "role_bindings": {
                    "*/*": ["viewer"],
                },
            },
            "jhub-apps": {
                "primary_namespace": "",
                "role_bindings": {
                    "*/*": ["viewer"],
                },
            },
            "conda-store-service-account": {
                "primary_namespace": "",
                "role_bindings": {
                    "*/*": ["admin"],
                },
            },
        }

        # Compound any logout URLs from extensions so they are are logged out in succession
        # when Keycloak and JupyterHub are logged out
        for ext in self.config.tf_extensions:
            if ext.logout != "":
                final_logout_uri = "{}?{}".format(
                    f"https://{domain}/{ext.urlslug}{ext.logout}",
                    urlencode({"redirect_uri": final_logout_uri}),
                )

        jupyterhub_theme = self.config.theme.jupyterhub
        if self.config.theme.jupyterhub.display_version and (
            not self.config.theme.jupyterhub.version
        ):
            jupyterhub_theme.update({"version": f"v{self.config.nebari_version}"})

        kubernetes_services_vars = KubernetesServicesInputVars(
            name=self.config.project_name,
            environment=self.config.namespace,
            endpoint=domain,
            realm_id=realm_id,
            node_groups=stage_outputs["stages/02-infrastructure"]["node_selectors"],
            jupyterhub_logout_redirect_url=final_logout_uri,
            cert_secret_name=(
                self.config.certificate.secret_name
                if self.config.certificate.type == "existing"
                else None
            ),
        )

        rook_ceph_vars = RookCephInputVars()

        conda_store_vars = CondaStoreInputVars(
            conda_store_environments={
                k: v.model_dump() for k, v in self.config.environments.items()
            },
            conda_store_default_namespace=self.config.conda_store.default_namespace,
            conda_store_filesystem_storage=self.config.storage.conda_store,
            conda_store_object_storage=self.config.storage.conda_store,
            conda_store_service_token_scopes=conda_store_token_scopes,
            conda_store_extra_settings=self.config.conda_store.extra_settings,
            conda_store_extra_config=self.config.conda_store.extra_config,
            conda_store_image=self.config.conda_store.image,
            conda_store_image_tag=self.config.conda_store.image_tag,
        )

        jupyterhub_vars = JupyterhubInputVars(
            jupyterhub_theme=jupyterhub_theme.model_dump(),
            jupyterlab_image=_split_docker_image_name(
                self.config.default_images.jupyterlab
            ),
            jupyterhub_shared_storage=self.config.storage.shared_filesystem,
            jupyterhub_shared_endpoint=jupyterhub_shared_endpoint,
            cloud_provider=cloud_provider,
            jupyterhub_profiles=self.config.profiles.model_dump()["jupyterlab"],
            jupyterhub_image=_split_docker_image_name(
                self.config.default_images.jupyterhub
            ),
            jupyterhub_overrides=[json.dumps(self.config.jupyterhub.overrides)],
            jupyterhub_hub_extraEnv=json.dumps(
                self.config.jupyterhub.overrides.get("hub", {}).get("extraEnv", [])
            ),
            idle_culler_settings=self.config.jupyterlab.idle_culler.model_dump(),
            argo_workflows_enabled=self.config.argo_workflows.enabled,
            jhub_apps_enabled=self.config.jhub_apps.enabled,
            jhub_apps_overrides=json.dumps(self.config.jhub_apps.overrides),
            initial_repositories=str(self.config.jupyterlab.initial_repositories),
            jupyterlab_default_settings=self.config.jupyterlab.default_settings,
            jupyterlab_gallery_settings=self.config.jupyterlab.gallery_settings,
            jupyterlab_preferred_dir=self.config.jupyterlab.preferred_dir,
            shared_fs_type=(
                # efs is equivalent to nfs in these modules
                SharedFsEnum.nfs
                if self.config.storage.type == SharedFsEnum.efs
                else self.config.storage.type
            ),
        )

        dask_gateway_vars = DaskGatewayInputVars(
            dask_worker_image=_split_docker_image_name(
                self.config.default_images.dask_worker
            ),
            dask_gateway_profiles=self.config.profiles.model_dump()["dask_worker"],
            cloud_provider=cloud_provider,
        )

        monitoring_vars = MonitoringInputVars(
            monitoring_enabled=self.config.monitoring.enabled,
            minio_enabled=self.config.monitoring.minio_enabled,
            grafana_loki_overrides=[json.dumps(self.config.monitoring.overrides.loki)],
            grafana_promtail_overrides=[
                json.dumps(self.config.monitoring.overrides.promtail)
            ],
            grafana_loki_minio_overrides=[
                json.dumps(self.config.monitoring.overrides.minio)
            ],
        )

        telemetry_vars = TelemetryInputVars(
            jupyterlab_pioneer_enabled=self.config.telemetry.jupyterlab_pioneer.enabled,
            jupyterlab_pioneer_log_format=self.config.telemetry.jupyterlab_pioneer.log_format,
        )

        argo_workflows_vars = ArgoWorkflowsInputVars(
            argo_workflows_enabled=self.config.argo_workflows.enabled,
            argo_workflows_overrides=[json.dumps(self.config.argo_workflows.overrides)],
            nebari_workflow_controller=self.config.argo_workflows.nebari_workflow_controller.enabled,
            workflow_controller_image_tag=self.config.argo_workflows.nebari_workflow_controller.image_tag,
            keycloak_read_only_user_credentials=keycloak_read_only_user_credentials,
        )

        return {
            **kubernetes_services_vars.model_dump(by_alias=True),
            **rook_ceph_vars.model_dump(by_alias=True),
            **conda_store_vars.model_dump(by_alias=True),
            **jupyterhub_vars.model_dump(by_alias=True),
            **dask_gateway_vars.model_dump(by_alias=True),
            **monitoring_vars.model_dump(by_alias=True),
            **argo_workflows_vars.model_dump(by_alias=True),
            **telemetry_vars.model_dump(by_alias=True),
        }

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        directory = "stages/07-kubernetes-services"
        import requests

        # suppress insecure warnings
        import urllib3

        urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

        def _attempt_connect_url(
            url, verify=False, num_attempts=NUM_ATTEMPTS, timeout=TIMEOUT
        ):
            for i in range(num_attempts):
                response = requests.get(url, verify=verify, timeout=timeout)
                if response.status_code < 400:
                    print(f"Attempt {i+1} health check succeeded for url={url}")
                    return True
                else:
                    print(f"Attempt {i+1} health check failed for url={url}")
                time.sleep(timeout)
            return False

        services = stage_outputs[directory]["service_urls"]["value"]
        for service_name, service in services.items():
            service_url = service["health_url"]
            if service_url and not _attempt_connect_url(service_url):
                print(
                    f"ERROR: Service {service_name} DOWN when checking url={service_url}"
                )
                sys.exit(1)


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [KubernetesServicesStage]



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/helm-extensions/main.tf
---

resource "helm_release" "custom-helm-deployment" {
  name       = var.name
  namespace  = var.namespace
  repository = var.repository
  chart      = var.chart
  version    = var.chart_version

  values = [jsonencode(var.overrides)]
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/helm-extensions/variables.tf
---

variable "name" {
  description = "helm deployment name"
  type        = string
  default     = "dev"
}

variable "namespace" {
  description = "deploy helm chart on this namespace"
  type        = string
  default     = "dev"
}

variable "repository" {
  description = "helm chart repository"
  type        = string
}

variable "chart" {
  description = "helm chart name in helm chart repository"
  type        = string
}

variable "chart_version" {
  description = "Helm chart version"
  type        = string
}

variable "overrides" {
  description = "Overrides for the helm chart values"
  type        = any
  default     = {}
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/nebariextension/ingress.tf
---

resource "kubernetes_manifest" "nebariextension-ingressroute" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "IngressRoute"
    metadata = {
      name      = "${var.name}-ingressroute"
      namespace = var.namespace
    }
    spec = {
      entryPoints = ["websecure"]
      routes = [
        {
          kind  = "Rule"
          match = "Host(`${var.external-url}`) && PathPrefix(`/${var.urlslug}/`)"

          # forwardauth middleware may be included via local.middlewares
          middlewares = concat(
            local.middlewares,
            [{
              name      = kubernetes_manifest.nebariextension-middleware.manifest.metadata.name
              namespace = var.namespace
            }]
          )

          services = [
            {
              name = kubernetes_service.nebari-extension-service.metadata[0].name
              port = 80
            }
          ]
        }
      ]
    }
  }
}

# Strip Prefix middleware to remove urlslug

resource "kubernetes_manifest" "nebariextension-middleware" {
  manifest = {
    apiVersion = "traefik.containo.us/v1alpha1"
    kind       = "Middleware"
    metadata = {
      name      = "nebariext-middleware-${var.name}"
      namespace = var.namespace
    }
    spec = {
      stripPrefixRegex = {
        regex = [
          "/${var.urlslug}"
        ]
      }
    }
  }
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/nebariextension/keycloak-config.tf
---


resource "keycloak_openid_client" "keycloak_ext_client" {
  count         = var.oauth2client ? 1 : 0
  realm_id      = var.nebari-realm-id
  client_id     = "${var.name}-client"
  client_secret = random_password.nebari-ext-client[count.index].result

  name    = "${var.name} Client"
  enabled = true

  access_type           = "CONFIDENTIAL"
  standard_flow_enabled = true

  valid_redirect_uris = [
    "https://${var.external-url}/${var.urlslug}/oauth_callback"
  ]
}

resource "random_password" "nebari-ext-client" {
  count   = var.oauth2client ? 1 : 0
  length  = 32
  special = false
}

resource "keycloak_openid_group_membership_protocol_mapper" "group_membership_mapper" {
  count = var.oauth2client ? 1 : 0

  realm_id  = var.nebari-realm-id
  client_id = keycloak_openid_client.keycloak_ext_client[count.index].id
  name      = "group-membership-mapper"

  claim_name = "groups"

  add_to_id_token     = false
  add_to_access_token = false
  add_to_userinfo     = true

  full_path = false
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/nebariextension/locals.tf
---

locals {
  middlewares = (var.private) ? ([{
    name      = var.forwardauth_middleware_name
    namespace = var.namespace
  }]) : ([])

  oauth2client_envs = (var.oauth2client) ? ([{
    name  = "OAUTH2_AUTHORIZE_URL"
    value = "https://${var.external-url}/auth/realms/${var.nebari-realm-id}/protocol/openid-connect/auth"
    },
    {
      name  = "OAUTH2_ACCESS_TOKEN_URL"
      value = "https://${var.external-url}/auth/realms/${var.nebari-realm-id}/protocol/openid-connect/token"
    },
    {
      name  = "OAUTH2_USER_DATA_URL"
      value = "https://${var.external-url}/auth/realms/${var.nebari-realm-id}/protocol/openid-connect/userinfo"
    },
    {
      name  = "OAUTH2_REDIRECT_BASE"
      value = "https://${var.external-url}/${var.urlslug}/"
    },
    {
      name  = "COOKIE_OAUTH2STATE_NAME"
      value = "${var.name}-o2state"
    },
    {
      name  = "OAUTH2_CLIENT_ID"
      value = "${var.name}-client"
    },
    {
      name  = "OAUTH2_CLIENT_SECRET"
      value = random_password.nebari-ext-client[0].result
  }]) : ([])

  keycloakadmin_envs = (var.keycloakadmin) ? ([{
    name  = "KEYCLOAK_SERVER_URL"
    value = "http://keycloak-headless.${var.namespace}:8080/auth/"
    },
    {
      name  = "KEYCLOAK_REALM"
      value = var.nebari-realm-id
    },
    {
      name  = "KEYCLOAK_ADMIN_USERNAME"
      value = "nebari-bot"
    },
    {
      name  = "KEYCLOAK_ADMIN_PASSWORD"
      value = var.keycloak_nebari_bot_password
  }]) : ([])

  jwt_envs = (var.jwt) ? ([{
    name  = "COOKIE_AUTHORIZATION_NAME"
    value = "${var.name}-jwt"
    },
    {
      name  = "JWT_SECRET_KEY"
      value = random_password.nebari-jwt-secret[0].result
  }]) : ([])
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/nebariextension/main.tf
---

terraform {
  required_providers {
    keycloak = {
      source  = "mrparkers/keycloak"
      version = "3.7.0"
    }
  }
}

resource "kubernetes_service" "nebari-extension-service" {
  metadata {
    name      = "${var.name}-service"
    namespace = var.namespace
  }
  spec {
    selector = {
      app = kubernetes_deployment.nebari-extension-deployment.spec.0.template.0.metadata[0].labels.app
    }
    port {
      port        = 80
      target_port = 80
    }

    type = "ClusterIP"
  }
}

resource "kubernetes_deployment" "nebari-extension-deployment" {
  metadata {
    name      = "${var.name}-deployment"
    namespace = var.namespace
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        app = "${var.name}-pod"
      }
    }

    template {
      metadata {
        labels = {
          app = "${var.name}-pod"
        }
      }

      spec {

        container {
          name  = "${var.name}-container"
          image = var.image

          env {
            name  = "PORT"
            value = "80"
          }

          dynamic "env" {
            for_each = concat(local.oauth2client_envs, local.keycloakadmin_envs, local.jwt_envs, var.envs)
            content {
              name  = env.value["name"]
              value = env.value["value"]
            }
          }

          port {
            container_port = 80
          }

          dynamic "volume_mount" {
            for_each = var.nebariconfigyaml ? [true] : []
            content {
              name       = "nebariyamlsecret"
              mount_path = "/etc/nebariyamlsecret/"
              read_only  = true
            }
          }

        }

        dynamic "volume" {
          for_each = var.nebariconfigyaml ? [true] : []
          content {
            name = "nebariyamlsecret"
            secret {
              secret_name = "nebari-config-yaml"
            }
          }
        }

      }
    }
  }
}

resource "random_password" "nebari-jwt-secret" {
  count   = var.jwt ? 1 : 0
  length  = 32
  special = false
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/modules/nebariextension/variables.tf
---

variable "namespace" {
  description = "Namespace to deploy into"
  type        = string
}

variable "name" {
  description = "Name of extension"
  type        = string
}

variable "external-url" {
  description = "URL of the Nebari"
  type        = string
}

variable "image" {
  description = "Docker image for extension"
  type        = string
}

variable "urlslug" {
  description = "Slug for URL"
  type        = string
}

variable "private" {
  description = "Protect behind login page"
  type        = bool
  default     = true
}

variable "oauth2client" {
  description = "Create a Keycloak client and include env vars"
  type        = bool
  default     = false
}

variable "keycloakadmin" {
  description = "Include env vars for a keycloak admin user to make Keycloak Admin API calls"
  type        = bool
  default     = false
}

variable "jwt" {
  description = "Create secret and cookie name for JWT, set as env vars"
  type        = bool
  default     = false
}

variable "nebariconfigyaml" {
  description = "Mount nebari-config.yaml from configmap"
  type        = bool
  default     = false
}

variable "envs" {
  description = "List of env var objects"
  type        = list(map(any))
  default     = []
}

variable "nebari-realm-id" {
  description = "Keycloak nebari realm id"
  type        = string
  default     = ""
}

variable "keycloak_nebari_bot_password" {
  description = "Keycloak client password"
  type        = string
  default     = ""
}

variable "forwardauth_middleware_name" {
  description = "Name of the traefik forward auth middleware"
  type        = string
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/helm-extension.tf
---

module "helm-extension" {
  for_each = { for extension in var.helm_extensions : extension.name => extension }

  source        = "./modules/helm-extensions"
  name          = each.value.name
  namespace     = var.environment
  repository    = each.value.repository
  chart         = each.value.chart
  chart_version = each.value.version
  overrides     = lookup(each.value, "overrides", {})
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/nebari-config.tf
---

resource "kubernetes_secret" "nebari_yaml_secret" {
  metadata {
    name      = "nebari-config-yaml"
    namespace = var.environment
  }

  data = {
    "nebari-config.yaml" = yamlencode(var.nebari_config_yaml)
  }
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/providers.tf
---

provider "keycloak" {
  tls_insecure_skip_verify = true
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/tf-extensions.tf
---

module "extension" {
  for_each = { for extension in var.tf_extensions : extension.name => extension }

  source = "./modules/nebariextension"

  name             = "nebari-ext-${each.key}"
  namespace        = var.environment
  image            = each.value.image
  urlslug          = each.value.urlslug
  private          = lookup(each.value, "private", false)
  oauth2client     = lookup(each.value, "oauth2client", false)
  keycloakadmin    = lookup(each.value, "keycloakadmin", false)
  jwt              = lookup(each.value, "jwt", false)
  nebariconfigyaml = lookup(each.value, "nebariconfigyaml", false)
  external-url     = var.endpoint
  nebari-realm-id  = var.realm_id

  keycloak_nebari_bot_password = each.value.keycloakadmin ? var.keycloak_nebari_bot_password : ""
  forwardauth_middleware_name  = var.forwardauth_middleware_name

  envs = lookup(each.value, "envs", [])
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/variables.tf
---

variable "environment" {
  description = "Kubernetes namespace to create resources within"
  type        = string
}

variable "endpoint" {
  description = "Endpoint for services"
  type        = string
}

variable "realm_id" {
  description = "Keycloak realm id for creating clients"
  type        = string
}

variable "tf_extensions" {
  description = "Nebari Terraform Extensions"
  default     = []
}

variable "nebari_config_yaml" {
  description = "Nebari Configuration"
  type        = any
}

variable "helm_extensions" {
  description = "Helm Extensions"
  default     = []
}

variable "keycloak_nebari_bot_password" {
  description = "Keycloak password for nebari-bot"
}

variable "forwardauth_middleware_name" {
  description = "Name of the traefik forward auth middleware"
  type        = string
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/template/versions.tf
---

terraform {
  required_providers {
    helm = {
      source  = "hashicorp/helm"
      version = "2.1.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.35.1"
    }
    keycloak = {
      source  = "mrparkers/keycloak"
      version = "3.7.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/nebari_tf_extensions/__init__.py
---

from typing import Any, Dict, List, Optional, Type

from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.tf_objects import (
    NebariHelmProvider,
    NebariKubernetesProvider,
    NebariTerraformState,
)
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


class NebariExtensionEnv(schema.Base):
    name: str
    value: str


class NebariExtension(schema.Base):
    name: str
    image: str
    urlslug: str
    private: bool = False
    oauth2client: bool = False
    keycloakadmin: bool = False
    jwt: bool = False
    nebariconfigyaml: bool = False
    logout: Optional[str] = None
    envs: Optional[List[NebariExtensionEnv]] = None


class HelmExtension(schema.Base):
    name: str
    repository: str
    chart: str
    version: str
    overrides: Dict = {}


class InputSchema(schema.Base):
    helm_extensions: List[HelmExtension] = []
    tf_extensions: List[NebariExtension] = []


class OutputSchema(schema.Base):
    pass


class NebariTFExtensionsStage(NebariTerraformStage):
    name = "08-nebari-tf-extensions"
    priority = 80

    input_schema = InputSchema
    output_schema = OutputSchema

    def tf_objects(self) -> List[Dict]:
        return [
            NebariTerraformState(self.name, self.config),
            NebariKubernetesProvider(self.config),
            NebariHelmProvider(self.config),
        ]

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        return {
            "environment": self.config.namespace,
            "endpoint": self.config.domain,
            "realm_id": stage_outputs["stages/06-kubernetes-keycloak-configuration"][
                "realm_id"
            ]["value"],
            "tf_extensions": [_.model_dump() for _ in self.config.tf_extensions],
            "nebari_config_yaml": self.config.model_dump(),
            "keycloak_nebari_bot_password": stage_outputs[
                "stages/05-kubernetes-keycloak"
            ]["keycloak_nebari_bot_password"]["value"],
            "helm_extensions": [_.model_dump() for _ in self.config.helm_extensions],
            "forwardauth_middleware_name": stage_outputs[
                "stages/07-kubernetes-services"
            ]["forward-auth-middleware"]["value"]["name"],
        }


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [NebariTFExtensionsStage]



---
File: nebari/src/_nebari/stages/terraform_state/template/aws/modules/terraform-state/main.tf
---

resource "aws_kms_key" "tf-state-key" {
  enable_key_rotation = true
}

resource "aws_s3_bucket" "terraform-state" {
  bucket = "${var.name}-terraform-state"

  force_destroy = true

  versioning {
    enabled = true
  }

  tags = merge({ Name = "S3 remote terraform state store" }, var.tags)

  lifecycle {
    ignore_changes = [
      server_side_encryption_configuration,
    ]
  }
}

resource "aws_s3_bucket_public_access_block" "terraform-state" {
  bucket                  = aws_s3_bucket.terraform-state.id
  ignore_public_acls      = true
  block_public_acls       = true
  block_public_policy     = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform-state" {
  bucket = aws_s3_bucket.terraform-state.id

  rule {
    apply_server_side_encryption_by_default {
      kms_master_key_id = aws_kms_key.tf-state-key.arn
      sse_algorithm     = "aws:kms"
    }
  }
  # // AWS may return HTTP 409 if PutBucketEncryption is called immediately after S3
  # bucket creation. Adding dependency avoids concurrent requests.
  depends_on = [aws_s3_bucket_public_access_block.terraform-state]
}

resource "aws_dynamodb_table" "terraform-state-lock" {
  name = "${var.name}-terraform-state-lock"

  read_capacity  = 1
  write_capacity = 1
  hash_key       = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }

  tags = merge({ Name = "DynamoDB table for locking terraform state store" }, var.tags)
}



---
File: nebari/src/_nebari/stages/terraform_state/template/aws/modules/terraform-state/output.tf
---

output "credentials" {
  description = "Resources from terraform-state"
  value = {
    bucket_arn = aws_s3_bucket.terraform-state.arn
    dynamo_arn = aws_dynamodb_table.terraform-state-lock.arn
  }
}



---
File: nebari/src/_nebari/stages/terraform_state/template/aws/modules/terraform-state/variables.tf
---

variable "name" {
  description = "Prefix of name to append resource"
  type        = string
}

variable "tags" {
  description = "Additional tags to apply to resource"
  type        = map(string)
  default     = {}
}



---
File: nebari/src/_nebari/stages/terraform_state/template/aws/main.tf
---

variable "name" {
  description = "Prefix name to assign to Nebari resources"
  type        = string
}

variable "namespace" {
  description = "Namespace to create Kubernetes resources"
  type        = string
}

module "terraform-state" {
  source = "./modules/terraform-state"

  name = "${var.name}-${var.namespace}"

  tags = {
    Project     = var.name
    Owner       = "terraform-state"
    Environment = var.namespace
  }
}

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.12.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/terraform_state/template/azure/modules/terraform-state/main.tf
---

resource "azurerm_resource_group" "terraform-state-resource-group" {
  name     = var.resource_group_name
  location = var.location
  tags     = var.tags
}

resource "azurerm_storage_account" "terraform-state-storage-account" {
  # name, can only consist of lowercase letters and numbers, and must be between 3 and 24 characters long
  name                     = replace("${var.name}${var.storage_account_postfix}", "-", "") # must be unique across the entire Azure service
  resource_group_name      = azurerm_resource_group.terraform-state-resource-group.name
  location                 = azurerm_resource_group.terraform-state-resource-group.location
  account_tier             = "Standard"
  account_replication_type = "GRS"
  tags                     = var.tags
  min_tls_version          = "TLS1_2"

  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_storage_container" "storage_container" {
  name                  = "${var.name}-state"
  storage_account_name  = azurerm_storage_account.terraform-state-storage-account.name
  container_access_type = "private"
}



---
File: nebari/src/_nebari/stages/terraform_state/template/azure/modules/terraform-state/variables.tf
---

variable "resource_group_name" {
  description = "Prefix of name to append resource"
  type        = string
}

variable "name" {
  description = "Prefix of name to append resource"
  type        = string
}

variable "location" {
  description = "Location for terraform state"
  type        = string
}

variable "storage_account_postfix" {
  description = "random characters appended to storage account name to facilitate global uniqueness"
  type        = string
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}



---
File: nebari/src/_nebari/stages/terraform_state/template/azure/main.tf
---

variable "name" {
  description = "Prefix name to assign to Nebari resources"
  type        = string
}

variable "namespace" {
  description = "Namespace to create Kubernetes resources"
  type        = string
}

variable "region" {
  description = "Region for AWS deployment"
  type        = string
}

variable "storage_account_postfix" {
  description = "Prefix to assign to storage account to ensure it is unique"
  type        = string
}

variable "state_resource_group_name" {
  description = "Name for terraform state resource group"
  type        = string
}

variable "tags" {
  description = "Tags to apply to all resources"
  type        = map(string)
  default     = {}
}

provider "azurerm" {
  features {}
}

module "terraform-state" {
  source = "./modules/terraform-state"

  name                    = "${var.name}-${var.namespace}"
  resource_group_name     = var.state_resource_group_name
  location                = var.region
  storage_account_postfix = var.storage_account_postfix
  tags                    = var.tags
}

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "=4.7.0"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/terraform_state/template/existing/main.tf
---




---
File: nebari/src/_nebari/stages/terraform_state/template/gcp/modules/gcs/main.tf
---

resource "google_storage_bucket" "static-site" {
  name     = var.name
  location = var.location

  force_destroy = var.force_destroy

  versioning {
    enabled = var.versioning
  }
}



---
File: nebari/src/_nebari/stages/terraform_state/template/gcp/modules/gcs/variables.tf
---

variable "name" {
  description = "Prefix name for GCS bucket"
  type        = string
}

variable "location" {
  description = "Location for gcs bucket"
  type        = string
}

variable "force_destroy" {
  description = "force_destroy all bucket contents when bucket is deleted"
  type        = bool
  default     = false
}

variable "versioning" {
  description = "Enable versioning on bucket"
  type        = bool
  default     = true
}

variable "public" {
  description = "Google Cloud Storage s3 bucket is exposed publicly (currently ignored)"
  type        = bool
  default     = false
}



---
File: nebari/src/_nebari/stages/terraform_state/template/gcp/modules/terraform-state/main.tf
---

module "gcs" {
  source = "../gcs"

  name          = "${var.name}-terraform-state"
  location      = var.location
  public        = false
  force_destroy = true

}



---
File: nebari/src/_nebari/stages/terraform_state/template/gcp/modules/terraform-state/variables.tf
---

variable "name" {
  description = "Prefix name for terraform state"
  type        = string
}

variable "location" {
  description = "Location for terraform state"
  type        = string
}



---
File: nebari/src/_nebari/stages/terraform_state/template/gcp/main.tf
---

variable "name" {
  description = "Prefix name to assign to Nebari resources"
  type        = string
}

variable "namespace" {
  description = "Namespace to create Kubernetes resources"
  type        = string
}

variable "region" {
  description = "Region for AWS deployment"
  type        = string
}

module "terraform-state" {
  source = "./modules/terraform-state"

  name     = "${var.name}-${var.namespace}"
  location = var.region
}

terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "6.14.1"
    }
  }
  required_version = ">= 1.0"
}



---
File: nebari/src/_nebari/stages/terraform_state/template/local/main.tf
---




---
File: nebari/src/_nebari/stages/terraform_state/__init__.py
---

import contextlib
import enum
import inspect
import os
import pathlib
import re
from typing import Any, Dict, List, Optional, Tuple, Type

from pydantic import BaseModel, field_validator

from _nebari import utils
from _nebari.provider import opentofu
from _nebari.provider.cloud import azure_cloud
from _nebari.stages.base import NebariTerraformStage
from _nebari.stages.tf_objects import NebariConfig
from _nebari.utils import (
    AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
    construct_azure_resource_group_name,
    modified_environ,
)
from nebari import schema
from nebari.hookspecs import NebariStage, hookimpl


class GCPInputVars(schema.Base):
    name: str
    namespace: str
    region: str


class AzureInputVars(schema.Base):
    name: str
    namespace: str
    region: str
    storage_account_postfix: str
    state_resource_group_name: str
    tags: Dict[str, str]

    @field_validator("state_resource_group_name")
    @classmethod
    def _validate_resource_group_name(cls, value: str) -> str:
        if value is None:
            return value
        length = len(value) + len(AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX)
        if length < 1 or length > 90:
            raise ValueError(
                f"Azure Resource Group name must be between 1 and 90 characters long, when combined with the suffix `{AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX}`."
            )
        if not re.match(r"^[\w\-\.\(\)]+$", value):
            raise ValueError(
                "Azure Resource Group name can only contain alphanumerics, underscores, parentheses, hyphens, and periods."
            )
        if value[-1] == ".":
            raise ValueError("Azure Resource Group name can't end with a period.")

        return value

    @field_validator("tags")
    @classmethod
    def _validate_tags(cls, value: Dict[str, str]) -> Dict[str, str]:
        return azure_cloud.validate_tags(value)


class AWSInputVars(schema.Base):
    name: str
    namespace: str


@schema.yaml_object(schema.yaml)
class TerraformStateEnum(str, enum.Enum):
    remote = "remote"
    local = "local"
    existing = "existing"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class TerraformState(schema.Base):
    type: TerraformStateEnum = TerraformStateEnum.remote
    backend: Optional[str] = None
    config: Dict[str, str] = {}


class InputSchema(schema.Base):
    terraform_state: TerraformState = TerraformState()


class OutputSchema(schema.Base):
    pass


class TerraformStateStage(NebariTerraformStage):
    name = "01-terraform-state"
    priority = 10

    input_schema = InputSchema
    output_schema = OutputSchema

    @property
    def template_directory(self):
        return (
            pathlib.Path(inspect.getfile(self.__class__)).parent
            / "template"
            / self.config.provider.value
        )

    @property
    def stage_prefix(self):
        return pathlib.Path("stages") / self.name / self.config.provider.value

    def state_imports(self) -> List[Tuple[str, str]]:
        if self.config.provider == schema.ProviderEnum.gcp:
            return [
                (
                    "module.terraform-state.module.gcs.google_storage_bucket.static-site",
                    f"{self.config.project_name}-{self.config.namespace}-terraform-state",
                )
            ]
        elif self.config.provider == schema.ProviderEnum.azure:
            subscription_id = os.environ["ARM_SUBSCRIPTION_ID"]
            resource_name_prefix = f"{self.config.project_name}-{self.config.namespace}"
            state_resource_group_name = construct_azure_resource_group_name(
                project_name=self.config.project_name,
                namespace=self.config.namespace,
                base_resource_group_name=self.config.azure.resource_group_name,
                suffix=AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
            )
            state_resource_name_prefix_safe = resource_name_prefix.replace("-", "")
            resource_group_url = f"/subscriptions/{subscription_id}/resourceGroups/{state_resource_group_name}"

            return [
                (
                    "module.terraform-state.azurerm_resource_group.terraform-state-resource-group",
                    resource_group_url,
                ),
                (
                    "module.terraform-state.azurerm_storage_account.terraform-state-storage-account",
                    f"{resource_group_url}/providers/Microsoft.Storage/storageAccounts/{state_resource_name_prefix_safe}{self.config.azure.storage_account_postfix}",
                ),
                (
                    "module.terraform-state.azurerm_storage_container.storage_container",
                    f"https://{state_resource_name_prefix_safe}{self.config.azure.storage_account_postfix}.blob.core.windows.net/{resource_name_prefix}-state",
                ),
            ]
        elif self.config.provider == schema.ProviderEnum.aws:
            return [
                (
                    "module.terraform-state.aws_s3_bucket.terraform-state",
                    f"{self.config.project_name}-{self.config.namespace}-terraform-state",
                ),
                (
                    "module.terraform-state.aws_dynamodb_table.terraform-state-lock",
                    f"{self.config.project_name}-{self.config.namespace}-terraform-state-lock",
                ),
            ]
        else:
            return []

    def tf_objects(self) -> List[Dict]:
        resources = [NebariConfig(self.config)]
        if self.config.provider == schema.ProviderEnum.gcp:
            return resources + [
                opentofu.Provider(
                    "google",
                    project=self.config.google_cloud_platform.project,
                    region=self.config.google_cloud_platform.region,
                ),
            ]
        elif self.config.provider == schema.ProviderEnum.aws:
            return resources + [
                opentofu.Provider("aws", region=self.config.amazon_web_services.region),
            ]
        else:
            return resources

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        if self.config.provider == schema.ProviderEnum.gcp:
            return GCPInputVars(
                name=self.config.project_name,
                namespace=self.config.namespace,
                region=self.config.google_cloud_platform.region,
            ).model_dump()
        elif self.config.provider == schema.ProviderEnum.aws:
            return AWSInputVars(
                name=self.config.project_name,
                namespace=self.config.namespace,
            ).model_dump()
        elif self.config.provider == schema.ProviderEnum.azure:
            return AzureInputVars(
                name=self.config.project_name,
                namespace=self.config.namespace,
                region=self.config.azure.region,
                storage_account_postfix=self.config.azure.storage_account_postfix,
                state_resource_group_name=construct_azure_resource_group_name(
                    project_name=self.config.project_name,
                    namespace=self.config.namespace,
                    base_resource_group_name=self.config.azure.resource_group_name,
                    suffix=AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
                ),
                tags=self.config.azure.tags,
            ).model_dump()
        elif (
            self.config.provider == schema.ProviderEnum.local
            or self.config.provider == schema.ProviderEnum.existing
        ):
            return {}
        else:
            ValueError(f"Unknown provider: {self.config.provider}")

    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        self.check_immutable_fields()

        # No need to run tofu init here as it's being called when running the
        # terraform show command, inside check_immutable_fields
        with super().deploy(stage_outputs, disable_prompt, tofu_init=False):
            env_mapping = {}

            with modified_environ(**env_mapping):
                yield

    def check_immutable_fields(self):
        nebari_config_state = self.get_nebari_config_state()
        if not nebari_config_state:
            return

        # compute diff of remote/prior and current nebari config
        nebari_config_diff = utils.JsonDiff(
            nebari_config_state, self.config.model_dump()
        )
        # check if any changed fields are immutable
        for keys, old, new in nebari_config_diff.modified():
            bottom_level_schema = self.config
            if len(keys) > 1:
                for key in keys[:-1]:
                    try:
                        bottom_level_schema = getattr(bottom_level_schema, key)
                    except AttributeError as e:
                        if isinstance(bottom_level_schema, dict):
                            # handle case where value is a dict
                            bottom_level_schema = bottom_level_schema[key]
                        else:
                            raise e

            # Return a default (mutable) extra field schema if bottom level is not a Pydantic model (such as a free-form 'overrides' block)
            if isinstance(bottom_level_schema, BaseModel):
                extra_field_schema = schema.ExtraFieldSchema(
                    **bottom_level_schema.model_fields[keys[-1]].json_schema_extra or {}
                )
            else:
                extra_field_schema = schema.ExtraFieldSchema()

            if extra_field_schema.immutable:
                key_path = ".".join(keys)
                raise ValueError(
                    f'Attempting to change immutable field "{key_path}" ("{old}"->"{new}") in Nebari config file.  Immutable fields cannot be changed after initial deployment.'
                )

    def get_nebari_config_state(self) -> dict:
        directory = str(self.output_directory / self.stage_prefix)
        tf_state = opentofu.show(directory)
        nebari_config_state = None

        # get nebari config from state
        for resource in (
            tf_state.get("values", {}).get("root_module", {}).get("resources", [])
        ):
            if resource["address"] == "terraform_data.nebari_config":
                nebari_config_state = resource["values"]["input"]
                break
        return nebari_config_state

    @contextlib.contextmanager
    def destroy(
        self, stage_outputs: Dict[str, Dict[str, Any]], status: Dict[str, bool]
    ):
        with super().destroy(stage_outputs, status):
            env_mapping = {}

            with modified_environ(**env_mapping):
                yield


@hookimpl
def nebari_stage() -> List[Type[NebariStage]]:
    return [TerraformStateStage]



---
File: nebari/src/_nebari/stages/__init__.py
---




---
File: nebari/src/_nebari/stages/base.py
---

import contextlib
import inspect
import os
import pathlib
import shutil
import sys
import tempfile
from typing import Any, Dict, List, Tuple

from jinja2 import Environment, FileSystemLoader
from kubernetes import client, config
from kubernetes.client.rest import ApiException

from _nebari.provider import helm, kubernetes, kustomize, opentofu
from _nebari.stages.tf_objects import NebariTerraformState
from nebari.hookspecs import NebariStage

KUSTOMIZATION_TEMPLATE = "kustomization.yaml.tmpl"


class NebariKustomizeStage(NebariStage):
    @property
    def template_directory(self):
        return pathlib.Path(inspect.getfile(self.__class__)).parent / "template"

    @property
    def stage_prefix(self):
        return pathlib.Path("stages") / self.name

    @property
    def kustomize_vars(self):
        return {}

    failed_to_create = False
    error_message = ""

    def _get_k8s_client(self, stage_outputs: Dict[str, Dict[str, Any]]):
        try:
            config.load_kube_config(
                config_file=stage_outputs["stages/02-infrastructure"][
                    "kubeconfig_filename"
                ]["value"]
            )
            api_instance = client.ApiClient()
        except ApiException:
            print(
                f"ERROR: After stage={self.name} "
                "unable to connect to kubernetes cluster"
            )
            sys.exit(1)
        return api_instance

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        return {}

    def set_outputs(
        self, stage_outputs: Dict[str, Dict[str, Any]], outputs: Dict[str, Any]
    ):
        stage_key = "stages/" + self.name
        if stage_key not in stage_outputs:
            stage_outputs[stage_key] = {**outputs}
        else:
            stage_outputs[stage_key].update(outputs)

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):

        if self.failed_to_create:
            print(
                f"ERROR: After stage={self.name} "
                f"failed to create kubernetes resources"
                f"with error: {self.error_message}"
            )
            sys.exit(1)

    def render(self) -> Dict[pathlib.Path, str]:
        env = Environment(loader=FileSystemLoader(self.template_directory))

        contents = {}
        if not (self.template_directory / KUSTOMIZATION_TEMPLATE).exists():
            raise FileNotFoundError(
                f"ERROR: After stage={self.name} "
                f"{KUSTOMIZATION_TEMPLATE} template file not found in template directory"
            )
        kustomize_template = env.get_template(KUSTOMIZATION_TEMPLATE)
        rendered_kustomization = kustomize_template.render(**self.kustomize_vars)
        with open(self.template_directory / "kustomization.yaml", "w") as f:
            f.write(rendered_kustomization)

        with tempfile.TemporaryDirectory() as temp_dir:
            kustomize.run_kustomize_subprocess(
                [
                    "build",
                    "-o",
                    f"{temp_dir}",
                    "--enable-helm",
                    "--helm-command",
                    f"{helm.download_helm_binary()}",
                    f"{self.template_directory}",
                ]
            )

            # copy crds from the template directory to the temp directory
            crds = self.template_directory.glob("charts/*/*/crds/*.yaml")
            for crd in crds:
                with crd.open("rb") as f:
                    contents[
                        pathlib.Path(
                            self.stage_prefix,
                            "crds",
                            crd.name,
                        )
                    ] = f.read()

            for root, _, filenames in os.walk(temp_dir):
                for filename in filenames:
                    root_filename = pathlib.Path(root) / filename
                    with root_filename.open("rb") as f:
                        contents[
                            pathlib.Path(
                                self.stage_prefix,
                                "manifests",
                                pathlib.Path.relative_to(
                                    pathlib.Path(root_filename), temp_dir
                                ),
                            )
                        ] = f.read()
            # cleanup generated kustomization.yaml
            pathlib.Path(self.template_directory, "kustomization.yaml").unlink()

            # clean up downloaded helm charts
            charts_dir = pathlib.Path(self.template_directory, "charts")
            if charts_dir.exists():
                shutil.rmtree(charts_dir)

            return contents

    # implement the deploy method by taking all of the kubernetes manifests
    # from the manifests sub folder and applying them to the kubernetes
    # cluster using the kubernetes python client in order
    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):

        print(f"Deploying kubernetes resources for {self.name}")
        # get the kubernetes client
        kubernetes_client = self._get_k8s_client(stage_outputs)

        # get the path to the manifests folder
        directory = pathlib.Path(self.output_directory, self.stage_prefix)

        # get the list of all the files in the crds folder
        crds = directory.glob("crds/*.yaml")

        # get the list of all the files in the manifests folder
        manifests = directory.glob("manifests/*.yaml")

        # apply each crd to the kubernetes cluster in alphabetical order
        for crd in sorted(crds):
            print(f"CRD: {crd}")
            try:
                kubernetes.create_from_yaml(kubernetes_client, crd, apply=True)
            except ApiException as e:
                self.failed_to_create = True
                self.error_message = str(e)
            print(f"Applied CRD: {crd}")

        # apply each manifest to the kubernetes cluster in alphabetical order
        for manifest in sorted(manifests):
            print(f"manifest: {manifest}")
            try:
                kubernetes.create_from_yaml(
                    kubernetes_client,
                    manifest,
                    namespace=self.config.namespace,
                    apply=True,
                )
            except ApiException as e:
                self.failed_to_create = True
                self.error_message = str(e)
            print(f"Applied manifest: {manifest}")
        yield

    @contextlib.contextmanager
    def destroy(
        self,
        stage_outputs: Dict[str, Dict[str, Any]],
        status: Dict[str, bool],
        ignore_errors: bool = True,
    ):
        # destroy each manifest in the reverse order
        print(f"Destroying kubernetes resources for {self.name}")

        # get the kubernetes client
        kubernetes_client = self._get_k8s_client(stage_outputs)

        # get the path to the manifests folder
        directory = pathlib.Path(self.output_directory, self.stage_prefix)

        # get the list of all the files in the crds folder
        crds = directory.glob("crds/*.yaml")

        # get the list of all the files in the manifests folder
        manifests = directory.glob("manifests/*.yaml")

        # destroy each manifest in the reverse order

        for manifest in sorted(manifests, reverse=True):

            print(f"Destroyed manifest: {manifest}")
            try:
                kubernetes.delete_from_yaml(kubernetes_client, manifest)
            except ApiException as e:
                self.error_message = str(e)
                if not ignore_errors:
                    raise e

        # destroy each crd in the reverse order

        for crd in sorted(crds, reverse=True):

            print(f"Destroyed CRD: {crd}")
            try:
                kubernetes.delete_from_yaml(kubernetes_client, crd)
            except ApiException as e:
                self.error_message = str(e)
                if not ignore_errors:
                    raise e
        yield


class NebariTerraformStage(NebariStage):
    @property
    def template_directory(self):
        return pathlib.Path(inspect.getfile(self.__class__)).parent / "template"

    @property
    def stage_prefix(self):
        return pathlib.Path("stages") / self.name

    def state_imports(self) -> List[Tuple[str, str]]:
        return []

    def tf_objects(self) -> List[Dict]:
        return [NebariTerraformState(self.name, self.config)]

    def render(self) -> Dict[pathlib.Path, str]:
        contents = {
            (self.stage_prefix / "_nebari.tf.json"): opentofu.tf_render_objects(
                self.tf_objects()
            )
        }
        for root, dirs, filenames in os.walk(self.template_directory):
            for filename in filenames:
                root_filename = pathlib.Path(root) / filename
                with root_filename.open("rb") as f:
                    contents[
                        pathlib.Path(
                            self.stage_prefix,
                            pathlib.Path.relative_to(
                                pathlib.Path(root_filename), self.template_directory
                            ),
                        )
                    ] = f.read()
        return contents

    def input_vars(self, stage_outputs: Dict[str, Dict[str, Any]]):
        return {}

    def set_outputs(
        self, stage_outputs: Dict[str, Dict[str, Any]], outputs: Dict[str, Any]
    ):
        stage_key = "stages/" + self.name
        if stage_key not in stage_outputs:
            stage_outputs[stage_key] = {**outputs}
        else:
            stage_outputs[stage_key].update(outputs)

    @contextlib.contextmanager
    def deploy(
        self,
        stage_outputs: Dict[str, Dict[str, Any]],
        disable_prompt: bool = False,
        tofu_init: bool = True,
    ):
        deploy_config = dict(
            directory=str(self.output_directory / self.stage_prefix),
            input_vars=self.input_vars(stage_outputs),
            tofu_init=tofu_init,
        )
        state_imports = self.state_imports()
        if state_imports:
            deploy_config["tofu_import"] = True
            deploy_config["state_imports"] = state_imports

        self.set_outputs(stage_outputs, opentofu.deploy(**deploy_config))
        self.post_deploy(stage_outputs, disable_prompt)
        yield

    def post_deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        pass

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        pass

    @contextlib.contextmanager
    def destroy(
        self,
        stage_outputs: Dict[str, Dict[str, Any]],
        status: Dict[str, bool],
        ignore_errors: bool = True,
    ):
        self.set_outputs(
            stage_outputs,
            opentofu.deploy(
                directory=str(self.output_directory / self.stage_prefix),
                input_vars=self.input_vars(stage_outputs),
                tofu_init=True,
                tofu_import=True,
                tofu_apply=False,
                tofu_destroy=False,
            ),
        )
        yield
        try:
            opentofu.deploy(
                directory=str(self.output_directory / self.stage_prefix),
                input_vars=self.input_vars(stage_outputs),
                tofu_init=True,
                tofu_import=True,
                tofu_apply=False,
                tofu_destroy=True,
            )
            status["stages/" + self.name] = True
        except opentofu.OpenTofuException as e:
            if not ignore_errors:
                raise e
            status["stages/" + self.name] = False



---
File: nebari/src/_nebari/stages/tf_objects.py
---

from _nebari.provider.opentofu import Data, Provider, Resource, TerraformBackend
from _nebari.utils import (
    AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
    construct_azure_resource_group_name,
    deep_merge,
)
from nebari import schema


def NebariKubernetesProvider(nebari_config: schema.Main):
    if nebari_config.provider == "aws":
        cluster_name = f"{nebari_config.escaped_project_name}-{nebari_config.namespace}"
        # The AWS provider needs to be added, as we are using aws related resources #1254
        return deep_merge(
            Data("aws_eks_cluster", "default", name=cluster_name),
            Data("aws_eks_cluster_auth", "default", name=cluster_name),
            Provider("aws", region=nebari_config.amazon_web_services.region),
            Provider(
                "kubernetes",
                host="${data.aws_eks_cluster.default.endpoint}",
                cluster_ca_certificate="${base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)}",
                token="${data.aws_eks_cluster_auth.default.token}",
            ),
        )
    return Provider(
        "kubernetes",
    )


def NebariHelmProvider(nebari_config: schema.Main):
    if nebari_config.provider == "aws":
        cluster_name = f"{nebari_config.escaped_project_name}-{nebari_config.namespace}"

        return deep_merge(
            Data("aws_eks_cluster", "default", name=cluster_name),
            Data("aws_eks_cluster_auth", "default", name=cluster_name),
            Provider(
                "helm",
                kubernetes=dict(
                    host="${data.aws_eks_cluster.default.endpoint}",
                    cluster_ca_certificate="${base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)}",
                    token="${data.aws_eks_cluster_auth.default.token}",
                ),
            ),
        )
    return Provider("helm")


def NebariTerraformState(directory: str, nebari_config: schema.Main):
    if nebari_config.terraform_state.type == "local":
        return {}
    elif nebari_config.terraform_state.type == "existing":
        return TerraformBackend(
            nebari_config["terraform_state"]["backend"],
            **nebari_config["terraform_state"]["config"],
        )
    elif nebari_config.provider == "aws":
        return TerraformBackend(
            "s3",
            bucket=f"{nebari_config.escaped_project_name}-{nebari_config.namespace}-terraform-state",
            key=f"terraform/{nebari_config.escaped_project_name}-{nebari_config.namespace}/{directory}.tfstate",
            region=nebari_config.amazon_web_services.region,
            encrypt=True,
            dynamodb_table=f"{nebari_config.escaped_project_name}-{nebari_config.namespace}-terraform-state-lock",
        )
    elif nebari_config.provider == "gcp":
        return TerraformBackend(
            "gcs",
            bucket=f"{nebari_config.escaped_project_name}-{nebari_config.namespace}-terraform-state",
            prefix=f"terraform/{nebari_config.escaped_project_name}/{directory}",
        )
    elif nebari_config.provider == "azure":
        return TerraformBackend(
            "azurerm",
            resource_group_name=construct_azure_resource_group_name(
                project_name=nebari_config.project_name,
                namespace=nebari_config.namespace,
                base_resource_group_name=nebari_config.azure.resource_group_name,
                suffix=AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX,
            ),
            # storage account must be globally unique
            storage_account_name=f"{nebari_config.escaped_project_name}{nebari_config.namespace}{nebari_config.azure.storage_account_postfix}",
            container_name=f"{nebari_config.escaped_project_name}-{nebari_config.namespace}-state",
            key=f"terraform/{nebari_config.escaped_project_name}-{nebari_config.namespace}/{directory}",
        )
    elif nebari_config.provider == "existing":
        optional_kwargs = {}
        if "kube_context" in nebari_config.existing:
            optional_kwargs["config_context"] = nebari_config.existing.kube_context
        return TerraformBackend(
            "kubernetes",
            secret_suffix=f"{nebari_config.escaped_project_name}-{nebari_config.namespace}-{directory}",
            load_config_file=True,
            **optional_kwargs,
        )
    elif nebari_config.provider == "local":
        optional_kwargs = {}
        if "kube_context" in nebari_config.local:
            optional_kwargs["config_context"] = nebari_config.local.kube_context
        return TerraformBackend(
            "kubernetes",
            secret_suffix=f"{nebari_config.escaped_project_name}-{nebari_config.namespace}-{directory}",
            load_config_file=True,
            **optional_kwargs,
        )
    else:
        raise NotImplementedError("state not implemented")


def NebariConfig(nebari_config: schema.Main):
    return Resource("terraform_data", "nebari_config", input=nebari_config.model_dump())



---
File: nebari/src/_nebari/subcommands/__init__.py
---




---
File: nebari/src/_nebari/subcommands/deploy.py
---

import pathlib
from typing import Optional

import rich
import typer

from _nebari.config import read_configuration
from _nebari.deploy import deploy_configuration
from _nebari.render import render_template
from nebari.hookspecs import hookimpl

TERRAFORM_STATE_STAGE_NAME = "01-terraform-state"


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command()
    def deploy(
        ctx: typer.Context,
        config_filename: pathlib.Path = typer.Option(
            ...,
            "--config",
            "-c",
            help="nebari configuration yaml file path",
        ),
        output_directory: pathlib.Path = typer.Option(
            "./",
            "-o",
            "--output",
            help="output directory",
        ),
        dns_provider: Optional[str] = typer.Option(
            None,
            "--dns-provider",
            help="dns provider to use for registering domain name mapping ⚠️ moved to `dns.provider` in nebari-config.yaml",
        ),
        dns_auto_provision: bool = typer.Option(
            False,
            "--dns-auto-provision",
            help="Attempt to automatically provision DNS, currently only available for `cloudflare` ⚠️ moved to `dns.auto_provision` in nebari-config.yaml",
        ),
        disable_prompt: bool = typer.Option(
            False,
            "--disable-prompt",
            help="Disable human intervention",
        ),
        disable_render: bool = typer.Option(
            False,
            "--disable-render",
            help="Disable auto-rendering in deploy stage",
        ),
        disable_checks: bool = typer.Option(
            False,
            "--disable-checks",
            help="Disable the checks performed after each stage",
        ),
        skip_remote_state_provision: bool = typer.Option(
            False,
            "--skip-remote-state-provision",
            help="Skip terraform state deployment which is often required in CI once the terraform remote state bootstrapping phase is complete",
        ),
    ):
        """
        Deploy the Nebari cluster from your [purple]nebari-config.yaml[/purple] file.
        """
        from nebari.plugins import nebari_plugin_manager

        if dns_provider or dns_auto_provision:
            msg = "The [green]`--dns-provider`[/green] and [green]`--dns-auto-provision`[/green] flags have been removed in favor of configuring DNS via nebari-config.yaml"
            rich.print(msg)
            raise typer.Abort()

        stages = nebari_plugin_manager.ordered_stages
        config_schema = nebari_plugin_manager.config_schema

        config = read_configuration(config_filename, config_schema=config_schema)

        if not disable_render:
            render_template(output_directory, config, stages)

        if skip_remote_state_provision:
            for stage in stages:
                if stage.name == TERRAFORM_STATE_STAGE_NAME:
                    stages.remove(stage)
            rich.print("Skipping remote state provision")

        # Digital Ocean support deprecation warning -- Nebari 2024.7.1
        if config.provider == "do" and not disable_prompt:
            msg = "Digital Ocean support is currently being deprecated and will be removed in a future release. Would you like to continue?"
            typer.confirm(msg)

        deploy_configuration(
            config,
            stages,
            disable_prompt=disable_prompt,
            disable_checks=disable_checks,
        )



---
File: nebari/src/_nebari/subcommands/destroy.py
---

import pathlib

import typer

from _nebari.config import read_configuration
from _nebari.destroy import destroy_configuration
from _nebari.render import render_template
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command()
    def destroy(
        ctx: typer.Context,
        config_filename: pathlib.Path = typer.Option(
            ..., "-c", "--config", help="nebari configuration file path"
        ),
        output_directory: pathlib.Path = typer.Option(
            "./",
            "-o",
            "--output",
            help="output directory",
        ),
        disable_render: bool = typer.Option(
            False,
            "--disable-render",
            help="Disable auto-rendering before destroy",
        ),
        disable_prompt: bool = typer.Option(
            False,
            "--disable-prompt",
            help="Destroy entire Nebari cluster without confirmation request. Suggested for CI use.",
        ),
    ):
        """
        Destroy the Nebari cluster from your [purple]nebari-config.yaml[/purple] file.
        """
        from nebari.plugins import nebari_plugin_manager

        stages = nebari_plugin_manager.ordered_stages
        config_schema = nebari_plugin_manager.config_schema

        def _run_destroy(
            config_filename=config_filename, disable_render=disable_render
        ):
            config = read_configuration(config_filename, config_schema=config_schema)

            if not disable_render:
                render_template(output_directory, config, stages)

            destroy_configuration(config, stages)

        if disable_prompt:
            _run_destroy()
        elif typer.confirm("Are you sure you want to destroy your Nebari cluster?"):
            _run_destroy()
        else:
            raise typer.Abort()



---
File: nebari/src/_nebari/subcommands/dev.py
---

import json
import pathlib

import typer

from _nebari.config import read_configuration
from _nebari.keycloak import keycloak_rest_api_call
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    app_dev = typer.Typer(
        add_completion=False,
        no_args_is_help=True,
        rich_markup_mode="rich",
        context_settings={"help_option_names": ["-h", "--help"]},
    )

    cli.add_typer(
        app_dev,
        name="dev",
        help="Development tools and advanced features.",
        rich_help_panel="Additional Commands",
    )

    @app_dev.command(name="keycloak-api")
    def keycloak_api(
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration file path",
        ),
        request: str = typer.Option(
            ...,
            "-r",
            "--request",
            help="Send a REST API request, valid requests follow patterns found here: [green]keycloak.org/docs-api/15.0/rest-api[/green]",
        ),
    ):
        """
        Interact with the Keycloak REST API directly.

        This is an advanced tool which can have potentially destructive consequences.
        Please use this at your own risk.

        """
        from nebari.plugins import nebari_plugin_manager

        config_schema = nebari_plugin_manager.config_schema

        config = read_configuration(config_filename, config_schema=config_schema)
        r = keycloak_rest_api_call(config, request=request)
        print(json.dumps(r, indent=4))



---
File: nebari/src/_nebari/subcommands/info.py
---

import collections

import rich
import typer
from rich.table import Table

from _nebari.version import __version__
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    EXTERNAL_PLUGIN_STYLE = "cyan"

    @cli.command()
    def info(ctx: typer.Context):
        """
        Display information about installed Nebari plugins and their configurations.
        """
        from nebari.plugins import nebari_plugin_manager

        rich.print(f"Nebari version: {__version__}")

        external_plugins = nebari_plugin_manager.get_external_plugins()

        hooks = collections.defaultdict(list)
        for plugin in nebari_plugin_manager.plugin_manager.get_plugins():
            for hook in nebari_plugin_manager.plugin_manager.get_hookcallers(plugin):
                hooks[hook.name].append(plugin.__name__)

        table = Table(title="Hooks")
        table.add_column("hook", justify="left", no_wrap=True)
        table.add_column("module", justify="left", no_wrap=True)

        for hook_name, modules in hooks.items():
            for module in modules:
                style = EXTERNAL_PLUGIN_STYLE if module in external_plugins else None
                table.add_row(hook_name, module, style=style)

        rich.print(table)

        table = Table(title="Runtime Stage Ordering")
        table.add_column("name")
        table.add_column("priority")
        table.add_column("module")
        for stage in nebari_plugin_manager.ordered_stages:
            style = (
                EXTERNAL_PLUGIN_STYLE if stage.__module__ in external_plugins else None
            )
            table.add_row(
                stage.name,
                str(stage.priority),
                f"{stage.__module__}.{stage.__name__}",
                style=style,
            )

        rich.print(table)



---
File: nebari/src/_nebari/subcommands/init.py
---

import enum
import os
import pathlib
import re
from typing import Optional

import questionary
import rich
import typer
from pydantic import BaseModel

from _nebari.config import write_configuration
from _nebari.constants import (
    AWS_DEFAULT_REGION,
    AZURE_DEFAULT_REGION,
    GCP_DEFAULT_REGION,
)
from _nebari.initialize import render_config
from _nebari.provider.cloud import amazon_web_services, azure_cloud, google_cloud
from _nebari.stages.bootstrap import CiEnum
from _nebari.stages.kubernetes_keycloak import AuthenticationEnum
from _nebari.stages.terraform_state import TerraformStateEnum
from _nebari.utils import get_latest_kubernetes_version
from nebari import schema
from nebari.hookspecs import hookimpl
from nebari.schema import ProviderEnum

MISSING_CREDS_TEMPLATE = "Unable to locate your {provider} credentials, refer to this guide on how to generate them:\n\n[green]\t{link_to_docs}[/green]\n\n"
LINKS_TO_DOCS_TEMPLATE = (
    "For more details, refer to the Nebari docs:\n\n\t[green]{link_to_docs}[/green]\n\n"
)
LINKS_TO_EXTERNAL_DOCS_TEMPLATE = "For more details, refer to the {provider} docs:\n\n\t[green]{link_to_docs}[/green]\n\n"

# links to external docs
CREATE_AWS_CREDS = (
    "https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html"
)
CREATE_GCP_CREDS = (
    "https://cloud.google.com/iam/docs/creating-managing-service-accounts"
)
CREATE_AZURE_CREDS = "https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/service_principal_client_secret#creating-a-service-principal-in-the-azure-portal"
CREATE_AUTH0_CREDS = "https://auth0.com/docs/get-started/auth0-overview/create-applications/machine-to-machine-apps"
CREATE_GITHUB_OAUTH_CREDS = "https://docs.github.com/en/developers/apps/building-oauth-apps/creating-an-oauth-app"
AWS_REGIONS = "https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-regions"
GCP_REGIONS = "https://cloud.google.com/compute/docs/regions-zones"
AZURE_REGIONS = "https://azure.microsoft.com/en-us/explore/global-infrastructure/geographies/#overview"


# links to Nebari docs
DOCS_HOME = "https://nebari.dev/docs/"
CHOOSE_CLOUD_PROVIDER = "https://nebari.dev/docs/get-started/deploy"

GUIDED_INIT_MSG = (
    "[bold green]START HERE[/bold green] - this will guide you step-by-step "
    "to generate your [purple]nebari-config.yaml[/purple]. "
    "It is an [i]alternative[/i] to passing the options listed below."
)

DEFAULT_REGION_MSG = "Defaulting to region:`{region}`."

DEFAULT_KUBERNETES_VERSION_MSG = (
    "Defaulting to highest supported Kubernetes version: `{kubernetes_version}`."
)

LATEST = "latest"

CLOUD_PROVIDER_FULL_NAME = {
    "Local": ProviderEnum.local.name,
    "Existing": ProviderEnum.existing.name,
    "Amazon Web Services": ProviderEnum.aws.name,
    "Google Cloud Platform": ProviderEnum.gcp.name,
    "Microsoft Azure": ProviderEnum.azure.name,
}


class GitRepoEnum(str, enum.Enum):
    github = "github.com"
    gitlab = "gitlab.com"


class InitInputs(schema.Base):
    cloud_provider: ProviderEnum = ProviderEnum.local
    project_name: schema.project_name_pydantic = ""
    domain_name: Optional[str] = None
    namespace: Optional[schema.namespace_pydantic] = "dev"
    auth_provider: AuthenticationEnum = AuthenticationEnum.password
    auth_auto_provision: bool = False
    repository: Optional[schema.github_url_pydantic] = None
    repository_auto_provision: bool = False
    ci_provider: CiEnum = CiEnum.none
    terraform_state: TerraformStateEnum = TerraformStateEnum.remote
    kubernetes_version: Optional[str] = None
    region: Optional[str] = None
    ssl_cert_email: Optional[schema.email_pydantic] = None
    disable_prompt: bool = False
    config_set: Optional[str] = None
    output: pathlib.Path = pathlib.Path("nebari-config.yaml")
    explicit: int = 0


def enum_to_list(enum_cls):
    return [e.value for e in enum_cls]


def get_region_docs(cloud_provider: str):
    if cloud_provider == ProviderEnum.aws.value.lower():
        return AWS_REGIONS
    elif cloud_provider == ProviderEnum.gcp.value.lower():
        return GCP_REGIONS
    elif cloud_provider == ProviderEnum.azure.value.lower():
        return AZURE_REGIONS


def handle_init(inputs: InitInputs, config_schema: BaseModel):
    """
    Take the inputs from the `nebari init` command, render the config and write it to a local yaml file.
    """

    # this will force the `set_kubernetes_version` to grab the latest version
    if inputs.kubernetes_version == "latest":
        inputs.kubernetes_version = None

    config = render_config(
        cloud_provider=inputs.cloud_provider,
        project_name=inputs.project_name,
        nebari_domain=inputs.domain_name,
        namespace=inputs.namespace,
        auth_provider=inputs.auth_provider,
        auth_auto_provision=inputs.auth_auto_provision,
        ci_provider=inputs.ci_provider,
        repository=inputs.repository,
        repository_auto_provision=inputs.repository_auto_provision,
        kubernetes_version=inputs.kubernetes_version,
        region=inputs.region,
        terraform_state=inputs.terraform_state,
        ssl_cert_email=inputs.ssl_cert_email,
        disable_prompt=inputs.disable_prompt,
        config_set=inputs.config_set,
    )

    try:
        write_configuration(
            inputs.output,
            config if not inputs.explicit else config_schema(**config),
            mode="x",
        )
    except FileExistsError:
        raise ValueError(
            "A nebari-config.yaml file already exists. Please move or delete it and try again."
        )


def check_repository_creds(ctx: typer.Context, git_provider: str):
    """Validate the necessary Git provider (GitHub) credentials are set."""

    if (
        git_provider == GitRepoEnum.github.value.lower()
        and not os.environ.get("GITHUB_USERNAME")
        or not os.environ.get("GITHUB_TOKEN")
    ):
        os.environ["GITHUB_USERNAME"] = typer.prompt(
            "Paste your GITHUB_USERNAME",
            hide_input=True,
        )
        os.environ["GITHUB_TOKEN"] = typer.prompt(
            "Paste your GITHUB_TOKEN",
            hide_input=True,
        )


def typer_validate_regex(regex: str, error_message: str = None):
    def callback(value):
        if value is None:
            return value

        if re.fullmatch(regex, value):
            return value
        message = error_message or f"Does not match {regex}"
        raise typer.BadParameter(message)

    return callback


def questionary_validate_regex(regex: str, error_message: str = None):
    def callback(value):
        if re.fullmatch(regex, value):
            return True

        message = error_message or f"Invalid input. Does not match {regex}"
        return message

    return callback


def check_auth_provider_creds(ctx: typer.Context, auth_provider: str):
    """Validate the the necessary auth provider credentials have been set as environment variables."""
    if ctx.params.get("disable_prompt"):
        return auth_provider.lower()

    auth_provider = auth_provider.lower()

    # Auth0
    if auth_provider == AuthenticationEnum.auth0.value.lower() and (
        not os.environ.get("AUTH0_CLIENT_ID")
        or not os.environ.get("AUTH0_CLIENT_SECRET")
        or not os.environ.get("AUTH0_DOMAIN")
    ):
        rich.print(
            MISSING_CREDS_TEMPLATE.format(
                provider="Auth0", link_to_docs=CREATE_AUTH0_CREDS
            )
        )

        if not os.environ.get("AUTH0_CLIENT_ID"):
            os.environ["AUTH0_CLIENT_ID"] = typer.prompt(
                "Paste your AUTH0_CLIENT_ID",
                hide_input=True,
            )

        if not os.environ.get("AUTH0_CLIENT_SECRET"):
            os.environ["AUTH0_CLIENT_SECRET"] = typer.prompt(
                "Paste your AUTH0_CLIENT_SECRET",
                hide_input=True,
            )

        if not os.environ.get("AUTH0_DOMAIN"):
            os.environ["AUTH0_DOMAIN"] = typer.prompt(
                "Paste your AUTH0_DOMAIN",
                hide_input=True,
            )

    # GitHub
    elif auth_provider == AuthenticationEnum.github.value.lower() and (
        not os.environ.get("GITHUB_CLIENT_ID")
        or not os.environ.get("GITHUB_CLIENT_SECRET")
    ):
        rich.print(
            MISSING_CREDS_TEMPLATE.format(
                provider="GitHub OAuth App", link_to_docs=CREATE_GITHUB_OAUTH_CREDS
            )
        )

        if not os.environ.get("GITHUB_CLIENT_ID"):
            os.environ["GITHUB_CLIENT_ID"] = typer.prompt(
                "Paste your GITHUB_CLIENT_ID",
                hide_input=True,
            )

        if not os.environ.get("GITHUB_CLIENT_SECRET"):
            os.environ["GITHUB_CLIENT_SECRET"] = typer.prompt(
                "Paste your GITHUB_CLIENT_SECRET",
                hide_input=True,
            )

    return auth_provider


def check_cloud_provider_creds(cloud_provider: ProviderEnum, disable_prompt: bool):
    """Validate that the necessary cloud credentials have been set as environment variables."""

    if disable_prompt:
        return cloud_provider.lower()

    # AWS
    if cloud_provider == ProviderEnum.aws.value.lower() and (
        not os.environ.get("AWS_ACCESS_KEY_ID")
        or not os.environ.get("AWS_SECRET_ACCESS_KEY")
    ):
        rich.print(
            MISSING_CREDS_TEMPLATE.format(
                provider="Amazon Web Services", link_to_docs=CREATE_AWS_CREDS
            )
        )

        os.environ["AWS_ACCESS_KEY_ID"] = typer.prompt(
            "Paste your AWS_ACCESS_KEY_ID",
            hide_input=True,
        )
        os.environ["AWS_SECRET_ACCESS_KEY"] = typer.prompt(
            "Paste your AWS_SECRET_ACCESS_KEY",
            hide_input=True,
        )

    # GCP
    elif cloud_provider == ProviderEnum.gcp.value.lower() and (
        not os.environ.get("GOOGLE_CREDENTIALS") or not os.environ.get("PROJECT_ID")
    ):
        rich.print(
            MISSING_CREDS_TEMPLATE.format(
                provider="Google Cloud Provider", link_to_docs=CREATE_GCP_CREDS
            )
        )

        os.environ["GOOGLE_CREDENTIALS"] = typer.prompt(
            "Paste your GOOGLE_CREDENTIALS",
            hide_input=True,
        )
        os.environ["PROJECT_ID"] = typer.prompt(
            "Paste your PROJECT_ID",
            hide_input=True,
        )

    # AZURE
    elif cloud_provider == ProviderEnum.azure.value.lower() and (
        not os.environ.get("ARM_CLIENT_ID")
        or not os.environ.get("ARM_CLIENT_SECRET")
        or not os.environ.get("ARM_SUBSCRIPTION_ID")
        or not os.environ.get("ARM_TENANT_ID")
    ):
        rich.print(
            MISSING_CREDS_TEMPLATE.format(
                provider="Azure", link_to_docs=CREATE_AZURE_CREDS
            )
        )
        os.environ["ARM_CLIENT_ID"] = typer.prompt(
            "Paste your ARM_CLIENT_ID",
            hide_input=True,
        )
        os.environ["ARM_SUBSCRIPTION_ID"] = typer.prompt(
            "Paste your ARM_SUBSCRIPTION_ID",
            hide_input=True,
        )
        os.environ["ARM_TENANT_ID"] = typer.prompt(
            "Paste your ARM_TENANT_ID",
            hide_input=True,
        )
        os.environ["ARM_CLIENT_SECRET"] = typer.prompt(
            "Paste your ARM_CLIENT_SECRET",
            hide_input=True,
        )

    return cloud_provider


def check_cloud_provider_kubernetes_version(
    kubernetes_version: str, cloud_provider: str, region: str
):
    if cloud_provider == ProviderEnum.aws.value.lower():
        versions = amazon_web_services.kubernetes_versions(region)

        if not kubernetes_version or kubernetes_version == LATEST:
            kubernetes_version = get_latest_kubernetes_version(versions)
            rich.print(
                DEFAULT_KUBERNETES_VERSION_MSG.format(
                    kubernetes_version=kubernetes_version
                )
            )
        if kubernetes_version not in versions:
            raise ValueError(
                f"Invalid Kubernetes version `{kubernetes_version}`. Please refer to the AWS docs for a list of valid versions: {versions}"
            )
    elif cloud_provider == ProviderEnum.azure.value.lower():
        versions = azure_cloud.kubernetes_versions(region)

        if not kubernetes_version or kubernetes_version == LATEST:
            kubernetes_version = get_latest_kubernetes_version(versions)
            rich.print(
                DEFAULT_KUBERNETES_VERSION_MSG.format(
                    kubernetes_version=kubernetes_version
                )
            )
        if kubernetes_version not in versions:
            raise ValueError(
                f"Invalid Kubernetes version `{kubernetes_version}`. Please refer to the Azure docs for a list of valid versions: {versions}"
            )
    elif cloud_provider == ProviderEnum.gcp.value.lower():
        versions = google_cloud.kubernetes_versions(region)

        if not kubernetes_version or kubernetes_version == LATEST:
            kubernetes_version = google_cloud.get_patch_version(
                get_latest_kubernetes_version(versions)
            )
            rich.print(
                DEFAULT_KUBERNETES_VERSION_MSG.format(
                    kubernetes_version=kubernetes_version
                )
            )
        if not any(v.startswith(kubernetes_version) for v in versions):
            raise ValueError(
                f"Invalid Kubernetes version `{kubernetes_version}`. Please refer to the GCP docs for a list of valid versions: {versions}"
            )

    return kubernetes_version


def check_cloud_provider_region(region: str, cloud_provider: str) -> str:
    if cloud_provider == ProviderEnum.aws.value.lower():
        if not region:
            region = os.environ.get("AWS_DEFAULT_REGION")
            if not region:
                region = AWS_DEFAULT_REGION
                rich.print(f"Defaulting to `{region}` region.")
            else:
                rich.print(
                    f"Falling back to the region found in the AWS_DEFAULT_REGION environment variable: `{region}`"
                )
        region = amazon_web_services.validate_region(region)
    elif cloud_provider == ProviderEnum.azure.value.lower():
        # TODO: Add a check for valid region for Azure
        if not region:
            region = AZURE_DEFAULT_REGION
            rich.print(DEFAULT_REGION_MSG.format(region=region))
    elif cloud_provider == ProviderEnum.gcp.value.lower():
        if not region:
            region = GCP_DEFAULT_REGION
            rich.print(DEFAULT_REGION_MSG.format(region=region))
        if region not in google_cloud.regions():
            raise ValueError(
                f"Invalid region `{region}`. Please refer to the GCP docs for a list of valid regions: {GCP_REGIONS}"
            )

    return region


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command()
    def init(
        cloud_provider: ProviderEnum = typer.Argument(
            ProviderEnum.local,
            help=f"options: {enum_to_list(ProviderEnum)}",
        ),
        # Although this unused below, the functionality is contained in the callback. Thus,
        # this attribute cannot be removed.
        guided_init: bool = typer.Option(
            False,
            help=GUIDED_INIT_MSG,
            callback=guided_init_wizard,
            is_eager=True,
        ),
        project_name: str = typer.Option(
            ...,
            "--project-name",
            "--project",
            "-p",
            callback=typer_validate_regex(
                schema.project_name_regex,
                "Project name must (1) consist of only letters, numbers, hyphens, and underscores, (2) begin and end with a letter, and (3) contain between 3 and 16 characters.",
            ),
        ),
        domain_name: Optional[str] = typer.Option(
            None,
            "--domain-name",
            "--domain",
            "-d",
        ),
        namespace: str = typer.Option(
            "dev",
            callback=typer_validate_regex(
                schema.namespace_regex,
                "Namespace must begin and end with a letter and consist of letters, dashes, or underscores.",
            ),
        ),
        region: str = typer.Option(
            None,
            help="The region you want to deploy your Nebari cluster to (if deploying to the cloud)",
        ),
        auth_provider: AuthenticationEnum = typer.Option(
            AuthenticationEnum.password,
            help=f"options: {enum_to_list(AuthenticationEnum)}",
            callback=check_auth_provider_creds,
        ),
        auth_auto_provision: bool = typer.Option(
            False,
        ),
        repository: str = typer.Option(
            None,
            help="Github repository URL to be initialized with --repository-auto-provision",
            callback=typer_validate_regex(
                schema.github_url_regex,
                "Must be a fully qualified GitHub repository URL.",
            ),
        ),
        repository_auto_provision: bool = typer.Option(
            False,
            help="Initialize the GitHub repository provided by --repository (GitHub credentials required)",
        ),
        ci_provider: CiEnum = typer.Option(
            CiEnum.none,
            help=f"options: {enum_to_list(CiEnum)}",
        ),
        terraform_state: TerraformStateEnum = typer.Option(
            TerraformStateEnum.remote,
            help=f"options: {enum_to_list(TerraformStateEnum)}",
        ),
        kubernetes_version: str = typer.Option(
            LATEST,
            help="The Kubernetes version you want to deploy your Nebari cluster to, leave blank for latest version",
        ),
        ssl_cert_email: str = typer.Option(
            None,
            callback=typer_validate_regex(
                schema.email_regex,
                f"Email must be valid and match the regex {schema.email_regex}",
            ),
        ),
        disable_prompt: bool = typer.Option(
            False,
            is_eager=True,
        ),
        config_set: str = typer.Option(
            None,
            "--config-set",
            "-s",
            help="Apply a pre-defined set of nebari configuration options.",
        ),
        output: str = typer.Option(
            pathlib.Path("nebari-config.yaml"),
            "--output",
            "-o",
            help="Output file path for the rendered config file.",
        ),
        explicit: int = typer.Option(
            0,
            "--explicit",
            "-e",
            count=True,
            help="Write explicit nebari config file (advanced users only).",
        ),
    ):
        """
        Create and initialize your [purple]nebari-config.yaml[/purple] file.

        This command will create and initialize your [purple]nebari-config.yaml[/purple] :sparkles:

        This file contains all your Nebari cluster configuration details and,
        is used as input to later commands such as [green]nebari render[/green], [green]nebari deploy[/green], etc.

        If you're new to Nebari, we recommend you use the Guided Init wizard.
        To get started simply run:

                [green]nebari init --guided-init[/green]

        """
        inputs = InitInputs()

        # validate inputs after they've been set so we can control the order they are validated
        # validations for --guided-init should be handled as a callbacks within the `guided_init_wizard`
        inputs.cloud_provider = check_cloud_provider_creds(
            cloud_provider, disable_prompt
        )

        # DigitalOcean is no longer supported
        if inputs.cloud_provider == "do":
            rich.print(
                ":warning: DigitalOcean is no longer supported. You'll need to deploy to an existing k8s cluster if you plan to use Nebari on DigitalOcean :warning:\n"
            )

        inputs.region = check_cloud_provider_region(region, inputs.cloud_provider)
        inputs.kubernetes_version = check_cloud_provider_kubernetes_version(
            kubernetes_version, inputs.cloud_provider, inputs.region
        )

        inputs.project_name = project_name
        inputs.domain_name = domain_name
        inputs.namespace = namespace
        inputs.auth_provider = auth_provider
        inputs.auth_auto_provision = auth_auto_provision
        inputs.repository = repository
        inputs.repository_auto_provision = repository_auto_provision
        inputs.ci_provider = ci_provider
        inputs.terraform_state = terraform_state
        inputs.ssl_cert_email = ssl_cert_email
        inputs.disable_prompt = disable_prompt
        inputs.config_set = config_set
        inputs.output = output
        inputs.explicit = explicit

        from nebari.plugins import nebari_plugin_manager

        handle_init(inputs, config_schema=nebari_plugin_manager.config_schema)

        nebari_plugin_manager.read_config(output)


def guided_init_wizard(ctx: typer.Context, guided_init: str):
    """
    Guided Init Wizard is a user-friendly questionnaire used to help generate the `nebari-config.yaml`.
    """
    qmark = "  "
    disable_checks = os.environ.get("NEBARI_DISABLE_INIT_CHECKS", False)

    if not guided_init:
        return guided_init

    if pathlib.Path("nebari-config.yaml").exists():
        raise ValueError(
            "A nebari-config.yaml file already exists. Please move or delete it and try again."
        )

    try:
        rich.print(
            (
                "\n\t[bold]Welcome to the Guided Init wizard![/bold]\n\n"
                "You will be asked a few questions to generate your [purple]nebari-config.yaml[/purple]. "
                f"{LINKS_TO_DOCS_TEMPLATE.format(link_to_docs=DOCS_HOME)}"
            )
        )

        if disable_checks:
            rich.print(
                "⚠️  Attempting to use the Guided Init wizard without any validation checks. There is no guarantee values provided will work!  ⚠️\n\n"
            )

        # pull in default values for each of the below
        inputs = InitInputs()

        # CLOUD PROVIDER
        rich.print(
            (
                "\n 🪴  Nebari runs on a Kubernetes cluster: Where do you want this Kubernetes cluster deployed? "
                "is where you want this Kubernetes cluster deployed. "
                f"{LINKS_TO_DOCS_TEMPLATE.format(link_to_docs=CHOOSE_CLOUD_PROVIDER)}"
                "\n\t❗️ [purple]local[/purple] requires Docker and Kubernetes running on your local machine. "
                "[italic]Currently only available on Linux OS.[/italic]"
                "\n\t❗️ [purple]existing[/purple] refers to an existing Kubernetes cluster that Nebari can be deployed on.\n"
                "\n\t❗️ [red]Digital Ocean[/red] is currently being deprecated and support will be removed in the future.\n"
            )
        )
        # try:
        cloud_provider: str = questionary.select(
            "Where would you like to deploy your Nebari cluster?",
            choices=CLOUD_PROVIDER_FULL_NAME.keys(),
            qmark=qmark,
        ).unsafe_ask()

        inputs.cloud_provider = CLOUD_PROVIDER_FULL_NAME.get(cloud_provider)

        if not disable_checks:
            check_cloud_provider_creds(
                cloud_provider=inputs.cloud_provider,
                disable_prompt=ctx.params.get("disable_prompt"),
            )

        # specific context needed when `check_project_name` is called
        ctx.params["cloud_provider"] = inputs.cloud_provider

        # cloud region
        if (
            inputs.cloud_provider != ProviderEnum.local.value.lower()
            and inputs.cloud_provider != ProviderEnum.existing.value.lower()
        ):
            aws_region = os.environ.get("AWS_DEFAULT_REGION")
            if inputs.cloud_provider == ProviderEnum.aws.value.lower() and aws_region:
                region = aws_region
            else:
                region_docs = get_region_docs(inputs.cloud_provider)
                rich.print(
                    (
                        "\n 🪴  Nebari clusters that run in the cloud require specifying which region to deploy to, "
                        "please review the the cloud provider docs on the names and format these region take "
                        f"{LINKS_TO_EXTERNAL_DOCS_TEMPLATE.format(provider=inputs.cloud_provider.value, link_to_docs=region_docs)}"
                    )
                )

                region = questionary.text(
                    "In which region would you like to deploy your Nebari cluster?",
                    qmark=qmark,
                ).unsafe_ask()

            if not disable_checks:
                region = check_cloud_provider_region(
                    region, cloud_provider=inputs.cloud_provider
                )

            inputs.region = region
            ctx.params["region"] = region

        name_guidelines = """
        The project name must adhere to the following requirements:
        - Letters from A to Z (upper and lower case), numbers, hyphens, and dashes
        - Length from 3 to 16 characters
        - Begin and end with a letter
        """

        # PROJECT NAME
        rich.print(
            (
                f"\n 🪴  Next, give your Nebari instance a project name. This name is what your Kubernetes cluster will be referred to as.\n{name_guidelines}\n"
            )
        )
        inputs.project_name = questionary.text(
            "What project name would you like to use?",
            qmark=qmark,
            validate=questionary_validate_regex(schema.project_name_regex),
        ).unsafe_ask()

        # DOMAIN NAME
        rich.print(
            (
                "\n\n 🪴  Great! Now you can provide a valid domain name (i.e. the URL) to access your Nebari instance. "
                "This should be a domain that you own. Default if unspecified is the IP of the load balancer.\n\n"
            )
        )
        inputs.domain_name = (
            questionary.text(
                "What domain name would you like to use?",
                qmark=qmark,
            ).unsafe_ask()
            or None
        )

        # AUTH PROVIDER
        rich.print(
            (
                # TODO once docs are updated, add links for more details
                "\n\n 🪴  Nebari comes with [green]Keycloak[/green], an open-source identity and access management tool. This is how users and permissions "
                "are managed on the platform. To connect Keycloak with an identity provider, you can select one now.\n\n"
                "\n\t❗️ [purple]password[/purple] is the default option and is not connected to any external identity provider.\n"
            )
        )
        inputs.auth_provider = questionary.select(
            "What authentication provider would you like?",
            choices=enum_to_list(AuthenticationEnum),
            qmark=qmark,
        ).unsafe_ask()

        if not disable_checks:
            check_auth_provider_creds(ctx, auth_provider=inputs.auth_provider)

        if inputs.auth_provider.lower() == AuthenticationEnum.auth0.value.lower():
            inputs.auth_auto_provision = questionary.confirm(
                "Would you like us to auto provision the Auth0 Machine-to-Machine app?",
                default=False,
                qmark=qmark,
                auto_enter=False,
            ).unsafe_ask()

        elif inputs.auth_provider.lower() == AuthenticationEnum.github.value.lower():
            rich.print(
                (
                    ":warning: If you haven't done so already, please ensure the following:\n"
                    f"The `Homepage URL` is set to: [green]https://{inputs.domain_name}[/green]\n"
                    f"The `Authorization callback URL` is set to: [green]https://{inputs.domain_name}/auth/realms/nebari/broker/github/endpoint[/green]\n\n"
                )
            )

        # GITOPS - REPOSITORY, CICD
        rich.print(
            (
                "\n\n 🪴  This next section is [italic]optional[/italic] but recommended. If you wish to adopt a GitOps approach to managing this platform, "
                "we will walk you through a set of questions to get that setup. With this setup, Nebari will use GitHub Actions workflows (or GitLab equivalent) "
                "to automatically handle the future deployments of your infrastructure.\n\n"
            )
        )
        if questionary.confirm(
            "Would you like to adopt a GitOps approach to managing Nebari?",
            default=False,
            qmark=qmark,
            auto_enter=False,
        ).unsafe_ask():
            repo_url = "https://{git_provider}/{org_name}/{repo_name}"

            git_provider = questionary.select(
                "Which git provider would you like to use?",
                choices=enum_to_list(GitRepoEnum),
                qmark=qmark,
            ).unsafe_ask()

            if git_provider == GitRepoEnum.github.value.lower():
                inputs.ci_provider = CiEnum.github_actions.value.lower()

                inputs.repository_auto_provision = questionary.confirm(
                    f"Would you like nebari to create a remote repository on {git_provider}?",
                    default=False,
                    qmark=qmark,
                    auto_enter=False,
                ).unsafe_ask()

                if inputs.repository_auto_provision:
                    org_name = questionary.text(
                        f"Which user or organization will this repository live under? ({repo_url.format(git_provider=git_provider, org_name='<org-name>', repo_name='')})",
                        qmark=qmark,
                    ).unsafe_ask()

                    repo_name = questionary.text(
                        f"And what will the name of this repository be? ({repo_url.format(git_provider=git_provider, org_name=org_name, repo_name='<repo-name>')})",
                        qmark=qmark,
                    ).unsafe_ask()

                    inputs.repository = repo_url.format(
                        git_provider=git_provider,
                        org_name=org_name,
                        repo_name=repo_name,
                    )

                    if not disable_checks:
                        check_repository_creds(ctx, git_provider)

            elif git_provider == GitRepoEnum.gitlab.value.lower():
                inputs.ci_provider = CiEnum.gitlab_ci.value.lower()

        # SSL CERTIFICATE
        if inputs.domain_name:
            rich.print(
                (
                    "\n\n 🪴  This next section is [italic]optional[/italic] but recommended. If you want your Nebari domain to use a Let's Encrypt SSL certificate, "
                    "all we need is an email address from you.\n\n"
                )
            )
            ssl_cert = questionary.confirm(
                "Would you like to add a Let's Encrypt SSL certificate to your domain?",
                default=False,
                qmark=qmark,
                auto_enter=False,
            ).unsafe_ask()

            if ssl_cert:
                inputs.ssl_cert_email = questionary.text(
                    "Which email address should Let's Encrypt associate the certificate with?",
                    qmark=qmark,
                ).unsafe_ask()

                if not disable_checks:
                    typer_validate_regex(
                        schema.email_regex,
                        f"Email must be valid and match the regex {schema.email_regex}",
                    )

        # ADVANCED FEATURES
        rich.print(
            (
                # TODO once docs are updated, add links for more info on these changes
                "\n\n 🪴  This next section is [italic]optional[/italic] and includes advanced configuration changes to the "
                "Terraform state, Kubernetes Namespace and Kubernetes version."
                "\n ⚠️  caution is advised!\n\n"
            )
        )
        if questionary.confirm(
            "Would you like to make advanced configuration changes?",
            default=False,
            qmark=qmark,
            auto_enter=False,
        ).unsafe_ask():
            # TERRAFORM STATE
            inputs.terraform_state = questionary.select(
                "Where should the Terraform State be provisioned?",
                choices=enum_to_list(TerraformStateEnum),
                qmark=qmark,
            ).unsafe_ask()

            # NAMESPACE
            inputs.namespace = questionary.text(
                "What would you like the main Kubernetes namespace to be called?",
                default=inputs.namespace,
                qmark=qmark,
            ).unsafe_ask()

            # KUBERNETES VERSION
            kubernetes_version = questionary.text(
                "Which Kubernetes version would you like to use (if none provided; latest version will be installed)?",
                qmark=qmark,
            ).unsafe_ask()
            if not disable_checks:
                check_cloud_provider_kubernetes_version(
                    kubernetes_version=kubernetes_version,
                    cloud_provider=inputs.cloud_provider,
                    region=inputs.region,
                )
            inputs.kubernetes_version = kubernetes_version

            # EXPLICIT CONFIG
            inputs.explicit = questionary.confirm(
                "Would you like the nebari config to show all available options? (recommended for advanced users only)",
                default=False,
                qmark=qmark,
                auto_enter=False,
            ).unsafe_ask()

        from nebari.plugins import nebari_plugin_manager

        config_schema = nebari_plugin_manager.config_schema

        handle_init(inputs, config_schema=config_schema)

        rich.print(
            (
                "\n\n\t:sparkles: [bold]Congratulations[/bold], you have generated the all important [purple]nebari-config.yaml[/purple] file :sparkles:\n\n"
                "You can always make changes to your [purple]nebari-config.yaml[/purple] file by editing the file directly.\n"
                "If you do make changes to it you can ensure it's still a valid configuration by running:\n\n"
                "\t[green]nebari validate --config path/to/nebari-config.yaml[/green]\n\n"
            )
        )

        base_cmd = f"nebari init {inputs.cloud_provider.value}"

        def if_used(key, model=inputs, ignore_list=["cloud_provider"]):
            if key not in ignore_list:
                value = getattr(model, key)
                if isinstance(value, enum.Enum):
                    return f"--{key} {value.value}".replace("_", "-")
                elif isinstance(value, bool):
                    if value:
                        return f"--{key}".replace("_", "-")
                elif isinstance(value, (int, str)):
                    if value:
                        return f"--{key} {value}".replace("_", "-")

        cmds = " ".join(
            [
                _
                for _ in [if_used(_) for _ in inputs.model_dump().keys()]
                if _ is not None
            ]
        )

        rich.print(
            (
                "For reference, if the previous Guided Init answers were converted into a direct [green]nebari init[/green] command, it would be:\n\n"
                f"\t[green]{base_cmd} {cmds}[/green]\n\n"
            )
        )

        rich.print(
            (
                "You can now deploy your Nebari instance with:\n\n"
                "\t[green]nebari deploy -c nebari-config.yaml[/green]\n\n"
                "For more information, run [green]nebari deploy --help[/green] or check out the documentation: "
                "[green]https://www.nebari.dev/docs/how-tos/[/green]"
            )
        )

    except KeyboardInterrupt:
        rich.print("\nUser quit the Guided Init.\n\n ")
        raise typer.Exit()

    raise typer.Exit()



---
File: nebari/src/_nebari/subcommands/keycloak.py
---

import json
import pathlib
from typing import Tuple

import typer

from _nebari.config import read_configuration
from _nebari.keycloak import do_keycloak, export_keycloak_users
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    app_keycloak = typer.Typer(
        add_completion=False,
        no_args_is_help=True,
        rich_markup_mode="rich",
        context_settings={"help_option_names": ["-h", "--help"]},
    )

    cli.add_typer(
        app_keycloak,
        name="keycloak",
        help="Interact with the Nebari Keycloak identity and access management tool.",
        rich_help_panel="Additional Commands",
    )

    @app_keycloak.command(name="adduser")
    def add_user(
        add_users: Tuple[str, str] = typer.Option(
            ..., "--user", help="Provide both: <username> <password>"
        ),
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration file path",
        ),
    ):
        """Add a user to Keycloak. User will be automatically added to the [italic]analyst[/italic] group."""
        from nebari.plugins import nebari_plugin_manager

        args = ["adduser", add_users[0], add_users[1]]
        config_schema = nebari_plugin_manager.config_schema
        config = read_configuration(config_filename, config_schema)
        do_keycloak(config, *args)

    @app_keycloak.command(name="listusers")
    def list_users(
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration file path",
        )
    ):
        """List the users in Keycloak."""
        from nebari.plugins import nebari_plugin_manager

        args = ["listusers"]
        config_schema = nebari_plugin_manager.config_schema
        config = read_configuration(config_filename, config_schema)
        do_keycloak(config, *args)

    @app_keycloak.command(name="export-users")
    def export_users(
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration file path",
        ),
        realm: str = typer.Option(
            "nebari",
            "--realm",
            help="realm from which users are to be exported",
        ),
    ):
        """Export the users in Keycloak."""
        from nebari.plugins import nebari_plugin_manager

        config_schema = nebari_plugin_manager.config_schema
        config = read_configuration(config_filename, config_schema=config_schema)
        r = export_keycloak_users(config, realm=realm)
        print(json.dumps(r, indent=4))



---
File: nebari/src/_nebari/subcommands/plugin.py
---

from importlib.metadata import version

import rich
import typer
from rich.table import Table

from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    plugin_cmd = typer.Typer(
        add_completion=False,
        no_args_is_help=True,
        rich_markup_mode="rich",
        context_settings={"help_option_names": ["-h", "--help"]},
    )

    cli.add_typer(
        plugin_cmd,
        name="plugin",
        help="Interact with nebari plugins",
        rich_help_panel="Additional Commands",
    )

    @plugin_cmd.command()
    def list(ctx: typer.Context):
        """
        List installed plugins
        """
        from nebari.plugins import nebari_plugin_manager

        external_plugins = nebari_plugin_manager.get_external_plugins()

        table = Table(title="Plugins")
        table.add_column("name", justify="left", no_wrap=True)
        table.add_column("version", justify="left", no_wrap=True)

        for plugin in external_plugins:
            table.add_row(plugin, version(plugin))

        rich.print(table)



---
File: nebari/src/_nebari/subcommands/render.py
---

import pathlib

import typer

from _nebari.config import read_configuration
from _nebari.render import render_template
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command(rich_help_panel="Additional Commands")
    def render(
        ctx: typer.Context,
        output_directory: pathlib.Path = typer.Option(
            "./",
            "-o",
            "--output",
            help="output directory",
        ),
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration yaml file path",
        ),
        dry_run: bool = typer.Option(
            False,
            "--dry-run",
            help="simulate rendering files without actually writing or updating any files",
        ),
    ):
        """
        Dynamically render the Terraform scripts and other files from your [purple]nebari-config.yaml[/purple] file.
        """
        from nebari.plugins import nebari_plugin_manager

        stages = nebari_plugin_manager.ordered_stages
        config_schema = nebari_plugin_manager.config_schema

        config = read_configuration(config_filename, config_schema=config_schema)
        render_template(output_directory, config, stages, dry_run=dry_run)



---
File: nebari/src/_nebari/subcommands/support.py
---

import pathlib
from zipfile import ZipFile

import kubernetes.client
import kubernetes.client.exceptions
import kubernetes.config
import typer

from _nebari.config import read_configuration
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command(rich_help_panel="Additional Commands")
    def support(
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration file path",
        ),
        output: str = typer.Option(
            "./nebari-support-logs.zip",
            "-o",
            "--output",
            help="output filename",
        ),
    ):
        """
        Support tool to write all Kubernetes logs locally and compress them into a zip file.

        The Nebari team recommends k9s to manage and inspect the state of the cluster.
        However, this command occasionally helpful for debugging purposes should the logs need to be shared.
        """
        from nebari.plugins import nebari_plugin_manager

        config_schema = nebari_plugin_manager.config_schema
        namespace = read_configuration(config_filename, config_schema).namespace

        kubernetes.config.kube_config.load_kube_config()

        v1 = kubernetes.client.CoreV1Api()

        pods = v1.list_namespaced_pod(namespace=namespace)

        for pod in pods.items:
            pathlib.Path(f"./log/{namespace}").mkdir(parents=True, exist_ok=True)
            path = pathlib.Path(f"./log/{namespace}/{pod.metadata.name}.txt")
            with path.open(mode="wt") as file:
                try:
                    file.write(
                        "%s\t%s\t%s\n"
                        % (
                            pod.status.pod_ip,
                            namespace,
                            pod.metadata.name,
                        )
                    )

                    # some pods are running multiple containers
                    containers = [
                        _.name if len(pod.spec.containers) > 1 else None
                        for _ in pod.spec.containers
                    ]

                    for container in containers:
                        if container is not None:
                            file.write(f"Container: {container}\n")
                        file.write(
                            v1.read_namespaced_pod_log(
                                name=pod.metadata.name,
                                namespace=namespace,
                                container=container,
                            )
                            + "\n"
                        )

                except kubernetes.client.exceptions.ApiException as e:
                    file.write("%s not available" % pod.metadata.name)
                    raise e

        with ZipFile(output, "w") as zip:
            for file in list(pathlib.Path(f"./log/{namespace}").glob("*.txt")):
                print(file)
                zip.write(file)



---
File: nebari/src/_nebari/subcommands/upgrade.py
---

import pathlib

import typer

from _nebari.upgrade import do_upgrade
from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command(rich_help_panel="Additional Commands")
    def upgrade(
        config_filename: pathlib.Path = typer.Option(
            ...,
            "-c",
            "--config",
            help="nebari configuration file path",
        ),
        attempt_fixes: bool = typer.Option(
            False,
            "--attempt-fixes",
            help="Attempt to fix the config for any incompatibilities between your old and new Nebari versions.",
        ),
    ):
        """
        Upgrade your [purple]nebari-config.yaml[/purple].

        Upgrade your [purple]nebari-config.yaml[/purple] after an nebari upgrade. If necessary, prompts users to perform manual upgrade steps required for the deploy process.

        See the project [green]RELEASE.md[/green] for details.
        """
        if not config_filename.is_file():
            raise ValueError(
                f"passed in configuration filename={config_filename} must exist"
            )

        do_upgrade(config_filename, attempt_fixes=attempt_fixes)



---
File: nebari/src/_nebari/subcommands/validate.py
---

import pathlib

import pydantic
import typer
from rich import print

from nebari.hookspecs import hookimpl


@hookimpl
def nebari_subcommand(cli: typer.Typer):
    @cli.command(rich_help_panel="Additional Commands")
    def validate(
        config_filename: pathlib.Path = typer.Option(
            ...,
            "--config",
            "-c",
            help="nebari configuration yaml file path, please pass in as -c/--config flag",
        ),
        enable_commenting: bool = typer.Option(
            False, "--enable-commenting", help="Toggle PR commenting on GitHub Actions"
        ),
    ):
        """
        Validate the values in the [purple]nebari-config.yaml[/purple] file are acceptable.
        """
        if enable_commenting:
            # for PR's only
            # comment_on_pr(config)
            pass
        else:
            from nebari.plugins import nebari_plugin_manager

            try:
                nebari_plugin_manager.read_config(config_filename)
                print(
                    "[bold purple]Successfully validated configuration.[/bold purple]"
                )
            except pydantic.ValidationError as e:
                print(
                    f"[bold red]ERROR validating configuration {config_filename.absolute()}[/bold red]"
                )
                print(str(e))
                raise typer.Abort()



---
File: nebari/src/_nebari/__init__.py
---




---
File: nebari/src/_nebari/cli.py
---

import typing

import typer
from typer.core import TyperGroup

from _nebari.version import __version__
from nebari.plugins import nebari_plugin_manager


class OrderCommands(TyperGroup):
    def list_commands(self, ctx: typer.Context):
        """Return list of commands in the order appear."""
        return list(self.commands)[::-1]


def version_callback(value: bool):
    if value:
        typer.echo(__version__)
        raise typer.Exit()


def exclude_stages(ctx: typer.Context, stages: typing.List[str]):
    nebari_plugin_manager.excluded_stages = stages
    return stages


def exclude_default_stages(ctx: typer.Context, exclude_default_stages: bool):
    nebari_plugin_manager.exclude_default_stages = exclude_default_stages
    return exclude_default_stages


def import_plugin(plugins: typing.List[str]):
    try:
        nebari_plugin_manager.load_plugins(plugins)
    except ModuleNotFoundError:
        typer.echo(
            "ERROR: Python module {e.name} not found. Make sure that the module is in your python path {sys.path}"
        )
        typer.Exit()
    return plugins


def create_cli():
    app = typer.Typer(
        cls=OrderCommands,
        help="Nebari CLI 🪴",
        add_completion=False,
        no_args_is_help=True,
        rich_markup_mode="rich",
        pretty_exceptions_show_locals=False,
        context_settings={"help_option_names": ["-h", "--help"]},
    )

    @app.callback()
    def common(
        ctx: typer.Context,
        version: bool = typer.Option(
            None,
            "-V",
            "--version",
            help="Nebari version number",
            callback=version_callback,
        ),
        plugins: typing.List[str] = typer.Option(
            [],
            "--import-plugin",
            help="Import nebari plugin",
            callback=import_plugin,
        ),
        excluded_stages: typing.List[str] = typer.Option(
            [],
            "--exclude-stage",
            help="Exclude nebari stage(s) by name or regex",
        ),
        exclude_default_stages: bool = typer.Option(
            False,
            "--exclude-default-stages",
            help="Exclude default nebari included stages",
        ),
    ):
        pass

    nebari_plugin_manager.plugin_manager.hook.nebari_subcommand(cli=app)

    return app



---
File: nebari/src/_nebari/config_set.py
---

import logging
import pathlib
from typing import Optional

from packaging.requirements import SpecifierSet
from pydantic import BaseModel, ConfigDict, field_validator

from _nebari._version import __version__
from _nebari.utils import yaml

logger = logging.getLogger(__name__)


class ConfigSetMetadata(BaseModel):
    model_config: ConfigDict = ConfigDict(extra="allow", arbitrary_types_allowed=True)
    name: str  # for use with guided init
    description: Optional[str] = None
    nebari_version: str | SpecifierSet

    @field_validator("nebari_version")
    @classmethod
    def validate_version_requirement(cls, version_req):
        if isinstance(version_req, str):
            version_req = SpecifierSet(version_req, prereleases=True)

        return version_req

    def check_version(self, version):
        if not self.nebari_version.contains(version, prereleases=True):
            raise ValueError(
                f'Nebari version "{version}" is not compatible with '
                f'version requirement {self.nebari_version} for "{self.name}" config set.'
            )


class ConfigSet(BaseModel):
    metadata: ConfigSetMetadata
    config: dict


def read_config_set(config_set_filepath: str):
    """Read a config set from a config file."""

    filename = pathlib.Path(config_set_filepath)

    with filename.open() as f:
        config_set_yaml = yaml.load(f)

    config_set = ConfigSet(**config_set_yaml)

    # validation
    config_set.metadata.check_version(__version__)

    return config_set



---
File: nebari/src/_nebari/config.py
---

import os
import pathlib
import re
import sys
from typing import Any, Dict, List, Union

import pydantic

from _nebari.utils import yaml


def set_nested_attribute(data: Any, attrs: List[str], value: Any):
    """Takes an arbitrary set of attributes and accesses the deep
    nested object config to set value
    """

    def _get_attr(d: Any, attr: str):
        if isinstance(d, list) and re.fullmatch(r"\d+", attr):
            return d[int(attr)]
        elif hasattr(d, "__getitem__"):
            return d[attr]
        else:
            return getattr(d, attr)

    def _set_attr(d: Any, attr: str, value: Any):
        if isinstance(d, list) and re.fullmatch(r"\d+", attr):
            d[int(attr)] = value
        elif hasattr(d, "__getitem__"):
            d[attr] = value
        else:
            setattr(d, attr, value)

    data_pos = data
    for attr in attrs[:-1]:
        data_pos = _get_attr(data_pos, attr)
    _set_attr(data_pos, attrs[-1], value)


def set_config_from_environment_variables(
    config: pydantic.BaseModel, keyword: str = "NEBARI_SECRET", separator: str = "__"
):
    """Setting nebari configuration values from environment variables

    For example `NEBARI_SECRET__ci_cd__branch=master` would set `ci_cd.branch = "master"`
    """
    nebari_secrets = [_ for _ in os.environ if _.startswith(keyword + separator)]
    for secret in nebari_secrets:
        attrs = secret[len(keyword + separator) :].split(separator)
        try:
            set_nested_attribute(config, attrs, os.environ[secret])
        except pydantic.ValidationError as e:
            print(
                f"ERROR: the provided environment variable {secret} causes the following pydantic validation error:\n\n",
                e,
            )
            sys.exit(1)
        except Exception as e:
            print(
                f"ERROR: the provided environment variable {secret} causes the following error:\n\n",
                e,
            )
            sys.exit(1)
    return config


def dump_nested_model(model_dict: Dict[str, Union[pydantic.BaseModel, str]]):
    result = {}
    for key, value in model_dict.items():
        result[key] = (
            value.model_dump() if isinstance(value, pydantic.BaseModel) else value
        )
    return result


def read_configuration(
    config_filename: pathlib.Path,
    config_schema: pydantic.BaseModel,
    read_environment: bool = True,
):
    """Read the nebari configuration from disk and apply validation"""
    filename = pathlib.Path(config_filename)

    if not filename.is_file():
        raise ValueError(
            f"passed in configuration filename={config_filename} does not exist"
        )

    with filename.open() as f:
        config_dict = yaml.load(f)
        config = config_schema(**config_dict)

    if read_environment:
        config = set_config_from_environment_variables(config)

    return config


def write_configuration(
    config_filename: pathlib.Path,
    config: Union[pydantic.BaseModel, Dict],
    mode: str = "w",
):
    """Write the nebari configuration file to disk"""
    with config_filename.open(mode) as f:
        if isinstance(config, pydantic.BaseModel):
            config_dict = config.model_dump()
            yaml.dump(config_dict, f)
        else:
            config = dump_nested_model(config)
            yaml.dump(config, f)


def backup_configuration(filename: pathlib.Path, extrasuffix: str = ""):
    if not filename.exists():
        return

    # Backup old file
    backup_filename = pathlib.Path(f"{filename}{extrasuffix}.backup")

    if backup_filename.exists():
        i = 1
        while True:
            next_backup_filename = pathlib.Path(f"{backup_filename}~{i}")
            if not next_backup_filename.exists():
                backup_filename = next_backup_filename
                break
            i = i + 1

    filename.rename(backup_filename)



---
File: nebari/src/_nebari/constants.py
---

CURRENT_RELEASE = "2025.2.1"

HELM_VERSION = "v3.15.3"
KUSTOMIZE_VERSION = "5.4.3"
OPENTOFU_VERSION = "1.8.3"

KUBERHEALTHY_HELM_VERSION = "100"

# 04-kubernetes-ingress
DEFAULT_TRAEFIK_IMAGE_TAG = "2.9.1"

HIGHEST_SUPPORTED_K8S_VERSION = ("1", "31")  # specify Major and Minor version
DEFAULT_GKE_RELEASE_CHANNEL = "UNSPECIFIED"

DEFAULT_NEBARI_DASK_VERSION = CURRENT_RELEASE
DEFAULT_NEBARI_IMAGE_TAG = CURRENT_RELEASE
DEFAULT_NEBARI_WORKFLOW_CONTROLLER_IMAGE_TAG = CURRENT_RELEASE

DEFAULT_CONDA_STORE_IMAGE_TAG = "2025.2.1"

LATEST_SUPPORTED_PYTHON_VERSION = "3.10"


# DOCS
AZURE_ENV_DOCS = "https://www.nebari.dev/docs/how-tos/nebari-azure"
AWS_ENV_DOCS = "https://www.nebari.dev/docs/how-tos/nebari-aws"
GCP_ENV_DOCS = "https://www.nebari.dev/docs/how-tos/nebari-gcp"

# DEFAULT CLOUD REGIONS
AWS_DEFAULT_REGION = "us-east-1"
AZURE_DEFAULT_REGION = "Central US"
GCP_DEFAULT_REGION = "us-central1"



---
File: nebari/src/_nebari/deploy.py
---

import contextlib
import logging
import pathlib
import textwrap
from typing import Any, Dict, List

from _nebari.utils import timer
from nebari import hookspecs, schema

logger = logging.getLogger(__name__)


def deploy_configuration(
    config: schema.Main,
    stages: List[hookspecs.NebariStage],
    disable_prompt: bool = False,
    disable_checks: bool = False,
) -> Dict[str, Any]:
    if config.prevent_deploy:
        raise ValueError(
            textwrap.dedent(
                """
        Deployment prevented due to the prevent_deploy setting in your nebari-config.yaml file.
        You could remove that field to deploy your Nebari, but please do NOT do so without fully understanding why that value was set in the first place.

        It may have been set during an upgrade of your nebari-config.yaml file because we do not believe it is safe to redeploy the new
        version of Nebari without having a full backup of your system ready to restore. It may be known that an in-situ upgrade is impossible
        and that redeployment will tear down your existing infrastructure before creating an entirely new Nebari without your old data.

        PLEASE get in touch with Nebari development team at https://github.com/nebari-dev/nebari for assistance in proceeding.
        Your data may be at risk without our guidance.
        """
            )
        )

    if config.domain is None:
        logger.info(
            "All nebari endpoints will be under kubernetes load balancer address which cannot be known before deployment"
        )
    else:
        logger.info(f"All nebari endpoints will be under https://{config.domain}")

    if disable_checks:
        logger.warning(
            "The validation checks at the end of each stage have been disabled"
        )

    with timer(logger, "deploying Nebari"):
        stage_outputs = {}
        with contextlib.ExitStack() as stack:
            for stage in stages:
                s: hookspecs.NebariStage = stage(
                    output_directory=pathlib.Path.cwd(), config=config
                )
                stack.enter_context(s.deploy(stage_outputs, disable_prompt))

                if not disable_checks:
                    s.check(stage_outputs, disable_prompt)
        print("Nebari deployed successfully")

        print("Services:")
        for service_name, service in stage_outputs["stages/07-kubernetes-services"][
            "service_urls"
        ]["value"].items():
            print(f" - {service_name} -> {service['url']}")

        print(
            f"Kubernetes kubeconfig located at file://{stage_outputs['stages/02-infrastructure']['kubeconfig_filename']['value']}"
        )
        username = "root"
        password = config.security.keycloak.initial_root_password
        if password:
            print(f"Kubecloak master realm username={username} password={password}")

        print(
            "Additional administration docs can be found at https://www.nebari.dev/docs/how-tos/configuring-keycloak"
        )

    return stage_outputs



---
File: nebari/src/_nebari/deprecate.py
---

DEPRECATED_FILE_PATHS = [
    # v0.4 removed in PR #1003 move to stages
    "infrastructure",
    "terraform-state",
    # v0.4 removed in PR #1068 deprecate some github actions
    ".github/workflows/image-pr.yaml",
    ".github/workflows/image.yaml",
    ".github/workflows/jupyterhub-pr.yaml",
    ".github/workflows/jupyterhub.yaml",
    # v2024.7.3 renamed misspelled file path
    "stages/07-kubernetes-services/modules/kubernetes/services/dask-gateway/controler.tf",  # codespell:ignore
]



---
File: nebari/src/_nebari/destroy.py
---

import contextlib
import logging
import pathlib
from typing import List

from _nebari.utils import timer
from nebari import hookspecs, schema

logger = logging.getLogger(__name__)


def destroy_configuration(config: schema.Main, stages: List[hookspecs.NebariStage]):
    logger.info(
        """Removing all infrastructure, your local files will still remain,
    you can use 'nebari deploy' to re-install infrastructure using same config file\n"""
    )

    stage_outputs = {}
    status = {}

    with timer(logger, "destroying Nebari"):
        with contextlib.ExitStack() as stack:
            for stage in stages:
                try:
                    s: hookspecs.NebariStage = stage(
                        output_directory=pathlib.Path.cwd(), config=config
                    )
                    stack.enter_context(s.destroy(stage_outputs, status))
                except Exception as e:
                    status[s.name] = False
                    print(
                        f"ERROR: stage={s.name} failed due to {e}. Due to stages depending on each other we can only destroy stages that occur before this stage"
                    )
                    break

    for stage_name, success in status.items():
        if not success:
            logger.error(f"Stage={stage_name} failed to fully destroy")

    if not all(status.values()):
        logger.error(
            "ERROR: not all nebari stages were destroyed properly. For cloud deployments of Nebari typically only stages 01 and 02 need to succeed to properly destroy everything"
        )
    else:
        print("Nebari properly destroyed all resources without error")



---
File: nebari/src/_nebari/initialize.py
---

import logging
import os
import re
import tempfile
from pathlib import Path
from typing import Any, Dict

import pydantic
import requests

from _nebari import constants, utils
from _nebari.config_set import read_config_set
from _nebari.provider import git
from _nebari.provider.cicd import github
from _nebari.provider.cloud import amazon_web_services, azure_cloud, google_cloud
from _nebari.provider.oauth.auth0 import create_client
from _nebari.stages.bootstrap import CiEnum
from _nebari.stages.infrastructure import (
    DEFAULT_AWS_NODE_GROUPS,
    DEFAULT_AZURE_NODE_GROUPS,
    DEFAULT_GCP_NODE_GROUPS,
    node_groups_to_dict,
)
from _nebari.stages.kubernetes_ingress import CertificateEnum
from _nebari.stages.kubernetes_keycloak import AuthenticationEnum
from _nebari.stages.terraform_state import TerraformStateEnum
from _nebari.utils import get_latest_kubernetes_version, random_secure_string
from _nebari.version import __version__
from nebari.schema import ProviderEnum, github_url_regex

logger = logging.getLogger(__name__)

WELCOME_HEADER_TEXT = "Your open source data science platform, hosted"


def render_config(
    project_name: str,
    nebari_domain: str = None,
    cloud_provider: ProviderEnum = ProviderEnum.local,
    ci_provider: CiEnum = CiEnum.none,
    repository: str = None,
    auth_provider: AuthenticationEnum = AuthenticationEnum.password,
    namespace: str = "dev",
    repository_auto_provision: bool = False,
    auth_auto_provision: bool = False,
    terraform_state: TerraformStateEnum = TerraformStateEnum.remote,
    kubernetes_version: str = None,
    region: str = None,
    disable_prompt: bool = False,
    ssl_cert_email: str = None,
    config_set: str = None,
) -> Dict[str, Any]:
    config = {
        "provider": cloud_provider,
        "namespace": namespace,
        "nebari_version": __version__,
    }

    if project_name is None and not disable_prompt:
        project_name = input("Provide project name: ")
    config["project_name"] = project_name

    if nebari_domain is not None:
        config["domain"] = nebari_domain

    config["ci_cd"] = {"type": ci_provider}
    config["terraform_state"] = {"type": terraform_state}

    # Save default password to file
    default_password_filename = Path(tempfile.gettempdir()) / "NEBARI_DEFAULT_PASSWORD"
    config["security"] = {
        "keycloak": {"initial_root_password": random_secure_string(length=32)}
    }
    with default_password_filename.open("w") as f:
        f.write(config["security"]["keycloak"]["initial_root_password"])
    default_password_filename.chmod(0o700)

    config["theme"] = {"jupyterhub": {"hub_title": f"Nebari - { project_name }"}}
    config["theme"]["jupyterhub"][
        "welcome"
    ] = """Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs/welcome">the documentation</a>. If you have any questions or feedback, reach the team on <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support forums</a>."""

    config["security"]["authentication"] = {"type": auth_provider}

    if auth_provider == AuthenticationEnum.github:
        config["security"]["authentication"]["config"] = {
            "client_id": os.environ.get(
                "GITHUB_CLIENT_ID",
                "<enter client id or remove to use GITHUB_CLIENT_ID environment variable (preferred)>",
            ),
            "client_secret": os.environ.get(
                "GITHUB_CLIENT_SECRET",
                "<enter client secret or remove to use GITHUB_CLIENT_SECRET environment variable (preferred)>",
            ),
        }
    elif auth_provider == AuthenticationEnum.auth0:
        if auth_auto_provision:
            auth0_config = create_client(config.domain, config.project_name)
            config["security"]["authentication"]["config"] = auth0_config
        else:
            config["security"]["authentication"]["config"] = {
                "client_id": os.environ.get(
                    "AUTH0_CLIENT_ID",
                    "<enter client id or remove to use AUTH0_CLIENT_ID environment variable (preferred)>",
                ),
                "client_secret": os.environ.get(
                    "AUTH0_CLIENT_SECRET",
                    "<enter client secret or remove to use AUTH0_CLIENT_SECRET environment variable (preferred)>",
                ),
                "auth0_subdomain": os.environ.get(
                    "AUTH0_DOMAIN",
                    "<enter subdomain (without .auth0.com) or remove to use AUTH0_DOMAIN environment variable>",
                ),
            }

    if cloud_provider == ProviderEnum.gcp:
        gcp_region = region or constants.GCP_DEFAULT_REGION
        gcp_kubernetes_version = kubernetes_version or get_latest_kubernetes_version(
            google_cloud.kubernetes_versions(gcp_region)
        )
        config["google_cloud_platform"] = {
            "kubernetes_version": gcp_kubernetes_version,
            "region": gcp_region,
            "node_groups": node_groups_to_dict(DEFAULT_GCP_NODE_GROUPS),
        }

        config["theme"]["jupyterhub"][
            "hub_subtitle"
        ] = f"{WELCOME_HEADER_TEXT} on Google Cloud Platform"
        if "PROJECT_ID" in os.environ:
            config["google_cloud_platform"]["project"] = os.environ["PROJECT_ID"]
        elif not disable_prompt:
            config["google_cloud_platform"]["project"] = input(
                "Enter Google Cloud Platform Project ID: "
            )

    elif cloud_provider == ProviderEnum.azure:
        azure_region = region or constants.AZURE_DEFAULT_REGION
        azure_kubernetes_version = kubernetes_version or get_latest_kubernetes_version(
            azure_cloud.kubernetes_versions(azure_region)
        )
        config["azure"] = {
            "kubernetes_version": azure_kubernetes_version,
            "region": azure_region,
            "storage_account_postfix": random_secure_string(length=4),
            "node_groups": node_groups_to_dict(DEFAULT_AZURE_NODE_GROUPS),
        }

        config["theme"]["jupyterhub"][
            "hub_subtitle"
        ] = f"{WELCOME_HEADER_TEXT} on Azure"

    elif cloud_provider == ProviderEnum.aws:
        aws_region = (
            region
            or os.environ.get("AWS_DEFAULT_REGION")
            or constants.AWS_DEFAULT_REGION
        )
        aws_kubernetes_version = kubernetes_version or get_latest_kubernetes_version(
            amazon_web_services.kubernetes_versions(aws_region)
        )
        config["amazon_web_services"] = {
            "kubernetes_version": aws_kubernetes_version,
            "region": aws_region,
            "node_groups": node_groups_to_dict(DEFAULT_AWS_NODE_GROUPS),
        }
        config["theme"]["jupyterhub"][
            "hub_subtitle"
        ] = f"{WELCOME_HEADER_TEXT} on Amazon Web Services"

    elif cloud_provider == ProviderEnum.existing:
        config["theme"]["jupyterhub"]["hub_subtitle"] = WELCOME_HEADER_TEXT

    elif cloud_provider == ProviderEnum.local:
        config["theme"]["jupyterhub"]["hub_subtitle"] = WELCOME_HEADER_TEXT

    if ssl_cert_email:
        config["certificate"] = {"type": CertificateEnum.letsencrypt.value}
        config["certificate"]["acme_email"] = ssl_cert_email

    if config_set:
        config_set = read_config_set(config_set)
        config = utils.deep_merge(config, config_set.config)

    # validate configuration and convert to model
    from nebari.plugins import nebari_plugin_manager

    try:
        config_model = nebari_plugin_manager.config_schema.model_validate(config)
    except pydantic.ValidationError as e:
        raise e

    if repository_auto_provision:
        match = re.search(github_url_regex, repository)
        if match:
            git_repository = github_auto_provision(
                config_model, match.group(2), match.group(3)
            )
            git_repository_initialize(git_repository)
        else:
            raise ValueError(
                f"Repository to be auto-provisioned is not the full URL of a GitHub repo: {repository}"
            )

    return config


def github_auto_provision(config: pydantic.BaseModel, owner: str, repo: str):
    already_exists = True
    try:
        github.get_repository(owner, repo)
    except requests.exceptions.HTTPError:
        # repo not found
        already_exists = False

    if not already_exists:
        try:
            github.create_repository(
                owner,
                repo,
                description=f"Nebari {config.project_name}-{config.provider.value}",
                homepage=f"https://{config.domain}",
            )
        except requests.exceptions.HTTPError as he:
            raise ValueError(
                f"Unable to create GitHub repo https://github.com/{owner}/{repo} - error message from GitHub is: {he}"
            )
    else:
        logger.warning(f"GitHub repo https://github.com/{owner}/{repo} already exists")

    try:
        # Secrets
        if config.provider == ProviderEnum.aws:
            for name in {
                "AWS_ACCESS_KEY_ID",
                "AWS_SECRET_ACCESS_KEY",
            }:
                github.update_secret(owner, repo, name, os.environ[name])
        elif config.provider == ProviderEnum.gcp:
            github.update_secret(owner, repo, "PROJECT_ID", os.environ["PROJECT_ID"])
            with open(os.environ["GOOGLE_CREDENTIALS"]) as f:
                github.update_secret(owner, repo, "GOOGLE_CREDENTIALS", f.read())
        elif config.provider == ProviderEnum.azure:
            for name in {
                "ARM_CLIENT_ID",
                "ARM_CLIENT_SECRET",
                "ARM_SUBSCRIPTION_ID",
                "ARM_TENANT_ID",
            }:
                github.update_secret(owner, repo, name, os.environ[name])
        github.update_secret(
            owner, repo, "REPOSITORY_ACCESS_TOKEN", os.environ["GITHUB_TOKEN"]
        )
    except requests.exceptions.HTTPError as he:
        raise ValueError(
            f"Unable to set Secrets on GitHub repo https://github.com/{owner}/{repo} - error message from GitHub is: {he}"
        )

    return f"git@github.com:{owner}/{repo}.git"


def git_repository_initialize(git_repository):
    if not git.is_git_repo(Path.cwd()):
        git.initialize_git(Path.cwd())
    git.add_git_remote(git_repository, path=Path.cwd(), remote_name="origin")



---
File: nebari/src/_nebari/keycloak.py
---

import json
import logging
import os
from urllib.parse import urljoin

import keycloak
import requests
import rich

from _nebari.stages.kubernetes_ingress import CertificateEnum
from nebari import schema

logger = logging.getLogger(__name__)


def do_keycloak(config: schema.Main, *args):
    # suppress insecure warnings
    import urllib3

    urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

    keycloak_admin = get_keycloak_admin_from_config(config)

    if args[0] == "adduser":
        if len(args) < 2:
            raise ValueError(
                "keycloak command 'adduser' requires `username [password]`"
            )

        username = args[1]
        password = args[2] if len(args) >= 3 else None
        create_user(keycloak_admin, username, password, domain=config.domain)
    elif args[0] == "listusers":
        list_users(keycloak_admin)
    else:
        raise ValueError(f"unknown keycloak command {args[0]}")


def create_user(
    keycloak_admin: keycloak.KeycloakAdmin,
    username: str,
    password: str = None,
    groups=None,
    email=None,
    domain=None,
    enabled=True,
):
    payload = {
        "username": username,
        "groups": groups or ["/developer"],
        "email": email or f"{username}@{domain or 'example.com'}",
        "enabled": enabled,
    }
    if password:
        payload["credentials"] = [
            {"type": "password", "value": password, "temporary": False}
        ]
    else:
        rich.print(
            f"Creating user=[green]{username}[/green] without password (none supplied)"
        )
    user = keycloak_admin.create_user(payload)
    rich.print(f"Created user=[green]{username}[/green]")
    return user


def list_users(keycloak_admin: keycloak.KeycloakAdmin):
    num_users = keycloak_admin.users_count()
    print(f"{num_users} Keycloak Users")

    user_format = "{username:32} | {email:32} | {groups}"
    print(user_format.format(username="username", email="email", groups="groups"))
    print("-" * 120)

    for user in keycloak_admin.get_users():
        user_groups = [_["name"] for _ in keycloak_admin.get_user_groups(user["id"])]
        print(
            user_format.format(
                username=user["username"], email=user["email"], groups=user_groups
            )
        )


def get_keycloak_admin(server_url, username, password, verify=False):
    try:
        keycloak_admin = keycloak.KeycloakAdmin(
            server_url=server_url,
            username=username,
            password=password,
            realm_name=os.environ.get("KEYCLOAK_REALM", "nebari"),
            user_realm_name="master",
            auto_refresh_token=("get", "put", "post", "delete"),
            verify=verify,
        )
    except (
        keycloak.exceptions.KeycloakConnectionError,
        keycloak.exceptions.KeycloakAuthenticationError,
    ) as e:
        raise ValueError(f"Failed to connect to Keycloak server: {e}")

    return keycloak_admin


def get_keycloak_admin_from_config(config: schema.Main):
    keycloak_server_url = os.environ.get(
        "KEYCLOAK_SERVER_URL", f"https://{config.domain}/auth/"
    )

    keycloak_username = os.environ.get("KEYCLOAK_ADMIN_USERNAME", "root")
    keycloak_password = os.environ.get(
        "KEYCLOAK_ADMIN_PASSWORD", config.security.keycloak.initial_root_password
    )

    should_verify_tls = config.certificate.type != CertificateEnum.selfsigned

    return get_keycloak_admin(
        server_url=keycloak_server_url,
        username=keycloak_username,
        password=keycloak_password,
        verify=should_verify_tls,
    )


def keycloak_rest_api_call(config: schema.Main = None, request: str = None):
    """Communicate directly with the Keycloak REST API by passing it a request"""
    keycloak_server_url = os.environ.get(
        "KEYCLOAK_SERVER_URL", f"https://{config.domain}/auth/"
    )

    keycloak_admin_username = os.environ.get("KEYCLOAK_ADMIN_USERNAME", "root")
    keycloak_admin_password = os.environ.get(
        "KEYCLOAK_ADMIN_PASSWORD",
        config.security.keycloak.initial_root_password,
    )

    try:
        # Get `token` to interact with Keycloak Admin
        url = urljoin(
            keycloak_server_url, "realms/master/protocol/openid-connect/token"
        )
        headers = {
            "Content-Type": "application/x-www-form-urlencoded",
        }
        data = {
            "username": keycloak_admin_username,
            "password": keycloak_admin_password,
            "grant_type": "password",
            "client_id": "admin-cli",
        }

        response = requests.post(
            url=url,
            headers=headers,
            data=data,
            verify=False,
        )

        if response.status_code == 200:
            token = json.loads(response.content.decode())["access_token"]
        else:
            raise ValueError(
                f"Unable to retrieve Keycloak API token. Status code: {response.status_code}"
            )

        # Send request to Keycloak REST API
        method, endpoint = request.split()
        url = urljoin(
            urljoin(keycloak_server_url, "admin/realms/"), endpoint.lstrip("/")
        )
        headers = {
            "Accept": "application/json",
            "Authorization": f"Bearer {token}",
        }

        response = requests.request(
            method=method, url=url, headers=headers, verify=False
        )

        if response.status_code == 200:
            content = json.loads(response.content.decode())
            return content
        else:
            raise ValueError(
                f"Unable to communicate with Keycloak API. Status code: {response.status_code}"
            )

    except requests.exceptions.RequestException as e:
        raise e


def export_keycloak_users(config: schema.Main, realm: str):
    request = f"GET /{realm}/users"

    users = keycloak_rest_api_call(config, request=request)

    return {
        "realm": realm,
        "users": users,
    }



---
File: nebari/src/_nebari/render.py
---

import hashlib
import pathlib
import shutil
import sys
from typing import Dict, List

from rich import print
from rich.table import Table

from _nebari.deprecate import DEPRECATED_FILE_PATHS
from nebari import hookspecs, schema


def render_template(
    output_directory: pathlib.Path,
    config: schema.Main,
    stages: List[hookspecs.NebariStage],
    dry_run=False,
):
    output_directory = pathlib.Path(output_directory).resolve()
    if output_directory == pathlib.Path.home():
        print("ERROR: Deploying Nebari in home directory is not advised!")
        sys.exit(1)

    # mkdir all the way down to repo dir so we can copy .gitignore
    # into it in remove_existing_renders
    output_directory.mkdir(exist_ok=True, parents=True)

    contents = {}
    for stage in stages:
        contents.update(
            stage(output_directory=output_directory, config=config).render()
        )

    new, untracked, updated, deleted = inspect_files(
        output_base_dir=output_directory,
        ignore_filenames=[
            "terraform.tfstate",
            ".terraform.lock.hcl",
            "terraform.tfstate.backup",
        ],
        ignore_directories=[
            ".terraform",
            "__pycache__",
        ],
        deleted_paths=DEPRECATED_FILE_PATHS,
        contents=contents,
    )

    if new:
        table = Table("The following files will be created:", style="deep_sky_blue1")
        for filename in sorted(set(map(str, new))):
            table.add_row(str(filename), style="green")
        print(table)
    if updated:
        table = Table("The following files will be updated:", style="deep_sky_blue1")
        for filename in sorted(set(map(str, updated))):
            table.add_row(str(filename), style="green")
        print(table)
    if deleted:
        table = Table("The following files will be deleted:", style="deep_sky_blue1")
        for filename in sorted(set(map(str, deleted))):
            table.add_row(str(filename), style="green")
        print(table)
    if untracked:
        table = Table(
            "The following files are untracked (only exist in output directory):",
            style="deep_sky_blue1",
        )
        for filename in sorted(set(map(str, updated))):
            table.add_row(str(filename), style="green")
        print(table)

    if dry_run:
        print("dry-run enabled no files will be created, updated, or deleted")
    else:
        for filename in new | updated:
            output_filename = output_directory / filename
            output_filename.parent.mkdir(parents=True, exist_ok=True)

            if isinstance(contents[filename], str):
                with open(output_filename, "w") as f:
                    f.write(contents[filename])
            else:
                with open(output_filename, "wb") as f:
                    f.write(contents[filename])

        for path in deleted:
            abs_path = (output_directory / path).resolve()

            if not abs_path.is_relative_to(output_directory):
                raise Exception(
                    f"[ERROR] SHOULD NOT HAPPEN filename was about to be deleted but path={abs_path} is outside of output_directory"
                )

            if abs_path.is_file():
                abs_path.unlink()
            elif abs_path.is_dir():
                shutil.rmtree(abs_path)


def inspect_files(
    output_base_dir: pathlib.Path,
    ignore_filenames: List[str] = None,
    ignore_directories: List[str] = None,
    deleted_paths: List[pathlib.Path] = None,
    contents: Dict[str, str] = None,
):
    """Return created, updated and untracked files by computing a checksum over the provided directory.

    Args:
        output_base_dir (str): Relative base path to output directory
        ignore_filenames (list[str]): Filenames to ignore while comparing for changes
        ignore_directories (list[str]): Directories to ignore while comparing for changes
        deleted_paths (list[Path]): Paths that if exist in output directory should be deleted
        contents (dict): filename to content mapping for dynamically generated files
    """
    ignore_filenames = ignore_filenames or []
    ignore_directories = ignore_directories or []
    contents = contents or {}

    source_files = {}
    output_files = {}

    def list_files(
        directory: pathlib.Path,
        ignore_filenames: List[str],
        ignore_directories: List[str],
    ):
        for path in directory.rglob("*"):
            if not path.is_file():
                continue
            yield path

    for filename in contents:
        if isinstance(contents[filename], str):
            source_files[filename] = hashlib.sha256(
                contents[filename].encode("utf8")
            ).hexdigest()
        else:
            source_files[filename] = hashlib.sha256(contents[filename]).hexdigest()

        output_filename = pathlib.Path(output_base_dir) / filename
        if output_filename.is_file():
            output_files[filename] = hash_file(filename)

    deleted_files = set()
    for path in deleted_paths:
        absolute_path = output_base_dir / path
        if absolute_path.exists():
            deleted_files.add(path)

    for filename in list_files(output_base_dir, ignore_filenames, ignore_directories):
        relative_path = pathlib.Path.relative_to(
            pathlib.Path(filename), output_base_dir
        )
        if filename.is_file():
            output_files[relative_path] = hash_file(filename)

    new_files = source_files.keys() - output_files.keys()
    untracted_files = output_files.keys() - source_files.keys()

    updated_files = set()
    for prevalent_file in source_files.keys() & output_files.keys():
        if source_files[prevalent_file] != output_files[prevalent_file]:
            updated_files.add(prevalent_file)

    return new_files, untracted_files, updated_files, deleted_files


def hash_file(file_path: str):
    """Get the hex digest of the given file.

    Args:
        file_path (str): path to file
    """
    with open(file_path, "rb") as f:
        return hashlib.sha256(f.read()).hexdigest()



---
File: nebari/src/_nebari/upgrade.py
---

"""
This file contains the upgrade logic for Nebari.
Each release of Nebari requires an upgrade step class (which is a child class of UpgradeStep) to be created.
When a user runs `nebari upgrade  -c nebari-config.yaml`, then the do_upgrade function will then run through all required upgrade steps to bring the config file up to date with the current version of Nebari.
"""

import json
import logging
import os
import re
import secrets
import string
import textwrap
from abc import ABC
from pathlib import Path
from typing import Any, ClassVar, Dict

import kubernetes.client
import kubernetes.config
import requests
import rich
from packaging.version import Version
from pydantic import ValidationError
from rich.prompt import Confirm, Prompt
from typing_extensions import override

from _nebari.config import backup_configuration
from _nebari.keycloak import get_keycloak_admin
from _nebari.stages.infrastructure import (
    provider_enum_default_node_groups_map,
    provider_enum_name_map,
)
from _nebari.utils import (
    get_k8s_version_prefix,
    get_provider_config_block_name,
    load_yaml,
    yaml,
)
from _nebari.version import __version__, rounded_ver_parse
from nebari.schema import ProviderEnum, is_version_accepted

logger = logging.getLogger(__name__)

NEBARI_WORKFLOW_CONTROLLER_DOCS = (
    "https://www.nebari.dev/docs/how-tos/using-argo/#jupyterflow-override-beta"
)
ARGO_JUPYTER_SCHEDULER_REPO = "https://github.com/nebari-dev/argo-jupyter-scheduler"

UPGRADE_KUBERNETES_MESSAGE = "Please see the [green][link=https://www.nebari.dev/docs/how-tos/kubernetes-version-upgrade]Kubernetes upgrade docs[/link][/green] for more information."
DESTRUCTIVE_UPGRADE_WARNING = "-> This version upgrade will result in your cluster being completely torn down and redeployed.  Please ensure you have backed up any data you wish to keep before proceeding!!!"
TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION = (
    "Nebari needs to generate an updated set of Terraform scripts for your deployment and delete the old scripts.\n"
    "Do you want Nebari to remove your [green]stages[/green] directory automatically for you? It will be recreated the next time Nebari is run.\n"
    "[red]Warning:[/red] This will remove everything in the [green]stages[/green] directory.\n"
    "If you do not have Nebari do it automatically here, you will need to remove the [green]stages[/green] manually with a command"
    "like [green]rm -rf stages[/green]."
)
DESTROY_STAGE_FILES_WITH_TF_STATE_NOT_REMOTE = (
    "⚠️ CAUTION ⚠️\n"
    "Nebari would like to remove your old Terraform/Opentofu [green]stages[/green] files. Your [blue]terraform_state[/blue] configuration is not set to [blue]remote[/blue], so destroying your [green]stages[/green] files could potentially be very detructive.\n"
    "If you don't have active Terraform/Opentofu deployment state files contained within your [green]stages[/green] directory, you may proceed by entering [red]y[/red] at the prompt."
    "If you have an active Terraform/Opentofu deployment with active state files in your [green]stages[/green] folder, you will need to either bring Nebari down temporarily to redeploy or pursue some other means to upgrade. Enter [red]n[/red] at the prompt.\n\n"
    "Do you want to proceed by deleting your [green]stages[/green] directory and everything in it? ([red]POTENTIALLY VERY DESTRUCTIVE[/red])"
)


def do_upgrade(config_filename, attempt_fixes=False):
    """
    Perform an upgrade of the Nebari configuration file.

    This function loads the YAML configuration file, checks for deprecated keys,
    validates the current version, and if necessary, upgrades the configuration
    to the latest version of Nebari.

    Args:
    config_filename (str): The path to the configuration file.
    attempt_fixes (bool): Whether to attempt automatic fixes for validation errors.

    Returns:
    None
    """
    config = load_yaml(config_filename)
    if config.get("qhub_version"):
        rich.print(
            f"Your config file [purple]{config_filename}[/purple] uses the deprecated qhub_version key.  Please change qhub_version to nebari_version and re-run the upgrade command."
        )
        return

    try:
        from nebari.plugins import nebari_plugin_manager

        nebari_plugin_manager.read_config(config_filename)
        rich.print(
            f"Your config file [purple]{config_filename}[/purple] appears to be already up-to-date for Nebari version [green]{__version__}[/green]"
        )
        return
    except (ValidationError, ValueError) as e:
        if is_version_accepted(config.get("nebari_version", "")):
            # There is an unrelated validation problem
            rich.print(
                f"Your config file [purple]{config_filename}[/purple] appears to be already up-to-date for Nebari version [green]{__version__}[/green] but there is another validation error.\n"
            )
            raise e

    start_version = config.get("nebari_version", "")

    UpgradeStep.upgrade(
        config, start_version, __version__, config_filename, attempt_fixes
    )

    # Backup old file
    backup_configuration(config_filename, f".{start_version or 'old'}")

    with config_filename.open("wt") as f:
        yaml.dump(config, f)

    rich.print(
        f"Saving new config file [purple]{config_filename}[/purple] ready for Nebari version [green]{__version__}[/green]"
    )

    ci_cd = config.get("ci_cd", {}).get("type", "")
    if ci_cd in ("github-actions", "gitlab-ci"):
        rich.print(
            f"\nSince you are using ci_cd [green]{ci_cd}[/green] you also need to re-render the workflows and re-commit the files to your Git repo:\n"
            f"   nebari render -c [purple]{config_filename}[/purple]\n"
        )


class UpgradeStep(ABC):
    """
    Abstract base class representing an upgrade step.

    Attributes:
        _steps (ClassVar[Dict[str, Any]]): Class variable holding registered upgrade steps.
        version (ClassVar[str]): The version of the upgrade step.
    """

    _steps: ClassVar[Dict[str, Any]] = {}
    version: ClassVar[str] = ""

    def __init_subclass__(cls):
        """
        Initializes a subclass of UpgradeStep.

        This method validates the version string and registers the subclass
        in the _steps dictionary.
        """
        try:
            parsed_version = Version(cls.version)
        except ValueError as exc:
            raise ValueError(f"Invalid version string {cls.version}") from exc

        cls.parsed_version = parsed_version
        assert (
            rounded_ver_parse(cls.version) == parsed_version
        ), f"Invalid version {cls.version}: must be a full release version, not a dev/prerelease/postrelease version"
        assert (
            cls.version not in cls._steps
        ), f"Duplicate UpgradeStep version {cls.version}"
        cls._steps[cls.version] = cls

    @classmethod
    def clear_steps_registry(cls):
        """Clears the steps registry. Useful for testing."""
        cls._steps.clear()

    @classmethod
    def has_step(cls, version):
        """
        Checks if there is an upgrade step for a given version.

        Args:
            version (str): The version to check.

        Returns:
            bool: True if the step exists, False otherwise.
        """
        return version in cls._steps

    @classmethod
    def upgrade(
        cls, config, start_version, finish_version, config_filename, attempt_fixes=False
    ):
        """
        Runs through all required upgrade steps (i.e. relevant subclasses of UpgradeStep).
        Calls UpgradeStep.upgrade_step for each.

        Args:
            config (dict): The current configuration dictionary.
            start_version (str): The starting version of the configuration.
            finish_version (str): The target version for the configuration.
            config_filename (str): The path to the configuration file.
            attempt_fixes (bool): Whether to attempt automatic fixes for validation errors.

        Returns:
            dict: The updated configuration dictionary.
        """
        starting_ver = rounded_ver_parse(start_version or "0.0.0")
        finish_ver = rounded_ver_parse(finish_version)

        if finish_ver < starting_ver:
            raise ValueError(
                f"Your nebari-config.yaml already belongs to a later version ({start_version}) than the installed version of Nebari ({finish_version}).\n"
                "You should upgrade the installed nebari package (e.g. pip install --upgrade nebari) to work with your deployment."
            )

        step_versions = sorted(
            [
                v
                for v in cls._steps.keys()
                if rounded_ver_parse(v) > starting_ver
                and rounded_ver_parse(v) <= finish_ver
            ],
            key=rounded_ver_parse,
        )

        current_start_version = start_version
        for stepcls in [cls._steps[str(v)] for v in step_versions]:
            step = stepcls()
            config = step.upgrade_step(
                config,
                current_start_version,
                config_filename,
                attempt_fixes=attempt_fixes,
            )
            current_start_version = step.get_version()
            print("\n")

        return config

    @classmethod
    def _rm_rf_stages(cls, config_filename, dry_run: bool = False, verbose=False):
        """
        Remove stage files during and upgrade step

        Usually used when you need files in your `stages` directory to be
        removed in order to avoid resource conflicts

        Args:
            config_filename (str): The path to the configuration file.
        Returns:
            None
        """
        config_dir = Path(config_filename).resolve().parent

        if Path.is_dir(config_dir):
            stage_dir = config_dir / "stages"

            stage_filenames = [d for d in stage_dir.rglob("*") if d.is_file()]

            for stage_filename in stage_filenames:
                if dry_run and verbose:
                    rich.print(f"Dry run: Would remove {stage_filename}")
                else:
                    stage_filename.unlink(missing_ok=True)
                    if verbose:
                        rich.print(f"Removed {stage_filename}")

            stage_filedirs = sorted(
                (d for d in stage_dir.rglob("*") if d.is_dir()),
                reverse=True,
            )

            for stage_filedir in stage_filedirs:
                if dry_run and verbose:
                    rich.print(f"Dry run: Would remove {stage_filedir}")
                else:
                    stage_filedir.rmdir()
                    if verbose:
                        rich.print(f"Removed {stage_filedir}")

            if dry_run and verbose:
                rich.print(f"Dry run: Would remove {stage_dir}")
            elif stage_dir.is_dir():
                stage_dir.rmdir()
                if verbose:
                    rich.print(f"Removed {stage_dir}")

    def get_version(self):
        """
        Returns:
            str: The version of the upgrade step.
        """
        return self.version

    def requires_nebari_version_field(self):
        """
        Checks if the nebari_version field is required for this upgrade step.

        Returns:
            bool: True if the nebari_version field is required, False otherwise.
        """
        return rounded_ver_parse(self.version) > rounded_ver_parse("0.3.13")

    def upgrade_step(self, config, start_version, config_filename, *args, **kwargs):
        """
        Perform the upgrade from start_version to self.version.

        Generally, this will be in-place in config, but must also return config dict.

        config_filename may be useful to understand the file path for nebari-config.yaml, for example
        to output another file in the same location.

        The standard body here will take care of setting nebari_version and also updating the image tags.

        It should normally be left as-is for all upgrades. Use _version_specific_upgrade below
        for any actions that are only required for the particular upgrade you are creating.

        Args:
            config (dict): The current configuration dictionary.
            start_version (str): The starting version of the configuration.
            config_filename (str): The path to the configuration file.

        Returns:
            dict: The updated configuration dictionary.
        """
        finish_version = self.get_version()
        __rounded_finish_version__ = str(rounded_ver_parse(finish_version))
        rich.print(
            f"\n---> Starting upgrade from [green]{start_version or 'old version'}[/green] to [green]{finish_version}[/green]\n"
        )

        # Set the new version
        if start_version == "":
            assert "nebari_version" not in config
        assert self.version != start_version

        if self.requires_nebari_version_field():
            rich.print(f"Setting nebari_version to [green]{self.version}[/green]")
            config["nebari_version"] = self.version

        def contains_image_and_tag(s: str) -> bool:
            """
            Check if the string matches the Nebari image pattern.

            Args:
                s (str): The string to check.

            Returns:
                bool: True if the string matches the pattern, False otherwise.
            """
            pattern = r"^quay\.io\/nebari\/nebari-(jupyterhub|jupyterlab|dask-worker)(-gpu)?:\d{4}\.\d+\.\d+$"
            return bool(re.match(pattern, s))

        def replace_image_tag_legacy(
            image: str, start_version: str, new_version: str
        ) -> str:
            """
            Replace legacy image tags with the new version.

            Args:
                image (str): The current image string.
                start_version (str): The starting version of the image.
                new_version (str): The new version to replace with.

            Returns:
                str: The updated image string with the new version, or None if no match.
            """
            start_version_regex = start_version.replace(".", "\\.")
            if not start_version:
                start_version_regex = "0\\.[0-3]\\.[0-9]{1,2}"

            docker_image_regex = re.compile(
                f"^([A-Za-z0-9_-]+/[A-Za-z0-9_-]+):v{start_version_regex}$"
            )

            m = docker_image_regex.match(image)
            if m:
                return ":".join([m.groups()[0], f"v{new_version}"])
            return None

        def replace_image_tag(
            s: str, new_version: str, config_path: str, attempt_fixes: bool
        ) -> str:
            """
            Replace the image tag with the new version.

            Args:
                s (str): The current image string.
                new_version (str): The new version to replace with.
                config_path (str): The path to the configuration file.

            Returns:
                str: The updated image string with the new version, or the original string if no changes.
            """
            legacy_replacement = replace_image_tag_legacy(s, start_version, new_version)
            if legacy_replacement:
                return legacy_replacement

            if not contains_image_and_tag(s):
                return s
            image_name, current_tag = s.split(":")
            if current_tag == new_version:
                return s
            loc = f"{config_path}: {image_name}"
            response = attempt_fixes or Confirm.ask(
                f"\nDo you want to replace current tag [green]{current_tag}[/green] with [green]{new_version}[/green] for:\n[purple]{loc}[/purple]?",
                default=True,
            )
            if response:
                return s.replace(current_tag, new_version)
            else:
                return s

        def set_nested_item(config: dict, config_path: list, value: str):
            """
            Set a nested item in the configuration dictionary.

            Args:
                config (dict): The configuration dictionary.
                config_path (list): The path to the item to set.
                value (str): The value to set.

            Returns:
                None
            """
            config_path = config_path.split(".")
            for k in config_path[:-1]:
                try:
                    k = int(k)
                except ValueError:
                    pass
                config = config[k]
            try:
                config_path[-1] = int(config_path[-1])
            except ValueError:
                pass
            config[config_path[-1]] = value

        def update_image_tag(
            config: dict,
            config_path: str,
            current_image: str,
            new_version: str,
            attempt_fixes: bool,
        ) -> dict:
            """
            Update the image tag in the configuration.

            Args:
                config (dict): The configuration dictionary.
                config_path (str): The path to the item to update.
                current_image (str): The current image string.
                new_version (str): The new version to replace with.

            Returns:
                dict: The updated configuration dictionary.
            """
            new_image = replace_image_tag(
                current_image,
                new_version,
                config_path,
                attempt_fixes,
            )
            if new_image != current_image:
                set_nested_item(config, config_path, new_image)

            return config

        # update default_images
        for k, v in config.get("default_images", {}).items():
            config_path = f"default_images.{k}"
            config = update_image_tag(
                config,
                config_path,
                v,
                __rounded_finish_version__,
                kwargs.get("attempt_fixes", False),
            )

        # update profiles.jupyterlab images
        for i, v in enumerate(config.get("profiles", {}).get("jupyterlab", [])):
            current_image = v.get("kubespawner_override", {}).get("image", None)
            if current_image:
                config = update_image_tag(
                    config,
                    f"profiles.jupyterlab.{i}.kubespawner_override.image",
                    current_image,
                    __rounded_finish_version__,
                    kwargs.get("attempt_fixes", False),
                )

        # update profiles.dask_worker images
        for k, v in config.get("profiles", {}).get("dask_worker", {}).items():
            current_image = v.get("image", None)
            if current_image:
                config = update_image_tag(
                    config,
                    f"profiles.dask_worker.{k}.image",
                    current_image,
                    __rounded_finish_version__,
                    kwargs.get("attempt_fixes", False),
                )

        # Run any version-specific tasks
        return self._version_specific_upgrade(
            config,
            start_version,
            config_filename,
            *args,
            **kwargs,
        )

    def _version_specific_upgrade(
        self, config, start_version, config_filename, *args, **kwargs
    ):
        """
        Perform version-specific upgrade tasks.

        Override this method in subclasses if you need to do anything specific to your version.

        Args:
            config (dict): The current configuration dictionary.
            start_version (str): The starting version of the configuration.
            config_filename (str): The path to the configuration file.

        Returns:
            dict: The updated configuration dictionary.
        """
        return config


class Upgrade_0_3_12(UpgradeStep):
    version = "0.3.12"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename, *args, **kwargs
    ):
        """
        This version of Nebari requires a conda_store image for the first time.
        """
        if config.get("default_images", {}).get("conda_store", None) is None:
            newimage = "quansight/conda-store-server:v0.3.3"
            rich.print(
                f"Adding default_images: conda_store image as [green]{newimage}[/green]"
            )
            if "default_images" not in config:
                config["default_images"] = {}
            config["default_images"]["conda_store"] = newimage
        return config


class Upgrade_0_4_0(UpgradeStep):
    version = "0.4.0"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        """
        This version of Nebari introduces Keycloak for authentication, removes deprecated fields,
        and generates a default password for the Keycloak root user.
        """
        security = config.get("security", {})
        users = security.get("users", {})
        groups = security.get("groups", {})

        # Custom Authenticators are no longer allowed
        if (
            config.get("security", {}).get("authentication", {}).get("type", "")
            == "custom"
        ):
            customauth_warning = (
                f"Custom Authenticators are no longer supported in {self.version} because Keycloak "
                "manages all authentication.\nYou need to find a way to support your authentication "
                "requirements within Keycloak."
            )
            if not kwargs.get("attempt_fixes", False):
                raise ValueError(
                    f"{customauth_warning}\n\nRun `nebari upgrade --attempt-fixes` to switch to basic Keycloak authentication instead."
                )
            else:
                rich.print(f"\nWARNING: {customauth_warning}")
                rich.print(
                    "\nSwitching to basic Keycloak authentication instead since you specified --attempt-fixes."
                )
                config["security"]["authentication"] = {"type": "password"}

        # Create a group/user import file for Keycloak

        realm_import_filename = config_filename.parent / "nebari-users-import.json"

        realm = {"id": "nebari", "realm": "nebari"}
        realm["users"] = [
            {
                "username": k,
                "enabled": True,
                "groups": sorted(
                    list(
                        (
                            {v.get("primary_group", "")}
                            | set(v.get("secondary_groups", []))
                        )
                        - {""}
                    )
                ),
            }
            for k, v in users.items()
        ]
        realm["groups"] = [
            {"name": k, "path": f"/{k}"}
            for k, v in groups.items()
            if k not in {"users", "admin"}
        ]

        backup_configuration(realm_import_filename)

        with realm_import_filename.open("wt") as f:
            json.dump(realm, f, indent=2)

        rich.print(
            f"\nSaving user/group import file [purple]{realm_import_filename}[/purple].\n\n"
            "ACTION REQUIRED: You must import this file into the Keycloak admin webpage after you redeploy Nebari.\n"
            "Visit the URL path /auth/ and login as 'root'. Under Manage, click Import and select this file.\n\n"
            "Non-admin users will default to analyst group membership after the upgrade (no dask access), "
            "so you may wish to promote some users into the developer group.\n"
        )

        if "users" in security:
            del security["users"]
        if "groups" in security:
            if "users" in security["groups"]:
                # Ensure the users default group is added to Keycloak
                security["shared_users_group"] = True
            del security["groups"]

        if "terraform_modules" in config:
            del config["terraform_modules"]
            rich.print(
                "Removing terraform_modules field from config as it is no longer used.\n"
            )

        if "default_images" not in config:
            config["default_images"] = {}

        # Remove conda_store image from default_images
        if "conda_store" in config["default_images"]:
            del config["default_images"]["conda_store"]

        # Remove dask_gateway image from default_images
        if "dask_gateway" in config["default_images"]:
            del config["default_images"]["dask_gateway"]

        # Create root password
        default_password = "".join(
            secrets.choice(string.ascii_letters + string.digits) for i in range(16)
        )
        security.setdefault("keycloak", {})["initial_root_password"] = default_password

        rich.print(
            f"Generated default random password=[green]{default_password}[/green] for Keycloak root user (Please change at /auth/ URL path).\n"
        )

        # project was never needed in Azure - it remained as PLACEHOLDER in earlier nebari inits!
        if "azure" in config:
            if "project" in config["azure"]:
                del config["azure"]["project"]

        # "oauth_callback_url" and "scope" not required in nebari-config.yaml
        # for Auth0 and Github authentication
        auth_config = config["security"]["authentication"].get("config", None)
        if auth_config:
            if "oauth_callback_url" in auth_config:
                del auth_config["oauth_callback_url"]
            if "scope" in auth_config:
                del auth_config["scope"]

        # It is not safe to immediately redeploy without backing up data ready to restore data
        # since a new cluster will be created for the new version.
        # Setting the following flag will prevent deployment and display guidance to the user
        # which they can override if they are happy they understand the situation.
        config["prevent_deploy"] = True

        return config


class Upgrade_0_4_1(UpgradeStep):
    version = "0.4.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        """
        Upgrade jupyterlab profiles.
        """
        rich.print("\nUpgrading jupyterlab profiles in order to specify access type:\n")

        profiles_jupyterlab = config.get("profiles", {}).get("jupyterlab", [])
        for profile in profiles_jupyterlab:
            name = profile.get("display_name", "")

            if "groups" in profile or "users" in profile:
                profile["access"] = "yaml"
            else:
                profile["access"] = "all"

            rich.print(
                f"Setting access type of JupyterLab profile [green]{name}[/green] to [green]{profile['access']}[/green]"
            )
        return config


class Upgrade_2023_4_2(UpgradeStep):
    version = "2023.4.2"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        """
        Prompt users to delete Argo CRDs
        """
        argo_crds = [
            "clusterworkflowtemplates.argoproj.io",
            "cronworkflows.argoproj.io",
            "workfloweventbindings.argoproj.io",
            "workflows.argoproj.io",
            "workflowtasksets.argoproj.io",
            "workflowtemplates.argoproj.io",
        ]

        argo_sa = ["argo-admin", "argo-dev", "argo-view"]

        namespace = config.get("namespace", "default")

        if kwargs.get("attempt_fixes", False):
            try:
                kubernetes.config.load_kube_config()
            except kubernetes.config.config_exception.ConfigException:
                rich.print(
                    "[red bold]No default kube configuration file was found. Make sure to [link=https://www.nebari.dev/docs/how-tos/debug-nebari#generating-the-kubeconfig]have one pointing to your Nebari cluster[/link] before upgrading.[/red bold]"
                )
                exit()

            for crd in argo_crds:
                api_instance = kubernetes.client.ApiextensionsV1Api()
                try:
                    api_instance.delete_custom_resource_definition(
                        name=crd,
                    )
                except kubernetes.client.exceptions.ApiException as e:
                    if e.status == 404:
                        rich.print(f"CRD [yellow]{crd}[/yellow] not found. Ignoring.")
                    else:
                        raise e
                else:
                    rich.print(f"Successfully removed CRD [green]{crd}[/green]")

            for sa in argo_sa:
                api_instance = kubernetes.client.CoreV1Api()
                try:
                    api_instance.delete_namespaced_service_account(
                        sa,
                        namespace,
                    )
                except kubernetes.client.exceptions.ApiException as e:
                    if e.status == 404:
                        rich.print(
                            f"Service account [yellow]{sa}[/yellow] not found. Ignoring."
                        )
                    else:
                        raise e
                else:
                    rich.print(
                        f"Successfully removed service account [green]{sa}[/green]"
                    )
        else:
            kubectl_delete_argo_crds_cmd = " ".join(
                (
                    *("kubectl delete crds",),
                    *argo_crds,
                ),
            )
            kubectl_delete_argo_sa_cmd = " ".join(
                (
                    *(
                        "kubectl delete sa",
                        f"-n {namespace}",
                    ),
                    *argo_sa,
                ),
            )
            rich.print(
                f"\n\n[bold cyan]Note:[/] Upgrading requires a one-time manual deletion of the Argo Workflows Custom Resource Definitions (CRDs) and service accounts. \n\n[red bold]"
                f"Warning:  [link=https://{config['domain']}/argo/workflows]Workflows[/link] and [link=https://{config['domain']}/argo/workflows]CronWorkflows[/link] created before deleting the CRDs will be erased when the CRDs are deleted and will not be restored.[/red bold] \n\n"
                f"The updated CRDs will be installed during the next [cyan bold]nebari deploy[/cyan bold] step. Argo Workflows will not function after deleting the CRDs until the updated CRDs and service accounts are installed in the next nebari deploy. "
                f"You must delete the Argo Workflows CRDs and service accounts before upgrading to {self.version} (or later) or the deploy step will fail.  "
                f"Please delete them before proceeding by generating a kubeconfig (see [link=https://www.nebari.dev/docs/how-tos/debug-nebari/#generating-the-kubeconfig]docs[/link]), installing kubectl (see [link=https://www.nebari.dev/docs/how-tos/debug-nebari#installing-kubectl]docs[/link]), and running the following two commands:\n\n\t[cyan bold]{kubectl_delete_argo_crds_cmd} [/cyan bold]\n\n\t[cyan bold]{kubectl_delete_argo_sa_cmd} [/cyan bold]"
            )

            continue_ = Confirm.ask(
                "Have you deleted the Argo Workflows CRDs and service accounts?",
                default=False,
            )
            if not continue_:
                rich.print(
                    f"You must delete the Argo Workflows CRDs and service accounts before upgrading to [green]{self.version}[/green] (or later)."
                )
                exit()

        return config


class Upgrade_2023_7_1(UpgradeStep):
    version = "2023.7.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        provider = config["provider"]
        if provider == ProviderEnum.aws.value:
            rich.print("\n ⚠️  DANGER ⚠️")
            rich.print(
                DESTRUCTIVE_UPGRADE_WARNING,
                "The 'prevent_deploy' flag has been set in your config file and must be manually removed to deploy.",
            )
            config["prevent_deploy"] = True

        return config


class Upgrade_2023_7_2(UpgradeStep):
    version = "2023.7.2"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        argo = config.get("argo_workflows", {})
        if argo.get("enabled"):
            response = kwargs.get("attempt_fixes", False) or Confirm.ask(
                f"\nDo you want to enable the [green][link={NEBARI_WORKFLOW_CONTROLLER_DOCS}]Nebari Workflow Controller[/link][/green], required for [green][link={ARGO_JUPYTER_SCHEDULER_REPO}]Argo-Jupyter-Scheduler[/link][green]?",
                default=True,
            )
            if response:
                argo["nebari_workflow_controller"] = {"enabled": True}

        rich.print("\n ⚠️ Deprecation Warnings ⚠️")
        rich.print(
            f"-> [green]{self.version}[/green] is the last Nebari version that supports CDS Dashboards"
        )

        return config


class Upgrade_2023_10_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2023.10.1

    Note:
        Upgrading to 2023.10.1 is considered high-risk because it includes a major refactor
        to introduce the extension mechanism system. This version introduces significant
        changes, including the support for third-party plugins, upgrades JupyterHub to version 3.1,
        and deprecates certain components such as CDS Dashboards, ClearML, Prefect, and kbatch.
    """

    version = "2023.10.1"
    # JupyterHub Helm chart 2.0.0 (app version 3.0.0) requires K8S Version >=1.23. (reference: https://z2jh.jupyter.org/en/stable/)
    # This released has been tested against 1.26
    min_k8s_version = 1.26

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        # Upgrading to 2023.10.1 is considered high-risk because it includes a major refacto
        # to introduce the extension mechanism system.
        rich.print("\n ⚠️  Warning ⚠️")
        rich.print(
            f"-> Nebari version [green]{self.version}[/green] includes a major refactor to introduce an extension mechanism that supports the development of third-party plugins."
        )
        rich.print(
            "-> Data should be backed up before performing this upgrade ([green][link=https://www.nebari.dev/docs/how-tos/manual-backup]see docs[/link][/green])  The 'prevent_deploy' flag has been set in your config file and must be manually removed to deploy."
        )

        # Setting the following flag will prevent deployment and display guidance to the user
        # which they can override if they are happy they understand the situation.
        config["prevent_deploy"] = True

        # Nebari version 2023.10.1 upgrades JupyterHub to 3.1.  CDS Dashboards are only compatible with
        # JupyterHub versions 1.X and so will be removed during upgrade.
        rich.print("\n ⚠️  Deprecation Warning ⚠️")
        rich.print(
            f"-> CDS dashboards are no longer supported in Nebari version [green]{self.version}[/green] and will be uninstalled."
        )
        if config.get("cdsdashboards"):
            rich.print("-> Removing cdsdashboards from config file.")
            del config["cdsdashboards"]

        # Deprecation Warning - ClearML, Prefect, kbatch
        rich.print("\n ⚠️  Deprecation Warning ⚠️")
        rich.print(
            "-> We will be removing and ending support for ClearML, Prefect and kbatch in the next release. The kbatch has been functionally replaced by Argo-Jupyter-Scheduler. We have seen little interest in ClearML and Prefect in recent years, and removing makes sense at this point. However if you wish to continue using them with Nebari we encourage you to [green][link=https://www.nebari.dev/docs/how-tos/nebari-extension-system/#developing-an-extension]write your own Nebari extension[/link][/green]."
        )

        # Kubernetes version check
        # JupyterHub Helm chart 2.0.0 (app version 3.0.0) requires K8S Version >=1.23. (reference: https://z2jh.jupyter.org/en/stable/)

        provider = config["provider"]
        provider_config_block = get_provider_config_block_name(provider)

        # Get current Kubernetes version if available in config.
        current_version = config.get(provider_config_block, {}).get(
            "kubernetes_version", None
        )

        # Convert to decimal prefix
        if provider in ["aws", "azure", "gcp", "do"]:
            current_version = get_k8s_version_prefix(current_version)

        # Try to convert known Kubernetes versions to float.
        if current_version is not None:
            try:
                current_version = float(current_version)
            except ValueError:
                current_version = None

        # Handle checks for when Kubernetes version should be detectable
        if provider in ["aws", "azure", "gcp", "do"]:
            # Kubernetes version not found in provider block
            if current_version is None:
                rich.print("\n ⚠️  Warning ⚠️")
                rich.print(
                    f"-> Unable to detect Kubernetes version for provider {provider}.  Nebari version [green]{self.version}[/green] requires Kubernetes version {str(self.min_k8s_version)}.  Please confirm your Kubernetes version is configured before upgrading."
                )

            # Kubernetes version less than required minimum
            if (
                isinstance(current_version, float)
                and current_version < self.min_k8s_version
            ):
                rich.print("\n ⚠️  Warning ⚠️")
                rich.print(
                    f"-> Nebari version [green]{self.version}[/green] requires Kubernetes version {str(self.min_k8s_version)}.  Your configured Kubernetes version is [red]{current_version}[/red]. {UPGRADE_KUBERNETES_MESSAGE}"
                )
                version_diff = round(self.min_k8s_version - current_version, 2)
                if version_diff > 0.01:
                    rich.print(
                        "-> The Kubernetes version is multiple minor versions behind the minimum required version. You will need to perform the upgrade one minor version at a time.  For example, if your current version is 1.24, you will need to upgrade to 1.25, and then 1.26."
                    )
                rich.print(
                    f"-> Update the value of [green]{provider_config_block}.kubernetes_version[/green] in your config file to a newer version of Kubernetes and redeploy."
                )

        else:
            rich.print("\n ⚠️  Warning ⚠️")
            rich.print(
                f"-> Unable to detect Kubernetes version for provider {provider}.  Nebari version [green]{self.version}[/green] requires Kubernetes version {str(self.min_k8s_version)} or greater."
            )
            rich.print(
                "-> Please ensure your Kubernetes version is up-to-date before proceeding."
            )

        if provider == "aws":
            rich.print("\n ⚠️  DANGER ⚠️")
            rich.print(DESTRUCTIVE_UPGRADE_WARNING)

        if kwargs.get("attempt_fixes", False) or Confirm.ask(
            TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION,
            default=False,
        ):
            if (
                (_terraform_state_config := config.get("terraform_state"))
                and (_terraform_state_config.get("type") != "remote")
                and not Confirm.ask(
                    DESTROY_STAGE_FILES_WITH_TF_STATE_NOT_REMOTE,
                    default=False,
                )
            ):
                exit()

            self._rm_rf_stages(
                config_filename,
                dry_run=kwargs.get("dry_run", False),
                verbose=True,
            )

        return config


class Upgrade_2023_11_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2023.11.1

    Note:
        - ClearML, Prefect, and kbatch are no longer supported in this version.
    """

    version = "2023.11.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("\n ⚠️  Deprecation Warning ⚠️")
        rich.print(
            f"-> ClearML, Prefect and kbatch are no longer supported in Nebari version [green]{self.version}[/green] and will be uninstalled."
        )

        if kwargs.get("attempt_fixes", False) or Confirm.ask(
            TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION,
            default=False,
        ):
            if (
                (_terraform_state_config := config.get("terraform_state"))
                and (_terraform_state_config.get("type") != "remote")
                and not Confirm.ask(
                    DESTROY_STAGE_FILES_WITH_TF_STATE_NOT_REMOTE,
                    default=False,
                )
            ):
                exit()

            self._rm_rf_stages(
                config_filename,
                dry_run=kwargs.get("dry_run", False),
                verbose=True,
            )

        return config


class Upgrade_2023_12_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2023.12.1

    Note:
        - This is the last version that supports the jupyterlab-videochat extension.
    """

    version = "2023.12.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("\n ⚠️  Deprecation Warning ⚠️")
        rich.print(
            f"-> [green]{self.version}[/green] is the last Nebari version that supports the jupyterlab-videochat extension."
        )
        rich.print()

        if kwargs.get("attempt_fixes", False) or Confirm.ask(
            TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION,
            default=False,
        ):
            if (
                (_terraform_state_config := config.get("terraform_state"))
                and (_terraform_state_config.get("type") != "remote")
                and not Confirm.ask(
                    DESTROY_STAGE_FILES_WITH_TF_STATE_NOT_REMOTE,
                    default=False,
                )
            ):
                exit()

            self._rm_rf_stages(
                config_filename,
                dry_run=kwargs.get("dry_run", False),
                verbose=True,
            )

        return config


class Upgrade_2024_1_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2024.1.1

    Note:
        - jupyterlab-videochat, retrolab, jupyter-tensorboard, jupyterlab-conda-store, and jupyter-nvdashboard are no longer supported.
    """

    version = "2024.1.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("\n ⚠️  Deprecation Warning ⚠️")
        rich.print(
            "-> jupyterlab-videochat, retrolab, jupyter-tensorboard, jupyterlab-conda-store and jupyter-nvdashboard",
            f"are no longer supported in Nebari version [green]{self.version}[/green] and will be uninstalled.",
        )
        rich.print()

        if kwargs.get("attempt_fixes", False) or Confirm.ask(
            TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION,
            default=False,
        ):
            if (
                (_terraform_state_config := config.get("terraform_state"))
                and (_terraform_state_config.get("type") != "remote")
                and not Confirm.ask(
                    DESTROY_STAGE_FILES_WITH_TF_STATE_NOT_REMOTE,
                    default=False,
                )
            ):
                exit()

            self._rm_rf_stages(
                config_filename,
                dry_run=kwargs.get("dry_run", False),
                verbose=True,
            )

        return config


class Upgrade_2024_3_1(UpgradeStep):
    version = "2024.3.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("Ready to upgrade to Nebari version [green]2024.3.1[/green].")

        return config


class Upgrade_2024_3_2(UpgradeStep):
    version = "2024.3.2"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("Ready to upgrade to Nebari version [green]2024.3.2[/green].")

        return config


class Upgrade_2024_3_3(UpgradeStep):
    version = "2024.3.3"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("Ready to upgrade to Nebari version [green]2024.3.3[/green].")

        return config


class Upgrade_2024_4_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2024.4.1

    Note:
        - Adds default configuration for node groups if not already defined.
    """

    version = "2024.4.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        # Default configuration for the node groups was added in this version. Therefore,
        # users upgrading who don't have any specific node groups defined on their config
        # file already, will be prompted and asked whether they want to include the default
        if provider := config.get("provider", ""):
            provider_full_name = provider_enum_name_map[provider]
            if provider_full_name in config and "node_groups" not in config.get(
                provider_full_name, {}
            ):
                try:
                    default_node_groups = provider_enum_default_node_groups_map[
                        provider
                    ]
                    continue_ = kwargs.get("attempt_fixes", False) or Confirm.ask(
                        f"Would you like to include the default configuration for the node groups in [purple]{config_filename}[/purple]?",
                        default=False,
                    )
                    if continue_:
                        config[provider_full_name]["node_groups"] = default_node_groups
                except KeyError:
                    pass

        return config


class Upgrade_2024_5_1(UpgradeStep):
    version = "2024.5.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("Ready to upgrade to Nebari version [green]2024.5.1[/green].")

        return config


class Upgrade_2024_6_1(UpgradeStep):
    """
    Upgrade step for version 2024.6.1

    This upgrade includes:
    - Manual updates for kube-prometheus-stack CRDs if monitoring is enabled.
    - Prompts to upgrade GCP node groups to more cost-efficient instances.
    """

    version = "2024.6.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        # Prompt users to manually update kube-prometheus-stack CRDs if monitoring is enabled
        if config.get("monitoring", {}).get("enabled", True):
            crd_urls = [
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_probes.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_prometheusagents.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_scrapeconfigs.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml",
                "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.73.0/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml",
            ]
            daemonset_name = "prometheus-node-exporter"
            namespace = config.get("namespace", "default")

            # We're upgrading from version 30.1.0 to 58.4.0. This is a major upgrade and requires manual intervention.
            # See https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md#upgrading-chart
            # for more information on why the following commands are necessary.
            commands = "[cyan bold]"
            for url in crd_urls:
                commands += f"kubectl apply --server-side --force-conflicts -f {url}\n"
            commands += f"kubectl delete daemonset -l app={daemonset_name} --namespace {namespace}\n"
            commands += "[/cyan bold]"

            rich.print(
                "\n ⚠️  Warning ⚠️"
                "\n-> [red bold]Nebari version 2024.6.1 comes with a new version of Grafana. Any custom dashboards that you created will be deleted after upgrading Nebari. Make sure to [link=https://grafana.com/docs/grafana/latest/dashboards/share-dashboards-panels/#export-a-dashboard-as-json]export them as JSON[/link] so you can [link=https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/#import-a-dashboard]import them[/link] again afterwards.[/red bold]"
                f"\n-> [red bold]Before upgrading, kube-prometheus-stack CRDs need to be updated and the {daemonset_name} daemonset needs to be deleted.[/red bold]"
            )
            run_commands = kwargs.get("attempt_fixes", False) or Confirm.ask(
                "\nDo you want Nebari to update the kube-prometheus-stack CRDs and delete the prometheus-node-exporter for you? If not, you'll have to do it manually.",
                default=False,
            )

            # By default, rich wraps lines by splitting them into multiple lines. This is
            # far from ideal, as users copy-pasting the commands will get errors when running them.
            # To avoid this, we use a rich console with a larger width to print the entire commands
            # and let the terminal wrap them if needed.
            console = rich.console.Console(width=220)
            if run_commands:
                try:
                    kubernetes.config.load_kube_config()
                except kubernetes.config.config_exception.ConfigException:
                    rich.print(
                        "[red bold]No default kube configuration file was found. Make sure to [link=https://www.nebari.dev/docs/how-tos/debug-nebari#generating-the-kubeconfig]have one pointing to your Nebari cluster[/link] before upgrading.[/red bold]"
                    )
                    exit()
                current_kube_context = kubernetes.config.list_kube_config_contexts()[1]
                cluster_name = current_kube_context["context"]["cluster"]
                rich.print(
                    f"The following commands will be run for the [cyan bold]{cluster_name}[/cyan bold] cluster"
                )
                _ = kwargs.get("attempt_fixes", False) or Prompt.ask(
                    "Hit enter to show the commands"
                )
                console.print(commands)

                _ = kwargs.get("attempt_fixes", False) or Prompt.ask(
                    "Hit enter to continue"
                )
                # We need to add a special constructor to the yaml loader to handle a specific
                # tag as otherwise the kubernetes API will fail when updating the CRD.
                yaml.constructor.add_constructor(
                    "tag:yaml.org,2002:value", lambda loader, node: node.value
                )
                for url in crd_urls:
                    response = requests.get(url)
                    response.raise_for_status()
                    crd = yaml.load(response.text)
                    crd_name = crd["metadata"]["name"]
                    api_instance = kubernetes.client.ApiextensionsV1Api()
                    try:
                        api_response = api_instance.read_custom_resource_definition(
                            name=crd_name
                        )
                    except kubernetes.client.exceptions.ApiException:
                        api_response = api_instance.create_custom_resource_definition(
                            body=crd
                        )
                    else:
                        api_response = api_instance.patch_custom_resource_definition(
                            name=crd["metadata"]["name"], body=crd
                        )

                api_instance = kubernetes.client.AppsV1Api()
                api_response = api_instance.list_namespaced_daemon_set(
                    namespace=namespace, label_selector=f"app={daemonset_name}"
                )
                if api_response.items:
                    api_instance.delete_namespaced_daemon_set(
                        name=api_response.items[0].metadata.name,
                        namespace=namespace,
                    )

                rich.print(
                    f"The kube-prometheus-stack CRDs have been updated and the {daemonset_name} daemonset has been deleted."
                )
            else:
                rich.print(
                    "[red bold]Before upgrading, you need to manually delete the prometheus-node-exporter daemonset and update the kube-prometheus-stack CRDs. To do that, please run the following commands.[/red bold]"
                )
                _ = Prompt.ask("Hit enter to show the commands")
                console.print(commands)

                _ = Prompt.ask("Hit enter to continue")
                continue_ = Confirm.ask(
                    f"Have you backed up your custom dashboards (if necessary), deleted the {daemonset_name} daemonset and updated the kube-prometheus-stack CRDs?",
                    default=False,
                )
                if not continue_:
                    rich.print(
                        f"[red bold]You must back up your custom dashboards (if necessary), delete the {daemonset_name} daemonset and update the kube-prometheus-stack CRDs before upgrading to [green]{self.version}[/green] (or later).[/bold red]"
                    )
                    exit()

        # Prompt users to upgrade to the new default node groups for GCP
        if (provider := config.get("provider", "")) == ProviderEnum.gcp.value:
            provider_full_name = provider_enum_name_map[provider]
            if not config.get(provider_full_name, {}).get("node_groups", {}):
                try:
                    text = textwrap.dedent(
                        f"""
                        The default node groups for GCP have been changed to cost efficient e2 family nodes reducing the running cost of Nebari on GCP by ~50%.
                        This change will affect your current deployment, and will result in ~15 minutes of downtime during the upgrade step as the node groups are switched out, but shouldn't result in data loss.

                        [red bold]Note: If upgrading to the new node types, the upgrade process will take longer than usual. For this upgrade only, you'll likely see a timeout \
                        error and need to restart the deployment process afterwards in order to upgrade successfully.[/red bold]

                        As always, make sure to backup data before upgrading.  See https://www.nebari.dev/docs/how-tos/manual-backup for more information.

                        Would you like to upgrade to the cost effective node groups [purple]{config_filename}[/purple]?
                        If not, select "N" and the old default node groups will be added to the nebari config file.
                    """
                    )
                    continue_ = kwargs.get("attempt_fixes", False) or Confirm.ask(
                        text,
                        default=True,
                    )
                    if not continue_:
                        config[provider_full_name]["node_groups"] = {
                            "general": {
                                "instance": "n1-standard-8",
                                "min_nodes": 1,
                                "max_nodes": 1,
                            },
                            "user": {
                                "instance": "n1-standard-4",
                                "min_nodes": 0,
                                "max_nodes": 5,
                            },
                            "worker": {
                                "instance": "n1-standard-4",
                                "min_nodes": 0,
                                "max_nodes": 5,
                            },
                        }
                except KeyError:
                    pass
            else:
                text = textwrap.dedent(
                    """
                    The default node groups for GCP have been changed to cost efficient e2 family nodes reducing the running cost of Nebari on GCP by ~50%.
                    Consider upgrading your node group instance types to the new default configuration.

                    Upgrading your general node will result in ~15 minutes of downtime during the upgrade step as the node groups are switched out, but shouldn't result in data loss.

                    As always, make sure to backup data before upgrading.  See https://www.nebari.dev/docs/how-tos/manual-backup for more information.

                    The new default node groups instances are:
                """
                )
                text += json.dumps(
                    {
                        "general": {"instance": "e2-highmem-4"},
                        "user": {"instance": "e2-standard-4"},
                        "worker": {"instance": "e2-standard-4"},
                    },
                    indent=4,
                )
                rich.print(text)
                if not kwargs.get("attempt_fixes", False):
                    _ = Prompt.ask("\n\nHit enter to continue")
        return config


class Upgrade_2024_7_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2024.7.1

    Note:
        - Digital Ocean deprecation warning.
    """

    version = "2024.7.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        if config.get("provider", "") == "do":
            rich.print("\n ⚠️  Deprecation Warning ⚠️")
            rich.print(
                "-> Digital Ocean support is currently being deprecated and will be removed in a future release.",
            )
            rich.print("")
        return config


class Upgrade_2024_9_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2024.9.1

    """

    version = "2024.9.1"

    # Nebari version 2024.9.1 has been marked as broken, and will be skipped:
    # https://github.com/nebari-dev/nebari/issues/2798
    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        return config


class Upgrade_2024_11_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2024.11.1
    """

    version = "2024.11.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        if config.get("provider", "") == ProviderEnum.azure.value:
            rich.print("\n ⚠️ Upgrade Warning ⚠️")
            rich.print(
                textwrap.dedent(
                    """
                -> Please ensure no users are currently logged in prior to deploying this update.  The node groups will be destroyed and recreated during the deployment process causing a downtime of approximately 15 minutes.

                Due to an upstream issue, Azure Nebari deployments may raise an error when deploying for the first time after this upgrade. Waiting for a few minutes and then re-running `nebari deploy` should resolve the issue.  More info can be found at [green][link=https://github.com/nebari-dev/nebari/issues/2640]issue #2640[/link][/green]."""
                ),
            )
            rich.print("")
        elif config.get("provider", "") == "do":
            rich.print("\n ⚠️  Deprecation Warning ⚠️")
            rich.print(
                "-> Digital Ocean support is currently being deprecated and will be removed in a future release.",
            )
            rich.print("")

        rich.print("\n ⚠️ Upgrade Warning ⚠️")

        text = textwrap.dedent(
            """
            Please ensure no users are currently logged in prior to deploying this
            update.

            This release introduces changes to how group directories are mounted in
            JupyterLab pods.

            Previously, every Keycloak group in the Nebari realm automatically created a
            shared directory at ~/shared/<group-name>, accessible to all group members
            in their JupyterLab pods.

            Moving forward, only groups assigned the JupyterHub client role
            [magenta]allow-group-directory-creation[/magenta] or its affiliated scope
            [magenta]write:shared-mount[/magenta] will have their directories mounted.

            By default, the admin, analyst, and developer groups will have this
            role assigned during the upgrade. For other groups, you'll now need to
            assign this role manually in the Keycloak UI to have their directories
            mounted.

            For more details check our [green][link=https://www.nebari.dev/docs/references/release/]release notes[/link][/green].
            """
        )
        rich.print(text)
        keycloak_admin = None

        # Prompt the user for role assignment (if yes, transforms the response into bool)
        # This needs to be monkeypatched and will be addressed in a future PR. Until then, this causes test failures.
        assign_roles = kwargs.get("attempt_fixes", False) or Confirm.ask(
            "[bold]Would you like Nebari to assign the corresponding role/scopes to all of your current groups automatically?[/bold]",
            default=False,
        )

        if assign_roles:
            # In case this is done with a local deployment
            import urllib3

            urllib3.disable_warnings()

            keycloak_username = os.environ.get("KEYCLOAK_ADMIN_USERNAME", "root")
            keycloak_password = os.environ.get(
                "KEYCLOAK_ADMIN_PASSWORD",
                config["security"]["keycloak"]["initial_root_password"],
            )

            try:
                # Quick test to connect to Keycloak
                keycloak_admin = get_keycloak_admin(
                    server_url=f"https://{config['domain']}/auth/",
                    username=keycloak_username,
                    password=keycloak_password,
                )
            except ValueError as e:
                if "invalid_grant" in str(e):
                    rich.print(
                        textwrap.dedent(
                            """
                            [red bold]Failed to connect to the Keycloak server.[/red bold]\n
                            [yellow]Please set the [bold]KEYCLOAK_ADMIN_USERNAME[/bold] and [bold]KEYCLOAK_ADMIN_PASSWORD[/bold]
                            environment variables with the Keycloak root credentials and try again.[/yellow]
                            """
                        )
                    )
                    exit()
                else:
                    # Handle other exceptions
                    rich.print(
                        f"[red bold]An unexpected error occurred: {repr(e)}[/red bold]"
                    )
                    exit()

            # Get client ID as role is bound to the JupyterHub client
            client_id = keycloak_admin.get_client_id("jupyterhub")
            role_name = "legacy-group-directory-creation-role"

            # Create role with shared scopes
            keycloak_admin.create_client_role(
                client_role_id=client_id,
                skip_exists=True,
                payload={
                    "name": role_name,
                    "attributes": {
                        "scopes": ["write:shared-mount"],
                        "component": ["shared-directory"],
                    },
                    "description": (
                        "Role to allow group directory creation, created as part of the "
                        "Nebari 2024.11.1 upgrade workflow."
                    ),
                },
            )

            role_id = keycloak_admin.get_client_role_id(
                client_id=client_id, role_name=role_name
            )

            role_representation = keycloak_admin.get_role_by_id(role_id=role_id)

            # Fetch all groups and groups with the role
            all_groups = keycloak_admin.get_groups()
            groups_with_role = keycloak_admin.get_client_role_groups(
                client_id=client_id, role_name=role_name
            )
            groups_with_role_ids = {group["id"] for group in groups_with_role}

            # Identify groups without the role
            groups_without_role = [
                group for group in all_groups if group["id"] not in groups_with_role_ids
            ]

            if groups_without_role:
                group_names = ", ".join(group["name"] for group in groups_without_role)
                rich.print(
                    f"\n[bold]Updating the following groups with the required permissions:[/bold] {group_names}\n"
                )
                for group in groups_without_role:
                    keycloak_admin.assign_group_client_roles(
                        group_id=group["id"],
                        client_id=client_id,
                        roles=[role_representation],
                    )
                rich.print(
                    "\n[green]Group permissions have been updated successfully.[/green]"
                )
            else:
                rich.print(
                    "\n[green]All groups already have the required permissions.[/green]"
                )
        return config


class Upgrade_2024_12_1(UpgradeStep):
    """
    Upgrade step for Nebari version 2024.12.1
    """

    version = "2024.12.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        if config.get("provider", "") == "do":
            rich.print(
                "\n[red bold]Error: DigitalOcean is no longer supported as a provider[/red bold].",
            )
            rich.print(
                "You can still deploy Nebari to a Kubernetes cluster on DigitalOcean by using 'existing' as the provider in the config file."
            )
            exit()

        rich.print("Ready to upgrade to Nebari version [green]2024.12.1[/green].")

        return config


class Upgrade_2025_2_1(UpgradeStep):
    version = "2025.2.1"

    @override
    def _version_specific_upgrade(
        self, config, start_version, config_filename: Path, *args, **kwargs
    ):
        rich.print("\n ⚠️ Upgrade Warning ⚠️")

        text = textwrap.dedent(
            """
            In this release, we have updated our maximum supported Kubernetes version from 1.29 to 1.31.
            Please note that Nebari will NOT automatically upgrade your running Kubernetes version as part of
            the redeployment process.

            After completing this upgrade step, we strongly recommend updating the Kubernetes version
            specified in your nebari-config YAML file and redeploying to apply the changes. Remember that
            Kubernetes minor versions must be upgraded incrementally (1.29 → 1.30 → 1.31).

            For more information on upgrading Kubernetes for your specific cloud provider, please visit:
            https://www.nebari.dev/docs/how-tos/kubernetes-version-upgrade
            """
        )

        rich.print(text)
        rich.print("Ready to upgrade to Nebari version [green]2025.2.1[/green].")

        return config


__rounded_version__ = str(rounded_ver_parse(__version__))

# Manually-added upgrade steps must go above this line
if not UpgradeStep.has_step(__rounded_version__):
    # Always have a way to upgrade to the latest full version number, even if no customizations
    # Don't let dev/prerelease versions cloud things
    class UpgradeLatest(UpgradeStep):
        """
        Upgrade step for the latest available version.

        This class ensures there is always an upgrade path to the latest version,
        even if no specific upgrade steps are defined for the current version.
        """

        version = __rounded_version__



---
File: nebari/src/_nebari/utils.py
---

import contextlib
import enum
import functools
import json
import os
import re
import secrets
import signal
import string
import subprocess
import sys
import threading
import time
import warnings
from pathlib import Path
from typing import Any, Dict, List, Set

from ruamel.yaml import YAML

from _nebari import constants

# environment variable overrides
NEBARI_GH_BRANCH = os.getenv("NEBARI_GH_BRANCH", None)

AZURE_TF_STATE_RESOURCE_GROUP_SUFFIX = "-state"
AZURE_NODE_RESOURCE_GROUP_SUFFIX = "-node-resource-group"

# Create a ruamel object with our favored config, for universal use
yaml = YAML()
yaml.preserve_quotes = True
yaml.default_flow_style = False


@contextlib.contextmanager
def timer(logger, prefix):
    start_time = time.time()
    yield
    logger.info(f"{prefix} took {time.time() - start_time:.3f} [s]")


@contextlib.contextmanager
def change_directory(directory):
    current_directory = Path.cwd()
    os.chdir(directory)
    yield
    os.chdir(current_directory)


def run_subprocess_cmd(processargs, prefix=b"", capture_output=False, **kwargs):
    """Runs subprocess command with realtime stdout logging with optional line prefix."""
    if prefix:
        line_prefix = f"[{prefix}]: ".encode("utf-8")
    else:
        line_prefix = b""

    if capture_output:
        stderr_stream = subprocess.PIPE
    else:
        stderr_stream = subprocess.STDOUT

    timeout = 0
    if "timeout" in kwargs:
        timeout = kwargs.pop("timeout")  # in seconds

    strip_errors = kwargs.pop("strip_errors", False)

    process = subprocess.Popen(
        processargs,
        **kwargs,
        stdout=subprocess.PIPE,
        stderr=stderr_stream,
        preexec_fn=os.setsid,
    )
    # Set timeout thread
    timeout_timer = None
    if timeout > 0:

        def kill_process():
            try:
                os.killpg(process.pid, signal.SIGTERM)
            except ProcessLookupError:
                pass  # Already finished

        timeout_timer = threading.Timer(timeout, kill_process)
        timeout_timer.start()

    print_stream = process.stderr if capture_output else process.stdout
    for line in iter(lambda: print_stream.readline(), b""):
        full_line = line_prefix + line
        if strip_errors:
            full_line = full_line.decode("utf-8")
            full_line = re.sub(
                r"\x1b\[31m", "", full_line
            )  # Remove red ANSI escape code
            full_line = full_line.encode("utf-8")

        sys.stdout.buffer.write(full_line)
        sys.stdout.flush()
    print_stream.close()

    output = []
    if capture_output:
        for line in iter(lambda: process.stdout.readline(), b""):
            output.append(line)
        process.stdout.close()

    if timeout_timer is not None:
        timeout_timer.cancel()

    exit_code = process.wait(
        timeout=10
    )  # Should already have finished because we have drained stdout

    if capture_output:
        return exit_code, b"".join(output)
    else:
        return exit_code, None


def load_yaml(config_filename: Path):
    """
    Return yaml dict containing config loaded from config_filename.
    """
    with open(config_filename) as f:
        config = yaml.load(f.read())

    return config


@contextlib.contextmanager
def modified_environ(*remove: List[str], **update: Dict[str, str]):
    """
    https://stackoverflow.com/questions/2059482/python-temporarily-modify-the-current-processs-environment/51754362
    Temporarily updates the ``os.environ`` dictionary in-place.

    The ``os.environ`` dictionary is updated in-place so that the modification
    is sure to work in all situations.

    :param remove: Environment variables to remove.
    :param update: Dictionary of environment variables and values to add/update.
    """
    env = os.environ
    update = update or {}
    remove = remove or []

    # List of environment variables being updated or removed.
    stomped = (set(update.keys()) | set(remove)) & set(env.keys())
    # Environment variables and values to restore on exit.
    update_after = {k: env[k] for k in stomped}
    # Environment variables and values to remove on exit.
    remove_after = frozenset(k for k in update if k not in env)

    try:
        env.update(update)
        [env.pop(k, None) for k in remove]
        yield
    finally:
        env.update(update_after)
        [env.pop(k) for k in remove_after]


def deep_merge(*args):
    """Deep merge multiple dictionaries.  Preserves order in dicts and lists.

    >>> value_1 = {
    'a': [1, 2],
    'b': {'c': 1, 'z': [5, 6]},
    'e': {'f': {'g': {}}},
    'm': 1,
    }

    >>> value_2 = {
        'a': [3, 4],
        'b': {'d': 2, 'z': [7]},
        'e': {'f': {'h': 1}},
        'm': [1],
    }

    >>> print(deep_merge(value_1, value_2))
    {'m': 1, 'e': {'f': {'g': {}, 'h': 1}}, 'b': {'d': 2, 'c': 1, 'z': [5, 6, 7]}, 'a': [1, 2, 3,  4]}
    """
    if len(args) == 0:
        return {}
    elif len(args) == 1:
        return args[0]
    elif len(args) > 2:
        return functools.reduce(deep_merge, args, {})
    else:  # length 2
        d1, d2 = args

    if isinstance(d1, dict) and isinstance(d2, dict):
        d3 = {}
        for key in tuple(d1.keys()) + tuple(d2.keys()):
            if key in d1 and key in d2:
                d3[key] = deep_merge(d1[key], d2[key])
            elif key in d1:
                d3[key] = d1[key]
            elif key in d2:
                d3[key] = d2[key]
        return d3
    elif isinstance(d1, list) and isinstance(d2, list):
        return [*d1, *d2]
    else:  # if they don't match use left one
        return d1


# https://github.com/minrk/escapism/blob/master/escapism.py
def escape_string(
    to_escape,
    safe=set(string.ascii_letters + string.digits),
    escape_char="_",
    allow_collisions=False,
):
    """Escape a string so that it only contains characters in a safe set.

    Characters outside the safe list will be escaped with _%x_,
    where %x is the hex value of the character.

    If `allow_collisions` is True, occurrences of `escape_char`
    in the input will not be escaped.

    In this case, `unescape` cannot be used to reverse the transform
    because occurrences of the escape char in the resulting string are ambiguous.
    Only use this mode when:

    1. collisions cannot occur or do not matter, and
    2. unescape will never be called.

    .. versionadded: 1.0
        allow_collisions argument.
        Prior to 1.0, behavior was the same as allow_collisions=False (default).

    """
    if sys.version_info >= (3,):

        def _ord(byte):
            return byte

        def _bchr(n):
            return bytes([n])

    else:
        _ord = ord
        _bchr = chr

    def _escape_char(c, escape_char):
        """Escape a single character"""
        buf = []
        for byte in c.encode("utf8"):
            buf.append(escape_char)
            buf.append("%X" % _ord(byte))
        return "".join(buf)

    if isinstance(to_escape, bytes):
        # always work on text
        to_escape = to_escape.decode("utf8")

    if not isinstance(safe, set):
        safe = set(safe)

    if allow_collisions:
        safe.add(escape_char)
    elif escape_char in safe:
        warnings.warn(
            "Escape character %r cannot be a safe character."
            " Set allow_collisions=True if you want to allow ambiguous escaped strings."
            % escape_char,
            RuntimeWarning,
            stacklevel=2,
        )
        safe.remove(escape_char)

    chars = []
    for c in to_escape:
        if c in safe:
            chars.append(c)
        else:
            chars.append(_escape_char(c, escape_char))

    return "".join(chars)


def random_secure_string(
    length: int = 16, chars: str = string.ascii_lowercase + string.digits
):
    return "".join(secrets.choice(chars) for i in range(length))


def set_docker_image_tag() -> str:
    """Set docker image tag for `jupyterlab`, `jupyterhub`, and `dask-worker`."""
    return os.environ.get("NEBARI_IMAGE_TAG", constants.DEFAULT_NEBARI_IMAGE_TAG)


def set_nebari_dask_version() -> str:
    """Set version of `nebari-dask` meta package."""
    return os.environ.get("NEBARI_DASK_VERSION", constants.DEFAULT_NEBARI_DASK_VERSION)


def get_latest_kubernetes_version(versions: List[str]) -> str:
    return sorted(versions)[-1]


def construct_azure_resource_group_name(
    project_name: str = "",
    namespace: str = "",
    base_resource_group_name: str = "",
    suffix: str = "",
) -> str:
    """
    Construct a resource group name for Azure.

    If the base_resource_group_name is provided, it will be used as the base,
    otherwise default to the project_name-namespace.
    """
    if base_resource_group_name:
        return f"{base_resource_group_name}{suffix}"
    return f"{project_name}-{namespace}{suffix}"


def get_k8s_version_prefix(k8s_version: str) -> str:
    """Return the major.minor version of the k8s version string."""

    k8s_version = str(k8s_version)
    # Split the input string by the first decimal point
    parts = k8s_version.split(".", 1)

    if len(parts) == 2:
        # Extract the part before the second decimal point
        before_second_decimal = parts[0] + "." + parts[1].split(".")[0]
        try:
            # Convert the extracted part to a float
            result = float(before_second_decimal)
            return result
        except ValueError:
            # Handle the case where the conversion to float fails
            return None
    else:
        # Handle the case where there is no second decimal point
        return None


def get_provider_config_block_name(provider):
    PROVIDER_CONFIG_NAMES = {
        "aws": "amazon_web_services",
        "azure": "azure",
        "gcp": "google_cloud_platform",
    }

    if provider in PROVIDER_CONFIG_NAMES.keys():
        return PROVIDER_CONFIG_NAMES[provider]
    else:
        return provider


def check_environment_variables(variables: Set[str], reference: str) -> None:
    """Check that environment variables are set."""
    required_variables = {
        variable: os.environ.get(variable, None) for variable in variables
    }
    missing_variables = {
        variable for variable, value in required_variables.items() if value is None
    }
    if missing_variables:
        raise ValueError(
            f"""Missing the following required environment variables: {required_variables}\n
            Please see the documentation for more information: {reference}"""
        )


def byte_unit_conversion(byte_size_str: str, output_unit: str = "B") -> float:
    """Converts string representation of byte size to another unit and returns float output

    e.g. byte_unit_conversion("1 KB", "B") -> 1000.0
    e.g. byte_unit_conversion("1 KiB", "B") -> 1024.0
    """
    byte_size_str = byte_size_str.lower()
    output_unit = output_unit.lower()

    units_multiplier = {
        "b": 1,
        "k": 1000,
        "m": 1000**2,
        "g": 1000**3,
        "t": 1000**4,
        "kb": 1000,
        "mb": 1000**2,
        "gb": 1000**3,
        "tb": 1000**4,
        "ki": 1024,
        "mi": 1024**2,
        "gi": 1024**3,
        "ti": 1024**4,
        "kib": 1024,
        "mib": 1024**2,
        "gib": 1024**3,
        "tib": 1024**4,
    }

    if output_unit not in units_multiplier:
        raise ValueError(
            f'Invalid input unit "{output_unit}".  Valid units are {units_multiplier.keys()}'
        )

    str_pattern = r"\s*^(\d+(?:\.\d*){0,1})\s*([a-zA-Z]*)\s*$"
    pattern = re.compile(str_pattern, re.IGNORECASE)
    match = pattern.search(byte_size_str)

    if not match:
        raise ValueError("Invalid byte size string")
    value = float(match.group(1))
    input_unit = match.group(2)
    if not input_unit:
        input_unit = "b"

    if input_unit not in units_multiplier:
        raise ValueError(
            f'Invalid input unit "{input_unit}".  Valid units are {list(units_multiplier.keys())}'
        )

    return value * units_multiplier[input_unit] / units_multiplier[output_unit]


class JsonDiffEnum(str, enum.Enum):
    ADDED = "+"
    REMOVED = "-"
    MODIFIED = "!"


class JsonDiff:
    def __init__(self, obj1: Dict[str, Any], obj2: Dict[str, Any]):
        self.diff = self.json_diff(obj1, obj2)

    @staticmethod
    def json_diff(obj1: Dict[str, Any], obj2: Dict[str, Any]) -> Dict[str, Any]:
        """Calculates the diff between two json-like objects

        # Example usage
        obj1 = {"a": 1, "b": {"c": 2, "d": 3}}
        obj2 = {"a": 1, "b": {"c": 2, "e": 4}, "f": 5}

        result = json_diff(obj1, obj2)
        """
        diff = {}
        for key in set(obj1.keys()) | set(obj2.keys()):
            if key not in obj1:
                diff[key] = {JsonDiffEnum.ADDED: obj2[key]}
            elif key not in obj2:
                diff[key] = {JsonDiffEnum.REMOVED: obj1[key]}
            elif obj1[key] != obj2[key]:
                if isinstance(obj1[key], dict) and isinstance(obj2[key], dict):
                    nested_diff = JsonDiff.json_diff(obj1[key], obj2[key])
                    if nested_diff:
                        diff[key] = nested_diff
                else:
                    diff[key] = {JsonDiffEnum.MODIFIED: (obj1[key], obj2[key])}
        return diff

    @staticmethod
    def walk_dict(d, path, sentinel):
        for key, value in d.items():
            if key is not sentinel:
                if not isinstance(value, dict):
                    continue
                yield from JsonDiff.walk_dict(value, path + [key], sentinel)
            else:
                yield path, value

    def modified(self):
        """Generator that yields the path, old value, and new value of changed items"""
        for path, (old, new) in self.walk_dict(self.diff, [], JsonDiffEnum.MODIFIED):
            yield path, old, new

    def __repr__(self):
        return f"{self.__class__.__name__}(diff={json.dumps(self.diff)})"



---
File: nebari/src/_nebari/version.py
---

"""a backport for the nebari version references."""

from importlib.metadata import distribution

from packaging.version import Version

__version__ = distribution("nebari").version


def rounded_ver_parse(version: str) -> Version:
    """
    Rounds a version string to the nearest patch version.

    Parameters
    ----------
    version : str
        A version string.

    Returns
    -------
    packaging.version.Version
        A version object.
    """
    base_version = Version(version).base_version
    return Version(base_version)



---
File: nebari/src/nebari/__init__.py
---




---
File: nebari/src/nebari/__main__.py
---

import typer

from _nebari.cli import create_cli


def main():
    cli = create_cli()
    cli()


# get the click object from the typer app so that we can autodoc the cli
# NOTE: this must happen _after_ all the subcommands have been added.
# Adapted from https://typer.tiangolo.com/tutorial/using-click/
typer_click_app = typer.main.get_command(create_cli())

if __name__ == "__main__":
    typer_click_app()



---
File: nebari/src/nebari/hookspecs.py
---

import contextlib
import pathlib
from typing import Any, Dict, List

import pydantic
import typer
from pluggy import HookimplMarker, HookspecMarker

from nebari import schema

hookspec = HookspecMarker("nebari")
hookimpl = HookimplMarker("nebari")


class NebariStage:
    name: str = None
    priority: int = None

    input_schema: pydantic.BaseModel = None
    output_schema: pydantic.BaseModel = None

    def __init__(self, output_directory: pathlib.Path, config: schema.Main):
        self.output_directory = output_directory
        self.config = config

    def render(self) -> Dict[str, str]:
        return {}

    @contextlib.contextmanager
    def deploy(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ):
        yield

    def check(
        self, stage_outputs: Dict[str, Dict[str, Any]], disable_prompt: bool = False
    ) -> bool:
        pass

    @contextlib.contextmanager
    def destroy(
        self, stage_outputs: Dict[str, Dict[str, Any]], status: Dict[str, bool]
    ):
        yield


@hookspec
def nebari_stage() -> List[NebariStage]:
    """Registers stages in nebari"""


@hookspec
def nebari_subcommand(cli: typer.Typer):
    """Register Typer subcommand in nebari"""



---
File: nebari/src/nebari/plugins.py
---

import itertools
import os
import re
import sys
import typing
from importlib import import_module
from importlib.util import module_from_spec, spec_from_file_location
from pathlib import Path

import pluggy

from nebari import hookspecs, schema

DEFAULT_SUBCOMMAND_PLUGINS = [
    # subcommands
    "_nebari.subcommands.info",
    "_nebari.subcommands.init",
    "_nebari.subcommands.dev",
    "_nebari.subcommands.deploy",
    "_nebari.subcommands.destroy",
    "_nebari.subcommands.keycloak",
    "_nebari.subcommands.plugin",
    "_nebari.subcommands.render",
    "_nebari.subcommands.support",
    "_nebari.subcommands.upgrade",
    "_nebari.subcommands.validate",
]

DEFAULT_STAGES_PLUGINS = [
    # stages
    "_nebari.stages.bootstrap",
    "_nebari.stages.terraform_state",
    "_nebari.stages.infrastructure",
    "_nebari.stages.kubernetes_initialize",
    "_nebari.stages.kubernetes_ingress",
    "_nebari.stages.kubernetes_keycloak",
    "_nebari.stages.kubernetes_keycloak_configuration",
    "_nebari.stages.kubernetes_services",
    "_nebari.stages.nebari_tf_extensions",
    "_nebari.stages.kubernetes_kuberhealthy",
    "_nebari.stages.kubernetes_kuberhealthy_healthchecks",
]


class NebariPluginManager:
    plugin_manager = pluggy.PluginManager("nebari")

    exclude_default_stages: bool = False
    exclude_stages: typing.List[str] = []

    def __init__(self) -> None:
        self.plugin_manager.add_hookspecs(hookspecs)

        if not hasattr(sys, "_called_from_test"):
            # Only load plugins if not running tests
            self.plugin_manager.load_setuptools_entrypoints("nebari")

        self.load_plugins(DEFAULT_SUBCOMMAND_PLUGINS)

    def load_plugins(self, plugins: typing.List[str]):
        def _import_module_from_filename(plugin: str):
            module_name = f"_nebari.stages._files.{plugin.replace(os.sep, '.')}"
            spec = spec_from_file_location(module_name, plugin)
            if spec is None:
                raise ImportError(f"Can not find {plugin!r} plugin.")
            if spec.loader is None:
                raise ImportError(f"Can not load {plugin!r} plugin.")
            module = module_from_spec(spec)
            sys.modules[module_name] = module
            spec.loader.exec_module(module)
            return module

        for plugin in plugins:
            if plugin.endswith(".py"):
                mod = _import_module_from_filename(plugin)
            else:
                mod = import_module(plugin)

            try:
                self.plugin_manager.register(mod, plugin)
            except ValueError:
                # Plugin already registered
                pass

    def get_available_stages(self):
        if not self.exclude_default_stages:
            self.load_plugins(DEFAULT_STAGES_PLUGINS)

        stages = itertools.chain.from_iterable(self.plugin_manager.hook.nebari_stage())

        # order stages by priority
        sorted_stages = sorted(stages, key=lambda s: s.priority)

        # filter out duplicate stages with same name (keep highest priority)
        visited_stage_names = set()
        filtered_stages = []
        for stage in reversed(sorted_stages):
            if stage.name in visited_stage_names:
                continue
            filtered_stages.insert(0, stage)
            visited_stage_names.add(stage.name)

        # filter out stages which match excluded stages
        included_stages = []
        for stage in filtered_stages:
            for exclude_stage in self.exclude_stages:
                if re.fullmatch(exclude_stage, stage.name) is not None:
                    break
            else:
                included_stages.append(stage)

        return included_stages

    def read_config(self, config_path: typing.Union[str, Path], **kwargs):
        if isinstance(config_path, str):
            config_path = Path(config_path)

        if not config_path.exists():
            raise FileNotFoundError(f"Config file {config_path} not found")

        from _nebari.config import read_configuration

        return read_configuration(config_path, self.config_schema, **kwargs)

    def get_external_plugins(self):
        external_plugins = []
        all_plugins = DEFAULT_SUBCOMMAND_PLUGINS + DEFAULT_STAGES_PLUGINS
        for plugin in self.plugin_manager.get_plugins():
            if plugin.__name__ not in all_plugins:
                external_plugins.append(plugin.__name__)
        return external_plugins

    @property
    def ordered_stages(self):
        return self.get_available_stages()

    @property
    def config_schema(self):
        classes = [schema.Main] + [
            _.input_schema for _ in self.ordered_stages if _.input_schema is not None
        ]
        return type("ConfigSchema", tuple(classes[::-1]), {})


nebari_plugin_manager = NebariPluginManager()



---
File: nebari/src/nebari/schema.py
---

import enum
from typing import Annotated

import pydantic
from pydantic import ConfigDict, Field, StringConstraints, field_validator
from ruamel.yaml import yaml_object

from _nebari.utils import escape_string, yaml
from _nebari.version import __version__, rounded_ver_parse

# Regex for suitable project names
project_name_regex = r"^[A-Za-z][A-Za-z0-9\-_]{1,14}[A-Za-z0-9]$"
project_name_pydantic = Annotated[str, StringConstraints(pattern=project_name_regex)]

# Regex for suitable namespaces
namespace_regex = r"^[A-Za-z][A-Za-z\-_]*[A-Za-z]$"
namespace_pydantic = Annotated[str, StringConstraints(pattern=namespace_regex)]

email_regex = "^[^ @]+@[^ @]+\\.[^ @]+$"
email_pydantic = Annotated[str, StringConstraints(pattern=email_regex)]

github_url_regex = r"^(https://)?github\.com/([^/]+)/([^/]+)/?$"
github_url_pydantic = Annotated[str, StringConstraints(pattern=github_url_regex)]


class Base(pydantic.BaseModel):
    model_config = ConfigDict(
        extra="forbid",
        validate_assignment=True,
        populate_by_name=True,
    )


@yaml_object(yaml)
class ProviderEnum(str, enum.Enum):
    local = "local"
    existing = "existing"
    aws = "aws"
    gcp = "gcp"
    azure = "azure"

    @classmethod
    def to_yaml(cls, representer, node):
        return representer.represent_str(node.value)


class ExtraFieldSchema(Base):
    model_config = ConfigDict(
        extra="allow",
        validate_assignment=True,
        populate_by_name=True,
    )
    immutable: bool = (
        False  # Whether field supports being changed after initial deployment
    )


class Main(Base):
    project_name: project_name_pydantic = Field(json_schema_extra={"immutable": True})
    namespace: namespace_pydantic = "dev"
    provider: ProviderEnum = Field(
        default=ProviderEnum.local,
        json_schema_extra={"immutable": True},
    )
    # In nebari_version only use major.minor.patch version - drop any pre/post/dev suffixes
    nebari_version: Annotated[str, Field(validate_default=True)] = __version__

    prevent_deploy: bool = (
        False  # Optional, but will be given default value if not present
    )

    # If the nebari_version in the schema is old
    # we must tell the user to first run nebari upgrade
    @field_validator("nebari_version")
    @classmethod
    def check_default(cls, value):
        assert cls.is_version_accepted(
            value
        ), f"nebari_version={value} is not an accepted version, it must be equivalent to {__version__}.\nInstall a different version of nebari or run nebari upgrade to ensure your config file is compatible."
        return value

    @classmethod
    def is_version_accepted(cls, v):
        return v != "" and rounded_ver_parse(v) == rounded_ver_parse(__version__)

    @property
    def escaped_project_name(self):
        """Escaped project-name know to be compatible with all clouds"""
        project_name = self.project_name

        if self.provider == ProviderEnum.azure and "-" in project_name:
            project_name = escape_string(project_name, escape_char="a")

        if self.provider == ProviderEnum.aws and project_name.startswith("aws"):
            project_name = "a" + project_name

        return project_name


def is_version_accepted(v):
    """
    Given a version string, return boolean indicating whether
    nebari_version in the nebari-config.yaml would be acceptable
    for deployment with the current Nebari package.
    """
    return Main.is_version_accepted(v)



---
File: nebari/tests/common/__init__.py
---




---
File: nebari/tests/common/conda_store_utils.py
---

import re

import requests

from tests.tests_deployment import constants


def get_conda_store_session():
    """Log into conda-store using the test account and get session"""
    session = requests.Session()
    r = session.get(
        f"https://{constants.NEBARI_HOSTNAME}/conda-store/login/?next=", verify=False
    )
    auth_url = re.search('action="([^"]+)"', r.content.decode("utf8")).group(1)
    response = session.post(
        auth_url.replace("&amp;", "&"),
        headers={"Content-Type": "application/x-www-form-urlencoded"},
        data={
            "username": constants.KEYCLOAK_USERNAME,
            "password": constants.KEYCLOAK_PASSWORD,
            "credentialId": "",
        },
        verify=False,
    )
    assert response.status_code == 200
    return session


def get_conda_store_user_permissions():
    """Log into conda-store using the test account and get session and using the token in
    session call conda-store API to get permissions.
    """
    session = get_conda_store_session()
    token = session.cookies.get("conda-store-auth")
    response = requests.get(
        f"https://{constants.NEBARI_HOSTNAME}/conda-store/api/v1/permission/",
        headers={"Authorization": f"Bearer {token}"},
        verify=False,
    )
    assert response.status_code == 200
    return response.json()



---
File: nebari/tests/common/config_mod_utils.py
---

import dataclasses
import typing

from _nebari.stages.infrastructure import AWSNodeGroup, GCPNodeGroup
from _nebari.stages.kubernetes_services import (
    AccessEnum,
    CondaEnvironment,
    JupyterLabProfile,
    KubeSpawner,
)

PREEMPTIBLE_NODE_GROUP_NAME = "preemptible-node-group"


@dataclasses.dataclass
class GPUConfig:
    cloud: str
    gpu_name: str
    node_selector: str
    node_selector_val: str
    extra_config: dict
    min_nodes: typing.Optional[int] = 0
    max_nodes: typing.Optional[int] = 2
    node_group_name: typing.Optional[str] = "gpu-node"
    docker_image: typing.Optional[str] = "quay.io/nebari/nebari-jupyterlab-gpu:2023.7.1"

    def node(self):
        return {
            "instance": self.gpu_name,
            "min_nodes": self.min_nodes,
            "max_nodes": self.max_nodes,
            **self.extra_config,
        }


AWS_GPU_CONF = GPUConfig(
    cloud="amazon_web_services",
    gpu_name="g4dn.xlarge",
    node_selector="beta.kubernetes.io/instance-type",
    node_selector_val="g4dn.xlarge",
    extra_config={
        "single_subnet": False,
        "gpu": True,
    },
)


GCP_GPU_CONF = GPUConfig(
    cloud="google_cloud_platform",
    gpu_name="n1-standard-16",
    node_selector="cloud.google.com/gke-nodepool",
    node_selector_val="g4dn.xlarge",
    extra_config={"guest_accelerators": [{"name": "nvidia-tesla-t4", "count": 1}]},
)


GPU_CONFIG = {
    "aws": AWS_GPU_CONF,
    "gcp": GCP_GPU_CONF,
}


def _create_gpu_environment():
    return CondaEnvironment(
        name="gpu",
        channels=["pytorch", "nvidia", "conda-forge"],
        dependencies=[
            "python=3.10.8",
            "ipykernel=6.21.0",
            "ipywidgets==7.7.1",
            "torchvision",
            "torchaudio",
            "cudatoolkit",
            "pytorch-cuda=11.7",
            "pytorch::pytorch",
        ],
    )


def add_gpu_config(config, cloud="aws"):
    gpu_config = GPU_CONFIG.get(cloud)
    if not gpu_config:
        raise ValueError(f"GPU not supported/tested on {cloud}")

    if cloud == "aws":
        gpu_node_group = AWSNodeGroup(
            instance=gpu_config.gpu_name,
            min_nodes=gpu_config.min_nodes,
            max_nodes=gpu_config.max_nodes,
            single_subnet=gpu_config.extra_config["single_subnet"],
            gpu=gpu_config.extra_config["gpu"],
        )
        kubespawner_overrides = KubeSpawner(
            image=gpu_config.docker_image,
            cpu_limit=4,
            cpu_guarantee=3,
            mem_limit="16G",
            mem_guarantee="10G",
            extra_resource_limits={"nvidia.com/gpu": 1},
            node_selector={
                gpu_config.node_selector: gpu_config.node_selector_val,
            },
        )
    else:
        gpu_node_group = None
        kubespawner_overrides = None

    jupyterlab_profile = JupyterLabProfile(
        display_name="GPU Instance",
        description="4 CPU / 16GB RAM / 1 NVIDIA T4 GPU (16 GB GPU RAM)",
        access=AccessEnum.all,
        groups=None,
        kubespawner_override=kubespawner_overrides,
    )

    cloud_section = getattr(config, gpu_config.cloud, None)
    cloud_section.node_groups[gpu_config.node_group_name] = gpu_node_group
    config.profiles.jupyterlab.append(jupyterlab_profile)
    config.environments["environment-gpu.yaml"] = _create_gpu_environment()

    return config


def add_preemptible_node_group(config, cloud="aws"):
    node_group = None
    if cloud == "aws":
        cloud_name = "amazon_web_services"
        # TODO: how to make preemptible?
        node_group = AWSNodeGroup(
            instance="m5.xlarge",
            min_nodes=1,
            max_nodes=5,
            single_subnet=False,
        )
    elif cloud == "gcp":
        cloud_name = "google_cloud_platform"
        node_group = GCPNodeGroup(
            instance="n1-standard-8",
            min_nodes=1,
            max_nodes=5,
            preemptible=True,
        )
    else:
        raise ValueError("Invalid cloud for preemptible config")

    cloud_section = getattr(config, cloud_name, None)
    if node_group:
        cloud_section.node_groups[PREEMPTIBLE_NODE_GROUP_NAME] = node_group

    return config



---
File: nebari/tests/common/handlers.py
---

import logging
import re
import time

from playwright.sync_api import expect

logger = logging.getLogger()


class JupyterLab:
    def __init__(self, navigator):
        logger.debug(">>> Starting notebook manager...")
        self.nav = navigator
        self.page = self.nav.page

    def reset_workspace(self):
        """Reset the JupyterLab workspace."""
        logger.debug(">>> Resetting JupyterLab workspace")

        # Check for and handle kernel popup
        logger.debug(">>> Checking for kernel popup")
        if self._check_for_kernel_popup():
            self._handle_kernel_popup()

        # Shutdown all kernels
        logger.debug(">>> Shutting down all kernels")
        self._shutdown_all_kernels()

        # Navigate back to root folder and close all tabs
        logger.debug(">>> Navigating to root folder and closing all tabs")
        self._navigate_to_root_folder()
        logger.debug(">>> Closing all tabs")
        self._close_all_tabs()

        # Ensure theme and launcher screen
        logger.debug(">>> Ensuring theme and launcher screen")
        self._assert_theme_and_launcher()

    def set_environment(self, kernel):
        """Set environment for a Jupyter notebook."""
        if not self._check_for_kernel_popup():
            self._trigger_kernel_change_popup()

        self._handle_kernel_popup(kernel)
        self._wait_for_kernel_label(kernel)

    def write_file(self, filepath, content):
        """Write a file to the Nebari instance filesystem."""
        logger.debug(f">>> Writing file to {filepath}")
        self._open_terminal()
        self._execute_terminal_commands(
            [f"cat <<EOF >{filepath}", content, "EOF", f"ls {filepath}"]
        )
        time.sleep(2)

    def _check_for_kernel_popup(self):
        """Check if the kernel popup is open."""
        logger.debug(">>> Checking for kernel popup")
        self.page.wait_for_load_state()
        time.sleep(3)
        visible = self.page.get_by_text("Select KernelStart a new").is_visible()
        logger.debug(f">>> Kernel popup visible: {visible}")
        return visible

    def _handle_kernel_popup(self, kernel=None):
        """Handle kernel popup by selecting the appropriate kernel or dismissing the popup."""
        if kernel:
            self._select_kernel(kernel)
        else:
            self._dismiss_kernel_popup()

    def _dismiss_kernel_popup(self):
        """Dismiss the kernel selection popup."""
        logger.debug(">>> Dismissing kernel popup")
        no_kernel_button = self.page.get_by_role("dialog").get_by_role(
            "button", name="No Kernel"
        )
        if no_kernel_button.is_visible():
            no_kernel_button.click()
        else:
            try:
                self.page.get_by_role("button", name="Cancel").click()
            except Exception:
                raise ValueError("Unable to escape kernel selection dialog.")

    def _shutdown_all_kernels(self):
        """Shutdown all running kernels."""
        logger.debug(">>> Shutting down all kernels")

        # Open the "Kernel" menu
        self.page.get_by_role("menuitem", name="Kernel").click()

        # Locate the "Shut Down All Kernels…" menu item
        shut_down_all = self.page.get_by_role("menuitem", name="Shut Down All Kernels…")

        # If it's not visible or is disabled, there's nothing to shut down
        if not shut_down_all.is_visible() or shut_down_all.is_disabled():
            logger.debug(">>> No kernels to shut down")
            return

        # Otherwise, click to shut down all kernels and confirm
        shut_down_all.click()
        self.page.get_by_role("button", name="Shut Down All").click()

    def _navigate_to_root_folder(self):
        """Navigate back to the root folder in JupyterLab."""
        # Make sure the home directory is select in the sidebar
        if not self.page.get_by_role(
            "region", name="File Browser Section"
        ).is_visible():
            file_browser_tab = self.page.get_by_role("tab", name="File Browser")
            file_browser_tab.click()

        logger.debug(">>> Navigating to root folder")
        self.page.get_by_title(f"/home/{self.nav.username}", exact=True).locator(
            "path"
        ).click()

    def _close_all_tabs(self):
        """Close all open tabs in JupyterLab."""
        logger.debug(">>> Closing all tabs")
        self.page.get_by_text("File", exact=True).click()
        self.page.get_by_role("menuitem", name="Close All Tabs", exact=True).click()

        if self.page.get_by_text("Save your work", exact=True).is_visible():
            self.page.get_by_role(
                "button", name="Discard changes to file", exact=True
            ).click()

    def _assert_theme_and_launcher(self):
        """Ensure that the theme is set to JupyterLab Dark and Launcher screen is visible."""
        expect(
            self.page.get_by_text(
                "Set Preferred Dark Theme: JupyterLab Dark", exact=True
            )
        ).to_be_hidden()
        self.page.get_by_title("VS Code [↗]").wait_for(state="visible")

    def _open_terminal(self):
        """Open a new terminal in JupyterLab."""
        self.page.get_by_text("File", exact=True).click()
        self.page.get_by_text("New", exact=True).click()
        self.page.get_by_role("menuitem", name="Terminal").get_by_text(
            "Terminal"
        ).click()

    def _execute_terminal_commands(self, commands):
        """Execute a series of commands in the terminal."""
        for command in commands:
            self.page.get_by_role("textbox", name="Terminal input").fill(command)
            self.page.get_by_role("textbox", name="Terminal input").press("Enter")
            time.sleep(0.5)


class Notebook(JupyterLab):
    def __init__(self, navigator):
        logger.debug(">>> Starting notebook manager...")
        self.nav = navigator
        self.page = self.nav.page

    def _open_notebook(self, notebook_name):
        """Open a notebook in JupyterLab."""
        self.page.get_by_text("File", exact=True).click()
        self.page.locator("#jp-mainmenu-file").get_by_text("Open from Path…").click()

        expect(self.page.get_by_text("Open PathPathCancelOpen")).to_be_visible()

        # Fill notebook name into the textbox and click Open
        self.page.get_by_placeholder("/path/relative/to/jlab/root").fill(notebook_name)
        self.page.get_by_role("button", name="Open").click()
        if self.page.get_by_text("Could not find path:").is_visible():
            self.page.get_by_role("button", name="Dismiss").click()
            raise ValueError(f"Notebook {notebook_name} not found")

        # make sure that this notebook is one currently selected
        expect(self.page.get_by_role("tab", name=notebook_name)).to_be_visible()

    def _run_all_cells(self):
        """Run all cells in a Jupyter notebook."""
        self.page.get_by_role("menuitem", name="Run").click()
        run_all_cells = self.page.locator("#jp-mainmenu-run").get_by_text(
            "Run All Cells", exact=True
        )
        if run_all_cells.is_visible():
            run_all_cells.click()
        else:
            self.page.get_by_text("Restart the kernel and run").click()
            # Check if restart popup is visible
            restart_popup = self.page.get_by_text("Restart Kernel?")
            if restart_popup.is_visible():
                restart_popup.click()
                self.page.get_by_role("button", name="Confirm Kernel Restart").click()

    def _wait_for_commands_completion(
        self, timeout: float, completion_wait_time: float
    ):
        """
        Wait for commands to finish running

        Parameters
        ----------
        timeout: float
            Time in seconds to wait for the expected output text to appear.
        completion_wait_time: float
        Time in seconds to wait between checking for expected output text.
        """
        elapsed_time = 0.0
        still_visible = True
        start_time = time.time()
        while elapsed_time < timeout:
            running = self.nav.page.get_by_text("[*]").all()
            still_visible = any(list(map(lambda r: r.is_visible(), running)))
            if not still_visible:
                break
            elapsed_time = time.time() - start_time
            time.sleep(completion_wait_time)
        if still_visible:
            raise ValueError(
                f"Timeout Waited for commands to finish, "
                f"but couldn't finish in {timeout} sec"
            )

    def _get_outputs(self):
        output_elements = self.nav.page.query_selector_all(".jp-OutputArea-output")
        text_content = [element.text_content().strip() for element in output_elements]
        return text_content

    def run_notebook(self, notebook_name, kernel):
        """Run a notebook in JupyterLab."""
        # Open the notebook
        logger.debug(f">>> Opening notebook: {notebook_name}")
        self._open_notebook(notebook_name)

        # Set environment
        logger.debug(f">>> Setting environment for kernel: {kernel}")
        self.set_environment(kernel=kernel)

        # Run all cells
        logger.debug(">>> Running all cells")
        self._run_all_cells()

        # Wait for commands to finish running
        logger.debug(">>> Waiting for commands to finish running")
        self._wait_for_commands_completion(timeout=300, completion_wait_time=5)

        # Get the outputs
        logger.debug(">>> Gathering outputs")
        outputs = self._get_outputs()

        return outputs

    def _trigger_kernel_change_popup(self):
        """Trigger the kernel change popup. (expects a notebook to be open)"""
        self.page.get_by_role("menuitem", name="Kernel").click()
        kernel_menu = self.page.get_by_role("menuitem", name="Change Kernel…")
        if kernel_menu.is_visible():
            kernel_menu.click()
            self.page.get_by_text("Select KernelStart a new").wait_for(state="visible")
            logger.debug(">>> Kernel popup is visible")
        else:
            pass

    def _select_kernel(self, kernel):
        """Select a kernel from the popup."""
        logger.debug(f">>> Selecting kernel: {kernel}")

        self.page.get_by_role("dialog").get_by_label("", exact=True).fill(kernel)

        # List of potential selectors
        selectors = [
            self.page.get_by_role("cell", name=re.compile(kernel, re.IGNORECASE)).nth(
                1
            ),
            self.page.get_by_role("cell", name=re.compile(kernel, re.IGNORECASE)).first,
            self.page.get_by_text(kernel, exact=True).nth(1),
        ]

        # Try each selector until one is visible and clickable
        # this is done due to the different ways the kernel can be displayed
        # as part of the new extension
        for selector in selectors:
            if selector.is_visible():
                selector.click()
                logger.debug(f">>> Kernel {kernel} selected")
                return

        # If none of the selectors match, dismiss the popup and raise an error
        self._dismiss_kernel_popup()
        raise ValueError(f"Kernel {kernel} not found in the list of kernels")

    def _wait_for_kernel_label(self, kernel):
        """Wait for the kernel label to be visible."""
        kernel_label_loc = self.page.get_by_role("button", name=kernel)
        if not kernel_label_loc.is_visible():
            kernel_label_loc.wait_for(state="attached")
        logger.debug(f">>> Kernel label {kernel} is now visible")


class CondaStore(JupyterLab):
    def __init__(self, navigator):
        self.page = navigator.page
        self.nav = navigator

    def _open_conda_store_service(self):
        self.page.get_by_text("Services", exact=True).click()
        self.page.get_by_text("Environment Management").click()
        expect(self.page.get_by_role("tab", name="conda-store")).to_be_visible()
        time.sleep(2)

    def _open_new_environment_tab(self):
        self.page.get_by_label("Create a new environment in").click()
        expect(
            self.page.get_by_role("button", name="Create", exact=True)
        ).to_be_visible()

    def _assert_user_namespace(self):
        user_namespace_dropdown = self.page.get_by_role(
            "button", name=f"{self.nav.username} Create a new"
        )

        if not (
            expect(
                user_namespace_dropdown
            ).to_be_visible()  # this asserts the user namespace shows in the UI
            or self.nav.username
            in user_namespace_dropdown.text_content()  # this attests that the namespace corresponds to the logged in user
        ):
            raise ValueError(f"User namespace {self.nav.username} not found")

    def _get_shown_namespaces(self):
        _envs = self.page.locator("#environmentsScroll").get_by_role("button")
        _env_contents = [env.text_content() for env in _envs.all()]
        # Remove the "New" entry from each namespace "button" text
        return [
            namespace.replace(" New", "")
            for namespace in _env_contents
            if namespace != " New"
        ]

    def _assert_logged_in(self):
        login_button = self.page.get_by_role("button", name="Log in")
        if login_button.is_visible():
            login_button.click()
            # wait for page to reload
            self.page.wait_for_load_state()
            time.sleep(2)
            # A reload is required as conda-store "created" a new page once logged in
            self.page.reload()
            self.page.wait_for_load_state()
            self._open_conda_store_service()
        else:
            # In this case logout should already be visible
            expect(self.page.get_by_role("button", name="Logout")).to_be_visible()
        self._assert_user_namespace()

    def conda_store_ui(self):
        logger.debug(">>> Opening Conda Store UI")
        self._open_conda_store_service()

        logger.debug(">>> Assert user is logged in")
        self._assert_logged_in()

        logger.debug(">>> Opening new environment tab")
        self._open_new_environment_tab()



---
File: nebari/tests/common/kube_api.py
---

import socket
import typing

from kubernetes import config
from kubernetes.client.api import core_v1_api
from kubernetes.client.models import V1Pod
from kubernetes.stream import portforward


def kubernetes_port_forward(
    pod_labels: typing.Dict[str, str], port: int, namespace: str = "dev"
) -> V1Pod:
    """Given pod labels and port, finds the pod name and port forwards to
    the given port.
    :param pod_labels: dict of labels, by which to search the pod
    :param port: port number to forward
    :param namespace: kubernetes namespace name
    :return: kubernetes pod object
    """
    config.load_kube_config()
    core_v1 = core_v1_api.CoreV1Api()
    label_selector = ",".join([f"{k}={v}" for k, v in pod_labels.items()])
    pods = core_v1.list_namespaced_pod(
        namespace=namespace, label_selector=label_selector
    )
    assert pods.items
    pod = pods.items[0]
    pod_name = pod.metadata.name

    def kubernetes_create_connection(address, *args, **kwargs):
        pf = portforward(
            core_v1.connect_get_namespaced_pod_portforward,
            pod_name,
            namespace,
            ports=str(port),
        )
        return pf.socket(port)

    socket.create_connection = kubernetes_create_connection
    return pod



---
File: nebari/tests/common/navigator.py
---

import logging
import re
import urllib
from abc import ABC
from pathlib import Path

from playwright.sync_api import expect, sync_playwright
from yarl import URL

logger = logging.getLogger()


class NavigatorMixin(ABC):
    """
    A mixin class providing common setup and teardown functionalities for Playwright navigators.
    """

    def __init__(
        self,
        headless=False,
        slow_mo=0,
        browser="chromium",
        video_dir=None,
        video_name_prefix=None,
    ):
        self.headless = headless
        self.slow_mo = slow_mo
        self.browser_name = browser
        self.video_dir = video_dir
        self.video_name_prefix = video_name_prefix
        self.initialized = False
        self.setup()

    def setup(self):
        """Setup Playwright browser and context."""
        logger.debug(">>> Setting up browser for Playwright")
        self.playwright = sync_playwright().start()
        try:
            self.browser = getattr(self.playwright, self.browser_name).launch(
                headless=self.headless, slow_mo=self.slow_mo
            )
        except AttributeError:
            raise RuntimeError(
                f"{self.browser_name} browser is not recognized."
            ) from None

        self.context = self.browser.new_context(
            ignore_https_errors=True,
            record_video_dir=self.video_dir,
        )
        self.page = self.context.new_page()
        self.initialized = True

    def _rename_test_video_path(self, video_path: Path):
        """Rename the test video file to the test unique identifier."""
        video_file_name = (
            f"{self.video_name_prefix}.mp4" if self.video_name_prefix else None
        )
        if video_file_name and video_path:
            Path.rename(video_path, Path(self.video_dir) / video_file_name)

    def teardown(self) -> None:
        """Teardown Playwright browser and context."""
        if self.initialized:
            # Rename the video file to the test unique identifier
            current_video_path = Path(self.page.video.path())
            self._rename_test_video_path(current_video_path)

            self.context.close()
            self.browser.close()
            self.playwright.stop()
            logger.debug(">>> Teardown complete.")
            self.initialized = False

    def __enter__(self):
        """Enter the runtime context related to this object."""
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        """Exit the runtime context related to this object."""
        self.teardown()


class LoginNavigator(NavigatorMixin):
    """
    A navigator class to handle login operations for Nebari.
    """

    def __init__(self, nebari_url, username, password, auth="password", **kwargs):
        super().__init__(**kwargs)
        self._nebari_url = URL(nebari_url)
        self.username = username
        self.password = password
        self.auth = auth
        logger.debug(
            f"LoginNavigator initialized with {self.auth} auth method. :: {self.nebari_url}"
        )

    @property
    def nebari_url(self):
        return self._nebari_url.human_repr()

    def login(self):
        """Login to Nebari deployment using the provided authentication method."""
        login_methods = {
            "google": self._login_google,
            "password": self._login_password,
        }
        try:
            login_methods[self.auth]()
        except KeyError:
            raise ValueError(f"Auth type {self.auth} is invalid.")

    def logout(self):
        """Logout from Nebari deployment."""
        self.page.get_by_role("button", name="Logout").click()
        self.page.wait_for_load_state

    def _login_google(self):
        logger.debug(">>> Sign in via Google and start the server")
        self.page.goto(url=self.nebari_url)
        expect(self.page).to_have_url(re.compile(f"{self.nebari_url}*"))

        self.page.get_by_role("button", name="Sign in with Keycloak").click()
        self.page.get_by_role("link", name="Google").click()
        self.page.get_by_role("textbox", name="Email or phone").fill(self.username)
        self.page.get_by_role("button", name="Next").click()
        self.page.get_by_role("textbox", name="Enter your password").fill(self.password)
        self.page.get_by_role("button", name="Next").click()
        self.page.wait_for_load_state("networkidle")

    def _login_password(self):
        logger.debug(">>> Sign in via Username/Password")
        self.page.goto(url=self.nebari_url)
        expect(self.page).to_have_url(re.compile(f"{self.nebari_url}*"))

        self.page.get_by_role("button", name="Sign in with Keycloak").click()
        self.page.get_by_label("Username").fill(self.username)
        self.page.get_by_label("Password").fill(self.password)
        self.page.get_by_role("button", name="Sign In").click()
        self.page.wait_for_load_state()

        # Redirect to hub control panel
        self.page.goto(urllib.parse.urljoin(self.nebari_url, "hub/home"))
        expect(self.page.get_by_role("button", name="Logout")).to_be_visible()


class ServerManager(LoginNavigator):
    """
    Manages server operations such as starting and stopping a Nebari server.
    """

    def __init__(
        self, instance_name="small-instance", wait_for_server_spinup=300_000, **kwargs
    ):
        super().__init__(**kwargs)
        self.instance_name = instance_name
        self.wait_for_server_spinup = wait_for_server_spinup

    def start_server(self):
        """Start a Nebari server, handling different UI states."""
        self.login()

        logout_button = self.page.get_by_text("Logout", exact=True)
        logout_button.wait_for(state="attached", timeout=90000)

        start_locator = self.page.get_by_role("button", name="My Server", exact=True)
        if start_locator.is_visible():
            start_locator.click()
        else:
            start_locator = self.page.get_by_role("button", name="Start My Server")
            if start_locator.is_visible():
                start_locator.click()

        server_options = self.page.get_by_role("heading", name="Server Options")
        if server_options.is_visible():
            self.page.locator(f"#profile-item-{self.instance_name}").click()
            self.page.get_by_role("button", name="Start").click()

        self.page.wait_for_url(re.compile(f".*user/{self.username}/.*"), timeout=180000)
        file_locator = self.page.get_by_text("File", exact=True)
        file_locator.wait_for(state="attached", timeout=self.wait_for_server_spinup)

        logger.debug(">>> Profile Spawn complete.")

    def stop_server(self):
        """Stops the Nebari server via the Hub Control Panel."""
        self.page.get_by_text("File", exact=True).click()
        with self.context.expect_page() as page_info:
            self.page.get_by_role("menuitem", name="Home", exact=True).click()

        home_page = page_info.value
        home_page.wait_for_load_state()
        stop_button = home_page.get_by_role("button", name="Stop My Server")
        stop_button.wait_for(state="visible")
        stop_button.click()
        stop_button.wait_for(state="hidden")


# Factory method for creating different navigators if needed
def navigator_factory(navigator_type, **kwargs):
    navigators = {
        "login": LoginNavigator,
        "server": ServerManager,
    }
    return navigators[navigator_type](**kwargs)



---
File: nebari/tests/common/playwright_fixtures.py
---

import logging
import os
from pathlib import Path

import dotenv
import pytest

from tests.common.navigator import navigator_factory

logger = logging.getLogger()


def load_env_vars():
    """Load environment variables using dotenv and return necessary parameters."""
    dotenv.load_dotenv()
    return {
        "nebari_url": os.getenv("NEBARI_FULL_URL"),
        "username": os.getenv("KEYCLOAK_USERNAME"),
        "password": os.getenv("KEYCLOAK_PASSWORD"),
    }


def build_params(request, pytestconfig, extra_params=None):
    """Construct and return parameters for navigator instances."""
    env_vars = load_env_vars()

    # Retrieve values from request or environment
    nebari_url = request.param.get("nebari_url") or env_vars.get("nebari_url")
    username = request.param.get("keycloak_username") or env_vars.get("username")
    password = request.param.get("keycloak_password") or env_vars.get("password")

    # Validate that required fields are present
    if not nebari_url:
        raise ValueError(
            "Error: 'nebari_url' is required but was not provided in "
            "'request.param' or environment variables."
        )
    if not username:
        raise ValueError(
            "Error: 'username' is required but was not provided in "
            "'request.param' or environment variables."
        )
    if not password:
        raise ValueError(
            "Error: 'password' is required but was not provided in "
            "'request.param' or environment variables."
        )

    # Build the params dictionary once all required fields are validated
    params = {
        "nebari_url": nebari_url,
        "username": username,
        "password": password,
        "auth": "password",
        "video_dir": "videos/",
        "headless": pytestconfig.getoption("--headed"),
        "slow_mo": pytestconfig.getoption("--slowmo"),
    }

    if extra_params:
        params.update(extra_params)

    return params


def create_navigator(navigator_type, params):
    """Create and return a navigator instance."""
    return navigator_factory(navigator_type, **params)


def pytest_sessionstart(session):
    """Called before the start of the session. Clean up the videos directory."""
    _videos_path = Path("./videos")
    if _videos_path.exists():
        for filename in os.listdir("./videos"):
            filepath = _videos_path / filename
            filepath.unlink()


# scope="function" will make sure that the fixture is created and destroyed for each test function.
@pytest.fixture(scope="function")
def navigator_session(request, pytestconfig):
    session_type = request.param.get("session_type")
    extra_params = request.param.get("extra_params", {})

    # Get the test function name for video naming
    test_name = request.node.originalname
    video_name_prefix = f"video_{test_name}"
    extra_params["video_name_prefix"] = video_name_prefix

    params = build_params(request, pytestconfig, extra_params)

    with create_navigator(session_type, params) as nav:
        # Setup the navigator instance (e.g., login or start server)
        try:
            if session_type == "login":
                nav.login()
            elif session_type == "server":
                nav.start_server()
            yield nav
        except Exception as e:
            logger.debug(e)
            raise


def parameterized_fixture(session_type, **extra_params):
    """Utility function to create parameterized pytest fixtures."""
    return pytest.mark.parametrize(
        "navigator_session",
        [{"session_type": session_type, "extra_params": extra_params}],
        indirect=True,
    )


def server_parameterized(instance_name=None, **kwargs):
    return parameterized_fixture("server", instance_name=instance_name, **kwargs)


def login_parameterized(**kwargs):
    return parameterized_fixture("login", **kwargs)


@pytest.fixture(scope="function")
def navigator(navigator_session):
    """High-level navigator instance. Can be overridden based on the available
    parameterized decorator."""
    yield navigator_session


@pytest.fixture(scope="session")
def test_data_root():
    return Path(__file__).parent / "notebooks"



---
File: nebari/tests/tests_deployment/__init__.py
---




---
File: nebari/tests/tests_deployment/conftest.py
---

import pytest

from tests.tests_deployment.keycloak_utils import delete_client_keycloak_test_roles
from tests.tests_deployment.utils import (
    get_jupyterhub_token,
    get_refresh_jupyterhub_token,
)


@pytest.fixture()
def cleanup_keycloak_roles():
    # setup
    yield
    # teardown
    delete_client_keycloak_test_roles(client_name="jupyterhub")
    delete_client_keycloak_test_roles(client_name="conda_store")


@pytest.fixture(scope="session")
def jupyterhub_access_token():
    return get_jupyterhub_token(note="base-jupyterhub-token")


@pytest.fixture(scope="function")
def refresh_token_response(request, jupyterhub_access_token):
    note = request.param  # Get the parameter passed to the fixture
    yield get_refresh_jupyterhub_token(jupyterhub_access_token, note)


def parameterized_fixture(new_note):
    """Utility function to create parameterized pytest fixtures."""
    return pytest.mark.parametrize(
        "refresh_token_response",
        [new_note],
        indirect=True,
    )


def token_parameterized(note):
    return parameterized_fixture(note)


@pytest.fixture(scope="function")
def access_token_response(refresh_token_response):
    yield refresh_token_response



---
File: nebari/tests/tests_deployment/constants.py
---

import os

NEBARI_HOSTNAME = os.environ.get("NEBARI_HOSTNAME", "github-actions.nebari.dev")
NEBARI_CONFIG_PATH = os.environ.get("NEBARI_CONFIG_PATH", "nebari-config.yaml")
GATEWAY_ENDPOINT = "gateway"

KEYCLOAK_USERNAME = os.environ.get("KEYCLOAK_USERNAME", "nebari")
KEYCLOAK_PASSWORD = os.environ.get("KEYCLOAK_PASSWORD", "nebari")

PARAMIKO_SSH_ALLOW_AGENT = False
PARAMIKO_SSH_LOOK_FOR_KEYS = False



---
File: nebari/tests/tests_deployment/keycloak_utils.py
---

import pathlib

from _nebari.config import read_configuration
from _nebari.keycloak import get_keycloak_admin_from_config
from nebari.plugins import nebari_plugin_manager
from tests.tests_deployment import constants


def get_keycloak_client_details_by_name(client_name, keycloak_admin=None):
    if not keycloak_admin:
        keycloak_admin = get_keycloak_admin()
    clients = keycloak_admin.get_clients()
    for client in clients:
        if client["clientId"] == client_name:
            return client


def get_keycloak_user_details_by_name(username, keycloak_admin=None):
    if not keycloak_admin:
        keycloak_admin = get_keycloak_admin()
    users = keycloak_admin.get_users()
    for user in users:
        if user["username"] == username:
            return user


def get_keycloak_role_details_by_name(roles, role_name):
    for role in roles:
        if role["name"] == role_name:
            return role


def get_keycloak_admin():
    config_schema = nebari_plugin_manager.config_schema
    config_filepath = constants.NEBARI_CONFIG_PATH
    assert pathlib.Path(config_filepath).exists()
    config = read_configuration(config_filepath, config_schema)
    return get_keycloak_admin_from_config(config)


def create_keycloak_client_role(
    client_id: str, role_name: str, scopes: str, component: str
):
    keycloak_admin = get_keycloak_admin()
    keycloak_admin.create_client_role(
        client_id,
        payload={
            "name": role_name,
            "description": f"{role_name} description",
            "attributes": {"scopes": [scopes], "component": [component]},
        },
    )
    client_roles = keycloak_admin.get_client_roles(client_id=client_id)
    return get_keycloak_role_details_by_name(client_roles, role_name)


def assign_keycloak_client_role_to_user(username: str, client_name: str, role: dict):
    """Given a keycloak role and client name, assign that to the user"""
    keycloak_admin = get_keycloak_admin()
    user_details = get_keycloak_user_details_by_name(
        username=username, keycloak_admin=keycloak_admin
    )
    client_details = get_keycloak_client_details_by_name(
        client_name=client_name, keycloak_admin=keycloak_admin
    )
    keycloak_admin.assign_client_role(
        user_id=user_details["id"], client_id=client_details["id"], roles=[role]
    )


def create_keycloak_role(client_name: str, role_name: str, scopes: str, component: str):
    """Create a role keycloak role for the given client with scopes and
    component set in attributes
    """
    keycloak_admin = get_keycloak_admin()
    client_details = get_keycloak_client_details_by_name(
        client_name=client_name, keycloak_admin=keycloak_admin
    )
    return create_keycloak_client_role(
        client_details["id"], role_name=role_name, scopes=scopes, component=component
    )


def get_keycloak_client_role(client_name, role_name):
    keycloak_admin = get_keycloak_admin()
    client_details = get_keycloak_client_details_by_name(
        client_name=client_name, keycloak_admin=keycloak_admin
    )
    return keycloak_admin.get_client_role(
        client_id=client_details["id"], role_name=role_name
    )


def get_keycloak_client_roles(client_name):
    keycloak_admin = get_keycloak_admin()
    client_details = get_keycloak_client_details_by_name(
        client_name=client_name, keycloak_admin=keycloak_admin
    )
    return keycloak_admin.get_client_roles(client_id=client_details["id"])


def get_keycloak_role_groups(client_id, role_name):
    keycloak_admin = get_keycloak_admin()
    return keycloak_admin.get_client_role_groups(
        client_id=client_id, role_name=role_name
    )


def delete_client_keycloak_test_roles(client_name):
    keycloak_admin = get_keycloak_admin()
    client_details = get_keycloak_client_details_by_name(
        client_name=client_name, keycloak_admin=keycloak_admin
    )
    client_roles = keycloak_admin.get_client_roles(client_id=client_details["id"])
    for role in client_roles:
        if not role["name"].startswith("test"):
            continue
        keycloak_admin.delete_client_role(
            client_role_id=client_details["id"],
            role_name=role["name"],
        )



---
File: nebari/tests/tests_deployment/test_conda_store_roles_loaded.py
---

import pytest

from tests.common.conda_store_utils import get_conda_store_user_permissions
from tests.tests_deployment import constants
from tests.tests_deployment.keycloak_utils import (
    assign_keycloak_client_role_to_user,
    create_keycloak_role,
)


@pytest.mark.parametrize(
    "scopes,changed_scopes",
    (
        [
            "admin!namespace=analyst,developer!namespace=nebari-git",
            {"nebari-git/*": ["developer"], "analyst/*": ["admin"]},
        ],
        [
            "admin!namespace=analyst,developer!namespace=invalid-namespace",
            {"analyst/*": ["admin"]},
        ],
        [
            # duplicate namespace role, chose highest permissions
            "admin!namespace=analyst,developer!namespace=analyst",
            {"analyst/*": ["admin"]},
        ],
        ["invalid-role!namespace=analyst", {}],
    ),
)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings(
    "ignore:.*auto_refresh_token is deprecated:DeprecationWarning"
)
@pytest.mark.filterwarnings("ignore::ResourceWarning")
def test_conda_store_roles_loaded_from_keycloak(
    scopes: str, changed_scopes: dict, cleanup_keycloak_roles
):

    # Verify permissions/roles are different from what we're about to set
    # So that this test is actually testing the change
    permissions = get_conda_store_user_permissions()
    entity_roles = permissions["data"]["entity_roles"]
    for namespace, role in changed_scopes.items():
        assert entity_roles[namespace] != role

    role = create_keycloak_role(
        client_name="conda_store",
        # Note: we're clearing this role after every test case, and we're clearing
        # it by name, so it must start with test- to be deleted afterwards
        role_name="test-custom-role",
        scopes=scopes,
        component="conda-store",
    )
    assert role
    # assign created role to the user
    assign_keycloak_client_role_to_user(
        constants.KEYCLOAK_USERNAME, client_name="conda_store", role=role
    )
    permissions = get_conda_store_user_permissions()
    updated_entity_roles = permissions["data"]["entity_roles"]

    # Verify permissions/roles are set to expectation
    assert updated_entity_roles == {
        **entity_roles,
        **changed_scopes,
    }



---
File: nebari/tests/tests_deployment/test_dask_gateway.py
---

import os

import dask_gateway
import pytest

from tests.tests_deployment import constants
from tests.tests_deployment.utils import get_jupyterhub_token


@pytest.fixture
def dask_gateway_object():
    """Connects to Dask Gateway cluster from outside the cluster."""
    os.environ["JUPYTERHUB_API_TOKEN"] = get_jupyterhub_token(
        "dask-gateway-pytest-token"
    )

    # Create custom class from Gateway that disables the tls/ssl verification
    # to do that we will override the self._request_kwargs dictionary within the
    # __init__, targeting aiohttp.ClientSession.request method

    class DaskGateway(dask_gateway.Gateway):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self._request_kwargs.update({"ssl": False})

    return DaskGateway(
        address=f"https://{constants.NEBARI_HOSTNAME}/{constants.GATEWAY_ENDPOINT}",
        auth="jupyterhub",
        proxy_address=f"tcp://{constants.NEBARI_HOSTNAME}:8786",
    )


@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings("ignore::ResourceWarning")
def test_dask_gateway(dask_gateway_object):
    """This test checks if we're able to connect to dask gateway."""
    assert dask_gateway_object.list_clusters() == []


@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings("ignore::ResourceWarning")
def test_dask_gateway_cluster_options(dask_gateway_object):
    """Tests Dask Gateway's cluster options."""
    cluster_options = dask_gateway_object.cluster_options()
    # # dask conda environment is not built in time to be available
    # assert cluster_options.conda_environment == "dask"
    assert cluster_options.profile in {"Small Worker", "Medium Worker"}
    assert cluster_options.environment_vars == {}



---
File: nebari/tests/tests_deployment/test_grafana_api.py
---

import base64

import pytest
import requests

from tests.tests_deployment import constants


@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_grafana_api_not_accessible_with_default_credentials():
    """Making sure that Grafana's API is not accessible on default user/pass"""
    user_pass_b64_encoded = base64.b64encode(b"admin:prom-operator").decode()
    response = requests.get(
        f"https://{constants.NEBARI_HOSTNAME}/monitoring/api/datasources",
        headers={"Authorization": f"Basic {user_pass_b64_encoded}"},
        verify=False,
    )
    assert response.status_code == 401



---
File: nebari/tests/tests_deployment/test_jupyterhub_api.py
---

import pytest
import requests

from tests.tests_deployment import constants
from tests.tests_deployment.conftest import token_parameterized
from tests.tests_deployment.keycloak_utils import (
    assign_keycloak_client_role_to_user,
    create_keycloak_role,
    get_keycloak_client_details_by_name,
    get_keycloak_client_role,
    get_keycloak_client_roles,
    get_keycloak_role_groups,
)
from tests.tests_deployment.utils import get_refresh_jupyterhub_token


@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_jupyterhub_loads_roles_from_keycloak(jupyterhub_access_token):
    response = requests.get(
        url=f"https://{constants.NEBARI_HOSTNAME}/hub/api/users/{constants.KEYCLOAK_USERNAME}",
        headers={"Authorization": f"Bearer {jupyterhub_access_token}"},
        verify=False,
    )
    user = response.json()
    assert set(user["roles"]) == {
        "user",
        "manage-account",
        "jupyterhub_developer",
        "argo-developer",
        "dask_gateway_developer",
        "grafana_viewer",
        "conda_store_developer",
        "argo-viewer",
        "grafana_developer",
        "manage-account-links",
        "view-profile",
        # default roles
        "allow-read-access-to-services-role",
        "allow-group-directory-creation-role",
    }


@token_parameterized(note="get-default-scopes")
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_default_user_role_scopes(access_token_response):
    token_scopes = set(access_token_response.json()["scopes"])
    assert "read:services" in token_scopes


@pytest.mark.filterwarnings(
    "ignore:.*auto_refresh_token is deprecated:DeprecationWarning"
)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_check_default_roles_added_in_keycloak():
    client_roles = get_keycloak_client_roles(client_name="jupyterhub")
    role_names = [role["name"] for role in client_roles]
    assert "allow-app-sharing-role" in role_names
    assert "allow-read-access-to-services-role" in role_names
    assert "allow-group-directory-creation-role" in role_names


@pytest.mark.filterwarnings(
    "ignore:.*auto_refresh_token is deprecated:DeprecationWarning"
)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_check_directory_creation_scope_attributes():
    client_role = get_keycloak_client_role(
        client_name="jupyterhub", role_name="allow-group-directory-creation-role"
    )
    assert client_role["attributes"]["component"][0] == "shared-directory"
    assert client_role["attributes"]["scopes"][0] == "write:shared-mount"


@pytest.mark.filterwarnings(
    "ignore:.*auto_refresh_token is deprecated:DeprecationWarning"
)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_groups_with_mount_permissions():
    client_role = get_keycloak_client_role(
        client_name="jupyterhub", role_name="allow-group-directory-creation-role"
    )
    client_details = get_keycloak_client_details_by_name(client_name="jupyterhub")
    role_groups = get_keycloak_role_groups(
        client_id=client_details["id"], role_name=client_role["name"]
    )
    assert set([group["path"] for group in role_groups]) == set(
        [
            "/developer",
            "/admin",
            "/analyst",
        ]
    )


@token_parameterized(note="before-role-creation-and-assignment")
@pytest.mark.parametrize(
    "component,scopes,expected_scopes_difference",
    (
        [
            "jupyterhub",
            "read:users:shares,read:groups:shares,users:shares",
            {"read:groups:shares", "users:shares", "read:users:shares"},
        ],
        ["invalid-component", "read:users:shares,read:groups:shares,users:shares", {}],
        ["invalid-component", "admin:invalid-scope", {}],
    ),
)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings(
    "ignore:.*auto_refresh_token is deprecated:DeprecationWarning"
)
def test_keycloak_roles_attributes_parsed_as_jhub_scopes(
    component,
    scopes,
    expected_scopes_difference,
    cleanup_keycloak_roles,
    access_token_response,
):
    # check token scopes before role creation and assignment
    token_scopes_before = set(access_token_response.json()["scopes"])
    # create keycloak role with jupyterhub scopes in attributes
    role = create_keycloak_role(
        client_name="jupyterhub",
        # Note: we're clearing this role after every test case, and we're clearing
        # it by name, so it must start with test- to be deleted afterward
        role_name="test-custom-role",
        scopes=scopes,
        component=component,
    )
    assert role
    # assign created role to the user
    assign_keycloak_client_role_to_user(
        constants.KEYCLOAK_USERNAME, client_name="jupyterhub", role=role
    )
    token_response_after = get_refresh_jupyterhub_token(
        old_token=access_token_response.json()["token"],
        note="after-role-creation-and-assignment",
    )
    token_scopes_after = set(token_response_after.json()["scopes"])
    # verify new scopes added/removed
    expected_scopes_difference = token_scopes_after - token_scopes_before
    # Comparing token scopes for the user before and after role assignment
    assert expected_scopes_difference == expected_scopes_difference


@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
def test_jupyterhub_loads_groups_from_keycloak(jupyterhub_access_token):
    response = requests.get(
        f"https://{constants.NEBARI_HOSTNAME}/hub/api/users/{constants.KEYCLOAK_USERNAME}",
        headers={"Authorization": f"Bearer {jupyterhub_access_token}"},
        verify=False,
    )
    user = response.json()
    assert set(user["groups"]) == {"/analyst", "/developer", "/users"}



---
File: nebari/tests/tests_deployment/test_jupyterhub_ssh.py
---

import re
import string
import time
import uuid

import paramiko
import pytest

from _nebari.utils import escape_string
from tests.tests_deployment import constants
from tests.tests_deployment.utils import monkeypatch_ssl_context

monkeypatch_ssl_context()

TIMEOUT_SECS = 300


@pytest.fixture(scope="session")
def paramiko_object(jupyterhub_access_token):
    """Connects to JupyterHub SSH cluster from outside the cluster.

    Ensures the JupyterLab pod is ready before attempting reauthentication
    by setting both `auth_timeout` and `banner_timeout` appropriately,
    and by retrying the connection until the pod is ready or a timeout occurs.
    """
    params = {
        "hostname": constants.NEBARI_HOSTNAME,
        "port": 8022,
        "username": constants.KEYCLOAK_USERNAME,
        "password": jupyterhub_access_token,
        "allow_agent": constants.PARAMIKO_SSH_ALLOW_AGENT,
        "look_for_keys": constants.PARAMIKO_SSH_LOOK_FOR_KEYS,
    }

    ssh_client = paramiko.SSHClient()
    ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())

    yield ssh_client, params

    ssh_client.close()


def invoke_shell(
    client: paramiko.SSHClient, params: dict[str, any]
) -> paramiko.Channel:
    client.connect(**params)
    return client.invoke_shell()


def extract_output(delimiter: str, output: str) -> str:
    # Extract the command output between the start and end delimiters
    match = re.search(rf"{delimiter}start\n(.*)\n{delimiter}end", output, re.DOTALL)
    if match:
        print(match.group(1).strip())
        return match.group(1).strip()
    else:
        return output.strip()


def run_command_list(
    commands: list[str], channel: paramiko.Channel, wait_time: int = 0
) -> dict[str, str]:
    command_delimiters = {}
    for command in commands:
        delimiter = uuid.uuid4().hex
        command_delimiters[command] = delimiter
        b = channel.send(f"echo {delimiter}start; {command}; echo {delimiter}end\n")
        if b == 0:
            print(f"Command '{command}' failed to send")
    # Wait for the output to be ready before reading
    time.sleep(wait_time)
    while not channel.recv_ready():
        time.sleep(1)
        print("Waiting for output")
    output = ""
    while channel.recv_ready():
        output += channel.recv(65535).decode("utf-8")
    outputs = {}
    for command, delimiter in command_delimiters.items():
        command_output = extract_output(delimiter, output)
        outputs[command] = command_output
    return outputs


@pytest.mark.timeout(TIMEOUT_SECS)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings("ignore::ResourceWarning")
def test_print_jupyterhub_ssh(paramiko_object):
    client, params = paramiko_object
    channel = invoke_shell(client, params)
    # Commands to run and just print the output
    commands_print = [
        "id",
        "env",
        "conda info",
        "df -h",
        "ls -la",
        "umask",
    ]
    outputs = run_command_list(commands_print, channel)
    for command, output in outputs.items():
        print(f"COMMAND: {command}")
        print(f"OUTPUT: {output}")
    channel.close()


@pytest.mark.timeout(TIMEOUT_SECS)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings("ignore::ResourceWarning")
def test_exact_jupyterhub_ssh(paramiko_object):
    client, params = paramiko_object
    channel = invoke_shell(client, params)
    # Commands to run and exactly match output
    commands_exact = {
        "id -u": "1000",
        "id -g": "100",
        "whoami": constants.KEYCLOAK_USERNAME,
        "pwd": f"/home/{constants.KEYCLOAK_USERNAME}",
        "echo $HOME": f"/home/{constants.KEYCLOAK_USERNAME}",
        "conda activate default && echo $CONDA_PREFIX": "/opt/conda/envs/default",
        "hostname": f"jupyter-{escape_string(constants.KEYCLOAK_USERNAME, safe=set(string.ascii_lowercase + string.digits), escape_char='-').lower()}",
    }
    outputs = run_command_list(list(commands_exact.keys()), channel)
    for command, output in outputs.items():
        assert (
            output == outputs[command]
        ), f"Command '{command}' output '{outputs[command]}' does not match expected '{output}'"

    channel.close()


@pytest.mark.timeout(TIMEOUT_SECS)
@pytest.mark.filterwarnings("ignore::urllib3.exceptions.InsecureRequestWarning")
@pytest.mark.filterwarnings("ignore::ResourceWarning")
def test_contains_jupyterhub_ssh(paramiko_object):
    client, params = paramiko_object
    channel = invoke_shell(client, params)

    # Commands to run and check if the output contains specific strings
    commands_contain = {
        "ls -la": ".bashrc",
        "cat ~/.bashrc": "Managed by Nebari",
        "cat ~/.profile": "Managed by Nebari",
        "cat ~/.bash_logout": "Managed by Nebari",
        # Ensure we don't copy over extra files from /etc/skel in init container
        "ls -la ~/..202*": "No such file or directory",
        "ls -la ~/..data": "No such file or directory",
    }

    outputs = run_command_list(commands_contain.keys(), channel, 30)
    for command, expected_output in commands_contain.items():
        assert (
            expected_output in outputs[command]
        ), f"Command '{command}' output does not contain expected substring '{expected_output}'. Instead got '{outputs[command]}'"

    channel.close()



---
File: nebari/tests/tests_deployment/test_loki_deployment.py
---

import json
import urllib.parse
import urllib.request as urllib_request

import pytest
from kubernetes.client import V1Pod

from tests.common.kube_api import kubernetes_port_forward

LOKI_BACKEND_PORT = 3100
LOKI_BACKEND_POD_LABELS = {
    "app.kubernetes.io/instance": "nebari-loki",
    "app.kubernetes.io/component": "backend",
}

MINIO_PORT = 9000
MINIO_POD_LABELS = {
    "app.kubernetes.io/instance": "nebari-loki-minio",
    "app.kubernetes.io/name": "minio",
}

LOKI_GATEWAY_PORT = 8080
LOKI_GATEWAY_POD_LABELS = {
    "app.kubernetes.io/instance": "nebari-loki",
    "app.kubernetes.io/component": "gateway",
}


@pytest.fixture(scope="module")
def port_forward_fixture(request):
    """Pytest fixture to port forward loki backend pod to make it accessible
    on localhost so that we can run some tests on it.
    """
    return kubernetes_port_forward(
        pod_labels=request.param["labels"], port=request.param["port"]
    )


def port_forward(labels, port):
    params = {"labels": labels, "port": port}
    return pytest.mark.parametrize("port_forward_fixture", [params], indirect=True)


@pytest.mark.parametrize(
    "endpoint_path",
    (
        "metrics",
        "services",
        "config",
        "ready",
        "log_level",
    ),
)
@port_forward(labels=LOKI_BACKEND_POD_LABELS, port=LOKI_BACKEND_PORT)
def test_loki_endpoint(endpoint_path: str, port_forward_fixture: V1Pod):
    """This will hit some endpoints in the loki API and verify that we
    get a 200 status code, to make sure Loki is working properly.
    :param endpoint_path: a loki api endpoint path
    :param port_forward_fixture: pytest fixture to port forward.
    :return:
    """
    pod_name = port_forward_fixture.metadata.name
    url = f"http://{pod_name}.pod.dev.kubernetes:{LOKI_BACKEND_PORT}/{endpoint_path}"
    response = urllib_request.urlopen(url)
    response.read().decode("utf-8")
    assert response.code == 200
    response.close()


@port_forward(labels=MINIO_POD_LABELS, port=MINIO_PORT)
def test_minio_accessible(port_forward_fixture: V1Pod):
    """This will hit liveness endpoint of minio  API and verify that we
    get a 200 status code, to make sure minio is up and running.
    :param port_forward_fixture: pytest fixture to port forward.
    :return:
    """
    pod_name = port_forward_fixture.metadata.name
    url = f"http://{pod_name}.pod.dev.kubernetes:{MINIO_PORT}/minio/health/live"
    response = urllib_request.urlopen(url)
    response.read().decode("utf-8")
    assert response.code == 200
    response.close()


@port_forward(labels=LOKI_GATEWAY_POD_LABELS, port=LOKI_GATEWAY_PORT)
def test_loki_gateway(port_forward_fixture: V1Pod):
    """This will hit an endpoint of loki gateway API and verify that we
    get a 200 status code, to make sure minio is up and running.
    :param port_forward_fixture: pytest fixture to port forward.
    :return:
    """
    pod_name = port_forward_fixture.metadata.name
    url = f"http://{pod_name}.pod.dev.kubernetes:{LOKI_BACKEND_PORT}/loki/api/v1/labels"
    response = urllib_request.urlopen(url)
    response_content = response.read().decode("utf-8")
    response_json = json.loads(response_content)
    assert response.code == 200
    assert response_json["status"] == "success"
    response.close()


@port_forward(labels=LOKI_GATEWAY_POD_LABELS, port=LOKI_GATEWAY_PORT)
def test_loki_gateway_fetch_logs(port_forward_fixture: V1Pod):
    """This will hit an endpoint of loki gateway API to fetch some logs
    and verify logs received.
    :param port_forward_fixture: pytest fixture to port forward.
    :return: None
    """
    pod_name = port_forward_fixture.metadata.name
    query_params = {
        "limit": "5",
        # Fetch logs for jupyterhub app
        "query": '{app="jupyterhub"}',
    }

    encoded_params = urllib.parse.urlencode(query_params)
    path = f"/loki/api/v1/query_range?{encoded_params}"
    url = f"http://{pod_name}.pod.dev.kubernetes:{LOKI_BACKEND_PORT}/{path}"
    response = urllib_request.urlopen(url)
    response_content = response.read().decode("utf-8")
    response_json = json.loads(response_content)
    assert response.code == 200
    assert response_json["status"] == "success"
    # Make sure log lines received
    assert len(response_json["data"]["result"][0]["values"]) > 0
    response.close()



---
File: nebari/tests/tests_deployment/utils.py
---

import re
import ssl

import requests
import requests.cookies

from tests.tests_deployment import constants


def get_jupyterhub_session():
    session = requests.Session()
    session.cookies.clear()

    try:
        response = session.get(
            f"https://{constants.NEBARI_HOSTNAME}/hub/oauth_login", verify=False
        )
        response.raise_for_status()

        # Extract the authentication URL from the response
        auth_url_match = re.search('action="([^"]+)"', response.content.decode("utf8"))

        if not auth_url_match:
            raise ValueError("Authentication URL not found in response.")

        auth_url = auth_url_match.group(1).replace("&amp;", "&")

        auth_data = {
            "username": constants.KEYCLOAK_USERNAME,
            "password": constants.KEYCLOAK_PASSWORD,
            "credentialId": "",
        }
        response = session.post(
            auth_url,
            headers={"Content-Type": "application/x-www-form-urlencoded"},
            data=auth_data,
            verify=False,
        )
        response.raise_for_status()

    except requests.RequestException as e:
        raise ValueError(f"An error occurred during authentication: {e}")

    return session


def create_jupyterhub_token(note):
    session = get_jupyterhub_session()

    try:
        # Retrieve the XSRF token from session cookies
        xsrf_token = session.cookies.get("_xsrf")
    except requests.cookies.CookieConflictError:
        xsrf_token = session.cookies.get("_xsrf", path="/hub/")

    if not xsrf_token:
        raise ValueError("XSRF token not found in session cookies.")

    headers = {
        "Referer": f"https://{constants.NEBARI_HOSTNAME}/hub/token",
        "X-XSRFToken": xsrf_token,
    }

    url = f"https://{constants.NEBARI_HOSTNAME}/hub/api/users/{constants.KEYCLOAK_USERNAME}/tokens"
    payload = {"note": note, "expires_in": None}

    try:
        response = session.post(url, headers=headers, json=payload, verify=False)
        if response.status_code == 403:
            # Retry with refreshed XSRF token if initial attempt is forbidden
            xsrf_token = response.cookies.get("_xsrf")
            headers["X-XSRFToken"] = xsrf_token
            response = session.post(url, headers=headers, json=payload, verify=False)
        response.raise_for_status()
    except requests.RequestException as e:
        raise ValueError(f"Failed to create JupyterHub token: {e}")

    return response


def get_refresh_jupyterhub_token(old_token, note):
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {old_token}",
    }

    data = {"note": note, "expires_in": None}

    try:
        response = requests.post(
            f"https://{constants.NEBARI_HOSTNAME}/hub/api/users/{constants.KEYCLOAK_USERNAME}/tokens",
            headers=headers,
            json=data,
            verify=False,
        )
        response.raise_for_status()  # Ensure the request was successful

    except requests.exceptions.RequestException as e:
        raise ValueError(f"An error occurred while creating the token: {e}")

    return response


def get_jupyterhub_token(note="jupyterhub-tests-deployment"):
    response = create_jupyterhub_token(note=note)
    try:
        token = response.json()["token"]
    except (KeyError, ValueError) as e:
        print(f"An error occurred while retrieving the token: {e}")
        raise

    return token


def monkeypatch_ssl_context():
    """
    This is a workaround monkeypatch to disable ssl checking to avoid SSL
    failures.
    TODO: A better way to do this would be adding the Traefik's default certificate's
    CA public key to the trusted certificate authorities.
    """

    def create_default_context(context):
        def _inner(*args, **kwargs):
            context.check_hostname = False
            context.verify_mode = ssl.CERT_NONE
            return context

        return _inner

    sslcontext = ssl.create_default_context()
    ssl.create_default_context = create_default_context(sslcontext)



---
File: nebari/tests/tests_e2e/playwright/.env.tpl
---

KEYCLOAK_USERNAME="USERNAME_OR_GOOGLE_EMAIL"
KEYCLOAK_PASSWORD="PASSWORD"
NEBARI_FULL_URL="https://localhost/"



---
File: nebari/tests/tests_e2e/playwright/README.md
---


# Nebari Integration Testing with Playwright

## How Does It Work?

Playwright manages interactions with websites, and we use it to interact with a deployed Nebari instance and test various integrations.

We use Playwright's synchronous API for our test suite. The first task is to launch the web browser of your choice: `chromium`, `webkit`, or `firefox`. Playwright uses browser contexts for test isolation, which can be created by default or manually for scenarios like admin vs. user testing. Each test starts with a blank page, and we navigate to a given URL during the test. This setup is managed by the `setup` method in the `Navigator` class.

## Directory Structure

The project directory structure is as follows:

```
tests
├── common
│   ├── __init__.py
│   ├── navigator.py
│   ├── handlers.py
│   ├── playwright_fixtures.py
├── ...
├── tests_e2e
│   └── playwright
│       ├── README.md
│       └── test_playwright.py
```

- `test_data/`: Contains test files, such as sample notebooks.
- `test_playwright.py`: The main test script that uses Playwright for integration testing.
- `navigator.py`: Contains the `NavigatorMixin` class, which manages browser
  interactions and context. As well as the `LoginNavigator` class, which manages user
  authentication and `ServerManager` class, which manages the user instance spawning.
- `handlers.py`: Contains classes fore handling the different level of access to
  services a User might encounter, such as Notebook, Conda-store and others.

Below is an example of how you might update the **Setup** and **Running the Playwright Tests** sections of your README to reflect the new `Makefile` and the updated `pytest` invocation.

---

## Setup

1. **Use the provided Makefile to install dependencies**

   Navigate to the Playwright tests directory and run the `setup` target:

   ```bash
   cd tests_e2e/playwright
   make setup
   ```

   This command will:

   - Install the pinned dependencies from `requirements.txt`.
   - Install Playwright and its required browser dependencies.
   - Create a new `.env` file from `.env.tpl`.

2. **Fill in the `.env` file**

   Open the newly created `.env` file and fill in the following values:

   - `KEYCLOAK_USERNAME`: Nebari username for username/password login (or Google email for Google sign-in).
   - `KEYCLOAK_PASSWORD`: Password associated with the above username.
   - `NEBARI_FULL_URL`: Full URL (including `https://`) to the Nebari instance (e.g., `https://nebari.quansight.dev/`).

   If you need to create a user for testing, you can do so with:

   ```bash
   nebari keycloak adduser --user <username> <password> --config <NEBARI_CONFIG_PATH>
   ```

*Note:* If you see the warning:
```
BEWARE: your OS is not officially supported by Playwright; downloading fallback build
```
it is not critical. Playwright should still work despite the warning.

## Running the Playwright Tests

You can run the Playwright tests with `pytest`.
```bash
pytest tests_e2e/playwright/test_playwright.py --numprocesses auto
```

> **Important**: Due to how Pytest manages async code; Playwright’s sync calls can conflict with default Pytest concurrency settings, and using `--numprocesses auto` helps mitigate potential thread-blocking issues.


Videos of the test playback will be available in `$PWD/videos/`. To disabled the browser
runtime preview of what is happening while the test runs, pass the `--headed` option to `pytest`. You
can also add the `--slowmo=$MILLI_SECONDS` option to introduce a delay before each
action by Playwright, thereby slowing down the process.

Alternatively, you can run Playwright methods outside of pytest. Below an example of
how to run a test, where you can interface with the Notebook handler:

```python
import os
import dotenv
from pathlib import Path

from tests.common.navigator import ServerManager
from tests.common.handlers import Notebook


# Instantiate the Navigator class
nav = ServerManage(
    nebari_url="https://nebari.quansight.dev/",
    username=os.environ["KEYCLOAK_USERNAME"],
    password=os.environ["KEYCLOAK_PASSWORD"],
    auth="password",
    instance_name="small-instance",
    headless=False,
    slow_mo=100,
)


notebook_manager = Notebook(navigator=navigator)

# Reset the JupyterLab workspace to ensure we're starting with only the Launcher screen open and in the root directory.
notebook_manager.reset_workspace()

notebook_name = "test_notebook_output.ipynb"
notebook_path = Path("tests_e2e/playwright/test_data") / notebook_name

assert notebook_path.exists()

# Write the sample notebook on the Nebari instance
with open(notebook_path, "r") as notebook:
    notebook_manager.write_file(filepath=notebook_name, content=notebook.read())

# Run a sample notebook (and collect the outputs)
outputs = notebook_manager.run_notebook(
    notebook_name=notebook_name, kernel="default"
)

# Close out Playwright and its associated browser handles
nav.teardown()
```

## Writing Playwright Tests

Most testing is done through `locators`, which connect Python objects to HTML elements on the page. Playwright offers several mechanisms for getting a locator for an item on the page, such as `get_by_role`, `get_by_text`, `get_by_label`, and `get_by_placeholder`.

```python
button = self.page.get_by_role("button", name="Sign in with Keycloak")
```

Once you have a handle on a locator, you can interact with it in various ways, depending on the type of object. For example, clicking a button:

```python
button.click()
```

Sometimes you'll need to wait for elements to load on the screen. You can wait for the page to finish loading:

```python
self.page.wait_for_load_state("networkidle")
```

Or wait for something specific to happen with the locator itself:

```python
button.wait_for(timeout=3000, state="attached")
```

Note that waiting for the page to finish loading may be misleading inside of JupyterLab since elements may need to load _inside_ the page or cause several bursts of network traffic.

Playwright has a built-in auto-wait feature that waits for a timeout period for actionable items. See [Playwright Actionability](https://playwright.dev/docs/actionability).

## Parameterized Decorators

### Usage

Parameterized decorators in your test setup allow you to run tests with different configurations or contexts. They are particularly useful for testing different scenarios, such as varying user roles or application states.

To easy the control over the initial setup of spawning the user instance and login, we
already provider two base decorators that can be used in your test:
- `server_parameterized`: Allows to login and spin a new instance of the server, based
  on the provided instance type. Allows for the nav.page to be run within the JupyterLab environment.
- ` login_parameterized`: Allow login to Nebari and sets you test workspace to the main
  hub, allow your tests to attest things like the launcher screen or the navbar components.

For example, using parameterized decorators to test different user roles might look like this:

```python
@pytest.mark.parametrize("is_admin", [False])
@login_parameterized()
def test_role_button(navigator, is_admin):
    _ = navigator.page.get_by_role("button", name="Admin Button").is_visible()
    assert _ == is_admin
    # Perform tests specific to the user role...
```
In the example above, we used the `login_parameterized` decorator which will log in as an user
(based on the KEYCLOAK_USERNAME and KEYCLOAK_PASSWORD) and and let you wander under the logged workspace,
we attest for the presence of the "Admin Button" in the page (which does not exist).

If your test suit presents a need for a more complex sequence of actions or special
parsing around the contents present in each page, you can create
your own handler to execute the auxiliary actions while the test is running. Check the
`handlers.py` over some examples of how that's being done.


## Debugging Playwright tests

Playwright supports a debug mode called
[Inspector](https://playwright.dev/python/docs/debug#playwright-inspector) that can be
used to inspect the browser and the page while the test is running. To enabled this
debugging option within the tests execution you can pass the `PWDEBUG=1` variable within
your test execution command.

For example, to run a single test with the debug mode enabled, you can use the following
```bash
PWDEBUG=1 pytest -s test_playwright.py::test_notebook --numprocesses 1
```



---
File: nebari/tests/tests_e2e/playwright/test_playwright.py
---

import pytest
from playwright.sync_api import expect

from tests.common.handlers import CondaStore, Notebook
from tests.common.playwright_fixtures import login_parameterized, server_parameterized


@login_parameterized()
def test_login_logout(navigator):
    expect(navigator.page.get_by_text(navigator.username)).to_be_visible()

    navigator.logout()
    expect(navigator.page.get_by_text("Sign in with Keycloak")).to_be_visible()


@pytest.mark.parametrize(
    "services",
    [
        (
            [
                "Home",
                "Token",
                "User Management",
                "Argo Workflows",
                "Environment Management",
                "Monitoring",
            ]
        ),
    ],
)
@login_parameterized()
def test_navbar_services(navigator, services):
    home_url = navigator._nebari_url / "hub/home"
    navigator.page.goto(home_url.human_repr())
    navigator.page.wait_for_load_state("networkidle")
    navbar_items = navigator.page.locator("#thenavbar").get_by_role("link")
    navbar_items_names = [item.text_content() for item in navbar_items.all()]
    assert len(navbar_items_names) == len(services)
    assert navbar_items_names == services


@pytest.mark.parametrize(
    "expected_outputs",
    [
        (["success: 6"]),
    ],
)
@server_parameterized(instance_name="small-instance")
def test_notebook(navigator, test_data_root, expected_outputs):
    notebook_manager = Notebook(navigator=navigator)

    notebook_manager.reset_workspace()

    notebook_name = "test_notebook_output.ipynb"
    notebook_path = test_data_root / notebook_name

    assert notebook_path.exists()

    with open(notebook_path, "r") as notebook:
        notebook_manager.write_file(filepath=notebook_name, content=notebook.read())

    outputs = notebook_manager.run_notebook(
        notebook_name=notebook_name, kernel="default"
    )

    assert outputs == expected_outputs

    # Clean up
    notebook_manager.reset_workspace()


@pytest.mark.parametrize(
    "namespaces",
    [
        (["analyst", "developer", "global", "nebari-git", "users"]),
    ],
)
@server_parameterized(instance_name="small-instance")
def test_conda_store_ui(navigator, namespaces):
    conda_store = CondaStore(navigator=navigator)

    conda_store.reset_workspace()

    conda_store.conda_store_ui()

    shown_namespaces = conda_store._get_shown_namespaces()
    shown_namespaces.sort()

    namespaces.append(navigator.username)
    namespaces.sort()

    assert shown_namespaces == namespaces
    # Clean up
    conda_store.reset_workspace()



---
File: nebari/tests/tests_integration/__init__.py
---




---
File: nebari/tests/tests_integration/conftest.py
---

pytest_plugins = [
    "tests.tests_integration.deployment_fixtures",
    "tests.common.playwright_fixtures",
]


# argparse under-the-hood
def pytest_addoption(parser):
    parser.addoption(
        "--cloud", action="store", help="Cloud to deploy on: aws/gcp/azure"
    )



---
File: nebari/tests/tests_integration/deployment_fixtures.py
---

import logging
import os
import pprint
import random
import shutil
import string
import uuid
import warnings
from pathlib import Path

import pytest
from urllib3.exceptions import InsecureRequestWarning

from _nebari.config import read_configuration, write_configuration
from _nebari.deploy import deploy_configuration
from _nebari.destroy import destroy_configuration
from _nebari.provider.cloud.amazon_web_services import aws_cleanup
from _nebari.provider.cloud.azure_cloud import azure_cleanup
from _nebari.provider.cloud.google_cloud import gcp_cleanup
from _nebari.render import render_template
from nebari import schema
from tests.common.config_mod_utils import add_gpu_config, add_preemptible_node_group
from tests.tests_unit.utils import render_config_partial

DEPLOYMENT_DIR = "_test_deploy"
CONFIG_FILENAME = "nebari-config.yaml"
DOMAIN = "ci-{cloud}.nebari.dev"
DEFAULT_IMAGE_TAG = "main"

logger = logging.getLogger(__name__)


def ignore_warnings():
    # Ignore this for now, as test is failing due to a
    # DeprecationWarning and InsecureRequestWarning
    warnings.filterwarnings("ignore", category=DeprecationWarning)
    warnings.filterwarnings("ignore", category=InsecureRequestWarning)


@pytest.fixture(autouse=True)
def disable_warnings():
    ignore_warnings()


def _random_letters(length=5):
    letters = string.ascii_letters
    return "".join(random.choice(letters) for _ in range(length)).lower()


def _get_or_create_deployment_directory(cloud):
    """This will create a directory to initialise and deploy
    Nebari from.
    """
    deployment_dirs = list(Path(Path(DEPLOYMENT_DIR) / cloud).glob(f"pytest{cloud}*"))
    if deployment_dirs:
        deployment_dir = deployment_dirs[0]
    else:
        project_name = f"pytest{cloud}{_random_letters()}"
        deployment_dir = Path(Path(Path(DEPLOYMENT_DIR) / cloud) / project_name)
        deployment_dir.mkdir(parents=True)
    return deployment_dir


def _delete_deployment_directory(deployment_dir: Path):
    """Delete the deployment directory if it exists."""
    config = list(deployment_dir.glob(CONFIG_FILENAME))
    if len(config) == 1:
        logger.info(f"Deleting deployment directory: {deployment_dir}")
        shutil.rmtree(deployment_dir)


def _set_nebari_creds_in_environment(config):
    os.environ["NEBARI_FULL_URL"] = f"https://{config.domain}/"
    os.environ["KEYCLOAK_USERNAME"] = "pytest"
    os.environ["KEYCLOAK_PASSWORD"] = os.environ.get(
        "PYTEST_KEYCLOAK_PASSWORD", uuid.uuid4().hex
    )


def _create_nebari_user(config):
    import keycloak

    from _nebari.keycloak import create_user, get_keycloak_admin_from_config

    keycloak_admin = get_keycloak_admin_from_config(config)
    try:
        user = create_user(keycloak_admin, "pytest", "pytest-password")
        return user
    except keycloak.KeycloakPostError as e:
        if e.response_code == 409:
            logger.info(f"User already exists: {e.response_body}")


def _cleanup_nebari(config: schema.Main):
    """Forcefully clean up any lingering resources."""

    cloud_provider = config.provider

    if cloud_provider == schema.ProviderEnum.aws.lower():
        logger.info("Forcefully clean up AWS resources")
        aws_cleanup(config)
    elif cloud_provider == schema.ProviderEnum.gcp.lower():
        logger.info("Forcefully clean up GCP resources")
        gcp_cleanup(config)
    elif cloud_provider == schema.ProviderEnum.azure.lower():
        logger.info("Forcefully clean up Azure resources")
        azure_cleanup(config)


@pytest.fixture(scope="session")
def deploy(request):
    """Deploy Nebari on the given cloud."""
    ignore_warnings()
    cloud = request.config.getoption("--cloud")

    # initialize
    deployment_dir = _get_or_create_deployment_directory(cloud)
    config = render_config_partial(
        project_name=deployment_dir.name,
        namespace="dev",
        nebari_domain=DOMAIN.format(cloud=cloud),
        cloud_provider=cloud,
        ci_provider="github-actions",
        auth_provider="password",
    )

    deployment_dir_abs = deployment_dir.absolute()
    os.chdir(deployment_dir)
    logger.info(f"Temporary directory: {deployment_dir}")
    config_path = Path(CONFIG_FILENAME)

    write_configuration(config_path, config)

    from nebari.plugins import nebari_plugin_manager

    stages = nebari_plugin_manager.ordered_stages
    config_schema = nebari_plugin_manager.config_schema

    config = read_configuration(config_path, config_schema)

    # Modify config
    config.certificate.type = "lets-encrypt"
    config.certificate.acme_email = "internal-devops@quansight.com"
    config.certificate.acme_server = "https://acme-v02.api.letsencrypt.org/directory"
    config.dns.provider = "cloudflare"
    config.dns.auto_provision = True
    config.default_images.jupyterhub = (
        f"quay.io/nebari/nebari-jupyterhub:{DEFAULT_IMAGE_TAG}"
    )
    config.default_images.jupyterlab = (
        f"quay.io/nebari/nebari-jupyterlab:{DEFAULT_IMAGE_TAG}"
    )
    config.default_images.dask_worker = (
        f"quay.io/nebari/nebari-dask-worker:{DEFAULT_IMAGE_TAG}"
    )

    if cloud in ["aws"]:
        config = add_gpu_config(config, cloud=cloud)
        config = add_preemptible_node_group(config, cloud=cloud)

    print("*" * 100)
    pprint.pprint(config.model_dump())
    print("*" * 100)

    # render
    render_template(deployment_dir_abs, config, stages)

    failed = False

    # deploy
    try:
        logger.info("*" * 100)
        logger.info(f"Deploying Nebari on {cloud}")
        logger.info("*" * 100)
        stage_outputs = deploy_configuration(
            config=config,
            stages=stages,
            disable_prompt=True,
            disable_checks=False,
        )
        _create_nebari_user(config)
        _set_nebari_creds_in_environment(config)
        yield stage_outputs
    except Exception as e:
        failed = True
        logger.exception(e)
        logger.error(f"Deploy Failed, Exception: {e}")

    # destroy
    try:
        logger.info("*" * 100)
        logger.info("Tearing down")
        logger.info("*" * 100)
        destroy_configuration(config, stages)
    except Exception as e:
        logger.exception(e)
        logger.error("Destroy failed!")
        raise
    finally:
        logger.info("*" * 100)
        logger.info("Cleaning up any lingering resources")
        logger.info("*" * 100)
        try:
            _cleanup_nebari(config)
        except Exception as e:
            logger.exception(e)
            logger.error(
                "Cleanup failed, please check if there are any lingering resources!"
            )
        _delete_deployment_directory(deployment_dir_abs)

    if failed:
        raise AssertionError("Deployment failed")



---
File: nebari/tests/tests_integration/README.md
---

# Integration Testing via Pytest

These tests are designed to test things on Nebari deployed
on cloud.

## Amazon Web Services

```bash
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
CLOUDFLARE_TOKEN
```

Assuming you're in the `tests_integration` directory, run:

```bash
pytest -vvv -s --cloud aws
```

This will deploy on Nebari on Amazon Web Services, run tests on the deployment
and then teardown the cluster.


## Azure

```bash
ARM_SUBSCRIPTION_ID
ARM_TENANT_ID
ARM_CLIENT_ID
ARM_CLIENT_SECRET
CLOUDFLARE_TOKEN
```

Assuming you're in the `tests_integration` directory, run:

```bash
pytest -vvv -s --cloud azure
```

This will deploy on Nebari on Azure, run tests on the deployment
and then teardown the cluster.



---
File: nebari/tests/tests_integration/test_all_clouds.py
---

import requests


def test_service_status(deploy):
    service_urls = deploy["stages/07-kubernetes-services"]["service_urls"]["value"]
    assert (
        requests.get(service_urls["jupyterhub"]["health_url"], verify=False).status_code
        == 200
    )
    assert (
        requests.get(service_urls["keycloak"]["health_url"], verify=False).status_code
        == 200
    )
    assert (
        requests.get(
            service_urls["dask_gateway"]["health_url"], verify=False
        ).status_code
        == 200
    )
    assert (
        requests.get(
            service_urls["conda_store"]["health_url"], verify=False
        ).status_code
        == 200
    )
    assert (
        requests.get(service_urls["monitoring"]["health_url"], verify=False).status_code
        == 200
    )


def test_verify_keycloak_users(deploy):
    """Tests if keycloak is working and it has expected users"""
    keycloak_credentials = deploy["stages/05-kubernetes-keycloak"][
        "keycloak_credentials"
    ]["value"]
    from keycloak import KeycloakAdmin

    keycloak_admin = KeycloakAdmin(
        server_url=f"{keycloak_credentials['url']}/auth/",
        username=keycloak_credentials["username"],
        password=keycloak_credentials["password"],
        realm_name=keycloak_credentials["realm"],
        client_id=keycloak_credentials["client_id"],
        verify=False,
    )
    assert set([u["username"] for u in keycloak_admin.get_users()]) == {
        "nebari-bot",
        "read-only-user",
        "root",
    }



---
File: nebari/tests/tests_integration/test_gpu.py
---

# 2023-09-14: This test is currently timing out on CI, so we're disabling it for now.

# import re

# import pytest

# from tests.common.playwright_fixtures import navigator_parameterized
# from tests.common.run_notebook import Notebook


# @pytest.mark.gpu
# @navigator_parameterized(instance_name="gpu-instance")
# def test_gpu(deploy, navigator, test_data_root):
#     test_app = Notebook(navigator=navigator)
#     conda_env = "gpu"
#     test_app.create_notebook(
#         conda_env=f"conda-env-nebari-git-nebari-git-{conda_env}-py"
#     )
#     test_app.assert_code_output(
#         code="!nvidia-smi",
#         expected_output=re.compile(".*\n.*\n.*NVIDIA-SMI.*CUDA Version"),
#     )

#     test_app.assert_code_output(
#         code="import torch;torch.cuda.is_available()", expected_output="True"
#     )



---
File: nebari/tests/tests_integration/test_preemptible.py
---

import pytest
from kubernetes import client, config

from tests.common.config_mod_utils import PREEMPTIBLE_NODE_GROUP_NAME


@pytest.mark.preemptible
def test_preemptible(request, deploy):
    config.load_kube_config(
        config_file=deploy["stages/02-infrastructure"]["kubeconfig_filename"]["value"]
    )
    if request.node.get_closest_marker("aws"):
        name_label = "eks.amazonaws.com/nodegroup"
        preemptible_key = "eks.amazonaws.com/capacityType"
        expected_value = "SPOT"
        pytest.xfail("Preemptible instances are not supported on AWS atm")

    elif request.node.get_closest_marker("gcp"):
        name_label = "cloud.google.com/gke-nodepool"
        preemptible_key = "cloud.google.com/gke-preemptible"
        expected_value = "true"
    else:
        pytest.skip("Unsupported cloud for preemptible")

    api_instance = client.CoreV1Api()
    nodes = api_instance.list_node()
    node_labels_map = {}
    for node in nodes.items:
        node_name = node.metadata.labels[name_label]
        node_labels_map[node_name] = node.metadata.labels
    preemptible_node_group_labels = node_labels_map[PREEMPTIBLE_NODE_GROUP_NAME]
    assert preemptible_node_group_labels.get(preemptible_key) == expected_value



---
File: nebari/tests/tests_unit/cli_validate/aws.error.kubernetes-version.yaml
---

project_name: test
amazon_web_services:
  region: us-east-1
  kubernetes_version: '1.0'



---
File: nebari/tests/tests_unit/cli_validate/aws.happy.yaml
---

provider: aws
namespace: dev
nebari_version: 2023.7.2.dev23+g53d17964.d20230824
project_name: test
domain: test.example.com
ci_cd:
  type: none
terraform_state:
  type: local
security:
  keycloak:
    initial_root_password: i8mawmz1ek6s9n9ms6ifgwborgryve8q
  authentication:
    type: password
theme:
  jupyterhub:
    hub_title: Nebari - test
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    hub_subtitle: Your open source data science platform, hosted on Amazon Web Services
certificate:
  type: lets-encrypt
  acme_email: test@example.com
amazon_web_services:
  region: us-east-1
  kubernetes_version: '1.20'



---
File: nebari/tests/tests_unit/cli_validate/azure.happy.yaml
---

provider: azure
namespace: dev
nebari_version: 2023.7.2.dev23+g53d17964.d20230824
project_name: test
domain: test.example.com
ci_cd:
  type: none
terraform_state:
  type: local
security:
  keycloak:
    initial_root_password: m1s25vc4k43dxbk5jaxubxcq39n4vmjq
  authentication:
    type: password
theme:
  jupyterhub:
    hub_title: Nebari - test
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    hub_subtitle: Your open source data science platform, hosted on Azure
certificate:
  type: lets-encrypt
  acme_email: test@example.com
azure:
  kubernetes_version: '1.20'
  storage_account_postfix: abcd
  region: Central US



---
File: nebari/tests/tests_unit/cli_validate/gcp.happy.yaml
---

provider: gcp
namespace: dev
nebari_version: 2023.7.2.dev23+g53d17964.d20230824
project_name: test
domain: test.example.com
ci_cd:
  type: none
terraform_state:
  type: local
security:
  keycloak:
    initial_root_password: m1s25vc4k43dxbk5jaxubxcq39n4vmjq
  authentication:
    type: password
theme:
  jupyterhub:
    hub_title: Nebari - test
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    hub_subtitle: Your open source data science platform, hosted on Azure
certificate:
  type: lets-encrypt
  acme_email: test@example.com
google_cloud_platform:
  project: pytest-project
  region: us-central1
  kubernetes_version: '1.20'



---
File: nebari/tests/tests_unit/cli_validate/local.error.authentication-type-custom.yaml
---

project_name: test
security:
  authentication:
    type: custom



---
File: nebari/tests/tests_unit/cli_validate/local.error.extra-inputs.yaml
---

project_name: test
this_is_an_error: true



---
File: nebari/tests/tests_unit/cli_validate/local.error.project_name.ends_with_special.yaml
---

project_name: invalidproject-



---
File: nebari/tests/tests_unit/cli_validate/local.error.project_name.starts_with_number.yaml
---

project_name: 123invalidproject



---
File: nebari/tests/tests_unit/cli_validate/local.error.project_name.too_long.yaml
---

project_name: thisprojectnameissolongitshouldbeinvalid



---
File: nebari/tests/tests_unit/cli_validate/local.happy.auth0.yaml
---

provider: local
project_name: foobar
security:
  authentication:
    type: Auth0
    config:
      client_id: test_client
      client_secret: test_secret
      auth0_subdomain: test_subdomain



---
File: nebari/tests/tests_unit/cli_validate/local.happy.github.yaml
---

provider: local
project_name: foobar
security:
  authentication:
    type: GitHub
    config:
      client_id: test_client
      client_secret: test_secret



---
File: nebari/tests/tests_unit/cli_validate/local.happy.project_name.with_numbers.yaml
---

project_name: my-test-1-2-3



---
File: nebari/tests/tests_unit/cli_validate/local.happy.yaml
---

provider: local
namespace: dev
nebari_version: 2023.7.2.dev23+g53d17964.d20230824
project_name: test
domain: test.example.com
ci_cd:
  type: none
terraform_state:
  type: local
security:
  keycloak:
    initial_root_password: muwti3n4d7m81c1svcgaahwhfi869yhg
  authentication:
    type: password
theme:
  jupyterhub:
    hub_title: Nebari - test
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    hub_subtitle: Your open source data science platform, hosted
certificate:
  type: lets-encrypt
  acme_email: test@example.com
jupyterhub:
  overrides:
    singleuser:
      extraEnv:
        TEST_ENV: "my_env"



---
File: nebari/tests/tests_unit/cli_validate/min.happy.jupyterlab.default_settings.yaml
---

project_name: test
jupyterlab:
  default_settings:
    "@jupyterlab/apputils-extension:themes":
      theme: JupyterLab Dark



---
File: nebari/tests/tests_unit/cli_validate/min.happy.jupyterlab.gallery_settings.yaml
---

project_name: test
jupyterlab:
  gallery_settings:
    title: Example repositories
    destination: examples
    exhibits:
      - title: Nebari
        git: https://github.com/nebari-dev/nebari.git
        homepage: https://github.com/nebari-dev/nebari
        description: 🪴 Nebari - your open source data science platform



---
File: nebari/tests/tests_unit/cli_validate/min.happy.monitoring.overrides.yaml
---

project_name: test
monitoring:
  enabled: true
  overrides:
    loki:
      loki: foobar
    promtail:
      promtail: foobar
    minio:
      minio: foobar



---
File: nebari/tests/tests_unit/cli_validate/min.happy.yaml
---

project_name: test



---
File: nebari/tests/tests_unit/qhub-config-yaml-files-for-upgrade/qhub-config-aws-310-customauth.yaml
---

project_name: aws-pytest
provider: aws
domain: aws.nebari.dev
certificate:
  type: self-signed
security:
  authentication:
    type: custom
    authentication_class: 'firstuseauthenticator.FirstUseAuthenticator'
    config:
       min_password_length: 5
  users:
    example-user:
      uid: 1000
      primary_group: admin
      secondary_groups:
      - users
      password: $2b$12$YrEkTAEFfo4fKO7lYPpReegKagd1irrW5YmRugJcaPCjkVaPzrVLq
  groups:
    users:
      gid: 100
    admin:
      gid: 101
default_images:
  jupyterhub: quansight/nebari-jupyterhub:v0.3.10
  jupyterlab: quansight/nebari-jupyterlab:v0.3.10
  dask_worker: quansight/nebari-dask-worker:v0.3.10
  dask_gateway: quansight/nebari-dask-gateway:v0.3.10
storage:
  conda_store: 60Gi
  shared_filesystem: 100Gi
theme:
  jupyterhub:
    hub_title: Nebari - do-pytest
    hub_subtitle: Autoscaling Compute Environment on AWS
    welcome: Welcome to do.nebari.dev. It is maintained by <a href="http://quansight.com">Quansight
      staff</a>. The hub's configuration is stored in a github repository based on
      <a href="https://github.com/Quansight/nebari/">https://github.com/Quansight/nebari/</a>.
      To provide feedback and report any technical problems, please use the <a href="https://github.com/Quansight/nebari/issues">github
      issue tracker</a>.
    logo: /hub/custom/images/jupyter_nebari_logo.svg
    primary_color: '#4f4173'
    secondary_color: '#957da6'
    accent_color: '#32C574'
    text_color: '#111111'
    h1_color: '#652e8e'
    h2_color: '#652e8e'
terraform_state:
  type: remote
namespace: dev
amazon_web_services:
  kubernetes_version: '1.20'
  region: us-east-1
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
      gpu: false
      single_subnet: false
      permissions_boundary:
    user:
      instance: m5.xlarge
      min_nodes: 0
      max_nodes: 5
      gpu: false
      single_subnet: false
      permissions_boundary:
    worker:
      instance: m5.xlarge
      min_nodes: 0
      max_nodes: 5
      gpu: false
      single_subnet: false
      permissions_boundary:
profiles:
  jupyterlab:
  - display_name: Small Instance
    description: Stable environment with 1 cpu / 4 GB ram
    default: true
    kubespawner_override:
      cpu_limit: 1
      cpu_guarantee: 0.75
      mem_limit: 4G
      mem_guarantee: 2.5G
      image: quansight/nebari-jupyterlab:v0.3.10
  - display_name: Medium Instance
    description: Stable environment with 2 cpu / 8 GB ram
    kubespawner_override:
      cpu_limit: 2
      cpu_guarantee: 1.5
      mem_limit: 8G
      mem_guarantee: 5G
      image: quansight/nebari-jupyterlab:v0.3.10
  dask_worker:
    Small Worker:
      worker_cores_limit: 1
      worker_cores: 0.75
      worker_memory_limit: 4G
      worker_memory: 2.5G
      worker_threads: 1
      image: quansight/nebari-dask-worker:v0.3.10
    Medium Worker:
      worker_cores_limit: 2
      worker_cores: 1.5
      worker_memory_limit: 8G
      worker_memory: 5G
      worker_threads: 2
      image: quansight/nebari-dask-worker:v0.3.10
environments:
  environment-dask.yaml:
    name: dask
    channels:
    - conda-forge
    dependencies:
    - python
    - ipykernel
    - ipywidgets
    - python-graphviz
    - dask ==2.30.0
    - distributed ==2.30.1
    - dask-gateway ==0.9.0
    - numpy
    - numba
    - pandas
  environment-dashboard.yaml:
    name: dashboard
    channels:
    - conda-forge
    dependencies:
    - python
    - ipykernel
    - ipywidgets >=7.6
    - param
    - python-graphviz
    - matplotlib >=3.3.4
    - panel >=0.10.3
    - voila >=0.2.7
    - streamlit >=0.76
    - dash >=1.19



---
File: nebari/tests/tests_unit/qhub-config-yaml-files-for-upgrade/qhub-config-aws-310.yaml
---

project_name: aws-pytest
provider: aws
domain: aws.nebari.dev
certificate:
  type: self-signed
security:
  authentication:
    type: password
  users:
    example-user:
      uid: 1000
      primary_group: admin
      secondary_groups:
      - users
      password: $2b$12$YrEkTAEFfo4fKO7lYPpReegKagd1irrW5YmRugJcaPCjkVaPzrVLq
  groups:
    users:
      gid: 100
    admin:
      gid: 101
default_images:
  jupyterhub: quansight/nebari-jupyterhub:v0.3.10
  jupyterlab: quansight/nebari-jupyterlab:v0.3.10
  dask_worker: quansight/nebari-dask-worker:v0.3.10
  dask_gateway: quansight/nebari-dask-gateway:v0.3.10
storage:
  conda_store: 60Gi
  shared_filesystem: 100Gi
theme:
  jupyterhub:
    hub_title: Nebari - do-pytest
    hub_subtitle: Autoscaling Compute Environment on AWS
    welcome: Welcome to do.nebari.dev. It is maintained by <a href="http://quansight.com">Quansight
      staff</a>. The hub's configuration is stored in a github repository based on
      <a href="https://github.com/Quansight/nebari/">https://github.com/Quansight/nebari/</a>.
      To provide feedback and report any technical problems, please use the <a href="https://github.com/Quansight/nebari/issues">github
      issue tracker</a>.
    logo: /hub/custom/images/jupyter_nebari_logo.svg
    primary_color: '#4f4173'
    secondary_color: '#957da6'
    accent_color: '#32C574'
    text_color: '#111111'
    h1_color: '#652e8e'
    h2_color: '#652e8e'
terraform_state:
  type: remote
namespace: dev
amazon_web_services:
  kubernetes_version: '1.20'
  region: us-east-1
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
      gpu: false
      single_subnet: false
      permissions_boundary:
    user:
      instance: m5.xlarge
      min_nodes: 0
      max_nodes: 5
      gpu: false
      single_subnet: false
      permissions_boundary:
    worker:
      instance: m5.xlarge
      min_nodes: 0
      max_nodes: 5
      gpu: false
      single_subnet: false
      permissions_boundary:
profiles:
  jupyterlab:
  - display_name: Small Instance
    description: Stable environment with 1 cpu / 4 GB ram
    default: true
    kubespawner_override:
      cpu_limit: 1
      cpu_guarantee: 0.75
      mem_limit: 4G
      mem_guarantee: 2.5G
      image: quansight/nebari-jupyterlab:v0.3.10
  - display_name: Medium Instance
    description: Stable environment with 2 cpu / 8 GB ram
    kubespawner_override:
      cpu_limit: 2
      cpu_guarantee: 1.5
      mem_limit: 8G
      mem_guarantee: 5G
      image: quansight/nebari-jupyterlab:v0.3.10
  dask_worker:
    Small Worker:
      worker_cores_limit: 1
      worker_cores: 0.75
      worker_memory_limit: 4G
      worker_memory: 2.5G
      worker_threads: 1
      image: quansight/nebari-dask-worker:v0.3.10
    Medium Worker:
      worker_cores_limit: 2
      worker_cores: 1.5
      worker_memory_limit: 8G
      worker_memory: 5G
      worker_threads: 2
      image: quansight/nebari-dask-worker:v0.3.10
environments:
  environment-dask.yaml:
    name: dask
    channels:
    - conda-forge
    dependencies:
    - python
    - ipykernel
    - ipywidgets
    - python-graphviz
    - dask ==2.30.0
    - distributed ==2.30.1
    - dask-gateway ==0.9.0
    - numpy
    - numba
    - pandas
  environment-dashboard.yaml:
    name: dashboard
    channels:
    - conda-forge
    dependencies:
    - python
    - ipykernel
    - ipywidgets >=7.6
    - param
    - python-graphviz
    - matplotlib >=3.3.4
    - panel >=0.10.3
    - voila >=0.2.7
    - streamlit >=0.76
    - dash >=1.19



---
File: nebari/tests/tests_unit/qhub-config-yaml-files-for-upgrade/qhub-users-import.json
---

{
  "id": "nebari",
  "realm": "nebari",
  "users": [
    {
      "username": "example-user",
      "enabled": true,
      "groups": [
        "admin",
        "users"
      ]
    }
  ],
  "groups": []
}



---
File: nebari/tests/tests_unit/__init__.py
---




---
File: nebari/tests/tests_unit/conftest.py
---

from pathlib import Path
from unittest.mock import Mock

import pytest

from _nebari.config import write_configuration
from _nebari.constants import (
    AWS_DEFAULT_REGION,
    AZURE_DEFAULT_REGION,
    GCP_DEFAULT_REGION,
)
from _nebari.initialize import render_config
from _nebari.render import render_template
from _nebari.stages.bootstrap import CiEnum
from _nebari.stages.kubernetes_keycloak import AuthenticationEnum
from _nebari.stages.terraform_state import TerraformStateEnum
from nebari import schema
from nebari.plugins import nebari_plugin_manager


@pytest.fixture
def config_path():
    return Path(__file__).parent / "cli_validate"


@pytest.fixture
def config_gcp(config_path):
    return config_path / "gcp.happy.yaml"


@pytest.fixture(autouse=True)
def mock_all_cloud_methods(monkeypatch):
    def _mock_return_value(return_value):
        m = Mock()
        m.return_value = return_value
        return m

    MOCK_VALUES = {
        # AWS
        "_nebari.provider.cloud.amazon_web_services.kubernetes_versions": [
            "1.18",
            "1.19",
            "1.20",
        ],
        "_nebari.provider.cloud.amazon_web_services.check_credentials": None,
        "_nebari.provider.cloud.amazon_web_services.regions": [
            "us-east-1",
            "us-west-2",
        ],
        "_nebari.provider.cloud.amazon_web_services.zones": [
            "us-west-2a",
            "us-west-2b",
        ],
        "_nebari.provider.cloud.amazon_web_services.instances": {
            "m5.xlarge": "m5.xlarge",
            "m5.2xlarge": "m5.2xlarge",
        },
        "_nebari.provider.cloud.amazon_web_services.kms_key_arns": {
            "xxxxxxxx-east-zzzz": {
                "Arn": "arn:aws:kms:us-east-1:100000:key/xxxxxxxx-east-zzzz",
                "KeyUsage": "ENCRYPT_DECRYPT",
                "KeySpec": "SYMMETRIC_DEFAULT",
            },
            "xxxxxxxx-west-zzzz": {
                "Arn": "arn:aws:kms:us-west-2:100000:key/xxxxxxxx-west-zzzz",
                "KeyUsage": "ENCRYPT_DECRYPT",
                "KeySpec": "SYMMETRIC_DEFAULT",
            },
        },
        # Azure
        "_nebari.provider.cloud.azure_cloud.kubernetes_versions": [
            "1.18",
            "1.19",
            "1.20",
        ],
        "_nebari.provider.cloud.azure_cloud.check_credentials": None,
        # Google Cloud
        "_nebari.provider.cloud.google_cloud.kubernetes_versions": [
            "1.18",
            "1.19",
            "1.20",
        ],
        "_nebari.provider.cloud.google_cloud.check_credentials": None,
        "_nebari.provider.cloud.google_cloud.regions": [
            "us-central1",
            "us-east1",
        ],
        "_nebari.provider.cloud.google_cloud.instances": [
            "e2-standard-4",
            "e2-standard-8",
            "e2-highmem-4",
        ],
    }

    for attribute_path, return_value in MOCK_VALUES.items():
        monkeypatch.setattr(attribute_path, _mock_return_value(return_value))

    monkeypatch.setenv("PROJECT_ID", "pytest-project")


@pytest.fixture(
    params=[
        # project, namespace, domain, cloud_provider, region, ci_provider, auth_provider
        (
            "pytestaws",
            "dev",
            "aws.nebari.dev",
            schema.ProviderEnum.aws,
            AWS_DEFAULT_REGION,
            CiEnum.github_actions,
            AuthenticationEnum.password,
        ),
        (
            "pytestgcp",
            "dev",
            "gcp.nebari.dev",
            schema.ProviderEnum.gcp,
            GCP_DEFAULT_REGION,
            CiEnum.gitlab_ci,
            AuthenticationEnum.password,
        ),
        (
            "pytestazure",
            "dev",
            "azure.nebari.dev",
            schema.ProviderEnum.azure,
            AZURE_DEFAULT_REGION,
            CiEnum.github_actions,
            AuthenticationEnum.password,
        ),
    ]
)
def nebari_config_options(request) -> schema.Main:
    """This fixtures creates a set of nebari configurations for tests"""
    DEFAULT_GH_REPO = "github.com/test/test"
    DEFAULT_TERRAFORM_STATE = TerraformStateEnum.remote

    (
        project,
        namespace,
        domain,
        cloud_provider,
        region,
        ci_provider,
        auth_provider,
    ) = request.param

    if ci_provider == CiEnum.github_actions:
        repo = DEFAULT_GH_REPO
    else:
        repo = None

    return dict(
        project_name=project,
        namespace=namespace,
        nebari_domain=domain,
        cloud_provider=cloud_provider,
        region=region,
        ci_provider=ci_provider,
        auth_provider=auth_provider,
        repository=repo,
        repository_auto_provision=False,
        auth_auto_provision=False,
        terraform_state=DEFAULT_TERRAFORM_STATE,
        disable_prompt=True,
    )


@pytest.fixture
def nebari_config(nebari_config_options):
    return nebari_plugin_manager.config_schema.model_validate(
        render_config(**nebari_config_options)
    )


@pytest.fixture
def nebari_stages():
    return nebari_plugin_manager.ordered_stages


@pytest.fixture
def nebari_render(nebari_config, nebari_stages, tmp_path):
    NEBARI_CONFIG_FN = "nebari-config.yaml"

    config_filename = tmp_path / NEBARI_CONFIG_FN
    write_configuration(config_filename, nebari_config)
    render_template(tmp_path, nebari_config, nebari_stages)
    return tmp_path, config_filename


@pytest.fixture
def new_upgrade_cls():
    from _nebari.upgrade import UpgradeStep

    assert UpgradeStep._steps
    steps_cache = UpgradeStep._steps.copy()
    UpgradeStep.clear_steps_registry()
    assert not UpgradeStep._steps
    yield UpgradeStep
    UpgradeStep._steps = steps_cache


@pytest.fixture
def config_schema():
    return nebari_plugin_manager.config_schema



---
File: nebari/tests/tests_unit/test_cli_deploy.py
---

from typer.testing import CliRunner

from _nebari.cli import create_cli

runner = CliRunner()


def test_dns_option(config_gcp):
    app = create_cli()
    result = runner.invoke(
        app,
        [
            "deploy",
            "-c",
            str(config_gcp),
            "--dns-provider",
            "cloudflare",
            "--dns-auto-provision",
        ],
    )
    assert (
        "The `--dns-provider` and `--dns-auto-provision` flags have been removed"
        in result.output
    )
    assert "Aborted" in result.output



---
File: nebari/tests/tests_unit/test_cli_dev.py
---

import json
import tempfile
from pathlib import Path
from typing import Any, List
from unittest.mock import Mock, patch

import pytest
import requests.exceptions
import yaml
from typer.testing import CliRunner

from _nebari.cli import create_cli

TEST_KEYCLOAKAPI_REQUEST = "GET /"  # get list of realms

TEST_DOMAIN = "nebari.example.com"
MOCK_KEYCLOAK_ENV = {
    "KEYCLOAK_SERVER_URL": f"https://{TEST_DOMAIN}/auth/",
    "KEYCLOAK_ADMIN_USERNAME": "root",
    "KEYCLOAK_ADMIN_PASSWORD": "super-secret-123!",
}

TEST_ACCESS_TOKEN = "abc123"

TEST_REALMS = [
    {"id": "test-realm", "realm": "test-realm"},
    {"id": "master", "realm": "master"},
]

runner = CliRunner()


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        ([], 0, ["Usage:"]),
        (["--help"], 0, ["Usage:"]),
        (["-h"], 0, ["Usage:"]),
        (["keycloak-api", "--help"], 0, ["Usage:"]),
        (["keycloak-api", "-h"], 0, ["Usage:"]),
        # error, missing args
        (["keycloak-api"], 2, ["Missing option"]),
        (["keycloak-api", "--config"], 2, ["requires an argument"]),
        (["keycloak-api", "-c"], 2, ["requires an argument"]),
        (["keycloak-api", "--request"], 2, ["requires an argument"]),
        (["keycloak-api", "-r"], 2, ["requires an argument"]),
    ],
)
def test_cli_dev_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["dev"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


def mock_api_post(admin_password: str, url: str, headers: Any, data: Any, verify: bool):
    response = Mock()
    if (
        url
        == f"{MOCK_KEYCLOAK_ENV['KEYCLOAK_SERVER_URL']}realms/master/protocol/openid-connect/token"
        and data["password"] == admin_password
    ):
        response.status_code = 200
        response.content = bytes(
            json.dumps({"access_token": TEST_ACCESS_TOKEN}), "UTF-8"
        )
    else:
        response.status_code = 403
    return response


def mock_api_request(
    access_token: str, method: str, url: str, headers: Any, verify: bool
):
    response = Mock()
    if (
        method == "GET"
        and url == f"{MOCK_KEYCLOAK_ENV['KEYCLOAK_SERVER_URL']}admin/realms/"
        and headers["Authorization"] == f"Bearer {access_token}"
    ):
        response.status_code = 200
        response.content = bytes(json.dumps(TEST_REALMS), "UTF-8")
    else:
        response.status_code = 403
    return response


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
@patch(
    "_nebari.keycloak.requests.request",
    side_effect=lambda method, url, headers, verify: mock_api_request(
        TEST_ACCESS_TOKEN, method, url, headers, verify
    ),
)
def test_cli_dev_keycloakapi_happy_path_from_env(
    _mock_requests_post, _mock_requests_request
):
    result = run_cli_dev(use_env=True)

    assert 0 == result.exit_code
    assert not result.exception

    r = json.loads(result.stdout)
    assert 2 == len(r)
    assert "test-realm" == r[0]["realm"]


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
@patch(
    "_nebari.keycloak.requests.request",
    side_effect=lambda method, url, headers, verify: mock_api_request(
        TEST_ACCESS_TOKEN, method, url, headers, verify
    ),
)
def test_cli_dev_keycloakapi_happy_path_from_config(
    _mock_requests_post, _mock_requests_request
):
    result = run_cli_dev(use_env=False)

    assert 0 == result.exit_code
    assert not result.exception

    r = json.loads(result.stdout)
    assert 2 == len(r)
    assert "test-realm" == r[0]["realm"]


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
def test_cli_dev_keycloakapi_error_bad_request(_mock_requests_post):
    result = run_cli_dev(request="malformed")

    assert 1 == result.exit_code
    assert result.exception
    assert "not enough values to unpack" in str(result.exception)


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        "invalid_admin_password", url, headers, data, verify
    ),
)
def test_cli_dev_keycloakapi_error_authentication(_mock_requests_post):
    result = run_cli_dev()

    assert 1 == result.exit_code
    assert result.exception
    assert "Unable to retrieve Keycloak API token" in str(result.exception)
    assert "Status code: 403" in str(result.exception)


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
@patch(
    "_nebari.keycloak.requests.request",
    side_effect=lambda method, url, headers, verify: mock_api_request(
        "invalid_access_token", method, url, headers, verify
    ),
)
def test_cli_dev_keycloakapi_error_authorization(
    _mock_requests_post, _mock_requests_request
):
    result = run_cli_dev()

    assert 1 == result.exit_code
    assert result.exception
    assert "Unable to communicate with Keycloak API" in str(result.exception)
    assert "Status code: 403" in str(result.exception)


@patch(
    "_nebari.keycloak.requests.post", side_effect=requests.exceptions.RequestException()
)
def test_cli_dev_keycloakapi_request_exception(_mock_requests_post):
    result = run_cli_dev()

    assert 1 == result.exit_code
    assert result.exception


@patch("_nebari.keycloak.requests.post", side_effect=Exception())
def test_cli_dev_keycloakapi_unhandled_error(_mock_requests_post):
    result = run_cli_dev()

    assert 1 == result.exit_code
    assert result.exception


def run_cli_dev(
    request: str = TEST_KEYCLOAKAPI_REQUEST,
    use_env: bool = True,
    extra_args: List[str] = [],
):
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        extra_config = (
            {
                "domain": TEST_DOMAIN,
                "security": {
                    "keycloak": {
                        "initial_root_password": MOCK_KEYCLOAK_ENV[
                            "KEYCLOAK_ADMIN_PASSWORD"
                        ]
                    }
                },
            }
            if not use_env
            else {}
        )
        config = {**{"project_name": "dev"}, **extra_config}
        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(config, f)

        assert tmp_file.exists() is True

        app = create_cli()

        args = [
            "dev",
            "keycloak-api",
            "--config",
            tmp_file.resolve(),
            "--request",
            request,
        ] + extra_args

        env = MOCK_KEYCLOAK_ENV if use_env else {}
        result = runner.invoke(app, args=args, env=env)

        return result



---
File: nebari/tests/tests_unit/test_cli_init_repository.py
---

import logging
import tempfile
from pathlib import Path
from unittest.mock import Mock, patch

import pytest
import requests.auth
import requests.exceptions
from typer.testing import CliRunner

from _nebari.cli import create_cli
from _nebari.provider.cicd.github import GITHUB_BASE_URL

runner = CliRunner()

TEST_GITHUB_USERNAME = "test-nebari-github-user"
TEST_GITHUB_TOKEN = "nebari-super-secret"

TEST_REPOSITORY_NAME = "nebari-test"

DEFAULT_ARGS = [
    "init",
    "local",
    "--project-name",
    "test",
    "--repository-auto-provision",
    "--repository",
    f"https://github.com/{TEST_GITHUB_USERNAME}/{TEST_REPOSITORY_NAME}",
]


@patch(
    "_nebari.provider.cicd.github.requests.get",
    side_effect=lambda url, json, auth: mock_api_request(
        "GET",
        url,
        json,
        auth,
    ),
)
@patch(
    "_nebari.provider.cicd.github.requests.post",
    side_effect=lambda url, json, auth: mock_api_request(
        "POST",
        url,
        json,
        auth,
    ),
)
@patch(
    "_nebari.provider.cicd.github.requests.put",
    side_effect=lambda url, json, auth: mock_api_request(
        "PUT",
        url,
        json,
        auth,
    ),
)
@patch(
    "_nebari.initialize.git",
    return_value=Mock(
        is_git_repo=Mock(return_value=False),
        initialize_git=Mock(return_value=True),
        add_git_remote=Mock(return_value=True),
    ),
)
def test_cli_init_repository_auto_provision(
    _mock_requests_get,
    _mock_requests_post,
    _mock_requests_put,
    _mock_git,
    monkeypatch: pytest.MonkeyPatch,
):
    monkeypatch.setenv("GITHUB_USERNAME", TEST_GITHUB_USERNAME)
    monkeypatch.setenv("GITHUB_TOKEN", TEST_GITHUB_TOKEN)

    app = create_cli()

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        result = runner.invoke(app, DEFAULT_ARGS + ["--output", tmp_file.resolve()])

        assert 0 == result.exit_code
        assert not result.exception
        assert tmp_file.exists() is True


@patch(
    "_nebari.provider.cicd.github.requests.get",
    side_effect=lambda url, json, auth: mock_api_request(
        "GET", url, json, auth, repo_exists=True
    ),
)
@patch(
    "_nebari.provider.cicd.github.requests.post",
    side_effect=lambda url, json, auth: mock_api_request(
        "POST",
        url,
        json,
        auth,
    ),
)
@patch(
    "_nebari.provider.cicd.github.requests.put",
    side_effect=lambda url, json, auth: mock_api_request(
        "PUT",
        url,
        json,
        auth,
    ),
)
@patch(
    "_nebari.initialize.git",
    return_value=Mock(
        is_git_repo=Mock(return_value=False),
        initialize_git=Mock(return_value=True),
        add_git_remote=Mock(return_value=True),
    ),
)
def test_cli_init_repository_repo_exists(
    _mock_requests_get,
    _mock_requests_post,
    _mock_requests_put,
    _mock_git,
    monkeypatch: pytest.MonkeyPatch,
    capsys,
    caplog,
):
    monkeypatch.setenv("GITHUB_USERNAME", TEST_GITHUB_USERNAME)
    monkeypatch.setenv("GITHUB_TOKEN", TEST_GITHUB_TOKEN)

    with capsys.disabled():
        caplog.set_level(logging.WARNING)

        app = create_cli()

        with tempfile.TemporaryDirectory() as tmp:
            tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
            assert tmp_file.exists() is False

            result = runner.invoke(app, DEFAULT_ARGS + ["--output", tmp_file.resolve()])

            assert 0 == result.exit_code
            assert not result.exception
            assert tmp_file.exists() is True
            assert "already exists" in caplog.text


def test_cli_init_error_repository_missing_env(monkeypatch: pytest.MonkeyPatch):
    for e in [
        "GITHUB_USERNAME",
        "GITHUB_TOKEN",
    ]:
        try:
            monkeypatch.delenv(e)
        except Exception as e:
            pass

    app = create_cli()

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        result = runner.invoke(app, DEFAULT_ARGS + ["--output", tmp_file.resolve()])

        assert 1 == result.exit_code
        assert result.exception
        assert "Environment variable(s) required for GitHub automation" in str(
            result.exception
        )
        assert tmp_file.exists() is False


@pytest.mark.parametrize(
    "url",
    [
        "https://github.com",
        "http://github.com/user/repo",
        "https://github.com/user/" "github.com/user/repo",
        "https://notgithub.com/user/repository",
    ],
)
def test_cli_init_error_invalid_repo(url):
    app = create_cli()

    args = ["init", "local", "--project-name", "test", "--repository", url]

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        result = runner.invoke(app, args + ["--output", tmp_file.resolve()])

        assert 2 == result.exit_code
        assert result.exception
        assert "repository URL" in str(result.stdout)
        assert tmp_file.exists() is False


def mock_api_request(
    method: str,
    url: str,
    json: str,
    auth: requests.auth.HTTPBasicAuth,
    repo_exists: bool = False,
):
    response = Mock()
    response.json = Mock(return_value={})
    response.raise_for_status = Mock(return_value=True)
    if (
        url.startswith(GITHUB_BASE_URL)
        and auth.username == TEST_GITHUB_USERNAME
        and auth.password == TEST_GITHUB_TOKEN
    ):
        response.status_code = 200
        if (
            not repo_exists
            and method == "GET"
            and url.endswith(f"repos/{TEST_GITHUB_USERNAME}/{TEST_REPOSITORY_NAME}")
        ):
            response.status_code = 404
            response.raise_for_status.side_effect = requests.exceptions.HTTPError
        elif method == "GET" and url.endswith(
            f"repos/{TEST_GITHUB_USERNAME}/{TEST_REPOSITORY_NAME}/actions/secrets/public-key"
        ):
            response.json = Mock(
                return_value={
                    "key": "hBT5WZEj8ZoOv6TYJsfWq7MxTEQopZO5/IT3ZCVQPzs=",
                    "key_id": "012345678912345678",
                }
            )
    else:
        response.status_code = 403
        response.raise_for_status.side_effect = requests.exceptions.HTTPError
    return response



---
File: nebari/tests/tests_unit/test_cli_init.py
---

import tempfile
from collections.abc import MutableMapping
from pathlib import Path
from typing import List

import pytest
import yaml
from typer import Typer
from typer.testing import CliRunner

from _nebari.cli import create_cli
from _nebari.constants import AZURE_DEFAULT_REGION

runner = CliRunner()

MOCK_KUBERNETES_VERSIONS = {
    "aws": ["1.20"],
    "azure": ["1.20"],
    "gcp": ["1.20"],
}
MOCK_CLOUD_REGIONS = {
    "aws": ["us-east-1"],
    "azure": [AZURE_DEFAULT_REGION],
    "gcp": ["us-central1"],
}


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        (["--help"], 0, ["Usage:", "nebari init"]),
        (["-h"], 0, ["Usage:", "nebari init"]),
        # error, missing args
        ([], 2, ["Missing option"]),
        (["--no-guided-init"], 2, ["Missing option"]),
        (["--project-name"], 2, ["requires an argument"]),
        (["--project"], 2, ["requires an argument"]),
        (["-p"], 2, ["requires an argument"]),
        (["--domain-name"], 2, ["requires an argument"]),
        (["--domain"], 2, ["requires an argument"]),
        (["--namespace"], 2, ["requires an argument"]),
        (["--auth-provider"], 2, ["requires an argument"]),
        (["--repository"], 2, ["requires an argument"]),
        (["--ci-provider"], 2, ["requires an argument"]),
        (["--terraform-state"], 2, ["requires an argument"]),
        (["--kubernetes-version"], 2, ["requires an argument"]),
        (["--region"], 2, ["requires an argument"]),
        (["--ssl-cert-email"], 2, ["requires an argument"]),
        (["--output"], 2, ["requires an argument"]),
        (["-o"], 2, ["requires an argument"]),
        (["--explicit"], 2, ["Missing option"]),
        (["-e"], 2, ["Missing option"]),
    ],
)
def test_cli_init_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["init"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


def generate_test_data_test_cli_init_happy_path():
    """
    Generate inputs to test_cli_init_happy_path representing all valid combinations of options
    available to nebari init
    """

    test_data = []
    for provider in ["local", "aws", "azure", "gcp", "existing"]:
        for region in get_cloud_regions(provider):
            for project_name in ["testproject"]:
                for domain_name in [f"{project_name}.example.com"]:
                    for namespace in ["test-ns"]:
                        for auth_provider in ["password", "Auth0", "GitHub"]:
                            for ci_provider in [
                                "none",
                                "github-actions",
                                "gitlab-ci",
                            ]:
                                for terraform_state in [
                                    "local",
                                    "remote",
                                    "existing",
                                ]:
                                    for email in ["noreply@example.com"]:
                                        for (
                                            kubernetes_version
                                        ) in get_kubernetes_versions(provider) + [
                                            "latest"
                                        ]:
                                            for explicit in [True, False]:
                                                test_data.append(
                                                    (
                                                        provider,
                                                        region,
                                                        project_name,
                                                        domain_name,
                                                        namespace,
                                                        auth_provider,
                                                        ci_provider,
                                                        terraform_state,
                                                        email,
                                                        kubernetes_version,
                                                        explicit,
                                                    )
                                                )

    keys = [
        "provider",
        "region",
        "project_name",
        "domain_name",
        "namespace",
        "auth_provider",
        "ci_provider",
        "terraform_state",
        "email",
        "kubernetes_version",
        "explicit",
    ]
    return {"keys": keys, "test_data": test_data}


def test_cli_init_happy_path(
    provider: str,
    region: str,
    project_name: str,
    domain_name: str,
    namespace: str,
    auth_provider: str,
    ci_provider: str,
    terraform_state: str,
    email: str,
    kubernetes_version: str,
    explicit: bool,
):
    app = create_cli()
    args = [
        "init",
        provider,
        "--no-guided-init",  # default
        "--no-auth-auto-provision",  # default
        "--no-repository-auto-provision",  # default
        "--disable-prompt",
        "--project-name",
        project_name,
        "--domain-name",
        domain_name,
        "--namespace",
        namespace,
        "--auth-provider",
        auth_provider,
        "--ci-provider",
        ci_provider,
        "--terraform-state",
        terraform_state,
        "--ssl-cert-email",
        email,
        "--kubernetes-version",
        kubernetes_version,
        "--region",
        region,
    ]
    if explicit:
        args += ["--explicit"]

    expected_yaml = f"""
    provider: {provider}
    namespace: {namespace}
    project_name: {project_name}
    domain: {domain_name}
    ci_cd:
        type: {ci_provider}
    terraform_state:
        type: {terraform_state}
    security:
        authentication:
            type: {auth_provider}
    certificate:
        type: lets-encrypt
        acme_email: {email}
    """

    provider_section = get_provider_section_header(provider)
    if provider_section != "" and kubernetes_version != "latest":
        expected_yaml += f"""
    {provider_section}:
        kubernetes_version: '{kubernetes_version}'
        region: '{region}'
    """

    assert_nebari_init_args(app, args, expected_yaml)


def assert_nebari_init_args(
    app: Typer, args: List[str], expected_yaml: str, input: str = None
):
    """
    Run nebari init with happy path assertions and verify the generated yaml contains
    all values in expected_yaml.
    """
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        result = runner.invoke(
            app, args + ["--output", tmp_file.resolve()], input=input
        )

        assert not result.exception
        assert 0 == result.exit_code
        assert tmp_file.exists() is True

        with open(tmp_file.resolve(), "r") as config_yaml:
            config = flatten_dict(yaml.safe_load(config_yaml))
            expected = flatten_dict(yaml.safe_load(expected_yaml))
            assert expected.items() <= config.items()


def pytest_generate_tests(metafunc):
    """
    Dynamically generate test data parameters for test functions by looking for
    and executing an associated generate_test_data_{function_name} if one exists.
    """

    try:
        td = eval(f"generate_test_data_{metafunc.function.__name__}")()
        metafunc.parametrize(",".join(td["keys"]), td["test_data"])
    except Exception:
        # expected when a generate_test_data_ function doesn't exist
        pass


# https://stackoverflow.com/a/62186053
def flatten_dict(dictionary, parent_key=False, separator="."):
    """
    Turn a nested dictionary into a flattened dictionary
    :param dictionary: The dictionary to flatten
    :param parent_key: The string to prepend to dictionary's keys
    :param separator: The string used to separate flattened keys
    :return: A flattened dictionary
    """

    items = []
    for key, value in dictionary.items():
        new_key = str(parent_key) + separator + key if parent_key else key
        if isinstance(value, MutableMapping):
            items.extend(flatten_dict(value, new_key, separator).items())
        elif isinstance(value, list):
            for k, v in enumerate(value):
                items.extend(flatten_dict({str(k): v}, new_key).items())
        else:
            items.append((new_key, value))
    return dict(items)


def get_provider_section_header(provider: str):
    if provider == "aws":
        return "amazon_web_services"
    if provider == "gcp":
        return "google_cloud_platform"
    if provider == "azure":
        return "azure"
    return ""


def get_cloud_regions(provider: str):
    if provider == "aws":
        return MOCK_CLOUD_REGIONS["aws"]
    if provider == "gcp":
        return MOCK_CLOUD_REGIONS["gcp"]
    if provider == "azure":
        return MOCK_CLOUD_REGIONS["azure"]

    return ""


def get_kubernetes_versions(provider: str):
    if provider == "aws":
        return MOCK_KUBERNETES_VERSIONS["aws"]
    if provider == "gcp":
        return MOCK_KUBERNETES_VERSIONS["gcp"]
    if provider == "azure":
        return MOCK_KUBERNETES_VERSIONS["azure"]
    return ""



---
File: nebari/tests/tests_unit/test_cli_keycloak.py
---

import json
import tempfile
from pathlib import Path
from typing import Any, List
from unittest.mock import Mock, patch

import keycloak.exceptions
import pytest
import requests.exceptions
import yaml
from typer.testing import CliRunner

from _nebari.cli import create_cli

TEST_KEYCLOAK_USERS = [
    {"id": "1", "username": "test-dev", "groups": ["analyst", "developer"]},
    {"id": "2", "username": "test-admin", "groups": ["admin"]},
    {"id": "3", "username": "test-nogroup", "groups": []},
]

TEST_DOMAIN = "nebari.example.com"
MOCK_KEYCLOAK_ENV = {
    "KEYCLOAK_SERVER_URL": f"https://{TEST_DOMAIN}/auth/",
    "KEYCLOAK_ADMIN_USERNAME": "root",
    "KEYCLOAK_ADMIN_PASSWORD": "super-secret-123!",
}

TEST_ACCESS_TOKEN = "abc123"

runner = CliRunner()


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        ([], 0, ["Usage:"]),
        (["--help"], 0, ["Usage:"]),
        (["-h"], 0, ["Usage:"]),
        (["adduser", "--help"], 0, ["Usage:"]),
        (["adduser", "-h"], 0, ["Usage:"]),
        (["export-users", "--help"], 0, ["Usage:"]),
        (["export-users", "-h"], 0, ["Usage:"]),
        (["listusers", "--help"], 0, ["Usage:"]),
        (["listusers", "-h"], 0, ["Usage:"]),
        # error, missing args
        (["adduser"], 2, ["Missing option"]),
        (["adduser", "--config"], 2, ["requires an argument"]),
        (["adduser", "-c"], 2, ["requires an argument"]),
        (["adduser", "--user"], 2, ["requires 2 arguments"]),
        (["export-users"], 2, ["Missing option"]),
        (["export-users", "--config"], 2, ["requires an argument"]),
        (["export-users", "-c"], 2, ["requires an argument"]),
        (["export-users", "--realm"], 2, ["requires an argument"]),
        (["listusers"], 2, ["Missing option"]),
        (["listusers", "--config"], 2, ["requires an argument"]),
        (["listusers", "-c"], 2, ["requires an argument"]),
    ],
)
def test_cli_keycloak_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["keycloak"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


@patch("keycloak.KeycloakAdmin")
def test_cli_keycloak_adduser_happy_path_from_env(_mock_keycloak_admin):
    result = run_cli_keycloak_adduser(use_env=True)

    assert 0 == result.exit_code
    assert not result.exception
    assert f"Created user={TEST_KEYCLOAK_USERS[0]['username']}" in result.stdout


@patch("keycloak.KeycloakAdmin")
def test_cli_keycloak_adduser_happy_path_from_config(_mock_keycloak_admin):
    result = run_cli_keycloak_adduser(use_env=False)

    assert 0 == result.exit_code
    assert not result.exception
    assert f"Created user={TEST_KEYCLOAK_USERS[0]['username']}" in result.stdout


@patch(
    "keycloak.KeycloakAdmin.__init__",
    side_effect=keycloak.exceptions.KeycloakConnectionError(
        error_message="connection test"
    ),
)
def test_cli_keycloak_adduser_keycloak_connection_exception(_mock_keycloak_admin):
    result = run_cli_keycloak_adduser()

    assert 1 == result.exit_code
    assert result.exception
    assert "Failed to connect to Keycloak server: connection test" in str(
        result.exception
    )


@patch(
    "keycloak.KeycloakAdmin.__init__",
    side_effect=keycloak.exceptions.KeycloakAuthenticationError(
        error_message="auth test"
    ),
)
def test_cli_keycloak_adduser_keycloak_auth_exception(_mock_keycloak_admin):
    result = run_cli_keycloak_adduser()

    assert 1 == result.exit_code
    assert result.exception
    assert "Failed to connect to Keycloak server: auth test" in str(result.exception)


@patch(
    "keycloak.KeycloakAdmin",
    return_value=Mock(
        create_user=Mock(
            side_effect=keycloak.exceptions.KeycloakConnectionError(
                error_message="unhandled"
            )
        ),
    ),
)
def test_cli_keycloak_adduser_keycloak_unhandled_error(_mock_keycloak_admin):
    result = run_cli_keycloak_adduser()

    assert 1 == result.exit_code
    assert result.exception
    assert "unhandled" == str(result.exception)


@patch(
    "keycloak.KeycloakAdmin",
    return_value=Mock(
        users_count=Mock(side_effect=lambda: len(TEST_KEYCLOAK_USERS)),
        get_users=Mock(
            side_effect=lambda: [
                {
                    "id": u["id"],
                    "username": u["username"],
                    "email": f"{u['username']}@example.com",
                }
                for u in TEST_KEYCLOAK_USERS
            ]
        ),
        get_user_groups=Mock(
            side_effect=lambda user_id: [
                {"name": g}
                for u in TEST_KEYCLOAK_USERS
                if u["id"] == user_id
                for g in u["groups"]
            ]
        ),
    ),
)
def test_cli_keycloak_listusers_happy_path_from_env(_mock_keycloak_admin):
    result = run_cli_keycloak_listusers(use_env=True)

    assert 0 == result.exit_code
    assert not result.exception

    # output should start with the number of users found then
    # display a table with their info
    assert result.stdout.startswith(f"{len(TEST_KEYCLOAK_USERS)} Keycloak Users")
    # user count + headers + separator + 3 user rows == 6
    assert 6 == len(result.stdout.strip().split("\n"))
    for u in TEST_KEYCLOAK_USERS:
        assert u["username"] in result.stdout


@patch(
    "keycloak.KeycloakAdmin",
    return_value=Mock(
        users_count=Mock(side_effect=lambda: len(TEST_KEYCLOAK_USERS)),
        get_users=Mock(
            side_effect=lambda: [
                {
                    "id": u["id"],
                    "username": u["username"],
                    "email": f"{u['username']}@example.com",
                }
                for u in TEST_KEYCLOAK_USERS
            ]
        ),
        get_user_groups=Mock(
            side_effect=lambda user_id: [
                {"name": g}
                for u in TEST_KEYCLOAK_USERS
                if u["id"] == user_id
                for g in u["groups"]
            ]
        ),
    ),
)
def test_cli_keycloak_listusers_happy_path_from_config(_mock_keycloak_admin):
    result = run_cli_keycloak_listusers(use_env=False)

    assert 0 == result.exit_code
    assert not result.exception

    # output should start with the number of users found then
    # display a table with their info
    assert result.stdout.startswith(f"{len(TEST_KEYCLOAK_USERS)} Keycloak Users")
    # user count + headers + separator + 3 user rows == 6
    assert 6 == len(result.stdout.strip().split("\n"))
    for u in TEST_KEYCLOAK_USERS:
        assert u["username"] in result.stdout


@patch(
    "keycloak.KeycloakAdmin.__init__",
    side_effect=keycloak.exceptions.KeycloakConnectionError(
        error_message="connection test"
    ),
)
def test_cli_keycloak_listusers_keycloak_connection_exception(_mock_keycloak_admin):
    result = run_cli_keycloak_listusers()

    assert 1 == result.exit_code
    assert result.exception
    assert "Failed to connect to Keycloak server: connection test" in str(
        result.exception
    )


@patch(
    "keycloak.KeycloakAdmin.__init__",
    side_effect=keycloak.exceptions.KeycloakAuthenticationError(
        error_message="auth test"
    ),
)
def test_cli_keycloak_listusers_keycloak_auth_exception(_mock_keycloak_admin):
    result = run_cli_keycloak_listusers()

    assert 1 == result.exit_code
    assert result.exception
    assert "Failed to connect to Keycloak server: auth test" in str(result.exception)


@patch(
    "keycloak.KeycloakAdmin",
    return_value=Mock(
        users_count=Mock(
            side_effect=keycloak.exceptions.KeycloakConnectionError(
                error_message="unhandled"
            )
        ),
    ),
)
def test_cli_keycloak_listusers_keycloak_unhandled_error(_mock_keycloak_admin):
    result = run_cli_keycloak_listusers()

    assert 1 == result.exit_code
    assert result.exception
    assert "unhandled" == str(result.exception)


def mock_api_post(admin_password: str, url: str, headers: Any, data: Any, verify: bool):
    response = Mock()
    if (
        url
        == f"{MOCK_KEYCLOAK_ENV['KEYCLOAK_SERVER_URL']}realms/master/protocol/openid-connect/token"
        and data["password"] == admin_password
    ):
        response.status_code = 200
        response.content = bytes(
            json.dumps({"access_token": TEST_ACCESS_TOKEN}), "UTF-8"
        )
    else:
        response.status_code = 403
    return response


def mock_api_request(
    access_token: str, method: str, url: str, headers: Any, verify: bool
):
    response = Mock()
    if (
        method == "GET"
        and url
        == f"{MOCK_KEYCLOAK_ENV['KEYCLOAK_SERVER_URL']}admin/realms/test-realm/users"
        and headers["Authorization"] == f"Bearer {access_token}"
    ):
        response.status_code = 200
        response.content = bytes(json.dumps(TEST_KEYCLOAK_USERS), "UTF-8")
    else:
        response.status_code = 403
    return response


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
@patch(
    "_nebari.keycloak.requests.request",
    side_effect=lambda method, url, headers, verify: mock_api_request(
        TEST_ACCESS_TOKEN, method, url, headers, verify
    ),
)
def test_cli_keycloak_exportusers_happy_path_from_env(
    _mock_requests_post, _mock_requests_request
):
    result = run_cli_keycloak_exportusers()

    assert 0 == result.exit_code
    assert not result.exception

    r = json.loads(result.stdout)
    assert "test-realm" == r["realm"]
    assert 3 == len(r["users"])
    assert "test-dev" == r["users"][0]["username"]


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
@patch(
    "_nebari.keycloak.requests.request",
    side_effect=lambda method, url, headers, verify: mock_api_request(
        TEST_ACCESS_TOKEN, method, url, headers, verify
    ),
)
def test_cli_keycloak_exportusers_happy_path_from_config(
    _mock_requests_post, _mock_requests_request
):
    result = run_cli_keycloak_exportusers(use_env=False)

    assert 0 == result.exit_code
    assert not result.exception

    r = json.loads(result.stdout)
    assert "test-realm" == r["realm"]
    assert 3 == len(r["users"])
    assert "test-dev" == r["users"][0]["username"]


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        "invalid_admin_password", url, headers, data, verify
    ),
)
def test_cli_keycloak_exportusers_error_authentication(_mock_requests_post):
    result = run_cli_keycloak_exportusers()

    assert 1 == result.exit_code
    assert result.exception
    assert "Unable to retrieve Keycloak API token" in str(result.exception)
    assert "Status code: 403" in str(result.exception)


@patch(
    "_nebari.keycloak.requests.post",
    side_effect=lambda url, headers, data, verify: mock_api_post(
        MOCK_KEYCLOAK_ENV["KEYCLOAK_ADMIN_PASSWORD"], url, headers, data, verify
    ),
)
@patch(
    "_nebari.keycloak.requests.request",
    side_effect=lambda method, url, headers, verify: mock_api_request(
        "invalid_access_token", method, url, headers, verify
    ),
)
def test_cli_keycloak_exportusers_error_authorization(
    _mock_requests_post, _mock_requests_request
):
    result = run_cli_keycloak_exportusers()

    assert 1 == result.exit_code
    assert result.exception
    assert "Unable to communicate with Keycloak API" in str(result.exception)
    assert "Status code: 403" in str(result.exception)


@patch(
    "_nebari.keycloak.requests.post", side_effect=requests.exceptions.RequestException()
)
def test_cli_keycloak_exportusers_request_exception(_mock_requests_post):
    result = run_cli_keycloak_exportusers()

    assert 1 == result.exit_code
    assert result.exception


@patch("_nebari.keycloak.requests.post", side_effect=Exception())
def test_cli_keycloak_exportusers_unhandled_error(_mock_requests_post):
    result = run_cli_keycloak_exportusers()

    assert 1 == result.exit_code
    assert result.exception


def run_cli_keycloak(command: str, use_env: bool, extra_args: List[str] = []):
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        extra_config = (
            {
                "domain": TEST_DOMAIN,
                "security": {
                    "keycloak": {
                        "initial_root_password": MOCK_KEYCLOAK_ENV[
                            "KEYCLOAK_ADMIN_PASSWORD"
                        ]
                    }
                },
            }
            if not use_env
            else {}
        )
        config = {**{"project_name": "keycloak"}, **extra_config}
        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(config, f)

        assert tmp_file.exists() is True

        app = create_cli()

        args = [
            "keycloak",
            command,
            "--config",
            tmp_file.resolve(),
        ] + extra_args

        env = MOCK_KEYCLOAK_ENV if use_env else {}
        result = runner.invoke(app, args=args, env=env)

        return result


def run_cli_keycloak_adduser(use_env: bool = True):
    username = TEST_KEYCLOAK_USERS[0]["username"]
    password = "test-password-123!"

    return run_cli_keycloak(
        "adduser",
        use_env=use_env,
        extra_args=[
            "--user",
            username,
            password,
        ],
    )


def run_cli_keycloak_listusers(use_env: bool = True):
    return run_cli_keycloak(
        "listusers",
        use_env=use_env,
    )


def run_cli_keycloak_exportusers(use_env: bool = True):
    return run_cli_keycloak(
        "export-users",
        use_env=use_env,
        extra_args=[
            "--realm",
            "test-realm",
        ],
    )



---
File: nebari/tests/tests_unit/test_cli_plugin.py
---

from typing import List
from unittest.mock import Mock, patch

import pytest
from typer.testing import CliRunner

from _nebari.cli import create_cli

runner = CliRunner()


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        ([], 0, ["Usage:"]),
        (["--help"], 0, ["Usage:"]),
        (["-h"], 0, ["Usage:"]),
        (["list", "--help"], 0, ["Usage:"]),
        (["list", "-h"], 0, ["Usage:"]),
        (["list"], 0, ["Plugins"]),
    ],
)
def test_cli_plugin_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["plugin"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


def mock_get_plugins():
    mytestexternalplugin = Mock()
    mytestexternalplugin.__name__ = "mytestexternalplugin"

    otherplugin = Mock()
    otherplugin.__name__ = "otherplugin"

    return [mytestexternalplugin, otherplugin]


def mock_version(pkg):
    pkg_version_map = {
        "mytestexternalplugin": "0.4.4",
        "otherplugin": "1.1.1",
    }
    return pkg_version_map.get(pkg)


@patch(
    "nebari.plugins.NebariPluginManager.plugin_manager.get_plugins", mock_get_plugins
)
@patch("_nebari.subcommands.plugin.version", mock_version)
def test_cli_plugin_list_external_plugins():
    app = create_cli()
    result = runner.invoke(app, ["plugin", "list"])
    assert result.exit_code == 0
    expected_output = [
        "Plugins",
        "mytestexternalplugin │ 0.4.4",
        "otherplugin          │ 1.1.1",
    ]
    for c in expected_output:
        assert c in result.stdout



---
File: nebari/tests/tests_unit/test_cli_support.py
---

import tempfile
from pathlib import Path
from typing import List
from unittest.mock import Mock, patch
from zipfile import ZipFile

import kubernetes.client
import kubernetes.client.exceptions
import pytest
import yaml
from typer.testing import CliRunner

from _nebari.cli import create_cli

runner = CliRunner()


class MockPod:
    name: str
    containers: List[str]
    ip_address: str

    def __init__(self, name: str, ip_address: str, containers: List[str]):
        self.name = name
        self.ip_address = ip_address
        self.containers = containers


def mock_list_namespaced_pod(pods: List[MockPod], namespace: str):
    return kubernetes.client.V1PodList(
        items=[
            kubernetes.client.V1Pod(
                metadata=kubernetes.client.V1ObjectMeta(
                    name=p.name, namespace=namespace
                ),
                spec=kubernetes.client.V1PodSpec(
                    containers=[
                        kubernetes.client.V1Container(name=c) for c in p.containers
                    ]
                ),
                status=kubernetes.client.V1PodStatus(pod_ip=p.ip_address),
            )
            for p in pods
        ]
    )


def mock_read_namespaced_pod_log(name: str, namespace: str, container: str):
    return f"Test log entry: {name} -- {namespace} -- {container}"


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        (["--help"], 0, ["Usage:"]),
        (["-h"], 0, ["Usage:"]),
        # error, missing args
        ([], 2, ["Missing option"]),
        (["--config"], 2, ["requires an argument"]),
        (["-c"], 2, ["requires an argument"]),
        (["--output"], 2, ["requires an argument"]),
        (["-o"], 2, ["requires an argument"]),
    ],
)
def test_cli_support_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["support"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


@patch("kubernetes.config.kube_config.load_kube_config", return_value=Mock())
@patch(
    "kubernetes.client.CoreV1Api",
    return_value=Mock(
        list_namespaced_pod=Mock(
            side_effect=lambda namespace: mock_list_namespaced_pod(
                [
                    MockPod(
                        name="pod-1",
                        ip_address="10.0.0.1",
                        containers=["container-1-1", "container-1-2"],
                    ),
                    MockPod(
                        name="pod-2",
                        ip_address="10.0.0.2",
                        containers=["container-2-1"],
                    ),
                ],
                namespace,
            )
        ),
        read_namespaced_pod_log=Mock(side_effect=mock_read_namespaced_pod_log),
    ),
)
def test_cli_support_happy_path(
    _mock_k8s_corev1api, _mock_config, monkeypatch: pytest.MonkeyPatch
):
    with tempfile.TemporaryDirectory() as tmp:
        # NOTE: The support command leaves the ./log folder behind after running,
        # relative to wherever the tests were run from.
        # Changing context to the tmp dir so this will be cleaned up properly.
        monkeypatch.chdir(Path(tmp).resolve())

        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump({"project_name": "support", "namespace": "test-ns"}, f)

        assert tmp_file.exists() is True

        app = create_cli()

        log_zip_file = Path(tmp).resolve() / "test-support.zip"
        assert log_zip_file.exists() is False

        result = runner.invoke(
            app,
            [
                "support",
                "--config",
                tmp_file.resolve(),
                "--output",
                log_zip_file.resolve(),
            ],
        )

        assert log_zip_file.exists() is True

        assert 0 == result.exit_code
        assert not result.exception
        assert "log/test-ns" in result.stdout

        # open the zip and check a sample file for the expected formatting
        with ZipFile(log_zip_file.resolve(), "r") as log_zip:
            # expect 1 log file per pod
            assert 2 == len(log_zip.namelist())
            with log_zip.open("log/test-ns/pod-1.txt") as log_file:
                content = str(log_file.read(), "UTF-8")
                # expect formatted header + logs for each container
                expected = """
10.0.0.1\ttest-ns\tpod-1
Container: container-1-1
Test log entry: pod-1 -- test-ns -- container-1-1
Container: container-1-2
Test log entry: pod-1 -- test-ns -- container-1-2
"""
                assert expected.strip() == content.strip()


@patch("kubernetes.config.kube_config.load_kube_config", return_value=Mock())
@patch(
    "kubernetes.client.CoreV1Api",
    return_value=Mock(
        list_namespaced_pod=Mock(
            side_effect=kubernetes.client.exceptions.ApiException(reason="unit testing")
        )
    ),
)
def test_cli_support_error_apiexception(
    _mock_k8s_corev1api, _mock_config, monkeypatch: pytest.MonkeyPatch
):
    with tempfile.TemporaryDirectory() as tmp:
        monkeypatch.chdir(Path(tmp).resolve())

        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump({"project_name": "support", "namespace": "test-ns"}, f)

        assert tmp_file.exists() is True

        app = create_cli()

        log_zip_file = Path(tmp).resolve() / "test-support.zip"

        result = runner.invoke(
            app,
            [
                "support",
                "--config",
                tmp_file.resolve(),
                "--output",
                log_zip_file.resolve(),
            ],
        )

        assert log_zip_file.exists() is False

        assert 1 == result.exit_code
        assert result.exception
        assert "Reason: unit testing" in str(result.exception)


def test_cli_support_error_missing_config():
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        app = create_cli()

        result = runner.invoke(app, ["support", "--config", tmp_file.resolve()])

        assert 1 == result.exit_code
        assert result.exception
        assert "nebari-config.yaml does not exist" in str(result.exception)



---
File: nebari/tests/tests_unit/test_cli_upgrade.py
---

import re
import tempfile
from pathlib import Path
from typing import Any, Dict, List

import pytest
import yaml
from rich.prompt import Confirm, Prompt
from typer.testing import CliRunner

import _nebari.upgrade
from _nebari.cli import create_cli
from _nebari.constants import AZURE_DEFAULT_REGION
from _nebari.upgrade import UPGRADE_KUBERNETES_MESSAGE
from _nebari.utils import get_provider_config_block_name
from _nebari.version import rounded_ver_parse

MOCK_KUBERNETES_VERSIONS = {
    "aws": ["1.20"],
    "azure": ["1.20"],
    "gcp": ["1.20"],
}
MOCK_CLOUD_REGIONS = {
    "aws": ["us-east-1"],
    "azure": [AZURE_DEFAULT_REGION],
    "gcp": ["us-central1"],
}


# can't upgrade to a previous version that doesn't have a corresponding
# UpgradeStep derived class. without these dummy classes, the rendered
# nebari-config.yaml will have the wrong version
class Test_Cli_Upgrade_2022_10_1(_nebari.upgrade.UpgradeStep):
    version = "2022.10.1"


class Test_Cli_Upgrade_2022_11_1(_nebari.upgrade.UpgradeStep):
    version = "2022.11.1"


class Test_Cli_Upgrade_2023_1_1(_nebari.upgrade.UpgradeStep):
    version = "2023.1.1"


class Test_Cli_Upgrade_2023_4_1(_nebari.upgrade.UpgradeStep):
    version = "2023.4.1"


class Test_Cli_Upgrade_2023_5_1(_nebari.upgrade.UpgradeStep):
    version = "2023.5.1"


### end dummy upgrade classes

runner = CliRunner()


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        (["--help"], 0, ["Usage:"]),
        (["-h"], 0, ["Usage:"]),
        # error, missing args
        ([], 2, ["Missing option"]),
        (["--config"], 2, ["requires an argument"]),
        (["-c"], 2, ["requires an argument"]),
        # only used for old qhub version upgrades
        (
            ["--attempt-fixes"],
            2,
            ["Missing option"],
        ),
    ],
)
def test_cli_upgrade_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["upgrade"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


def test_cli_upgrade_2022_10_1_to_2022_11_1(monkeypatch: pytest.MonkeyPatch):
    assert_nebari_upgrade_success(monkeypatch, "2022.10.1", "2022.11.1")


def test_cli_upgrade_2022_11_1_to_2023_1_1(monkeypatch: pytest.MonkeyPatch):
    assert_nebari_upgrade_success(monkeypatch, "2022.11.1", "2023.1.1")


def test_cli_upgrade_2023_1_1_to_2023_4_1(monkeypatch: pytest.MonkeyPatch):
    assert_nebari_upgrade_success(monkeypatch, "2023.1.1", "2023.4.1")


def test_cli_upgrade_2023_4_1_to_2023_5_1(monkeypatch: pytest.MonkeyPatch):
    assert_nebari_upgrade_success(
        monkeypatch,
        "2023.4.1",
        "2023.5.1",
        # Have you deleted the Argo Workflows CRDs and service accounts
        inputs=["y"],
    )


@pytest.mark.parametrize(
    "provider",
    ["aws", "azure", "gcp"],
)
def test_cli_upgrade_2023_5_1_to_2023_7_1(
    monkeypatch: pytest.MonkeyPatch, provider: str
):
    config = assert_nebari_upgrade_success(
        monkeypatch, "2023.5.1", "2023.7.1", provider=provider
    )
    prevent_deploy = config.get("prevent_deploy")
    if provider == "aws":
        assert prevent_deploy
    else:
        assert not prevent_deploy


@pytest.mark.parametrize(
    "workflows_enabled, workflow_controller_enabled",
    [(True, True), (True, False), (False, None), (None, None)],
)
def test_cli_upgrade_2023_7_1_to_2023_7_2(
    monkeypatch: pytest.MonkeyPatch,
    workflows_enabled: bool,
    workflow_controller_enabled: bool,
):
    addl_config = {}
    inputs = []

    if workflows_enabled is not None:
        addl_config = {"argo_workflows": {"enabled": workflows_enabled}}
        if workflows_enabled is True:
            inputs.append("y" if workflow_controller_enabled else "n")

    upgraded = assert_nebari_upgrade_success(
        monkeypatch,
        "2023.7.1",
        "2023.7.2",
        addl_config=addl_config,
        # Do you want to enable the Nebari Workflow Controller?
        inputs=inputs,
    )

    if workflows_enabled is True:
        if workflow_controller_enabled:
            assert (
                True
                is upgraded["argo_workflows"]["nebari_workflow_controller"]["enabled"]
            )
        else:
            # not sure this makes sense, the code defaults this to True if missing
            assert "nebari_workflow_controller" not in upgraded["argo_workflows"]
    elif workflows_enabled is False:
        assert False is upgraded["argo_workflows"]["enabled"]
    else:
        # argo_workflows config missing
        # this one doesn't sound right either, they default to true if this is missing, so why skip the questions?
        assert "argo_workflows" not in upgraded


def test_cli_upgrade_image_tags(monkeypatch: pytest.MonkeyPatch):
    start_version = "2023.5.1"
    end_version = "2023.7.1"

    upgraded = assert_nebari_upgrade_success(
        monkeypatch,
        start_version,
        end_version,
        # # number of "y" inputs directly corresponds to how many matching images are found in yaml
        inputs=["y", "y", "y", "y", "y", "y", "y"],
        addl_config=yaml.safe_load(
            f"""
default_images:
  jupyterhub: quay.io/nebari/nebari-jupyterhub:{start_version}
  jupyterlab: quay.io/nebari/nebari-jupyterlab:{start_version}
  dask_worker: quay.io/nebari/nebari-dask-worker:{start_version}
profiles:
  jupyterlab:
  - display_name: base
    kubespawner_override:
      image: quay.io/nebari/nebari-jupyterlab:{start_version}
  - display_name: gpu
    kubespawner_override:
      image: quay.io/nebari/nebari-jupyterlab-gpu:{start_version}
  - display_name: any-other-version
    kubespawner_override:
      image: quay.io/nebari/nebari-jupyterlab:1955.11.5
  - display_name: leave-me-alone
    kubespawner_override:
      image: quay.io/nebari/leave-me-alone:{start_version}
  dask_worker:
    test:
      image: quay.io/nebari/nebari-dask-worker:{start_version}
"""
        ),
    )

    for _, v in upgraded["default_images"].items():
        assert v.endswith(end_version)

    for profile in upgraded["profiles"]["jupyterlab"]:
        if profile["display_name"] != "leave-me-alone":
            # assume all other images should have been upgraded to the end_version
            assert profile["kubespawner_override"]["image"].endswith(end_version)
        else:
            # this one was selected not to match the regex for nebari images, should have been left alone
            assert profile["kubespawner_override"]["image"].endswith(start_version)

    for _, profile in upgraded["profiles"]["dask_worker"].items():
        assert profile["image"].endswith(end_version)


def test_cli_upgrade_fail_on_missing_file():
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        app = create_cli()

        result = runner.invoke(app, ["upgrade", "--config", tmp_file.resolve()])

        assert 1 == result.exit_code
        assert result.exception
        assert (
            f"passed in configuration filename={tmp_file.resolve()} must exist"
            in str(result.exception)
        )


def test_cli_upgrade_fail_on_downgrade():
    start_version = "9999.9.9"  # way in the future
    end_version = _nebari.upgrade.__version__

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = yaml.safe_load(
            f"""
project_name: test
provider: local
domain: test.example.com
namespace: dev
nebari_version: {start_version}
        """
        )

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        result = runner.invoke(app, ["upgrade", "--config", tmp_file.resolve()])

        assert 1 == result.exit_code
        assert result.exception
        assert (
            f"already belongs to a later version ({start_version}) than the installed version of Nebari ({end_version})"
            in str(result.exception)
        )

        # make sure the file is unaltered
        with open(tmp_file.resolve(), "r") as c:
            assert yaml.safe_load(c) == nebari_config


def test_cli_upgrade_does_nothing_on_same_version():
    # this test only seems to work against the actual current version, any
    # mocked earlier versions trigger an actual update
    start_version = _nebari.upgrade.__version__

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = yaml.safe_load(
            f"""
project_name: test
provider: local
domain: test.example.com
namespace: dev
nebari_version: {start_version}
        """
        )

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        result = runner.invoke(app, ["upgrade", "--config", tmp_file.resolve()])

        # feels like this should return a non-zero exit code if the upgrade is not happening
        assert 0 == result.exit_code
        assert not result.exception
        assert "up-to-date" in result.stdout

        # make sure the file is unaltered
        with open(tmp_file.resolve(), "r") as c:
            assert yaml.safe_load(c) == nebari_config


def test_cli_upgrade_0_3_12_to_0_4_0(monkeypatch: pytest.MonkeyPatch):
    start_version = "0.3.12"
    end_version = "0.4.0"

    def callback(tmp_file: Path, _result: Any):
        users_import_file = tmp_file.parent / "nebari-users-import.json"
        assert users_import_file.exists()

        return True  # continue with default assertions

    # custom authenticators removed in 0.4.0, should be replaced by password
    upgraded = assert_nebari_upgrade_success(
        monkeypatch,
        start_version,
        end_version,
        addl_args=["--attempt-fixes"],
        addl_config=yaml.safe_load(
            """
security:
  authentication:
    type: custom
    config:
      oauth_callback_url: ""
      scope: ""
  users: {}
  groups:
    users: {}
terraform_modules: []
default_images:
  conda_store: ""
  dask_gateway: ""
"""
        ),
        callback=callback,
    )

    assert "password" == upgraded["security"]["authentication"]["type"]
    assert "" != upgraded["security"]["keycloak"]["initial_root_password"]
    assert "users" not in upgraded["security"]
    assert "groups" not in upgraded["security"]
    assert "config" not in upgraded["security"]["authentication"]
    assert True is upgraded["security"]["shared_users_group"]
    assert "terraform_modules" not in upgraded
    assert {} == upgraded["default_images"]
    assert True is upgraded["prevent_deploy"]


def test_cli_upgrade_to_0_4_0_fails_for_custom_auth_without_attempt_fixes():
    start_version = "0.3.12"

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = yaml.safe_load(
            f"""
project_name: test
provider: local
domain: test.example.com
namespace: dev
nebari_version: {start_version}
security:
  authentication:
    type: custom
        """
        )

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        result = runner.invoke(app, ["upgrade", "--config", tmp_file.resolve()])

        assert 1 == result.exit_code
        assert result.exception
        assert "Custom Authenticators are no longer supported" in str(result.exception)

        # make sure the file is unaltered
        with open(tmp_file.resolve(), "r") as c:
            assert yaml.safe_load(c) == nebari_config


@pytest.mark.skipif(
    rounded_ver_parse(_nebari.upgrade.__version__) < rounded_ver_parse("2023.10.1"),
    reason="This test is only valid for versions >= 2023.10.1",
)
def test_cli_upgrade_to_2023_10_1_cdsdashboard_removed(monkeypatch: pytest.MonkeyPatch):
    start_version = "2023.7.2"
    end_version = "2023.10.1"

    addl_config = yaml.safe_load(
        """
cdsdashboards:
  enabled: true
  cds_hide_user_named_servers: true
  cds_hide_user_dashboard_servers: false
        """
    )

    upgraded = assert_nebari_upgrade_success(
        monkeypatch,
        start_version,
        end_version,
        addl_args=["--attempt-fixes"],
        addl_config=addl_config,
    )

    assert not upgraded.get("cdsdashboards")
    assert upgraded.get("prevent_deploy")


@pytest.mark.skipif(
    rounded_ver_parse(_nebari.upgrade.__version__) < rounded_ver_parse("2023.10.1"),
    reason="This test is only valid for versions >= 2023.10.1",
)
@pytest.mark.parametrize(
    ("provider", "k8s_status"),
    [
        ("aws", "compatible"),
        ("aws", "incompatible"),
        ("aws", "invalid"),
        ("azure", "compatible"),
        ("azure", "incompatible"),
        ("azure", "invalid"),
        ("gcp", "compatible"),
        ("gcp", "incompatible"),
        ("gcp", "invalid"),
    ],
)
def test_cli_upgrade_to_2023_10_1_kubernetes_validations(
    monkeypatch: pytest.MonkeyPatch, provider: str, k8s_status: str
):
    start_version = "2023.7.2"
    end_version = "2023.10.1"
    monkeypatch.setattr(_nebari.upgrade, "__version__", end_version)

    kubernetes_configs = {
        "aws": {"incompatible": "1.19", "compatible": "1.26", "invalid": "badname"},
        "azure": {"incompatible": "1.23", "compatible": "1.26", "invalid": "badname"},
        "gcp": {"incompatible": "1.23", "compatible": "1.26", "invalid": "badname"},
    }

    def mock_input_ask(prompt, *args, **kwargs):
        from _nebari.upgrade import TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION

        # For more about structural pattern matching, see:
        # https://peps.python.org/pep-0636/
        match prompt:
            case str(s) if s == TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION:
                return kwargs.get("attempt_fixes", False)
            case _:
                return kwargs.get("default", False)

    monkeypatch.setattr(Confirm, "ask", mock_input_ask)
    monkeypatch.setattr(
        Prompt,
        "ask",
        lambda x, *args, **kwargs: "",
    )

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = yaml.safe_load(
            f"""
project_name: test
provider: {provider}
domain: test.example.com
namespace: dev
nebari_version: {start_version}
cdsdashboards:
  enabled: true
  cds_hide_user_named_servers: true
  cds_hide_user_dashboard_servers: false
{get_provider_config_block_name(provider)}:
    region: {MOCK_CLOUD_REGIONS.get(provider, {})[0]}
    kubernetes_version: {kubernetes_configs[provider][k8s_status]}
        """
        )
        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

            assert tmp_file.exists() is True
            app = create_cli()

            result = runner.invoke(app, ["upgrade", "--config", tmp_file.resolve()])

            if k8s_status == "incompatible":
                UPGRADE_KUBERNETES_MESSAGE_WO_BRACKETS = re.sub(
                    r"\[.*?\]", "", UPGRADE_KUBERNETES_MESSAGE
                )
                assert UPGRADE_KUBERNETES_MESSAGE_WO_BRACKETS in result.stdout.replace(
                    "\n", ""
                )

            if k8s_status == "compatible":
                assert 0 == result.exit_code
                assert not result.exception
                assert "Saving new config file" in result.stdout

                # load the modified nebari-config.yaml and check the new version has changed
                with open(tmp_file.resolve(), "r") as f:
                    upgraded = yaml.safe_load(f)
                    assert end_version == upgraded["nebari_version"]

            if k8s_status == "invalid":
                assert (
                    "Unable to detect Kubernetes version for provider {}".format(
                        provider
                    )
                    in result.stdout
                )


def assert_nebari_upgrade_success(
    monkeypatch: pytest.MonkeyPatch,
    start_version: str,
    end_version: str,
    provider: str = "local",
    addl_args: List[str] = [],
    addl_config: Dict[str, Any] = {},
    inputs: List[str] = [],
    callback: Any = None,
) -> Dict[str, Any]:
    monkeypatch.setattr(_nebari.upgrade, "__version__", end_version)

    # create a tmp dir and clean up when done
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        # merge basic config with any test case specific values provided
        nebari_config = {
            **yaml.safe_load(
                f"""
project_name: test
provider: {provider}
domain: test.example.com
namespace: dev
nebari_version: {start_version}
        """
            ),
            **addl_config,
        }

        # write the test nebari-config.yaml file to tmp location
        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        if inputs is not None and len(inputs) > 0:
            inputs.append("")  # trailing newline for last input

        # run nebari upgrade -c tmp/nebari-config.yaml
        result = runner.invoke(
            app,
            ["upgrade", "--config", tmp_file.resolve()] + addl_args,
            input="\n".join(inputs),
        )

        enable_default_assertions = True

        if callback is not None:
            enable_default_assertions = callback(tmp_file, result)

        if enable_default_assertions:
            assert 0 == result.exit_code
            assert not result.exception
            assert "Saving new config file" in result.stdout

            # load the modified nebari-config.yaml and check the new version has changed
            with open(tmp_file.resolve(), "r") as f:
                upgraded = yaml.safe_load(f)
                assert end_version == upgraded["nebari_version"]

            # check backup matches original
            backup_file = (
                Path(tmp).resolve() / f"nebari-config.yaml.{start_version}.backup"
            )
            assert backup_file.exists() is True
            with open(backup_file.resolve(), "r") as b:
                backup = yaml.safe_load(b)
                assert backup == nebari_config

        # pass the parsed nebari-config.yaml with upgrade mods back to caller for
        # additional assertions
        return upgraded



---
File: nebari/tests/tests_unit/test_cli_validate.py
---

import re
import shutil
import tempfile
from pathlib import Path
from typing import Any, Dict, List

import pytest
import yaml
from typer.testing import CliRunner

from _nebari._version import __version__
from _nebari.cli import create_cli

TEST_DATA_DIR = Path(__file__).resolve().parent / "cli_validate"

runner = CliRunner()


def _update_yaml_file(file_path: Path, key: str, value: Any):
    """Utility function to update a yaml file with a new key/value pair."""
    with open(file_path, "r") as f:
        yaml_data = yaml.safe_load(f)

    yaml_data[key] = value

    with open(file_path, "w") as f:
        yaml.safe_dump(yaml_data, f)


@pytest.mark.parametrize(
    "args, exit_code, content",
    [
        # --help
        (["--help"], 0, ["Usage:"]),
        (["-h"], 0, ["Usage:"]),
        # error, missing args
        ([], 2, ["Missing option"]),
        (["--config"], 2, ["requires an argument"]),
        (["-c"], 2, ["requires an argument"]),
        (
            ["--enable-commenting"],
            2,
            ["Missing option"],
        ),  # https://github.com/nebari-dev/nebari/issues/1937
    ],
)
def test_cli_validate_stdout(args: List[str], exit_code: int, content: List[str]):
    app = create_cli()
    result = runner.invoke(app, ["validate"] + args)
    assert result.exit_code == exit_code
    for c in content:
        assert c in result.stdout


def generate_test_data_test_cli_validate_local_happy_path():
    """
    Search the cli_validate folder for happy path test cases
    and add them to the parameterized list of inputs for
    test_cli_validate_local_happy_path
    """

    test_data = []
    for f in TEST_DATA_DIR.iterdir():
        if f.is_file() and re.match(
            r"^\w*\.happy.*\.yaml$", f.name
        ):  # sample.happy.optional-description.yaml
            test_data.append((f.name))
    keys = [
        "config_yaml",
    ]
    return {"keys": keys, "test_data": test_data}


def test_cli_validate_local_happy_path(config_yaml: str):
    test_file = TEST_DATA_DIR / config_yaml
    assert test_file.exists() is True

    with tempfile.TemporaryDirectory() as tmpdirname:
        temp_test_file = shutil.copy(test_file, tmpdirname)

        # update the copied test file with the current version if necessary
        _update_yaml_file(temp_test_file, "nebari_version", __version__)

        app = create_cli()
        result = runner.invoke(app, ["validate", "--config", temp_test_file])
        assert not result.exception
        assert 0 == result.exit_code
        assert "Successfully validated configuration" in result.stdout


def test_cli_validate_from_env():
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = yaml.safe_load(
            """
provider: aws
project_name: test
amazon_web_services:
  region: us-east-1
  kubernetes_version: '1.19'
        """
        )

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        valid_result = runner.invoke(
            app,
            ["validate", "--config", tmp_file.resolve()],
            env={"NEBARI_SECRET__amazon_web_services__kubernetes_version": "1.20"},
        )

        assert 0 == valid_result.exit_code
        assert not valid_result.exception
        assert "Successfully validated configuration" in valid_result.stdout

        invalid_result = runner.invoke(
            app,
            ["validate", "--config", tmp_file.resolve()],
            env={"NEBARI_SECRET__amazon_web_services__kubernetes_version": "1.0"},
        )

        assert 1 == invalid_result.exit_code
        assert invalid_result.exception
        assert "Invalid `kubernetes-version`" in invalid_result.stdout


@pytest.mark.parametrize(
    "key, value, provider, expected_message, addl_config",
    [
        ("NEBARI_SECRET__project_name", "123invalid", "local", "validation error", {}),
        (
            "NEBARI_SECRET__this_is_an_error",
            "true",
            "local",
            "Object has no attribute",
            {},
        ),
        (
            "NEBARI_SECRET__amazon_web_services__kubernetes_version",
            "1.0",
            "aws",
            "validation error",
            {
                "amazon_web_services": {
                    "region": "us-east-1",
                    "kubernetes_version": "1.19",
                }
            },
        ),
    ],
)
def test_cli_validate_error_from_env(
    key: str,
    value: str,
    provider: str,
    expected_message: str,
    addl_config: Dict[str, Any],
):
    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = {
            **yaml.safe_load(
                f"""
provider: {provider}
project_name: test
        """
            ),
            **addl_config,
        }

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        # confirm the file is otherwise valid without environment variable overrides
        pre = runner.invoke(app, ["validate", "--config", tmp_file.resolve()])
        assert 0 == pre.exit_code
        assert not pre.exception

        # run validate again with environment variables that are expected to trigger
        # validation errors
        result = runner.invoke(
            app, ["validate", "--config", tmp_file.resolve()], env={key: value}
        )

        assert 1 == result.exit_code
        assert result.exception
        assert expected_message in result.stdout


@pytest.mark.parametrize(
    "provider, addl_config",
    [
        (
            "aws",
            {
                "amazon_web_services": {
                    "kubernetes_version": "1.20",
                    "region": "us-east-1",
                }
            },
        ),
        ("azure", {"azure": {"kubernetes_version": "1.20", "region": "Central US"}}),
        (
            "gcp",
            {
                "google_cloud_platform": {
                    "kubernetes_version": "1.20",
                    "region": "us-east1",
                    "project": "test",
                }
            },
        ),
        pytest.param(
            "local",
            {"security": {"authentication": {"type": "Auth0"}}},
            id="auth-provider-auth0",
        ),
        pytest.param(
            "local",
            {"security": {"authentication": {"type": "GitHub"}}},
            id="auth-provider-github",
        ),
    ],
)
def test_cli_validate_error_missing_cloud_env(
    monkeypatch: pytest.MonkeyPatch, provider: str, addl_config: Dict[str, Any]
):
    # cloud methods are all globally mocked, need to reset so the env variables will be checked
    monkeypatch.undo()
    for e in [
        "AWS_ACCESS_KEY_ID",
        "AWS_SECRET_ACCESS_KEY",
        "GOOGLE_CREDENTIALS",
        "PROJECT_ID",
        "ARM_SUBSCRIPTION_ID",
        "ARM_TENANT_ID",
        "ARM_CLIENT_ID",
        "ARM_CLIENT_SECRET",
        "SPACES_ACCESS_KEY_ID",
        "SPACES_SECRET_ACCESS_KEY",
        "AUTH0_CLIENT_ID",
        "AUTH0_CLIENT_SECRET",
        "AUTH0_DOMAIN",
        "GITHUB_CLIENT_ID",
        "GITHUB_CLIENT_SECRET",
    ]:
        try:
            monkeypatch.delenv(e)
        except Exception:
            pass

    with tempfile.TemporaryDirectory() as tmp:
        tmp_file = Path(tmp).resolve() / "nebari-config.yaml"
        assert tmp_file.exists() is False

        nebari_config = {
            **yaml.safe_load(
                f"""
provider: {provider}
project_name: test
        """
            ),
            **addl_config,
        }

        with open(tmp_file.resolve(), "w") as f:
            yaml.dump(nebari_config, f)

        assert tmp_file.exists() is True
        app = create_cli()

        result = runner.invoke(app, ["validate", "--config", tmp_file.resolve()])

        assert 1 == result.exit_code
        assert result.exception
        assert "Missing the following required environment variable" in result.stdout


def generate_test_data_test_cli_validate_error():
    """
    Search the cli_validate folder for unhappy path test cases
    and add them to the parameterized list of inputs for
    test_cli_validate_error. Optionally parse an expected
    error message from the file name to assert is present
    in the validate output
    """

    test_data = []
    for f in TEST_DATA_DIR.iterdir():
        if f.is_file():
            m = re.match(r"^\w*\.error\.([\w-]*)\.yaml$", f.name) or re.match(
                r"^\w*\.error\.([\w-]*)\.[\w-]*\.yaml$", f.name
            )  # sample.error.assert-message.optional-description.yaml
            if m:
                test_data.append((f.name, m.groups()[0]))
            elif re.match(r"^\w*\.error\.yaml$", f.name):  # sample.error.yaml
                test_data.append((f.name, None))
    keys = [
        "config_yaml",
        "expected_message",
    ]
    return {"keys": keys, "test_data": test_data}


def test_cli_validate_error(config_yaml: str, expected_message: str):
    test_file = TEST_DATA_DIR / config_yaml
    assert test_file.exists() is True

    app = create_cli()
    result = runner.invoke(app, ["validate", "--config", test_file])

    assert result.exception
    assert 1 == result.exit_code
    assert "ERROR validating configuration" in result.stdout
    if expected_message:
        # since this will usually come from a parsed filename, assume spacing/hyphenation/case is optional
        assert (expected_message in result.stdout.lower()) or (
            expected_message.replace("-", " ").replace("_", " ")
            in result.stdout.lower()
        )


def pytest_generate_tests(metafunc):
    """
    Dynamically generate test data parameters for test functions by looking for
    and executing an associated generate_test_data_{function_name} if one exists.
    """

    try:
        td = eval(f"generate_test_data_{metafunc.function.__name__}")()
        metafunc.parametrize(",".join(td["keys"]), td["test_data"])
    except Exception:
        # expected when a generate_test_data_ function doesn't exist
        pass



---
File: nebari/tests/tests_unit/test_cli.py
---

import subprocess

import pytest

from _nebari.subcommands.init import InitInputs
from nebari.plugins import nebari_plugin_manager

PROJECT_NAME = "clitest"
DOMAIN_NAME = "clitest.dev"


@pytest.mark.parametrize(
    "namespace, auth_provider, ci_provider, ssl_cert_email",
    (
        [None, None, None, None],
        ["prod", "password", "github-actions", "it@acme.org"],
    ),
)
def test_nebari_init(tmp_path, namespace, auth_provider, ci_provider, ssl_cert_email):
    """Test `nebari init` CLI command."""
    command = [
        "nebari",
        "init",
        "local",
        f"--project={PROJECT_NAME}",
        f"--domain={DOMAIN_NAME}",
        "--disable-prompt",
    ]

    default_values = InitInputs()

    if namespace:
        command.append(f"--namespace={namespace}")
    else:
        namespace = default_values.namespace
    if auth_provider:
        command.append(f"--auth-provider={auth_provider}")
    else:
        auth_provider = default_values.auth_provider
    if ci_provider:
        command.append(f"--ci-provider={ci_provider}")
    else:
        ci_provider = default_values.ci_provider
    if ssl_cert_email:
        command.append(f"--ssl-cert-email={ssl_cert_email}")
    else:
        ssl_cert_email = default_values.ssl_cert_email

    subprocess.run(command, cwd=tmp_path, check=True)

    config = nebari_plugin_manager.read_config(tmp_path / "nebari-config.yaml")

    assert config.namespace == namespace
    assert config.security.authentication.type.lower() == auth_provider
    assert config.ci_cd.type == ci_provider
    assert config.certificate.acme_email == ssl_cert_email


@pytest.mark.parametrize(
    "command",
    (
        ["nebari", "--version"],
        ["nebari", "info"],
    ),
)
def test_nebari_commands_no_args(command):
    subprocess.run(command, check=True, capture_output=True, text=True).stdout.strip()



---
File: nebari/tests/tests_unit/test_commons.py
---

from _nebari.provider.cloud.commons import filter_by_highest_supported_k8s_version


def test_filter_by_highest_supported_k8s_version():
    version_to_filter = "99.99"

    k8s_versions = [
        "1.21.7",
        "1.21.9",
        "1.22.4",
        "1.22.6",
        "1.23.3",
        "1.23.5",
        "1.24.0",
        version_to_filter,
    ]
    actual = filter_by_highest_supported_k8s_version(k8s_versions)
    expected = sorted(list(set(k8s_versions) - {version_to_filter}))
    assert actual == expected



---
File: nebari/tests/tests_unit/test_config_set.py
---

from unittest.mock import patch

import pytest
from packaging.requirements import SpecifierSet

from _nebari.config_set import ConfigSetMetadata, read_config_set

test_version = "2024.12.2"


@pytest.mark.parametrize(
    "version_input,test_version,should_pass",
    [
        # Standard version tests
        (">=2024.12.0,<2025.0.0", "2024.12.2", True),
        (SpecifierSet(">=2024.12.0,<2025.0.0"), "2024.12.2", True),
        # Pre-release version requirement tests
        (">=2024.12.0rc1,<2025.0.0", "2024.12.0rc1", True),
        (SpecifierSet(">=2024.12.0rc1"), "2024.12.0rc2", True),
        # Pre-release test version against standard requirement
        (">=2024.12.0,<2025.0.0", "2024.12.1rc1", True),
        (SpecifierSet(">=2024.12.0,<2025.0.0"), "2024.12.1rc1", True),
        # Failing cases
        (">=2025.0.0", "2024.12.2rc1", False),
        (SpecifierSet(">=2025.0.0rc1"), "2024.12.2", False),
    ],
)
def test_version_requirement(version_input, test_version, should_pass):
    metadata = ConfigSetMetadata(name="test-config", nebari_version=version_input)

    if should_pass:
        metadata.check_version(test_version)
    else:
        with pytest.raises(ValueError) as exc_info:
            metadata.check_version(test_version)
        assert "Nebari version" in str(exc_info.value)


def test_read_config_set_valid(tmp_path):
    config_set_yaml = """
    metadata:
      name: test-config
      nebari_version: ">=2024.12.0"
    config:
      key: value
    """
    config_set_filepath = tmp_path / "config_set.yaml"
    config_set_filepath.write_text(config_set_yaml)
    with patch("_nebari.config_set.__version__", "2024.12.2"):
        config_set = read_config_set(str(config_set_filepath))
    assert config_set.metadata.name == "test-config"
    assert config_set.config["key"] == "value"


def test_read_config_set_invalid_version(tmp_path):
    config_set_yaml = """
    metadata:
      name: test-config
      nebari_version: ">=2025.0.0"
    config:
      key: value
    """
    config_set_filepath = tmp_path / "config_set.yaml"
    config_set_filepath.write_text(config_set_yaml)

    with patch("_nebari.config_set.__version__", "2024.12.2"):
        with pytest.raises(ValueError) as exc_info:
            read_config_set(str(config_set_filepath))
        assert "Nebari version" in str(exc_info.value)


if __name__ == "__main__":
    pytest.main()



---
File: nebari/tests/tests_unit/test_config.py
---

import os
import pathlib

import pytest

from _nebari.config import (
    backup_configuration,
    read_configuration,
    set_config_from_environment_variables,
    set_nested_attribute,
    write_configuration,
)


def test_set_nested_attribute():
    data = {"a": {"b": {"c": 1}}}
    set_nested_attribute(data, ["a", "b", "c"], 2)
    assert data["a"]["b"]["c"] == 2

    data = {"a": [1, 2, 3]}
    set_nested_attribute(data, ["a", "1"], 4)
    assert data["a"][1] == 4

    data = {"a": {"1": "value"}}
    set_nested_attribute(data, ["a", "1"], "new_value")
    assert data["a"]["1"] == "new_value"

    class Dummy:
        pass

    obj = Dummy()
    obj.a = Dummy()
    obj.a.b = 1
    set_nested_attribute(obj, ["a", "b"], 2)
    assert obj.a.b == 2

    data = {"a": [{"b": 1}, {"b": 2}]}
    set_nested_attribute(data, ["a", "1", "b"], 3)
    assert data["a"][1]["b"] == 3

    with pytest.raises(Exception):
        set_nested_attribute(data, ["a", "2", "b"], 3)


def test_set_config_from_environment_variables(nebari_config):
    secret_key = "NEBARI_SECRET__namespace"
    secret_value = "test"
    os.environ[secret_key] = secret_value

    secret_key_nested = "NEBARI_SECRET__theme__jupyterhub__welcome"
    secret_value_nested = "Hi from test_set_config_from_environment_variables"
    os.environ[secret_key_nested] = secret_value_nested

    updated_config = set_config_from_environment_variables(
        nebari_config, "NEBARI_SECRET"
    )

    assert updated_config.namespace == secret_value
    assert updated_config.theme.jupyterhub.welcome == secret_value_nested

    del os.environ[secret_key]
    del os.environ[secret_key_nested]


def test_set_config_from_environment_invalid_secret(nebari_config):
    invalid_secret_key = "NEBARI_SECRET__nonexistent__attribute"
    os.environ[invalid_secret_key] = "some_value"

    with pytest.raises(SystemExit) as excinfo:
        set_config_from_environment_variables(nebari_config, "NEBARI_SECRET")

    assert excinfo.value.code == 1

    del os.environ[invalid_secret_key]


def test_write_and_read_configuration(nebari_config, tmp_path):
    config_file = tmp_path / "nebari-config.yaml"

    write_configuration(config_file, nebari_config)
    nebari_config_new = read_configuration(config_file, nebari_config.__class__)

    # TODO: determine a way to compare the two objects directly
    assert nebari_config.namespace == nebari_config_new.namespace
    assert (
        nebari_config.theme.jupyterhub.welcome
        == nebari_config_new.theme.jupyterhub.welcome
    )


def test_read_configuration_non_existent_file(nebari_config):
    non_existent_file = pathlib.Path("/path/to/nonexistent/file.yaml")

    with pytest.raises(ValueError, match="does not exist"):
        read_configuration(non_existent_file, nebari_config.__class__)


def test_write_configuration_with_dict(nebari_config, tmp_path):
    config_file = tmp_path / "nebari-config-dict.yaml"
    config_dict = nebari_config.model_dump()

    write_configuration(config_file, config_dict)
    read_config = read_configuration(config_file, nebari_config.__class__)

    # TODO: determine a way to compare the two objects directly
    assert nebari_config.namespace == read_config.namespace
    assert (
        nebari_config.theme.jupyterhub.welcome == read_config.theme.jupyterhub.welcome
    )


def test_backup_non_existent_file(tmp_path):
    non_existent_file = tmp_path / "non_existent_config.yaml"
    backup_configuration(non_existent_file)
    assert not (tmp_path / "non_existent_config.yaml.backup").exists()


def test_backup_existing_file_no_previous_backup(nebari_config, tmp_path):
    config_file = tmp_path / "nebari-config.yaml"
    extrasuffix = "-abc"

    write_configuration(config_file, nebari_config)

    backup_configuration(config_file, extrasuffix)

    assert not config_file.exists()
    assert (tmp_path / f"nebari-config.yaml{extrasuffix}.backup").exists()


def test_backup_existing_file_with_previous_backup(nebari_config, tmp_path):
    fn = "nebari-config.yaml"
    backup_fn = f"{fn}.backup"
    config_file = tmp_path / fn
    backup_file = tmp_path / backup_fn

    write_configuration(config_file, nebari_config)
    write_configuration(backup_file, nebari_config)

    backup_configuration(config_file)

    assert not config_file.exists()
    assert (tmp_path / f"{backup_fn}~1").exists()


def test_backup_multiple_existing_backups(nebari_config, tmp_path):
    fn = "nebari-config.yaml"
    backup_fn = f"{fn}.backup"
    config_file = tmp_path / fn
    backup_fn = tmp_path / backup_fn

    # create and write to `nebari-config.yaml` and `nebari-config.yaml.backup`
    write_configuration(config_file, nebari_config)
    write_configuration(backup_fn, nebari_config)

    for i in range(1, 5):
        backup_fn_i = tmp_path / f"{backup_fn}~{i}"
        # create and write to `nebari-config.yaml.backup~i`
        write_configuration(backup_fn_i, nebari_config)

    backup_configuration(config_file)

    assert (tmp_path / f"{backup_fn}~5").exists()



---
File: nebari/tests/tests_unit/test_init.py
---

import pytest

from _nebari.constants import AWS_DEFAULT_REGION
from _nebari.initialize import render_config
from _nebari.stages.bootstrap import CiEnum
from _nebari.stages.kubernetes_keycloak import AuthenticationEnum
from nebari.schema import ProviderEnum


@pytest.mark.parametrize(
    "k8s_version, cloud_provider, expected",
    [
        (None, ProviderEnum.aws, "1.20"),
        ("1.19", ProviderEnum.aws, "1.19"),
    ],
)
def test_render_config(mock_all_cloud_methods, k8s_version, cloud_provider, expected):
    if type(expected) is type and issubclass(expected, Exception):
        with pytest.raises(expected):
            config = render_config(
                project_name="test",
                namespace="dev",
                nebari_domain="test.dev",
                cloud_provider=cloud_provider,
                region=AWS_DEFAULT_REGION,
                ci_provider=CiEnum.none,
                auth_provider=AuthenticationEnum.password,
                kubernetes_version=k8s_version,
            )
            assert config
    else:
        config = render_config(
            project_name="test",
            namespace="dev",
            nebari_domain="test.dev",
            cloud_provider=cloud_provider,
            region=AWS_DEFAULT_REGION,
            ci_provider=CiEnum.none,
            auth_provider=AuthenticationEnum.password,
            kubernetes_version=k8s_version,
        )

        assert (
            config.get("amazon_web_services", {}).get("kubernetes_version") == expected
        )

    assert config["project_name"] == "test"



---
File: nebari/tests/tests_unit/test_links.py
---

import pytest
import requests

from _nebari.constants import AWS_ENV_DOCS, AZURE_ENV_DOCS, GCP_ENV_DOCS

LINKS_TO_TEST = [
    AWS_ENV_DOCS,
    GCP_ENV_DOCS,
    AZURE_ENV_DOCS,
]


@pytest.mark.parametrize("url,status_code", [(url, 200) for url in LINKS_TO_TEST])
def test_links(url, status_code):
    response = requests.get(url)
    assert response.status_code == status_code



---
File: nebari/tests/tests_unit/test_render.py
---

import os

from _nebari.stages.bootstrap import CiEnum
from nebari.plugins import nebari_plugin_manager


def test_render_config(nebari_render):
    output_directory, config_filename = nebari_render
    config = nebari_plugin_manager.read_config(config_filename)
    assert {"nebari-config.yaml", "stages", ".gitignore"} <= set(
        os.listdir(output_directory)
    )
    assert {
        "07-kubernetes-services",
        "02-infrastructure",
        "01-terraform-state",
        "05-kubernetes-keycloak",
        "08-nebari-tf-extensions",
        "06-kubernetes-keycloak-configuration",
        "04-kubernetes-ingress",
        "03-kubernetes-initialize",
    }.issubset(os.listdir(output_directory / "stages"))

    assert (
        output_directory / "stages" / f"01-terraform-state/{config.provider.value}"
    ).is_dir()
    assert (
        output_directory / "stages" / f"02-infrastructure/{config.provider.value}"
    ).is_dir()

    if config.ci_cd.type == CiEnum.github_actions:
        assert (output_directory / ".github/workflows/").is_dir()
    elif config.ci_cd.type == CiEnum.gitlab_ci:
        assert (output_directory / ".gitlab-ci.yml").is_file()



---
File: nebari/tests/tests_unit/test_schema.py
---

from contextlib import nullcontext

import pytest
from pydantic import ValidationError

from nebari import schema
from nebari.plugins import nebari_plugin_manager


def test_minimal_schema():
    config = nebari_plugin_manager.config_schema(project_name="test")
    assert config.project_name == "test"
    assert config.storage.conda_store == "200Gi"


def test_minimal_schema_from_file(tmp_path):
    filename = tmp_path / "nebari-config.yaml"
    with filename.open("w") as f:
        f.write("project_name: test\n")

    config = nebari_plugin_manager.read_config(filename)
    assert config.project_name == "test"
    assert config.storage.conda_store == "200Gi"


def test_minimal_schema_from_file_with_env(tmp_path, monkeypatch):
    filename = tmp_path / "nebari-config.yaml"
    with filename.open("w") as f:
        f.write("project_name: test\n")

    monkeypatch.setenv("NEBARI_SECRET__project_name", "env")
    monkeypatch.setenv("NEBARI_SECRET__storage__conda_store", "1000Gi")

    config = nebari_plugin_manager.read_config(filename)
    assert config.project_name == "env"
    assert config.storage.conda_store == "1000Gi"


def test_minimal_schema_from_file_without_env(tmp_path, monkeypatch):
    filename = tmp_path / "nebari-config.yaml"
    with filename.open("w") as f:
        f.write("project_name: test\n")

    monkeypatch.setenv("NEBARI_SECRET__project_name", "env")
    monkeypatch.setenv("NEBARI_SECRET__storage__conda_store", "1000Gi")

    config = nebari_plugin_manager.read_config(filename, read_environment=False)
    assert config.project_name == "test"
    assert config.storage.conda_store == "200Gi"


def test_render_schema(nebari_config):
    assert isinstance(nebari_config, schema.Main)
    assert nebari_config.project_name == f"pytest{nebari_config.provider.value}"
    assert nebari_config.namespace == "dev"


@pytest.mark.parametrize(
    "provider, exception",
    [
        (
            "fake",
            pytest.raises(
                ValueError,
                match="'fake' is not a valid enumeration member; permitted: local, existing, aws, gcp, azure",
            ),
        ),
        ("aws", nullcontext()),
        ("gcp", nullcontext()),
        ("azure", nullcontext()),
        ("existing", nullcontext()),
        ("local", nullcontext()),
    ],
)
def test_provider_validation(config_schema, provider, exception):
    config_dict = {
        "project_name": "test",
        "provider": f"{provider}",
    }
    with exception:
        config = config_schema(**config_dict)
        assert config.provider == provider


@pytest.mark.parametrize(
    "provider, full_name, default_fields",
    [
        ("local", "local", {}),
        ("existing", "existing", {}),
        (
            "aws",
            "amazon_web_services",
            {"region": "us-east-1", "kubernetes_version": "1.18"},
        ),
        (
            "gcp",
            "google_cloud_platform",
            {
                "region": "us-east1",
                "project": "test-project",
                "kubernetes_version": "1.18",
            },
        ),
        (
            "azure",
            "azure",
            {
                "region": "eastus",
                "kubernetes_version": "1.18",
                "storage_account_postfix": "test",
            },
        ),
    ],
)
def test_no_provider(config_schema, provider, full_name, default_fields):
    config_dict = {
        "project_name": "test",
        f"{full_name}": default_fields,
    }
    config = config_schema(**config_dict)
    assert config.provider == provider
    assert full_name in config.model_dump()


def test_multiple_providers(config_schema):
    config_dict = {
        "project_name": "test",
        "local": {},
        "existing": {},
    }
    msg = r"Multiple providers set: \['local', 'existing'\]"
    with pytest.raises(ValidationError, match=msg):
        config_schema(**config_dict)


def test_aws_permissions_boundary(config_schema):
    permissions_boundary = "arn:aws:iam::123456789012:policy/MyBoundaryPolicy"
    config_dict = {
        "project_name": "test",
        "provider": "aws",
        "amazon_web_services": {
            "region": "us-east-1",
            "kubernetes_version": "1.19",
            "permissions_boundary": f"{permissions_boundary}",
        },
    }
    config = config_schema(**config_dict)
    assert config.provider == "aws"
    assert config.amazon_web_services.permissions_boundary == permissions_boundary


@pytest.mark.parametrize("provider", ["local", "existing"])
def test_set_provider(config_schema, provider):
    config_dict = {
        "project_name": "test",
        "provider": provider,
        f"{provider}": {"kube_context": "some_context"},
    }
    config = config_schema(**config_dict)
    assert config.provider == provider
    result_config_dict = config.model_dump()
    assert provider in result_config_dict
    assert result_config_dict[provider]["kube_context"] == "some_context"


def test_provider_config_mismatch_warning(config_schema):
    config_dict = {
        "project_name": "test",
        "provider": "local",
        "existing": {"kube_context": "some_context"},  # <-- Doesn't match the provider
    }
    with pytest.warns(UserWarning, match="configuration defined for other providers"):
        config_schema(**config_dict)



---
File: nebari/tests/tests_unit/test_stages.py
---

import pathlib
from unittest.mock import patch

import pytest

from _nebari.stages.terraform_state import TerraformStateStage
from _nebari.utils import yaml
from _nebari.version import __version__
from nebari import schema
from nebari.plugins import nebari_plugin_manager

HERE = pathlib.Path(__file__).parent


@pytest.fixture
def mock_config():
    with open(HERE / "./cli_validate/local.happy.yaml", "r") as f:
        mock_config_file = yaml.load(f)
        mock_config_file["nebari_version"] = __version__

    config = nebari_plugin_manager.config_schema.model_validate(mock_config_file)
    return config


@pytest.fixture
def terraform_state_stage(mock_config, tmp_path):
    return TerraformStateStage(tmp_path, mock_config)


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_immutable_fields_no_changes(mock_get_state, terraform_state_stage):
    mock_get_state.return_value = terraform_state_stage.config.model_dump()

    # This should not raise an exception
    terraform_state_stage.check_immutable_fields()


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_immutable_fields_mutable_change(
    mock_get_state, terraform_state_stage, mock_config
):
    old_config = mock_config.model_copy(deep=True)
    old_config.namespace = "old-namespace"
    mock_get_state.return_value = old_config.model_dump()

    # This should not raise an exception (namespace is mutable)
    terraform_state_stage.check_immutable_fields()


@patch.object(TerraformStateStage, "get_nebari_config_state")
@patch.object(schema.Main, "model_fields")
def test_check_immutable_fields_immutable_change(
    mock_model_fields, mock_get_state, terraform_state_stage, mock_config
):
    old_config = mock_config.model_copy(deep=True)
    old_config.local = None
    old_config.provider = schema.ProviderEnum.gcp
    mock_get_state.return_value = old_config.model_dump()

    # Mock the provider field to be immutable
    mock_model_fields.__getitem__.return_value.json_schema_extra = {"immutable": True}

    with pytest.raises(ValueError) as exc_info:
        terraform_state_stage.check_immutable_fields()

    assert 'Attempting to change immutable field "provider"' in str(exc_info.value)


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_immutable_fields_no_prior_state(mock_get_state, terraform_state_stage):
    mock_get_state.return_value = None

    # This should not raise an exception
    terraform_state_stage.check_immutable_fields()


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_dict_value_change(mock_get_state, terraform_state_stage, mock_config):
    old_config = mock_config.model_copy(deep=True)
    terraform_state_stage.config.local.node_selectors["worker"].value += "new_value"
    mock_get_state.return_value = old_config.model_dump()

    # should not throw an exception
    terraform_state_stage.check_immutable_fields()


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_list_change(mock_get_state, terraform_state_stage, mock_config):
    old_config = mock_config.model_copy(deep=True)
    old_config.environments["environment-dask.yaml"].channels.append("defaults")
    mock_get_state.return_value = old_config.model_dump()

    # should not throw an exception
    terraform_state_stage.check_immutable_fields()


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_immutable_fields_old_nebari_version(
    mock_get_state, terraform_state_stage, mock_config
):
    old_config = mock_config.model_copy(deep=True).model_dump()
    old_config["nebari_version"] = "2024.7.1"  # Simulate an old version
    mock_get_state.return_value = old_config

    # This should not raise an exception
    terraform_state_stage.check_immutable_fields()


@patch.object(TerraformStateStage, "get_nebari_config_state")
def test_check_immutable_fields_change_dict_any(
    mock_get_state, terraform_state_stage, mock_config
):
    old_config = mock_config.model_copy(deep=True).model_dump()
    # Change the value of a config deep in 'overrides' block
    old_config["jupyterhub"]["overrides"]["singleuser"]["extraEnv"][
        "TEST_ENV"
    ] = "new_value"
    mock_get_state.return_value = old_config

    # This should not raise an exception
    terraform_state_stage.check_immutable_fields()



---
File: nebari/tests/tests_unit/test_upgrade.py
---

from contextlib import nullcontext
from pathlib import Path

import pytest
from rich.prompt import Confirm, Prompt

from _nebari.upgrade import do_upgrade
from _nebari.version import __version__, rounded_ver_parse
from nebari.plugins import nebari_plugin_manager


@pytest.fixture
def qhub_users_import_json():
    return (
        (
            Path(__file__).parent
            / "./qhub-config-yaml-files-for-upgrade/qhub-users-import.json"
        )
        .read_text()
        .rstrip()
    )


class MockKeycloakAdmin:
    @staticmethod
    def get_client_id(*args, **kwargs):
        return "test-client"

    @staticmethod
    def create_client_role(*args, **kwargs):
        return "test-client-role"

    @staticmethod
    def get_client_role_id(*args, **kwargs):
        return "test-client-role-id"

    @staticmethod
    def get_role_by_id(*args, **kwargs):
        return bytearray("test-role-id", "utf-8")

    @staticmethod
    def get_groups(*args, **kwargs):
        return []

    @staticmethod
    def get_client_role_groups(*args, **kwargs):
        return []

    @staticmethod
    def assign_group_client_roles(*args, **kwargs):
        pass


@pytest.mark.parametrize(
    "old_qhub_config_path_str,attempt_fixes,expect_upgrade_error",
    [
        (
            "./qhub-config-yaml-files-for-upgrade/qhub-config-aws-310.yaml",
            False,
            False,
        ),
        (
            "./qhub-config-yaml-files-for-upgrade/qhub-config-aws-310-customauth.yaml",
            False,
            True,
        ),
        (
            "./qhub-config-yaml-files-for-upgrade/qhub-config-aws-310-customauth.yaml",
            True,
            False,
        ),
    ],
)
def test_upgrade_4_0(
    old_qhub_config_path_str,
    attempt_fixes,
    expect_upgrade_error,
    tmp_path,
    qhub_users_import_json,
    monkeypatch,
):
    def mock_input(prompt, **kwargs):
        from _nebari.upgrade import TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION

        # Mock different upgrade steps prompt answers
        if prompt == "Have you deleted the Argo Workflows CRDs and service accounts?":
            return True
        elif (
            prompt
            == "\nDo you want Nebari to update the kube-prometheus-stack CRDs and delete the prometheus-node-exporter for you? If not, you'll have to do it manually."
        ):
            return False
        elif (
            prompt
            == "Have you backed up your custom dashboards (if necessary), deleted the prometheus-node-exporter daemonset and updated the kube-prometheus-stack CRDs?"
        ):
            return True
        elif (
            prompt
            == "[bold]Would you like Nebari to assign the corresponding role/scopes to all of your current groups automatically?[/bold]"
        ):
            return False
        elif prompt == TERRAFORM_REMOVE_TERRAFORM_STAGE_FILES_CONFIRMATION:
            return attempt_fixes
        # All other prompts will be answered with "y"
        else:
            return True

    monkeypatch.setattr(Confirm, "ask", mock_input)
    monkeypatch.setattr(Prompt, "ask", lambda x: "")

    from kubernetes import config as _kube_config
    from kubernetes.client import ApiextensionsV1Api as _ApiextensionsV1Api
    from kubernetes.client import AppsV1Api as _AppsV1Api
    from kubernetes.client import CoreV1Api as _CoreV1Api
    from kubernetes.client import V1Status as _V1Status

    def monkey_patch_delete_crd(*args, **kwargs):
        return _V1Status(code=200)

    def monkey_patch_delete_namespaced_sa(*args, **kwargs):
        return _V1Status(code=200)

    def monkey_patch_list_namespaced_daemon_set(*args, **kwargs):
        class MonkeypatchApiResponse:
            items = False

        return MonkeypatchApiResponse

    monkeypatch.setattr(
        _kube_config,
        "load_kube_config",
        lambda *args, **kwargs: None,
    )
    monkeypatch.setattr(
        _kube_config,
        "list_kube_config_contexts",
        lambda *args, **kwargs: [None, {"context": {"cluster": "test"}}],
    )
    monkeypatch.setattr(
        _ApiextensionsV1Api,
        "delete_custom_resource_definition",
        monkey_patch_delete_crd,
    )
    monkeypatch.setattr(
        _CoreV1Api,
        "delete_namespaced_service_account",
        monkey_patch_delete_namespaced_sa,
    )
    monkeypatch.setattr(
        _ApiextensionsV1Api,
        "read_custom_resource_definition",
        lambda *args, **kwargs: True,
    )
    monkeypatch.setattr(
        _ApiextensionsV1Api,
        "patch_custom_resource_definition",
        lambda *args, **kwargs: True,
    )
    monkeypatch.setattr(
        _AppsV1Api,
        "list_namespaced_daemon_set",
        monkey_patch_list_namespaced_daemon_set,
    )

    from _nebari import upgrade as _upgrade

    def monkey_patch_get_keycloak_admin(*args, **kwargs):
        return MockKeycloakAdmin()

    monkeypatch.setattr(
        _upgrade,
        "get_keycloak_admin",
        monkey_patch_get_keycloak_admin,
    )

    old_qhub_config_path = Path(__file__).parent / old_qhub_config_path_str

    tmp_qhub_config = Path(tmp_path, old_qhub_config_path.name)
    tmp_qhub_config.write_text(old_qhub_config_path.read_text())  # Copy contents to tmp

    orig_contents = tmp_qhub_config.read_text()  # Read in initial contents

    assert not Path(tmp_path, "qhub-users-import.json").exists()

    # Do the upgrade
    if not expect_upgrade_error:
        do_upgrade(
            tmp_qhub_config, attempt_fixes
        )  # Would raise an error if invalid by current Nebari version's standards
    else:
        with pytest.raises(ValueError):
            do_upgrade(tmp_qhub_config, attempt_fixes)
        return

    # Check the resulting YAML
    config = nebari_plugin_manager.read_config(tmp_qhub_config)

    assert len(config.security.keycloak.initial_root_password) == 16
    assert not hasattr(config.security, "users")
    assert not hasattr(config.security, "groups")

    __rounded_version__ = rounded_ver_parse(__version__)

    # Check image versions have been bumped up
    assert (
        config.default_images.jupyterhub
        == f"quansight/nebari-jupyterhub:v{__rounded_version__}"
    )
    assert (
        config.profiles.jupyterlab[0].kubespawner_override.image
        == f"quansight/nebari-jupyterlab:v{__rounded_version__}"
    )
    assert config.security.authentication.type != "custom"

    # Keycloak import users json
    assert (
        Path(tmp_path, "nebari-users-import.json").read_text().rstrip()
        == qhub_users_import_json
    )

    # Check backup
    tmp_qhub_config_backup = Path(tmp_path, f"{old_qhub_config_path.name}.old.backup")

    assert orig_contents == tmp_qhub_config_backup.read_text()


@pytest.mark.parametrize(
    "version_str, exception",
    [
        ("1.0.0", nullcontext()),
        ("1.cool.0", pytest.raises(ValueError, match=r"Invalid version string .*")),
        ("0,1.0", pytest.raises(ValueError, match=r"Invalid version string .*")),
        ("", pytest.raises(ValueError, match=r"Invalid version string .*")),
        (
            "1.0.0-rc1",
            pytest.raises(
                AssertionError,
                match=r"Invalid version .*: must be a full release version, not a dev/prerelease/postrelease version",
            ),
        ),
        (
            "1.0.0dev1",
            pytest.raises(
                AssertionError,
                match=r"Invalid version .*: must be a full release version, not a dev/prerelease/postrelease version",
            ),
        ),
    ],
)
def test_version_string(new_upgrade_cls, version_str, exception):
    with exception:

        class DummyUpgrade(new_upgrade_cls):
            version = version_str


def test_duplicated_version(new_upgrade_cls):
    duplicated_version = "1.2.3"
    with pytest.raises(
        AssertionError, match=rf"Duplicate UpgradeStep version {duplicated_version}"
    ):

        class DummyUpgrade(new_upgrade_cls):
            version = duplicated_version

        class DummyUpgrade2(new_upgrade_cls):
            version = duplicated_version

        class DummyUpgrade3(new_upgrade_cls):
            version = "1.2.4"



---
File: nebari/tests/tests_unit/test_utils.py
---

import pytest

from _nebari.utils import JsonDiff, JsonDiffEnum, byte_unit_conversion, deep_merge


@pytest.mark.parametrize(
    "value, from_unit, to_unit, expected",
    [
        (1, "", "B", 1),
        (1, "B", "B", 1),
        (1, "KB", "B", 1000),
        (1, "K", "B", 1000),
        (1, "k", "b", 1000),
        (1, "MB", "B", 1000**2),
        (1, "GB", "B", 1000**3),
        (1, "TB", "B", 1000**4),
        (1, "KiB", "B", 1024),
        (1, "MiB", "B", 1024**2),
        (1, "GiB", "B", 1024**3),
        (1, "TiB", "B", 1024**4),
        (1000, "B", "KB", 1),
        (1000, "KB", "K", 1000),
        (1000, "K", "KB", 1000),
        (1000, "MB", "KB", 1000**2),
        (1000, "GB", "KB", 1000**3),
        (1000, "TB", "KB", 1000**4),
        (1000, "KiB", "KB", 1024),
        (1000, "Ki", "KB", 1024),
        (1000, "Ki", "K", 1024),
        (1000, "MiB", "KB", 1024**2),
        (1000, "GiB", "KB", 1024**3),
        (1000, "TiB", "KB", 1024**4),
        (1000**2, "B", "MB", 1),
        (1000**2, "KB", "MB", 1000),
        (1000**2, "MB", "MB", 1000**2),
        (1000**2, "GB", "MB", 1000**3),
        (1000**2, "TB", "MB", 1000**4),
        (1000**2, "MiB", "MB", 1024**2),
        (1000**3, "B", "GB", 1),
        (1000**3, "KB", "GB", 1000),
    ],
)
def test_byte_unit_conversion(value, from_unit, to_unit, expected):
    assert byte_unit_conversion(f"{value} {from_unit}", to_unit) == expected


def test_JsonDiff_diff():
    obj1 = {"a": 1, "b": {"c": 2, "d": 3}}
    obj2 = {"a": 1, "b": {"c": 3, "e": 4}, "f": 5}
    diff = JsonDiff(obj1, obj2)
    assert diff.diff == {
        "b": {
            "e": {JsonDiffEnum.ADDED: 4},
            "c": {JsonDiffEnum.MODIFIED: (2, 3)},
            "d": {JsonDiffEnum.REMOVED: 3},
        },
        "f": {JsonDiffEnum.ADDED: 5},
    }


def test_JsonDiff_modified():
    obj1 = {"a": 1, "b": {"!": 2, "-": 3}, "+": 4}
    obj2 = {"a": 1, "b": {"!": 3, "+": 4}, "+": 5}
    diff = JsonDiff(obj1, obj2)
    modifieds = diff.modified()
    assert sorted(modifieds) == sorted([(["b", "!"], 2, 3), (["+"], 4, 5)])


def test_deep_merge_order_preservation_dict():
    value_1 = {
        "a": [1, 2],
        "b": {"c": 1, "z": [5, 6]},
        "e": {"f": {"g": {}}},
        "m": 1,
    }

    value_2 = {
        "a": [3, 4],
        "b": {"d": 2, "z": [7]},
        "e": {"f": {"h": 1}},
        "m": [1],
    }

    expected_result = {
        "a": [1, 2, 3, 4],
        "b": {"c": 1, "z": [5, 6, 7], "d": 2},
        "e": {"f": {"g": {}, "h": 1}},
        "m": 1,
    }

    result = deep_merge(value_1, value_2)
    assert result == expected_result
    assert list(result.keys()) == list(expected_result.keys())
    assert list(result["b"].keys()) == list(expected_result["b"].keys())
    assert list(result["e"]["f"].keys()) == list(expected_result["e"]["f"].keys())


def test_deep_merge_order_preservation_list():
    value_1 = {
        "a": [1, 2],
        "b": {"c": 1, "z": [5, 6]},
    }

    value_2 = {
        "a": [3, 4],
        "b": {"d": 2, "z": [7]},
    }

    expected_result = {
        "a": [1, 2, 3, 4],
        "b": {"c": 1, "z": [5, 6, 7], "d": 2},
    }

    result = deep_merge(value_1, value_2)
    assert result == expected_result
    assert result["a"] == expected_result["a"]
    assert result["b"]["z"] == expected_result["b"]["z"]


def test_deep_merge_single_dict():
    value_1 = {
        "a": [1, 2],
        "b": {"c": 1, "z": [5, 6]},
    }

    expected_result = value_1

    result = deep_merge(value_1)
    assert result == expected_result
    assert list(result.keys()) == list(expected_result.keys())
    assert list(result["b"].keys()) == list(expected_result["b"].keys())


def test_deep_merge_empty():
    expected_result = {}

    result = deep_merge()
    assert result == expected_result



---
File: nebari/tests/tests_unit/utils.py
---

from functools import partial

from _nebari.initialize import render_config

DEFAULT_TERRAFORM_STATE = "remote"

DEFAULT_GH_REPO = "github.com/test/test"
render_config_partial = partial(
    render_config,
    repository=DEFAULT_GH_REPO,
    repository_auto_provision=False,
    auth_auto_provision=False,
    terraform_state=DEFAULT_TERRAFORM_STATE,
    disable_prompt=True,
)
INIT_INPUTS = [
    # project, namespace, domain, cloud_provider, ci_provider, auth_provider
    ("pytestaws", "dev", "aws.nebari.dev", "aws", "github-actions", "github"),
    ("pytestgcp", "dev", "gcp.nebari.dev", "gcp", "github-actions", "github"),
    ("pytestazure", "dev", "azure.nebari.dev", "azure", "github-actions", "github"),
]

NEBARI_CONFIG_FN = "nebari-config.yaml"
PRESERVED_DIR = "preserved_dir"



---
File: nebari/tests/__init__.py
---




---
File: nebari/tests/conftest.py
---

pytest_plugins = ["tests.common.playwright_fixtures"]



---
File: nebari/tests/utils.py
---

from functools import partial

from _nebari.initialize import render_config

DEFAULT_TERRAFORM_STATE = "remote"

DEFAULT_GH_REPO = "github.com/test/test"
render_config_partial = partial(
    render_config,
    repository=DEFAULT_GH_REPO,
    repository_auto_provision=False,
    auth_auto_provision=False,
    terraform_state=DEFAULT_TERRAFORM_STATE,
    disable_prompt=True,
)
INIT_INPUTS = [
    # project, namespace, domain, cloud_provider, ci_provider, auth_provider
    ("pytestaws", "dev", "aws.nebari.dev", "aws", "github-actions", "github"),
    ("pytestgcp", "dev", "gcp.nebari.dev", "gcp", "github-actions", "github"),
    ("pytestazure", "dev", "azure.nebari.dev", "azure", "github-actions", "github"),
]

NEBARI_CONFIG_FN = "nebari-config.yaml"
PRESERVED_DIR = "preserved_dir"



---
File: nebari/.cirun.yml
---

# Self-Hosted Github Action Runners on AWS via Cirun.io
# Reference: https://docs.cirun.io/reference/yaml
runners:
  - name: run-k8s-tests
    # Cloud Provider: AWS
    cloud: aws
    # Instance Type has 8 vcpu, 32 GiB memory, Up to 5 Gbps Network Performance
    instance_type: t3a.2xlarge
    # Custom AMI with docker/cypress/hub pre-installed
    machine_image: ami-0a388df278199ff52
    # Region: Oregon
    region: us-west-2
    # Use Spot Instances for cost savings
    preemptible:
      - true
      - false
    labels:
      - cirun-runner



---
File: nebari/.pre-commit-config.yaml
---

# pre-commit is a tool to perform a predefined set of tasks manually and/or
# automatically before git commits are made.
#
# Config reference: https://pre-commit.com/#pre-commit-configyaml---top-level
#
# Common tasks
#
# - Register git hooks: pre-commit install --install-hooks
# - Run on all files:   pre-commit run --all-files
#
# These pre-commit hooks are run as CI.
#
# NOTE: if it can be avoided, add configs/args in pyproject.toml or below instead of creating a new `.config.file`.
# https://pre-commit.ci/#configuration
ci:
  autoupdate_schedule: monthly
  autofix_commit_msg: |
    [pre-commit.ci] Apply automatic pre-commit fixes
  # this does not work on pre-commit ci
  skip: [terraform_fmt]

repos:
  # general
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v5.0.0
    hooks:
      - id: end-of-file-fixer
        exclude: "^docs-sphinx/cli.html"
      - id: trailing-whitespace
        exclude: "^docs-sphinx/cli.html"
      - id: check-json
      - id: check-yaml
        args: [--allow-multiple-documents]
      - id: check-toml
      # Lint: Checks that non-binary executables have a proper shebang.
      - id: check-executables-have-shebangs
        exclude: "^src/_nebari/template/"

  - repo: https://github.com/crate-ci/typos
    rev: typos-dict-v0.12.4
    hooks:
      - id: typos

  - repo: https://github.com/codespell-project/codespell
    rev: v2.4.1
    hooks:
      - id: codespell
        args:
          [
            "--write",
          ]
        language: python
        additional_dependencies:
        - tomli

  # python
  - repo: https://github.com/psf/black
    rev: 25.1.0
    hooks:
      - id: black
        args: ["--line-length=88", "--exclude=/src/_nebari/template/"]

  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.9.4
    hooks:
      - id: ruff
        args: ["--fix"]

  - repo: https://github.com/pycqa/isort
    rev: 6.0.0
    hooks:
      - id: isort
        name: isort
        additional_dependencies: [toml]
        files: \.py$
        args: ["--profile", "black"]

  # terraform
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: v1.97.2
    hooks:
      - id: terraform_fmt
        args:
          - --args=-write=true



---
File: nebari/CODE_OF_CONDUCT.md
---

# Nebari Code of Conduct

Nebari is a community-oriented and community-led project.
We value the participation of every member of our community and want to ensure that every contributor has an enjoyable and fulfilling experience.
We are committed to creating a friendly and respectful place for learning, teaching, and contributing.
Accordingly, everyone who participates in the Nebari project is expected to show respect and courtesy to other community members at all times.

You can find the [Nebari Code of Conduct on the `nebari-dev/governance` repository](https://github.com/nebari-dev/governance/blob/main/CODE_OF_CONDUCT.md) and it applies to all spaces managed by Nebari including, but not limited to, in-person and online focus groups and workshops, and communications online via GitHub.



---
File: nebari/CONTRIBUTING.md
---

# Contributing to Nebari

Welcome 👋🏼!

Thanks for being interested in contributing to Nebari. We’re glad you want to join this community!
Open source doesn’t always have the best reputation for being friendly and welcoming, but the Nebari team truly believes
that everyone belongs in open source, and we are dedicated to making you feel welcome.

All contributions are welcome, including issues, contributing code, new docs as well as updates and tweaks, blog posts,
helping out people, organizing community events, working on accessibility and design items, and more.
Continue reading to learn what the community can do for you and what you can do for the community.
By contributing to open source projects you can connect with people, learn new skills, become a subject-matter expert,
and apply all learnings to your own projects.

> **Note**
> Our detailed contribution guidelines can be found [on Nebari's main documentation site][nebari-community].
> Make sure to check the guidelines before you start contributing.

## Reporting issues

When reporting issues please include as much detail as possible about your operating system, Nebari version, and dependencies version.
Whenever possible, also include a brief, self-contained code example that demonstrates the problem.
This [blog post by Matthew Rocklin](https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports) is a good primer on how to craft minimal bug reports.

- Use the [Nebari issue tracker][nebari-issues] for issues, bug reports, and feature requests for Nebari.
- Use the [Nebari documentation issue tracker][nebari-docs-issues] for documentation-related improvements.
- Read more about [best practices for issues creation](https://www.nebari.dev/docs/community/file-issues) in our community docs.

<!-- Links -->

[nebari-docs-issues]: https://github.com/nebari-dev/nebari-docs/issues
[nebari-issues]: https://github.com/nebari-dev/nebari/issues
[nebari-community]: https://www.nebari.dev/community/introduction



---
File: nebari/pyproject.toml
---

### Build ###
[build-system]
requires = ["hatchling", "hatch-vcs"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = [
    "src/_nebari",
    "src/nebari",
]

[tool.hatch.version]
source = "vcs"

[tool.hatch.build.hooks.vcs]
version-file = "src/_nebari/_version.py"
local_scheme = "node-and-timestamp"


### Project ###
[project]
name = "nebari"
dynamic = ["version"]
description = "A Jupyter and Dask-powered open source data science platform."
readme = "README.md"
requires-python = ">=3.10"
license = "BSD-3-Clause"
authors = [
    { name = "Nebari development team", email = "internal-it@quansight.com" },
]
keywords = [
    "aws",
    "gcp",
    "do",
    "azure",
    "nebari",
    "dask",
    "jupyter",
]
classifiers = [
    "Development Status :: 5 - Production/Stable",
    "Intended Audience :: Developers",
    "Topic :: Software Development :: Build Tools",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
    "Programming Language :: Python :: 3.13",
    "Intended Audience :: Developers",
    "Intended Audience :: Education",
    "Intended Audience :: Information Technology",
    "Intended Audience :: Science/Research",
    "Intended Audience :: System Administrators",
    "Framework :: Jupyter :: JupyterLab",
]

dependencies = [
    "auth0-python==4.7.1",
    "azure-identity==1.12.0",
    "azure-mgmt-containerservice==26.0.0",
    "azure-mgmt-resource==23.0.1",
    "bcrypt==4.0.1",
    "boto3==1.34.63",
    "cloudflare==2.11.7",
    "google-auth==2.31.0",
    "google-cloud-compute==1.19.1",
    "google-cloud-container==2.49.0",
    "google-cloud-iam==2.15.1",
    "google-cloud-storage==2.18.0",
    "grpc-google-iam-v1==0.13.1",
    "jinja2",
    "kubernetes==27.2.0",
    "pluggy==1.3.0",
    "prompt-toolkit==3.0.36",
    "pydantic==2.9.2",
    "pynacl==1.5.0",
    "python-keycloak>=3.9.0,<4.0.0",
    "questionary==2.0.0",
    "requests-toolbelt==1.0.0",
    "rich==13.5.1",
    "ruamel.yaml==0.18.6",
    "typer==0.9.0",
    "packaging==23.2",
    "typing-extensions>=4.11.0",
]

[project.optional-dependencies]
dev = [
    "black==22.3.0",
    "coverage[toml]",
    "dask-gateway",
    "escapism",
    "importlib-metadata<5.0",
    "mypy==1.6.1",
    "paramiko",
    "pre-commit",
    "pytest-cov",
    "pytest-playwright",
    "pytest-timeout",
    "pytest",
    "python-dotenv",
    "python-hcl2",
    "setuptools==63.4.3",
    "tqdm",
]
docs = [
    "sphinx",
    "sphinx_click",
]

[project.urls]
Documentation = "https://www.nebari.dev/docs/welcome"
Source = "https://github.com/nebari-dev/nebari"

[project.scripts]
nebari = "nebari.__main__:main"

[tool.mypy]
warn_return_any = true
warn_unused_configs = true
files = [
    "src/_nebari",
    "src/nebari",
]
exclude = [
    "src/_nebari/stages/kubernetes_services/template" # skip traitlets configuration files
]

[[tool.mypy.overrides]]
module = [
    "auth0.authentication",
    "auth0.management",
    "CloudFlare",
    "kubernetes",
    "kubernetes.client",
    "kubernetes.config",
    "kubernetes.client.rest",
    "kubernetes.client.exceptions",
    "keycloak",
    "keycloak.exceptions",
    "boto3",
    "botocore.exceptions",
]
ignore_missing_imports = true

[tool.ruff]
extend-exclude = [
    "src/_nebari/template",
    "home",
    "__pycache__"
]

[tool.ruff.lint]
select = [
    "E",  # E: pycodestyle rules
    "F",  # F: pyflakes rules
    "PTH",  # PTH: flake8-use-pathlib rules
]
ignore = [
    "E501", # Line too long
    "F821", # Undefined name
    "PTH123", # open() should be replaced by Path.open()
]

[tool.coverage.run]
branch = true

[tool.coverage.report]
# Regexes for lines to exclude from consideration
exclude_also = [
    # Don't complain about missing debug-only code:
    "def __repr__",
    "if self\\.debug",

    # Don't complain if tests don't hit defensive assertion code:
    "raise AssertionError",
    "raise NotImplementedError",

    # Don't complain if non-runnable code isn't run:
    "if 0:",
    "if __name__ == .__main__.:",

    # Don't complain about abstract methods, they aren't run:
    "@(abc\\.)?abstractmethod",
    ]
ignore_errors = false

[tool.typos]
files.extend-exclude = ["_build", "*/build/*", "*/node_modules/*", "nebari.egg-info", "*.git", "*.js", "*.json", "*.yaml", "*.yml", "pre-commit-config.yaml"]
default.extend-ignore-re = ["(?Rm)^.*(#|//)\\s*typos: ignore$"]
default.extend-ignore-words-re = ["ask", "ASK"]
default.check-filename = true

[tool.codespell]
# Ref: https://github.com/codespell-project/codespell#using-a-config-file
skip = '_build,*/build/*,*/node_modules/*,nebari.egg-info,*.git,package-lock.json,*.lock'
check-hidden = true
ignore-regex = '^\s*"image/\S+": ".*'
ignore-words-list = 'ask'



---
File: nebari/pytest.ini
---

[pytest]
addopts =
    # show tests that (f)ailed, (E)error, or (X)passed in the summary  # typos: ignore
    -rfEX
    # Make tracebacks shorter
    --tb=native
    # turn warnings into errors
    -Werror
markers =
    gpu: test gpu working properly
    preemptible: test preemptible instances
testpaths =
    tests
xfail_strict = True

log_format =  %(asctime)s %(levelname)9s %(lineno)4s %(module)s: %(message)s
log_date_format = %Y-%m-%d %H:%M:%S
log_cli = True
log_cli_level = INFO



---
File: nebari/README.md
---

<p align="center">
<picture>
  <source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup.svg">
  <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg">
  <img alt="Nebari logo mark - text will be black in light color mode and white in dark color mode." src="https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg" width="50%"/>
</picture>
</p>

<h1 align="center"> Your open source data science platform. Built for scale, designed for collaboration. </h1>

---

| Information | Links |
| :---------- | :-----|
|   Project   | [![License](https://img.shields.io/badge/License-BSD%203--Clause-gray.svg?colorA=2D2A56&colorB=5936D9&style=flat.svg)](https://opensource.org/licenses/BSD-3-Clause) [![Nebari documentation](https://img.shields.io/badge/%F0%9F%93%96%20Read-the%20docs-gray.svg?colorA=2D2A56&colorB=5936D9&style=flat.svg)](https://www.nebari.dev/docs/welcome) [![PyPI](https://img.shields.io/pypi/v/nebari)](https://badge.fury.io/py/nebari) [![conda version](https://img.shields.io/conda/vn/conda-forge/nebari)]((https://anaconda.org/conda-forge/nebari))  |
|  Community  | [![GH discussions](https://img.shields.io/badge/%F0%9F%92%AC%20-Participate%20in%20discussions-gray.svg?colorA=2D2A56&colorB=5936D9&style=flat.svg)](https://github.com/nebari-dev/nebari/discussions) [![Open an issue](https://img.shields.io/badge/%F0%9F%93%9D%20Open-an%20issue-gray.svg?colorA=2D2A56&colorB=5936D9&style=flat.svg)](https://github.com/nebari-dev/nebari/issues/new/choose) [![Community guidelines](https://img.shields.io/badge/🤝%20Community-guidelines-gray.svg?colorA=2D2A56&colorB=5936D9&style=flat.svg)](https://www.nebari.dev/docs/community/) |
|     CI      | [![Kubernetes Tests](https://github.com/nebari-dev/nebari/actions/workflows/test_local_integration.yaml/badge.svg)](https://github.com/nebari-dev/nebari/actions/workflows/kubernetes_test.yaml) [![Tests](https://github.com/nebari-dev/nebari/actions/workflows/test.yaml/badge.svg)](https://github.com/nebari-dev/nebari/actions/workflows/test.yaml) [![Test Nebari Provider](https://github.com/nebari-dev/nebari/actions/workflows/test-provider.yaml/badge.svg)](https://github.com/nebari-dev/nebari/actions/workflows/test-provider.yaml)|
| Cloud Providers | [![AWS Deployment Status](https://github.com/nebari-dev/nebari/actions/workflows/test_aws_integration.yaml/badge.svg)](https://github.com/nebari-dev/nebari/actions/workflows/test_aws_integration.yaml) [![Azure Deployment Status](https://github.com/nebari-dev/nebari/actions/workflows/test_azure_integration.yaml/badge.svg)](https://github.com/nebari-dev/nebari/actions/workflows/test_azure_integration.yaml) [![GCP Deployment Status](https://github.com/nebari-dev/nebari/actions/workflows/test_gcp_integration.yaml/badge.svg)](https://github.com/nebari-dev/nebari/actions/workflows/test_gcp_integration.yaml)|

## Table of contents

- [Table of contents](#table-of-contents)
- [Nebari](#nebari)
  - [Cloud Providers ☁️](#cloud-providers-️)
- [Installation 💻](#installation-)
  - [Pre-requisites](#pre-requisites)
  - [Install Nebari](#install-nebari)
- [Usage 🚀](#usage-)
- [Nebari HPC](#nebari-hpc)
- [Contributing to Nebari 👩🏻‍💻](#contributing-to-nebari-)
  - [Installing the Development version of Nebari ⚙️](#installing-the-development-version-of-nebari-️)
  - [Questions? 🤔](#questions-)
- [Code of Conduct 📖](#code-of-conduct-)
- [Ongoing Support](#ongoing-support)
- [License](#license)

> **⚠️ Warning ⚠️**
> The `2023.10.1` release includes the initial implementation of a [Pluggy-based](https://pluggy.readthedocs.io/en/stable/) extension mechanism, for more details refer [here](https://www.nebari.dev/docs/community/plugins).
> This version also fully deprecates CDS Dashboards as it is no longer compatible with the newer versions of JupyterHub.
> For more details on all of changes included in this release, please refer to our [release notes](./RELEASE.md).
> After you've installed version `2023.10.1`, you can update your `nebari-config.yaml` by running `nebari upgrade -c nebari-config.yaml`, please
> follow the upgrades instructions output by this command.
> And please make sure to [back up your data before attempting an upgrade](https://www.nebari.dev/docs/how-tos/manual-backup).

Automated data science platform. From [JupyterHub](https://jupyter.org/hub "Multi-user version of the Notebook") to Cloud environments with
[Dask Gateway](https://docs.dask.org/ "Parallel computing in Python").

Nebari is an open source data platform that enables users to build and maintain cost-effective and scalable compute platforms
on [HPC](#nebari-hpc) or [Kubernetes](#nebari) with minimal DevOps overhead.

**This repository details the [Nebari](https://nebari.dev/ "Official Nebari docs") (Kubernetes) version.**

Not sure what to choose? Check out our documentation on [choosing a deployment platform](https://www.nebari.dev/docs/get-started/deploy)

## Nebari

The Kubernetes version of Nebari uses [Terraform](https://www.terraform.io/), [Helm](https://helm.sh/), and
[GitHub Actions](https://docs.github.com/en/free-pro-team@latest/actions).

- Terraform handles the build, change, and versioning of the infrastructure.
- Helm helps to define, install, and manage [Kubernetes](https://kubernetes.io/ "Automated container deployment, scaling, and management") resources.
- GitHub Actions is used to automatically create commits when the configuration file (`nebari-config.yaml`) is rendered,
  as well as to kick off the deployment action.

Nebari aims to abstract all these complexities for its users.
Hence, it is not necessary to know any of the technologies mentioned above to have your project successfully deployed.

> TLDR: If you know GitHub and feel comfortable generating and using API keys, you should have all it takes to deploy and maintain your system without the need for a dedicated
> DevOps team. No need to learn Kubernetes, Terraform, or Helm.

### Cloud Providers ☁️

Nebari offers out-of-the-box support for the major public cloud providers:
Amazon [AWS](https://aws.amazon.com/), [GCP](https://cloud.google.com/ "Google Cloud Provider"), and Microsoft [Azure](https://azure.microsoft.com/en-us/).
![High-level illustration of Nebari architecture](https://raw.githubusercontent.com/nebari-dev/nebari-docs/main/docs/static/img/welcome/nebari_overview_sequence.png)

## Installation 💻

### Pre-requisites

- Operating System: Currently, Nebari supports development on macOS and Linux operating systems. Windows is NOT supported.
  However, we would welcome contributions that add and improve support for Windows.
- You need Python >= 3.10 on your local machine or virtual environment to work on Nebari.
- Adopting virtual environments ([`conda`](https://docs.conda.io/en/latest/), [`pipenv`](https://github.com/pypa/pipenv) or
  [`venv`](https://docs.python.org/3/library/venv.html)) is also encouraged.

### Install Nebari

To install Nebari type the following commands in your command line:

- Install using `conda`:

  ```bash
  conda install -c conda-forge nebari

  # if you prefer using mamba
  mamba install -c conda-forge nebari
  ```

- Install using `pip`:

  ```bash
  pip install nebari
  ```

Once finished, you can check Nebari's version (and additional CLI arguments) by typing:

```bash
nebari --help
```

If successful, the CLI output will be similar to the following:

```bash
usage: nebari [-h] [-v] {deploy,destroy,render,init,validate} ...

Nebari command line

positional arguments:
  {deploy,destroy,render,init,validate}
                        Nebari

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         Nebari version
```

## Usage 🚀

Nebari requires setting multiple environment variables to automate the deployments fully.
For details on obtaining those variables, check the [Nebari Get started documentation][docs-get-started].

Once all the necessary credentials are gathered and set as [UNIX environment variables](https://linuxize.com/post/how-to-set-and-list-environment-variables-in-linux/), Nebari can be
deployed in minutes.

For detailed step-by-step instructions on how to deploy Nebari, check the [Nebari documentation][docs-deploy].

## Nebari HPC

An HPC version of Nebari is currently not available. There is one under development for Nebari's precursor QHub.
Curious? Check out the [QHub HPC](https://github.com/Quansight/qhub-hpc) repository.

## Contributing to Nebari 👩🏻‍💻

Thinking about contributing? Check out our [Contribution Guidelines](CONTRIBUTING.md) to get started.

### Installing the Development version of Nebari ⚙️

To install the latest developer version (unstable) use:

```bash
pip install git+https://github.com/nebari-dev/nebari.git
```

### Questions? 🤔

Have a look at our [Frequently Asked Questions (FAQ)][nebari-faqs] to see if your query has been answered.

Getting help:

- [GitHub Discussions][gh-discussions] is our user forum. It can be used to raise discussions about a subject,
    such as: "What is the recommended way to do _X_ with Nebari?"
- [Issues][nebari-issues] for queries, bug reporting, feature requests, documentation, etc.

> We work around the clock to make Nebari better, but sometimes your query might take a while to get a reply. We
> apologize in advance and ask you to please, be patient :pray:.

## Code of Conduct 📖

To guarantee a welcoming and friendly community, we require all community members to follow our [Code of Conduct](https://github.com/Quansight/.github/blob/master/CODE_OF_CONDUCT.md).

## Ongoing Support

If you're using Nebari and would like professional support, please get in touch with the Nebari development team.

## License

[Nebari is BSD3 licensed](LICENSE).

<!-- links -->
[nebari-issues]: https://github.com/nebari-dev/nebari/issues
[nebari-faqs]: https://www.nebari.dev/docs/faq
[gh-discussions]: https://github.com/nebari-dev/nebari/discussions
[docs-get-started]: https://www.nebari.dev/docs/get-started
[docs-deploy]: https://www.nebari.dev/docs/get-started/deploy



---
File: nebari/RELEASE.md
---

# Release notes

_Contains description of Nebari releases._

<!-- Note:
The RELEASE.md file at the root of the Nebari codebase is the source of truth for all release notes.
If you want to update the release notes, open a PR against nebari-dev/nebari.
This file is copied to nebari-dev/nebari-docs using a GitHub Action. -->

---

## Release 2025.2.1 - February 7, 2025

> NOTE: In this release, we have updated our maximum supported Kubernetes version from
> 1.29 to 1.31. we strongly recommend updating the Kubernetes version
> specified in your nebari-config YAML file and redeploying to apply the changes.
>
> Remember that Kubernetes minor versions must be upgraded incrementally (1.29 → 1.30 →
> 1.31).

### What's Changed
- fix bug to allow --import-plugin to work by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2864
- Add azure kubernetes policy add-on by @viniciusdc in https://github.com/nebari-dev-nebari/pull/2888
- Yaml config sets by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/287-
- Add ability to list user installed plugins from the CLI by @soapy1 in https://githu-.com/nebari-dev/nebari/pull/2891
- [ENH] - Include "--attempt-fixes" flag from Nebari upgrade CLI in upgrade steps log-c by @smokestacklightnin in https://github.com/nebari-dev/nebari/pull/2839
- add authorized ip range variable for azure by @dcmcand in https://github.com/nebari-dev/nebari/pull/2880
- Upgrade conda-store to 2024.11.2 by @marcelovilla in https://github.com/nebari-dev/-ebari/pull/2815
- Handle default value for azure addon policy by @viniciusdc in https://github.com/ne-ari-dev/nebari/pull/2905
- Update conda-store-ui tests for updated page by @soapy1 in https://github.com/nebar--dev/nebari/pull/2911
- Remove unintended character at the end of the TF_LOG variable by @marcelovilla in h-tps://github.com/nebari-dev/nebari/pull/2912
- Update k8s max version by @dcmcand in https://github.com/nebari-dev/nebari/pull/290-
- [ENH] - Use GitHub secrets instead of Vault by @smokestacklightnin in https://github.com/nebari-d-v/nebari/pull/2889
- adds info command text display & change the order of command display by @kernel-loophole in https-//github.com/nebari-dev/nebari/pull/2916
- `2025.1.1` Upgrade step and version bump by @viniciusdc in https://github.com/nebari-dev/nebari/p-ll/2924
- Retrieve all conda-store environments by @soapy1 in https://github.com/nebari-dev/nebari/pull/291-
- [BUG] - Make sure to get envs when the number of envs is less than page limit by @soapy1 in https://github.com/nebar--dev/nebari/pull/2939
- Fix Playwright CI errors & update local instructions by @viniciusdc in https://github.com/nebari-dev/nebari/pull/294-
- Update conda-store-server image + use public auth_schema module for AuthenticationToken by @soapy1 in https://github.com/nebari-dev/nebari/pull/2931-

### New Contributors

- @soapy1 made their first contribution in https://github.com/nebari-dev/nebari/pull2891
- @smokestacklightnin made their first contribution in https://github.com/nebari-dev
/nebari/pull/2839
- @kernel-loophole made their first contribution in https://github.com/nebari-dev/nebari/pull/2916

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.12.1...2025.2.1

## Release 2024.12.1 - December 13, 2024

> NOTE: Support for DigitalOcean has been removed in this release. If you plan to deploy Nebari on DigitalOcean, you first need to independently create a Kubernetes cluster and then use the `existing` deployment option.

### What's Changed
- Precommit typos by @blakerosenthal in https://github.com/nebari-dev/nebari/pull/2731
- fix typo in KubernetesCredentials by @blakerosenthal in https://github.com/nebari-dev/nebari/pull/2729
- handle branch rename from develop to main in github actions by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2748
- remove do integration test by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2765
- Remove old develop branch references after default branch renaming by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2769
- fix CICD issue with pre-commit action by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2775
- fix CHECK_URL in kuberhealthy checks to respect namespaces by @dcmcand in https://github.com/nebari-dev/nebari/pull/2779
- remove duplicate GCPPrivateClusterConfig class by @dcmcand in https://github.com/nebari-dev/nebari/pull/2786
- Fix hub variable for jupyterhub_dashboard by @kenafoster in https://github.com/nebari-dev/nebari/pull/2721
- Fix Pytest Tests failing on PRs updating src by @joneszc in https://github.com/nebari-dev/nebari/pull/2790
- Add ability to add overrides to jhub-apps config by @aktech in https://github.com/nebari-dev/nebari/pull/2754
- Remove leftover develop reference by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2792
- fix bug where check_immutable_fields throws error with old version of Nebari by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2796
- Fix immutable field validation error when a sub-schema is not Pydantic by @kenafoster in https://github.com/nebari-dev/nebari/pull/2797
- Address issue with AWS instance type schema by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2787
- add broken note by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2802
- Fix release notes formatting to restore docs syncing functionality by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2809
- Refactor role creation for upgrade command path by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2795
- add test workflow for upgrade by @pmeier in https://github.com/nebari-dev/nebari/pull/2780
- Add config option to enable the encryption of AWS EKS secrets by @joneszc in https://github.com/nebari-dev/nebari/pull/2788
- remove digital ocean tests by @dcmcand in https://github.com/nebari-dev/nebari/pull/2813
- Python3 13 upgrade dependencies by @dcmcand in https://github.com/nebari-dev/nebari/pull/2823
- Test support for Python 3.13 in CI by @aktech in https://github.com/nebari-dev/nebari/pull/2774
- remove unmaintained nix files by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2831
- allow passing X.XX or X.XX.XX as k8s versions by @dcmcand in https://github.com/nebari-dev/nebari/pull/2840
- Remove explicit branch inputs from cloud integration test workflows in GHA by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2837
- Allow overriding of keycloak root credentials for `2024.11.1` upgrade path by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2843
- Added security group rule descriptions by @jcbolling in https://github.com/nebari-dev/nebari/pull/2850
- Set `launch_template.ami_id` attrs to private by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2842
- attempt to address paramiko connection errors by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2811
- specify terraform registry for providers not in opentofu registry by @dcmcand in https://github.com/nebari-dev/nebari/pull/2852
- Disable AWS `launch_template` from nebari-config schema by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2856
- Remove Digital Ocean references by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2838
- Use tofu binary instead of terraform one by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2773
- Add 2024.11.1 release notes and bump version by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2859
- Disable `jupyterlab-jhub-apps` extension when jhub-apps is disabled by @krassowski in https://github.com/nebari-dev/nebari/pull/2804
- Validate instance types for GCP by @blakerosenthal in https://github.com/nebari-dev/nebari/pull/2730
- update gcp instance validation by @dcmcand in https://github.com/nebari-dev/nebari/pull/2875

### New Contributors
- @jcbolling made their first contribution in https://github.com/nebari-dev/nebari/pull/2850

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.11.1...2024.12.1

## Release 2024.11.1 - November 21, 2024 (Hotfix Release)

> NOTE: This hotfix addresses several major bugs identified in the 2024.9.1 release. For a detailed overview, please refer to the related discussion at #2798. Users should upgrade directly from 2024.7.1 to 2024.11.1.

### What's Changed

- fix `CHECK_URL` in kuberhealthy checks to respect namespaces by @dcmcand in https://github.com/nebari-dev/nebari/pull/2779
- fix bug where `check_immutable_fields` throws error with old version of Nebari by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2796
- Fix immutable field validation error when a sub-schema is not Pydantic by @kenafoster in https://github.com/nebari-dev/nebari/pull/2797
- Address issue with AWS instance type schema by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2787
- Add broken note by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2802
- Refactor role creation for upgrade command path by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2795
- Allow overriding of keycloak root credentials for 2024.11.1 upgrade path #2843
- Disable AWS `launch_template` from nebari-config schema #2856

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.9.1...2024.11.1

## Release 2024.9.1 - September 27, 2024 (Broken Release)

> WARNING: This release was later found to have unresolved issues described further in [issue 2798](https://github.com/nebari-dev/nebari/issues/2798). We have marked this release as broken on conda-forge and yanked it on PyPI. One of the bugs prevents any upgrade from 2024.9.1 to 2024.11.1. Users should skip this release entirely and upgrade directly from 2024.7.1 to 2024.11.1.

> WARNING: This release changes how group directories are mounted in JupyterLab pods: only groups with specific permissions will have their directories mounted. If you rely on custom group mounts, we strongly recommend running `nebari upgrade` before updating. This will prompt you to confirm how Nebari should handle your groups—either keep them mounted or allow unmounting. **No data will be lost**, and you can reverse this anytime.

### What's Changed

- Fix: KeyValueDict error when deploying to existing infrastructure by @oftheaxe in https://github.com/nebari-dev/nebari/pull/2560
- Remove unused AWS terraform modules by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2623
- Upgrade Hashicorp Vault action by @aktech in https://github.com/nebari-dev/nebari/pull/2616
- Pass `oauth_no_confirm=True` to jhub-apps by @krassowski in https://github.com/nebari-dev/nebari/pull/2631
- Use Rook Ceph for Jupyterhub and Conda Store drives by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2541
- Fix typo in guided init by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2635
- Action var tests off by @BrianCashProf in https://github.com/nebari-dev/nebari/pull/2632
- add a "moved" block to account for refactored terraform code without deleting/recreating NFS disks by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2639
- Use Helm Chart for JupyterHub 5.1.0 by @krassowski in https://github.com/nebari-dev/nebari/pull/2661
- Add a how to test section to PR template by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2659
- Support disallowed nebari config changes by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2660
- Fix converted init command in guided init by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2666
- Add initial uptime metrics by @dcmcand in https://github.com/nebari-dev/nebari/pull/2609
- Refactor and extend Playwright tests by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2644
- Remove Cypress remaining tests/files by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2672
- refactor jupyterhub user token retrieval within pytest by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2645
- add moved block to account for terraform changes on AWS only by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2673
- Refactor shared group mounting using RBAC by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2593
- Dashboard fix usage report by @kenafoster in https://github.com/nebari-dev/nebari/pull/2671
- only capture stdout not stdout+stderr when capture_output=True by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2704
- revert breaking change to azure deployment test by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2706
- Refactor GitOps approach prompt flow in guided init by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2269
- template the kustomization.yaml file by @dcmcand in https://github.com/nebari-dev/nebari/pull/2667
- Fix auto-provisioned GitHub repo description after guided init by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2708
- Add amazon_web_services configuration option to specify EKS cluster api server endpoint access setting by @joneszc in https://github.com/nebari-dev/nebari/pull/2618
- Use Google Auth and Cloud Python APIs instead of `gcloud` CLI by @swastik959 in https://github.com/nebari-dev/nebari/pull/2083
- fix broken links in README.md, SECURITY.md, and CONTRIBUTING.md by @blakerosenthal in https://github.com/nebari-dev/nebari/pull/2720
- add test for changing dicts and lists by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2724
- 2024.9.1 upgrade notes by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2726
- Add Support for AWS Launch Template Configuration by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2668
- Run terraform init before running terraform show by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2734
- Release Process Checklist Updates by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2727
- Test implicit aiohttp's TCP to HTTP connector change by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2741
- remove comments by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2743
- Deploy Rook Ceph Helm only when Ceph FS Needed by @kenafoster in https://github.com/nebari-dev/nebari/pull/2742
- fix group mounting paths by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2738
- Add compatibility prompt and notes for shared group mounting by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2739

### New Contributors

- @oftheaxe made their first contribution in https://github.com/nebari-dev/nebari/pull/2560
- @joneszc made their first contribution in https://github.com/nebari-dev/nebari/pull/2618
- @swastik959 made their first contribution in https://github.com/nebari-dev/nebari/pull/2083
- @blakerosenthal made their first contribution in https://github.com/nebari-dev/nebari/pull/2720

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.7.1...2024.9.1

## Release 2024.7.1 - August 8, 2024

> NOTE: Support for Digital Ocean deployments using CLI commands and related Terraform modules is being deprecated. Although Digital Ocean will no longer be directly supported in future releases, you can still deploy to Digital Ocean infrastructure using the current `existing` deployment option.

### What's Changed

- Enable authentication by default in jupyter-server by @krassowski in https://github.com/nebari-dev/nebari/pull/2288
- remove dns sleep by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2550
- Conda-store permissions v2 + load roles from keycloak by @aktech in https://github.com/nebari-dev/nebari/pull/2531
- Restrict public access and add bucket encryption using cmk by @dcmcand in https://github.com/nebari-dev/nebari/pull/2525
- Add overwrite to AWS coredns addon by @dcmcand in https://github.com/nebari-dev/nebari/pull/2538
- Add a default roles at initialisation by @aktech in https://github.com/nebari-dev/nebari/pull/2546
- Hide gallery section if no exhibits are configured by @krassowski in https://github.com/nebari-dev/nebari/pull/2549
- Add note about ~/.bash_profile by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2575
- Expose jupyterlab-gallery branch and depth options by @krassowski in https://github.com/nebari-dev/nebari/pull/2556
- #2566 Upgrade Jupyterhub ssh image by @arjxn-py in https://github.com/nebari-dev/nebari/pull/2576
- Stop copying unnecessary files into user home directory by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2578
- Include deprecation notes for init/deploy subcommands by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2582
- Only download jar if file doesn't exist by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2588
- Remove unnecessary experimental flag by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2606
- Add typos spell checker to pre-commit by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2568
- Enh 2451 skip conditionals by @BrianCashProf in https://github.com/nebari-dev/nebari/pull/2569
- Improve codespell support: adjust and concentrate config to pyproject.toml and fix more typos by @yarikoptic in https://github.com/nebari-dev/nebari/pull/2583
- Move codespell config to pyproject.toml only by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2611
- Add `depends_on` for bucket encryption by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2615

### New Contributors

- @BrianCashProf made their first contribution in https://github.com/nebari-dev/nebari/pull/2569
- @yarikoptic made their first contribution in https://github.com/nebari-dev/nebari/pull/2583

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.6.1...2024.7.1

## Release 2024.6.1 - June 26, 2024

> NOTE: This release includes an upgrade to the `kube-prometheus-stack` Helm chart, resulting in a newer version of Grafana. When upgrading your Nebari cluster, you will be prompted to have Nebari update some CRDs and delete a DaemonSet on your behalf. If you prefer, you can also run the commands yourself, which will be shown to you. If you have any custom dashboards, you'll also need to back them up by [exporting them as JSON](https://grafana.com/docs/grafana/latest/dashboards/share-dashboards-panels/#export-a-dashboard-as-json), so you can [import them](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/#import-a-dashboard) after upgrading.

### What's Changed

- Fetch JupyterHub roles from Keycloak by @krassowski in https://github.com/nebari-dev/nebari/pull/2447
- Update selector for Start server button to use button tag by @krassowski in https://github.com/nebari-dev/nebari/pull/2464
- Reduce GCP Fixed Costs by 50% by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2453
- Restore JupyterHub updates from PR-2427 by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2465
- Workload identity by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2460
- Fix test using a non-specific selector by @krassowski in https://github.com/nebari-dev/nebari/pull/2475
- add verify=false since we use self signed cert in tests by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2481
- fix forward auth when using custom cert by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2479
- Upgrade to JupyterHub 5.0.0b2 by @krassowski in https://github.com/nebari-dev/nebari/pull/2468
- upgrade instructions for PR 2453 by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2466
- Use Helm Chart for JupyterHub 5.0.0 final by @krassowski in https://github.com/nebari-dev/nebari/pull/2484
- Parse and insert keycloak roles scopes into JupyterHub by @aktech in https://github.com/nebari-dev/nebari/pull/2471
- Add CITATION file by @pavithraes in https://github.com/nebari-dev/nebari/pull/2455
- CI: add azure integration by @fangchenli in https://github.com/nebari-dev/nebari/pull/2061
- Create trivy.yml by @dcmcand in https://github.com/nebari-dev/nebari/pull/2458
- don't run azure deployment on PRs, only on schedule and manual trigger by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2498
- add cloud provider deployment status badges to README.md by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2407
- Upgrade kube-prometheus-stack helm chart by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2472
- upgrade note by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2502
- Remove VSCode from jhub_apps default services by @jbouder in https://github.com/nebari-dev/nebari/pull/2503
- Explicit config by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2294
- fix general node scaling bug for azure by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2517
- Skip running cleanup on pull requests by @aktech in https://github.com/nebari-dev/nebari/pull/2488
- 1792 Add docstrings to `upgrade.py` by @arjxn-py in https://github.com/nebari-dev/nebari/pull/2512
- set's min TLS version for azure storage account to TLS 1.2 by @dcmcand in https://github.com/nebari-dev/nebari/pull/2522
- Fix conda-store and Traefik Grafana Dashboards by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2540
- Implement support for jupyterlab-gallery config by @krassowski in https://github.com/nebari-dev/nebari/pull/2501
- Add option to run CRDs updates and DaemonSet deletion on user's behalf. by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2544

### New Contributors

- @arjxn-py made their first contribution in https://github.com/nebari-dev/nebari/pull/2512

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.5.1...2024.6.1

## Release 2024.5.1 - May 13, 2024

### What's Changed

- make userscheduler run on general node group by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2415
- Upgrade to Pydantic V2 by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2348
- Pydantic2 PR fix by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2421
- remove redundant pydantic class, fix bug by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2426
- Update `python-keycloak` version pins constraints by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2435
- add HERA_TOKEN env var to user pods by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2438
- fix docs link by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2443
- Update allowed admin groups by @aktech in https://github.com/nebari-dev/nebari/pull/2429

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.4.1...2024.5.1

## Release 2024.4.1 - April 20, 2024

### What's Changed

- update azurerm version by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2370
- Get JupyterHub `groups` from Keycloak, support `oauthenticator` 16.3+ by @krassowski in https://github.com/nebari-dev/nebari/pull/2361
- add full names for cloud providers in guided init by @exitflynn in https://github.com/nebari-dev/nebari/pull/2375
- Add middleware to prefix JupyterHub navbar items with /hub. by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2360
- CLN: split #1928, refactor render test by @fangchenli in https://github.com/nebari-dev/nebari/pull/2246
- add trailing slash for jupyterhub proxy paths by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2387
- remove references to deprecated cdsdashboards by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2390
- add default node groups to config by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2398
- Update concurrency settings for Integration tests by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2393
- Make CI/CD Cloud Provider Test Conditional by @tylergraff in https://github.com/nebari-dev/nebari/pull/2369

### New Contributors

- @exitflynn made their first contribution in https://github.com/nebari-dev/nebari/pull/2375

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.3.3...2024.4.1

## Release 2024.3.3 - March 27, 2024

### What's Changed

- get default variable value when following a terraform variable by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2322
- Upgrade Actions versions by @isumitjha in https://github.com/nebari-dev/nebari/pull/2291
- Cleanup spawner logs by @krassowski in https://github.com/nebari-dev/nebari/pull/2328
- Fix loki gateway url when deployed on non-dev namespace by @aktech in https://github.com/nebari-dev/nebari/pull/2327
- Dmcandrew update ruamel.yaml by @dcmcand in https://github.com/nebari-dev/nebari/pull/2315
- upgrade auth0-python version to ultimately resolve CVE-2024-26130 by @tylergraff in https://github.com/nebari-dev/nebari/pull/2314
- remove deprecated code paths by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2349
- Create SECURITY.md by @dcmcand in https://github.com/nebari-dev/nebari/pull/2354
- Set node affinity for more pods to ensure they run on general node pool by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2353
- Deduplicate conda-store in JupyterLab main menu by @krassowski in https://github.com/nebari-dev/nebari/pull/2347
- Pass current namespace to argo via environment variable by @krassowski in https://github.com/nebari-dev/nebari/pull/2317
- PVC for Traefik Ingress (prevent LetsEncrypt throttling) by @kenafoster in https://github.com/nebari-dev/nebari/pull/2352

### New Contributors

- @isumitjha made their first contribution in https://github.com/nebari-dev/nebari/pull/2291
- @tylergraff made their first contribution in https://github.com/nebari-dev/nebari/pull/2314

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.3.2...2024.3.3

## Release 2024.3.2 - March 14, 2024

### What's Changed

- update max k8s versions and remove depreciated api usage in local deploy by @dcmcand in https://github.com/nebari-dev/nebari/pull/2276
- update keycloak image repo by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2312
- Generate random password for Grafana by @aktech in https://github.com/nebari-dev/nebari/pull/2289
- update conda store to 2024.3.1 by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/2316
- Switch PyPI release workflow to use trusted publishing by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2323

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.3.1...2024.3.2

## Release 2024.3.1 - March 11, 2024

### What's Changed

- Modify Playwright test to account for changes in JupyterLab UI. by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2232
- Add favicon to jupyterhub theme. by @jbouder in https://github.com/nebari-dev/nebari/pull/2222
- Set min nodes to 0 for worker and user. by @pt247 in https://github.com/nebari-dev/nebari/pull/2168
- Remove `jhub-client` from pyproject.toml by @pavithraes in https://github.com/nebari-dev/nebari/pull/2242
- Include permission validation step to programmatically cloned repos by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2258
- Expose jupyter's preferred dir as a config option by @krassowski in https://github.com/nebari-dev/nebari/pull/2251
- Allow to configure default settings for JupyterLab (`overrides.json`) by @krassowski in https://github.com/nebari-dev/nebari/pull/2249
- Feature/jlab menu customization by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2259
- Add cloud provider to the dask config.json file by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2266
- Fix syntax error in jupyter-server-config Python file by @krassowski in https://github.com/nebari-dev/nebari/pull/2286
- Add "Open VS Code" entry in services by @krassowski in https://github.com/nebari-dev/nebari/pull/2267
- Add Grafana Loki integration by @aktech in https://github.com/nebari-dev/nebari/pull/2156

### New Contributors

- @jbouder made their first contribution in https://github.com/nebari-dev/nebari/pull/2222
- @krassowski made their first contribution in https://github.com/nebari-dev/nebari/pull/2251

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2024.1.1...2024.3.1

## Release 2024.1.1 - January 17, 2024

### Feature changes and enhancements

- Upgrade conda-store to latest version 2024.1.1
- Add Jhub-Apps
- Add Jupyterlab-pioneer
- Minor improvements and bug fixes

### Breaking Changes

> WARNING: jupyterlab-videochat, retrolab, jupyter-tensorboard, jupyterlab-conda-store and jupyter-nvdashboard are no longer supported in Nebari version and will be uninstalled."

### What's Changed

- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/2176
- Fix logic for dns lookup. by @pt247 in https://github.com/nebari-dev/nebari/pull/2166
- Integrate JupyterHub App Launcher into Nebari by @aktech in https://github.com/nebari-dev/nebari/pull/2185
- Pass in permissions boundary to k8s module by @aktech in https://github.com/nebari-dev/nebari/pull/2153
- Add jupyterlab-pioneer by @aktech in https://github.com/nebari-dev/nebari/pull/2127
- JHub Apps: Filter conda envs by user by @aktech in https://github.com/nebari-dev/nebari/pull/2187
- update upgrade command by @dcmcand in https://github.com/nebari-dev/nebari/pull/2198
- Remove JupyterLab from services list by @aktech in https://github.com/nebari-dev/nebari/pull/2189
- Adding fields to ignore within keycloak_realm by @costrouc in https://github.com/nebari-dev/nebari/pull/2200
- Add Nebari menu item configuration. by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2196
- Disable "Newer update available" popup as default setting by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2192
- Block usage of pip inside jupyterlab by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2191
- Return all environments instead of just those under the user's namespace for jhub-apps by @marcelovilla in https://github.com/nebari-dev/nebari/pull/2206
- Adding a temporary writable directory for conda-store server /home/conda by @costrouc in https://github.com/nebari-dev/nebari/pull/2209
- Add demo repositories mechanism to populate user's space by @viniciusdc in https://github.com/nebari-dev/nebari/pull/2207
- update nebari_workflow_controller and conda_store tags to test rc by @dcmcand in https://github.com/nebari-dev/nebari/pull/2210
- 2023.12.1 release notes by @dcmcand in https://github.com/nebari-dev/nebari/pull/2211
- Make it so that jhub-apps default theme doesn't override by @costrouc in https://github.com/nebari-dev/nebari/pull/2213
- Adding additional theme variables to jupyterhub theme config by @costrouc in https://github.com/nebari-dev/nebari/pull/2215
- updates Current Release to 2024.1.1 by @dcmcand in https://github.com/nebari-dev/nebari/pull/2227

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.12.1...2024.1.1

## Release 2023.12.1 - December 15, 2023

### Feature changes and enhancements

- Upgrade conda-store to latest version 2023.10.1
- Minor improvements and bug fixes

### Breaking Changes

> WARNING: Prefect, ClearML and kbatch were removed in this release and upgrading to this version will result in all of them being uninstalled.

### What's Changed

- BUG: fix incorrect config override #2086 by @fangchenli in https://github.com/nebari-dev/nebari/pull/2087
- ENH: add AWS IAM permissions_boundary option #2078 by @fangchenli in https://github.com/nebari-dev/nebari/pull/2082
- CI: cleanup local integration workflow by @fangchenli in https://github.com/nebari-dev/nebari/pull/2079
- ENH: check missing GCP services by @fangchenli in https://github.com/nebari-dev/nebari/pull/2036
- ENH: use packaging for version parsing, add unit tests by @fangchenli in https://github.com/nebari-dev/nebari/pull/2048
- ENH: specify required field when retrieving available gcp regions by @fangchenli in https://github.com/nebari-dev/nebari/pull/2033
- Upgrade conda-store to 2023.10.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2092
- Add upgrade command for 2023.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2103
- CLN: cleanup typing and typing import in init by @fangchenli in https://github.com/nebari-dev/nebari/pull/2107
- Remove kbatch, prefect and clearml by @iameskild in https://github.com/nebari-dev/nebari/pull/2101
- Fix integration tests, helm-validate script by @iameskild in https://github.com/nebari-dev/nebari/pull/2102
- Re-enable AWS tags support by @iameskild in https://github.com/nebari-dev/nebari/pull/2096
- Update upgrade instructions for 2023.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2112
- Update nebari-git env pins by by @iameskild in https://github.com/nebari-dev/nebari/pull/2113
- Update release notes for 2023.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2114

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.11.1...2023.12.1

## Release 2023.11.1 - November 15, 2023

### Feature changes and enhancements

- Upgrade conda-store to latest version 2023 .10.1
- Minor improvements and bug fixes

### Breaking Changes

> WARNING: Prefect, ClearML and kbatch were removed in this release and upgrading to this version will result in all of them being uninstalled.

### What's Changed

- BUG: fix incorrect config override #2086 by @fangchenli in https://github.com/nebari-dev/nebari/pull/2087
- ENH: add AWS IAM permissions_boundary option #2078 by @fangchenli in https://github.com/nebari-dev/nebari/pull/2082
- CI: cleanup local integration workflow by @fangchenli in https://github.com/nebari-dev/nebari/pull/2079
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/2099
- ENH: check missing GCP services by @fangchenli in https://github.com/nebari-dev/nebari/pull/2036
- ENH: use packaging for version parsing, add unit tests by @fangchenli in https://github.com/nebari-dev/nebari/pull/2048
- ENH: specify required field when retrieving available gcp regions by @fangchenli in https://github.com/nebari-dev/nebari/pull/2033
- Upgrade conda-store to 2023.10.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2092
- Add upgrade command for 2023.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2103
- CLN: cleanup typing and typing import in init by @fangchenli in https://github.com/nebari-dev/nebari/pull/2107
- Remove kbatch, prefect and clearml by @iameskild in https://github.com/nebari-dev/nebari/pull/2101
- Fix integration tests, helm-validate script by @iameskild in https://github.com/nebari-dev/nebari/pull/2102
- Re-enable AWS tags support by @iameskild in https://github.com/nebari-dev/nebari/pull/2096
- Update upgrade instructions for 2023.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2112
- Update nebari-git env pins by by @iameskild in https://github.com/nebari-dev/nebari/pull/2113
- Update release notes for 2023.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2114

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.10.1...2023.11.1

## Release 2023.10.1 - October 20, 2023

This release includes a major refactor which introduces a Pluggy-based extension mechanism which allow developers to build new stages. This is the initial implementation
of the extension mechanism and we expect the interface to be refined overtime. If you're interested in developing your own stage plugin, please refer to [our documentation](https://www.nebari.dev/docs/how-tos/nebari-extension-system#developing-an-extension). When you're ready to upgrade, please download this version from either PyPI or Conda-Forge and run the `nebari upgrade -c nebari-config.yaml`
command and follow the instructions

> WARNING: CDS Dashboards was removed in this release and upgrading to this version will result in CDS Dashboards being uninstalled. A replacement dashboarding solution is currently in the works
> and will be integrated soon.

> WARNING: Given the scope of changes in this release, we highly recommend backing up your system before upgrading. Please refer to our [Manual Backup](https://www.nebari.dev/docs/how-tos/manual-backup) documentation for more details.

### Feature changes and enhancements

- Extension Mechanism Implementation in [PR 1833](https://github.com/nebari-dev/nebari/pull/1833)
  - This also includes much stricter schema validation.
- JupyterHub upgraded to 3.1 in [PR 1856](https://github.com/nebari-dev/nebari/pull/1856)'

### Breaking Changes

- While we have tried our best to avoid breaking changes when introducing the extension mechanism, the scope of the changes is too large for us to confidently say there won't be breaking changes.

> WARNING: CDS Dashboards was removed in this release and upgrading to this version will result in CDS Dashboards being uninstalled. A replacement dashboarding solution is currently in the work and will be integrated soon.

> WARNING: We will be removing and ending support for ClearML, Prefect and kbatch in the next release. The kbatch has been functionally replaced by Argo-Jupyter-Scheduler. We have seen little interest in ClearML and Prefect in recent years, and removing makes sense at this point. However if you wish to continue using them with Nebari we encourage you to [write your own Nebari extension](https://www.nebari.dev/docs/how-tos/nebari-extension-system#developing-an-extension).

### What's Changed

- Spinup spot instance for CI with cirun by @aktech in https://github.com/nebari-dev/nebari/pull/1882
- Fix argo-viewer service account reference by @iameskild in https://github.com/nebari-dev/nebari/pull/1881
- Framework for Nebari deployment via pytest for extensive testing by @aktech in https://github.com/nebari-dev/nebari/pull/1867
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1878
- Test GCP/AWS Deployment with Pytest by @aktech in https://github.com/nebari-dev/nebari/pull/1871
- Bump DigitalOcean provider to latest by @aktech in https://github.com/nebari-dev/nebari/pull/1891
- Ensure path is Path object by @iameskild in https://github.com/nebari-dev/nebari/pull/1888
- enabling viewing hidden files in jupyterlab file explorer by @kalpanachinnappan in https://github.com/nebari-dev/nebari/pull/1893
- Extension Mechanism Implementation by @costrouc in https://github.com/nebari-dev/nebari/pull/1833
- Fix import path in deployment tests & misc by @aktech in https://github.com/nebari-dev/nebari/pull/1908
- pytest:ensure failure on warnings by @costrouc in https://github.com/nebari-dev/nebari/pull/1907
- workaround for mixed string/posixpath error by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1915
- ENH: Remove aws cli, use boto3 by @fangchenli in https://github.com/nebari-dev/nebari/pull/1920
- paginator for boto3 ec2 instance types by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1923
- Update README.md -- fix typo. by @teoliphant in https://github.com/nebari-dev/nebari/pull/1925
- Add more unit tests, add cleanup step for Digital Ocean integration test by @iameskild in https://github.com/nebari-dev/nebari/pull/1910
- Add cleanup step for AWS integration test, ensure disable_prompt is passed through by @iameskild in https://github.com/nebari-dev/nebari/pull/1921
- K8s 1.25 + More Improvements by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1856
- adding lifecycle ignore to eks node group by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1905
- nebari init unit tests by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1931
- Bug fix - JH singleuser environment getting overwritten by @kenafoster in https://github.com/nebari-dev/nebari/pull/1933
- Allow users to specify the Azure RG to deploy into by @iameskild in https://github.com/nebari-dev/nebari/pull/1927
- nebari validate unit tests by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1938
- adding openid connect provider to enable irsa feature by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1903
- nebari upgrade CLI tests by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1963
- CI: Add test coverage by @fangchenli in https://github.com/nebari-dev/nebari/pull/1959
- nebari cli environment variable handling, support, keycloak, dev tests by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1968
- CI: remove empty notebook to fix pre-commit json check by @fangchenli in https://github.com/nebari-dev/nebari/pull/1976
- TYP: fix typing error in plugins by @fangchenli in https://github.com/nebari-dev/nebari/pull/1973
- TYP: fix return class type in hookimpl by @fangchenli in https://github.com/nebari-dev/nebari/pull/1975
- Allow users to specify Azure tags by @iameskild in https://github.com/nebari-dev/nebari/pull/1967
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1979
- Do not try and add argo envs when disabled by @iameskild in https://github.com/nebari-dev/nebari/pull/1926
- Handle region with care, updates to test suite by @iameskild in https://github.com/nebari-dev/nebari/pull/1930
- remove custom auth from config schema by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1994
- CLI: handle removed dns options in deploy command by @fangchenli in https://github.com/nebari-dev/nebari/pull/1992
- Add API docs by @kcpevey in https://github.com/nebari-dev/nebari/pull/1634
- Upgrade images for jupyterhub-ssh, kbatch by @iameskild in https://github.com/nebari-dev/nebari/pull/1997
- Add permissions to generate_cli_docs workflow by @iameskild in https://github.com/nebari-dev/nebari/pull/2005
- standardize regex and messaging for names by @kenafoster in https://github.com/nebari-dev/nebari/pull/2003
- ENH: specify required fields when retrieving available gcp projects by @fangchenli in https://github.com/nebari-dev/nebari/pull/2008
- Modify JupyterHub networkPolicy to match existing policy by @iameskild in https://github.com/nebari-dev/nebari/pull/1991
- Update package dependencies by @iameskild in https://github.com/nebari-dev/nebari/pull/1986
- CI: Add AWS integration test workflow, clean up by @iameskild in https://github.com/nebari-dev/nebari/pull/1977
- BUG: fix unboundlocalerror in integration test by @fangchenli in https://github.com/nebari-dev/nebari/pull/1999
- Auth0/Github auth-provider config validation fix by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/2009
- terraform upgrade to 1.5.7 by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1998
- cli init repo auto provision fix by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/2012
- Add gcp_cleanup, minor changes by @iameskild in https://github.com/nebari-dev/nebari/pull/2010
- Fix #2024 by @dcmcand in https://github.com/nebari-dev/nebari/pull/2025
- Upgrade conda-store to 2023.9.2 by @iameskild in https://github.com/nebari-dev/nebari/pull/2028
- Add upgrade steps, instructions for 2023.9.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/2029
- CI: add gcp integration test by @fangchenli in https://github.com/nebari-dev/nebari/pull/2049
- CLN: remove flake8 from dependencies by @fangchenli in https://github.com/nebari-dev/nebari/pull/2044
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/2047
- fix typo in guided init for Digital Ocean by @dcmcand in https://github.com/nebari-dev/nebari/pull/2059
- CI: add do integration by @fangchenli in https://github.com/nebari-dev/nebari/pull/2060
- TYP: make all subfolders under kubernetes_services/template non-module by @fangchenli in https://github.com/nebari-dev/nebari/pull/2043
- TYP: fix most typing errors in provider by @fangchenli in https://github.com/nebari-dev/nebari/pull/2038
- Fix link to documentation on Nebari Deployment home page by @aktech in https://github.com/nebari-dev/nebari/pull/2063
- TST: enable timeout config in playwright notebook test by @fangchenli in https://github.com/nebari-dev/nebari/pull/1996
- DEPS: sync supported python version by @fangchenli in https://github.com/nebari-dev/nebari/pull/2065
- Test support for Python 3.12 by @aktech in https://github.com/nebari-dev/nebari/pull/2046
- BUG: fix validation error related to `provider` #2054 by @fangchenli in https://github.com/nebari-dev/nebari/pull/2056
- CI: improve unit test workflow in CI, revert #2046 by @fangchenli in https://github.com/nebari-dev/nebari/pull/2071
- TST: enable exact_match config in playwright notebook test by @fangchenli in https://github.com/nebari-dev/nebari/pull/2027
- CI: move conda build test to separate job by @fangchenli in https://github.com/nebari-dev/nebari/pull/2073
- Revert conda-store to v0.4.14, #2028 by @iameskild in https://github.com/nebari-dev/nebari/pull/2074
- ENH/CI: add mypy config, and CI workflow by @fangchenli in https://github.com/nebari-dev/nebari/pull/2066
- Update upgrade for 2023.10.1 by @kenfoster in https://github.com/nebari-dev/nebari/pull/2080
- Update RELEASE notes, minor fixes by @iameskild in https://github.com/nebari-dev/nebari/pull/2039

### New Contributors

- @kalpanachinnappan made their first contribution in https://github.com/nebari-dev/nebari/pull/1893
- @fangchenli made their first contribution in https://github.com/nebari-dev/nebari/pull/1920
- @teoliphant made their first contribution in https://github.com/nebari-dev/nebari/pull/1925
- @kenafoster made their first contribution in https://github.com/nebari-dev/nebari/pull/1933
- @dcmcand made their first contribution in https://github.com/nebari-dev/nebari/pull/2025

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.7.2...2023.10.1

## Release 2023.7.2 - August 3, 2023

This is a hot-fix release that resolves an issue whereby users in the `analyst` group are unable to launch their JupyterLab server because the name of the viewer-specific `ARGO_TOKEN` was mislabeled; see [PR 1881](https://github.com/nebari-dev/nebari/pull/1881) for more details.

### What's Changed

- Fix argo-viewer service account reference by @iameskild in https://github.com/nebari-dev/nebari/pull/1881
- Add release notes for 2023.7.2, update release notes for 2023.7.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/1886

## Release 2023.7.1 - July 21, 2023

> WARNING: CDS Dashboards will be deprecated soon. Nebari `2023.7.1` will be the last release with support for CDS Dashboards integration. A new dashboard sharing mechanism added in the near future, but some releases in the interim will not have dashboard sharing capabilities..

> WARNING: For those running on AWS, upgrading from previous versions to `2023.7.1` requires a [backup](https://www.nebari.dev/docs/how-tos/manual-backup). Due to changes made to the VPC (See [issue 1884](https://github.com/nebari-dev/nebari/issues/1884) for details), Terraform thinks it needs to destroy and reprovision a new VPC which causes the entire cluster to be destroyed and rebuilt.

### Feature changes and enhancements

- Addition of Nebari-Workflow-Controller in [PR 1741](https://github.com/nebari-dev/nebari/pull/1741)
- Addition of Argo-Jupyter-Scheduler in [PR 1832](https://github.com/nebari-dev/nebari/pull/1832)
- Make most of the API private

### Breaking Changes

- As mentioned in the above WARNING, clusters running on AWS should perform a [manual backup](https://www.nebari.dev/docs/how-tos/manual-backup) before running the upgrade to the latest version as changes to the AWS VPC will cause the cluster to be destroyed and redeployed.

### What's Changed

- use conda forge explicitly in conda build test by @pmeier in https://github.com/nebari-dev/nebari/pull/1771
- document that the upgrade command is for all nebari upgrades by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1794
- don't fail CI matrices fast by @pmeier in https://github.com/nebari-dev/nebari/pull/1804
- unvendor keycloak_metrics_spi by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1810
- Dedent fail-fast by @iameskild in https://github.com/nebari-dev/nebari/pull/1815
- support deploying on existing vpc on aws by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1807
- purge most danlging qhub references by @pmeier in https://github.com/nebari-dev/nebari/pull/1802
- Add Argo Workflow Admission controller by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1741
- purge infracost CLI command / CI jobs by @pmeier in https://github.com/nebari-dev/nebari/pull/1820
- remove unused function parameters and CLI flags by @pmeier in https://github.com/nebari-dev/nebari/pull/1725
- purge docs and nox by @pmeier in https://github.com/nebari-dev/nebari/pull/1801
- Add Helm chart lint tool by @viniciusdc in https://github.com/nebari-dev/nebari/pull/1679
- don't set /etc/hosts in CI by @pmeier in https://github.com/nebari-dev/nebari/pull/1729
- remove execute permissions on templates by @pmeier in https://github.com/nebari-dev/nebari/pull/1798
- fix deprecated file deletion by @pmeier in https://github.com/nebari-dev/nebari/pull/1799
- make nebari API private by @pmeier in https://github.com/nebari-dev/nebari/pull/1778
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1831
- Simplify CI by @iameskild in https://github.com/nebari-dev/nebari/pull/1819
- Fix edge-case where k8s_version is equal to HIGHEST_SUPPORTED_K8S_VER… by @iameskild in https://github.com/nebari-dev/nebari/pull/1842
- add more configuration to enable private clusters on AWS by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1841
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1851
- AWS gov cloud support by @sblair-metrostar in https://github.com/nebari-dev/nebari/pull/1857
- Pathlib everywhere by @pmeier in https://github.com/nebari-dev/nebari/pull/1773
- Initial playwright setup by @kcpevey in https://github.com/nebari-dev/nebari/pull/1665
- Changes required for Jupyter-Scheduler integration by @iameskild in https://github.com/nebari-dev/nebari/pull/1832
- Update upgrade command in preparation for release by @iameskild in https://github.com/nebari-dev/nebari/pull/1868
- Add release notes by @iameskild in https://github.com/nebari-dev/nebari/issues/1869

### New Contributors

- @sblair-metrostar made their first contribution in https://github.com/nebari-dev/nebari/pull/1857

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.5.1...2023.7.1

### Release 2023.5.1 - May 5, 2023

### Feature changes and enhancements

- Upgrade Argo-Workflows to version 3.4.4

### Breaking Changes

- The Argo-Workflows version upgrade will result in a breaking change if the existing Kubernetes CRDs are not deleted (see the NOTE below for more details).
- There is a minor breaking change for the Nebari CLI version shorthand, previously it `nebari -v` and now to align with Python convention, it will be `nebari -V`.

> NOTE: After installing the Nebari version `2023.5.1`, please run `nebari upgrade -c nebari-config.yaml` to upgrade
> the `nebari-config.yaml`. This command will also prompt you to delete a few Kubernetes resources (specifically
> the Argo-Workflows CRDS and service accounts) before you can upgrade.

### What's Changed

- Use --quiet flag for conda install in CI by @pmeier in https://github.com/nebari-dev/nebari/pull/1699
- improve CLI tests by @pmeier in https://github.com/nebari-dev/nebari/pull/1710
- Fix Existing dashboards by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1723
- Fix dashboards by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1727
- Typo in the conda-store - conda_store key by @costrouc in https://github.com/nebari-dev/nebari/pull/1740
- use -V (upper case) for --version short form by @pmeier in https://github.com/nebari-dev/nebari/pull/1720
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1692
- improve pytest configuration by @pmeier in https://github.com/nebari-dev/nebari/pull/1700
- fix upgrade command to look for nebari_version instead of qhub_version by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1693
- remove lazy import by @pmeier in https://github.com/nebari-dev/nebari/pull/1721
- fix nebari invocation through python by @pmeier in https://github.com/nebari-dev/nebari/pull/1711
- Update Argo Workflows to latest version by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1639
- Update secret token in release-notes-sync action by @pavithraes in https://github.com/nebari-dev/nebari/pull/1753
- Typo fix in release-notes-sync action by @pavithraes in https://github.com/nebari-dev/nebari/pull/1756
- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1758
- Update path in release-notes-sync action by @pavithraes in https://github.com/nebari-dev/nebari/pull/1757
- Updating heading format in release notes by @pavithraes in https://github.com/nebari-dev/nebari/pull/1761
- Update vault url by @costrouc in https://github.com/nebari-dev/nebari/pull/1752
- Fix? contributor test trigger by @pmeier in https://github.com/nebari-dev/nebari/pull/1734
- Consistent user Experience with y/N. by @AM-O7 in https://github.com/nebari-dev/nebari/pull/1747
- Fix contributor trigger by @pmeier in https://github.com/nebari-dev/nebari/pull/1765
- add more debug output to contributor test trigger by @pmeier in https://github.com/nebari-dev/nebari/pull/1766
- fix copy-paste error by @pmeier in https://github.com/nebari-dev/nebari/pull/1767
- add instructions insufficient permissions of contributor trigger by @pmeier in https://github.com/nebari-dev/nebari/pull/1772
- fix invalid escape sequence by @pmeier in https://github.com/nebari-dev/nebari/pull/1770
- Update AMI in `.cirun.yml` for nebari-dev-ci AWS account by @aktech in https://github.com/nebari-dev/nebari/pull/1776
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1768
- turn warnings into errors with pytest by @pmeier in https://github.com/nebari-dev/nebari/pull/1774
- purge setup.cfg by @pmeier in https://github.com/nebari-dev/nebari/pull/1781
- improve pre-commit run on GHA by @pmeier in https://github.com/nebari-dev/nebari/pull/1782
- Upgrade to k8s 1.24 by @iameskild in https://github.com/nebari-dev/nebari/pull/1760
- Overloaded dask gateway fix by @Adam-D-Lewis in https://github.com/nebari-dev/nebari/pull/1777
- Add option to specify GKE release channel by @iameskild in https://github.com/nebari-dev/nebari/pull/1648
- Update upgrade command, add RELEASE notes by @iameskild in https://github.com/nebari-dev/nebari/pull/1789

### New Contributors

- @pmeier made their first contribution in https://github.com/nebari-dev/nebari/pull/1699
- @AM-O7 made their first contribution in https://github.com/nebari-dev/nebari/pull/1747

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.4.1...2023.5.1

## Release 2023.4.1 - April 12, 2023

> NOTE: Nebari requires Kubernetes version 1.23 and Digital Ocean now requires new clusters to run Kubernetes version 1.24. This means that if you are currently running on Digital Ocean, you should be fine but deploying on a new cluster on Digital Ocean is not possible until we upgrade Kubernetes version (see [issue 1622](https://github.com/nebari-dev/nebari/issues/1622) for more details).

### Feature changes and enhancements

- Upgrades and improvements to conda-store including a new user-interface and greater administrator capabilities.
- Idle-culler settings can now be configured directly from the `nebari-config.yaml`.

### What's Changed

- PR: Raise timeout for jupyter session by @ppwadhwa in https://github.com/nebari-dev/nebari/pull/1646
- PR lower dashboard launch timeout by @ppwadhwa in https://github.com/nebari-dev/nebari/pull/1647
- PR: Update dashboard environment by @ppwadhwa in https://github.com/nebari-dev/nebari/pull/1655
- Fix doc link in README.md by @tkoyama010 in https://github.com/nebari-dev/nebari/pull/1660
- PR: Update dask environment by @ppwadhwa in https://github.com/nebari-dev/nebari/pull/1654
- Feature remove jupyterlab news by @costrouc in https://github.com/nebari-dev/nebari/pull/1641
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1644
- Feat GitHub actions before_script and after_script steps by @costrouc in https://github.com/nebari-dev/nebari/pull/1672
- Remove examples folder by @ppwadhwa in https://github.com/nebari-dev/nebari/pull/1664
- Fix GH action typos by @kcpevey in https://github.com/nebari-dev/nebari/pull/1677
- Github Actions CI needs id-token write permissions by @costrouc in https://github.com/nebari-dev/nebari/pull/1682
- Update AWS force destroy script, include lingering volumes by @iameskild in https://github.com/nebari-dev/nebari/pull/1681
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1673
- Make idle culler settings configurable from the `nebari-config.yaml` by @iameskild in https://github.com/nebari-dev/nebari/pull/1689
- Update pyproject dependencies and add test to ensure it builds on conda-forge by @iameskild in https://github.com/nebari-dev/nebari/pull/1662
- Retrieve secrets from Vault, fix test-provider CI by @iameskild in https://github.com/nebari-dev/nebari/pull/1676
- Pull PyPI secrets from Vault by @iameskild in https://github.com/nebari-dev/nebari/pull/1696
- Adding newest conda-store 0.4.14 along with superadmin credentials by @costrouc in https://github.com/nebari-dev/nebari/pull/1701
- Update release notes for 2023.4.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/1722

### New Contributors

- @ppwadhwa made their first contribution in https://github.com/nebari-dev/nebari/pull/1646
- @tkoyama010 made their first contribution in https://github.com/nebari-dev/nebari/pull/1660

**Full Changelog**: https://github.com/nebari-dev/nebari/compare/2023.1.1...2023.4.1

## Release 2023.1.1 - January 30, 2023

### What's Changed

- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1588
- Make conda-store file system read-only by default by @alimanfoo in https://github.com/nebari-dev/nebari/pull/1595
- ENH - Switch to ruff and pre-commit.ci by @trallard in https://github.com/nebari-dev/nebari/pull/1602
- Migrate to hatch by @iameskild in https://github.com/nebari-dev/nebari/pull/1545
- Add check_repository_cred function to CLI by @iameskild in https://github.com/nebari-dev/nebari/pull/1605
- Adding jupyterlab-conda-store extension support to Nebari by @costrouc in https://github.com/nebari-dev/nebari/pull/1564
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/nebari-dev/nebari/pull/1613
- Ensure Argo-Workflow controller containerRuntimeExecutor is set to emissary by @iameskild in https://github.com/nebari-dev/nebari/pull/1614
- Pass `secret_name` to TF scripts when certificate type = existing by @iameskild in https://github.com/nebari-dev/nebari/pull/1621
- Pin Nebari dependencies, set k8s version for GKE by @iameskild in https://github.com/nebari-dev/nebari/pull/1624
- Create aws-force-destroy bash script by @iameskild in https://github.com/nebari-dev/nebari/pull/1611
- Add option for AWS node-groups to run in a single subnet/AZ by @iameskild in https://github.com/nebari-dev/nebari/pull/1428
- Add export-users to keycloak CLI command, add dev CLI command by @iameskild in https://github.com/nebari-dev/nebari/pull/1610
- Unpin packages in default dashboard env by @iameskild in https://github.com/nebari-dev/pull/1628
- Add release notes for 2023.1.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/1629
- Set GKE release_channel to unspecified to prevent auto k8s updates by @iameskild in https://github.com/nebari-dev/nebari/pull/1630
- Update default nebari-dask, nebari image tags by @iameskild in https://github.com/nebari-dev/nebari/pull/1636

### New Contributors

- @pre-commit-ci made their first contribution in https://github.com/nebari-dev/nebari/pull/1613

## Release 2022.11.1 - December 1, 2022

### What's Changed

- cherry-pick Update README logo (#1514) by @aktech in https://github.com/nebari-dev/nebari/pull/1517
- Release/2022.10.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/1527
- Add Note about QHub->Nebari rename in old docs by @pavithraes in https://github.com/nebari-dev/nebari/pull/1543
- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1550
- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1551
- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1555
- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1560
- Small CLI fixes by @iameskild in https://github.com/nebari-dev/nebari/pull/1529
- 🔄 Synced file(s) with nebari-dev/.github by @nebari-sensei in https://github.com/nebari-dev/nebari/pull/1561
- Render github actions configurations as yaml by @aktech in https://github.com/nebari-dev/nebari/pull/1528
- Update "QHub" to "Nebari" in example notebooks by @pavithraes in https://github.com/nebari-dev/nebari/pull/1556
- Update links to Nebari docs in guided init by @pavithraes in https://github.com/nebari-dev/nebari/pull/1557
- CI: Spinup unique cirun runners for each job by @aktech in https://github.com/nebari-dev/nebari/pull/1563
- Issue-1417: Improve Dask workers placement on AWS | fixing a minor typo by @limacarvalho in https://github.com/nebari-dev/nebari/pull/1487
- Update `setup-node` version by @iameskild in https://github.com/nebari-dev/nebari/pull/1570
- Facilitate CI run for contributor PR by @aktech in https://github.com/nebari-dev/nebari/pull/1568
- Action to sync release notes with nebari-docs by @pavithraes in https://github.com/nebari-dev/nebari/pull/1554
- Restore how the dask worker node group is selected by default by @iameskild in https://github.com/nebari-dev/nebari/pull/1577
- Fix skip check for workflows by @aktech in https://github.com/nebari-dev/nebari/pull/1578
- 📝 Update readme by @trallard in https://github.com/nebari-dev/nebari/pull/1579
- MAINT - Miscellaneous maintenance tasks by @trallard in https://github.com/nebari-dev/nebari/pull/1580
- Wait for Test PyPI to upload test release by @iameskild in https://github.com/nebari-dev/nebari/pull/1583
- Add release notes for 2022.11.1 by @iameskild in https://github.com/nebari-dev/nebari/pull/1584

### New Contributors

- @nebari-sensei made their first contribution in https://github.com/nebari-dev/nebari/pull/1550
- @limacarvalho made their first contribution in https://github.com/nebari-dev/nebari/pull/1487

## Release 2022.10.1 - October 28, 2022

### **WARNING**

> The project has recently been renamed from QHub to Nebari. If your deployment is is still managed by `qhub`, performing an inplace upgrade will **IRREVOCABLY BREAK** your deployment. This will cause you to lose any data stored on the platform, including but not limited to, NFS (filesystem) data, conda-store environments, Keycloak users and groups, etc. Please [backup](https://www.nebari.dev/docs/how-tos/manual-backup) your data before attempting an upgrade.

### Feature changes and enhancements

We are happy to announce the first official release of Nebari (formly QHub)! This release lays the groundwork for many exciting new features and improvements to come.

This release introduces several important changes which include:

- a major project name change from QHub to Nebari - [PR 1508](https://github.com/nebari-dev/nebari/pull/1508)
- a switch from the SemVer to CalVer versioning format - [PR 1501](https://github.com/nebari-dev/nebari/pull/1501)
- a new, Typer-based CLI for improved user experience - [PR 1443](https://github.com/Quansight/qhub/pull/1443) + [PR 1519](https://github.com/nebari-dev/nebari/pull/1519)

Although breaking changes are never fun, the Nebari development team believes these changes are important for the immediate and future success of the project. If you experience any issues or have any questions about these changes, feel free to open an [issue on our Github repo](https://github.com/nebari-dev/nebari/issues).

### What's Changed

- Switch to CalVer by @iameskild in https://github.com/nebari-dev/nebari/pull/1501
- Update theme welcome messages to use Nebari by @pavithraes in https://github.com/nebari-dev/nebari/pull/1503
- Name change QHub --> Nebari by @iameskild in https://github.com/nebari-dev/nebari/pull/1508
- qhub/initialize: lazy load attributes that require remote information by @FFY00 in https://github.com/nebari-dev/nebari/pull/1509
- Update README logo reference by @viniciusdc in https://github.com/nebari-dev/nebari/pull/1514
- Add fix, enhancements and pytests for CLI by @iameskild in https://github.com/nebari-dev/nebari/pull/1498
- Remove old CLI + cleanup by @iameskild in https://github.com/nebari-dev/nebari/pull/1519
- Update `skip_remote_state_provision` default value by @viniciusdc in https://github.com/nebari-dev/nebari/pull/1521
- Add release notes for 2022.10.1 in https://github.com/nebari-dev/nebari/pull/1523

### New Contributors

- @pavithraes made their first contribution in https://github.com/nebari-dev/nebari/pull/1503
- @FFY00 made their first contribution in https://github.com/nebari-dev/nebari/pull/1509

**Note: The following releases (v0.4.5 and lower) were made under the name `Quansight/qhub`.**

## Release v0.4.5 - October 14, 2022

Enhancements for this release include:

- Fix reported bug with Azure deployments due to outdated azurerm provider
- All dashboards related conda-store environments are now visible as options for spawning dashboards
- New Nebari entrypoint
- New Typer-based CLI for Qhub (available using new entrypoint)
- Renamed built-in conda-store namespaces and added customization support
- Updated Traefik version to support the latest Kubernetes API

### What's Changed

- Update azurerm version by @tjcrone in https://github.com/Quansight/qhub/pull/1471
- Make CDSDashboards.conda_envs dynamically update from function by @costrouc in https://github.com/Quansight/qhub/pull/1358
- Fix get_latest_repo_tag fn by @iameskild in https://github.com/Quansight/qhub/pull/1485
- Nebari Typer CLI by @asmijafar20 in https://github.com/Quansight/qhub/pull/1443
- Pass AWS `region`, `kubernetes_version` to terraform scripts by @iameskild in https://github.com/Quansight/qhub/pull/1493
- Enable ebs-csi driver on AWS, add region + kubernetes_version vars by @iameskild in https://github.com/Quansight/qhub/pull/1494
- Update traefik version + CRD by @iameskild in https://github.com/Quansight/qhub/pull/1489
- [ENH] Switch default and filesystem name envs by @viniciusdc in https://github.com/Quansight/qhub/pull/1357

### New Contributors

- @tjcrone made their first contribution in https://github.com/Quansight/qhub/pull/1471

### Migration note

If you are upgrading from a version of Nebari prior to `0.4.5`, you will need to manually update your conda-store namespaces
to be compatible with the new Nebari version. This is a one-time migration step that will need to be performed after upgrading to continue using the service. Refer to [How to migrate base conda-store namespaces](https://deploy-preview-178--nebari-docs.netlify.app/troubleshooting#conda-store-compatibility-migration-steps-when-upgrading-to-045) for further instructions.

## Release v0.4.4 - September 22, 2022

### Feature changes and enhancements

Enhancements for this release include:

- Bump `conda-store` version to `v0.4.11` and enable overrides
- Fully decouple the JupyterLab, JupyterHub and Dask-Worker images from the main codebase
  - See https://github.com/nebari-dev/nebari-docker-images for images
- Add support for Python 3.10
- Add support for Terraform binary download for M1 Mac
- Add option to supply additional arguments to ingress from qhub-config.yaml
- Add support for Kubernetes Kind (local)

### What's Changed

- Add support for terraform binary download for M1 by @aktech in https://github.com/Quansight/qhub/pull/1370
- Improvements in the QHub Cost estimate tool by @HarshCasper in https://github.com/Quansight/qhub/pull/1365
- Add Python-3.10 by @HarshCasper in https://github.com/Quansight/qhub/pull/1352
- Add backwards compatibility item to test checklist by @viniciusdc in https://github.com/Quansight/qhub/pull/1381
- add code server version to fix build by @HarshCasper in https://github.com/Quansight/qhub/pull/1383
- Update Cirun.io config to use labels by @aktech in https://github.com/Quansight/qhub/pull/1379
- Decouple docker images by @iameskild in https://github.com/Quansight/qhub/pull/1371
- Set LATEST_SUPPORTED_PYTHON_VERSION as str by @iameskild in https://github.com/Quansight/qhub/pull/1387
- Integrate kind into local deployment to no longer require minikube for development by @costrouc in https://github.com/Quansight/qhub/pull/1171
- Upgrade conda-store to 0.4.7 allow for customization by @costrouc in https://github.com/Quansight/qhub/pull/1385
- [ENH] Bump conda-store to v0.4.9 by @viniciusdc in https://github.com/Quansight/qhub/pull/1392
- [ENH] Add `pyarrow` and `s3fs` by @viniciusdc in https://github.com/Quansight/qhub/pull/1393
- Fixing bug in authentication method in Conda-Store authentication by @costrouc in https://github.com/Quansight/qhub/pull/1396
- CI: Merge test and release to PyPi workflows into one by @HarshCasper in https://github.com/Quansight/qhub/pull/1386
- Update packages in the dashboard env by @iameskild in https://github.com/Quansight/qhub/pull/1402
- BUG: Setting behind proxy setting in conda-store to be aware of http vs. https by @costrouc in https://github.com/Quansight/qhub/pull/1404
- Minor update to release workflow by @iameskild in https://github.com/Quansight/qhub/pull/1406
- Clean up release workflow by @iameskild in https://github.com/Quansight/qhub/pull/1407
- Add release notes for v0.4.4 by @iameskild in https://github.com/Quansight/qhub/pull/1408
- Update Ingress overrides behaviour by @viniciusdc in https://github.com/Quansight/qhub/pull/1420
- Preserve conda-store image permissions by @iameskild in https://github.com/Quansight/qhub/pull/1419
- Add project name to jhub helm chart release name by @iameskild in https://github.com/Quansight/qhub/pull/1422
- Fix for helm extension overrides data type issue by @konkapv in https://github.com/Quansight/qhub/pull/1424
- Add option to disable tls certificate by @iameskild in https://github.com/Quansight/qhub/pull/1421
- Fixing provider=existing for local/existing by @costrouc in https://github.com/Quansight/qhub/pull/1425
- Update release, testing checklist by @iameskild in https://github.com/Quansight/qhub/pull/1397
- Add `--disable-checks` flag to deploy by @iameskild in https://github.com/Quansight/qhub/pull/1429
- Adding option to supply additional arguments to ingress via `ingress.terraform_overrides.additional-arguments` by @costrouc in https://github.com/Quansight/qhub/pull/1431
- Add properties to middleware crd headers by @iameskild in https://github.com/Quansight/qhub/pull/1434
- Restart conda-store worker when new conda env is added to config.yaml by @iameskild in https://github.com/Quansight/qhub/pull/1437
- Pin dask ipywidgets version to `7.7.1` by @viniciusdc in https://github.com/Quansight/qhub/pull/1442
- Set qhub-dask version to 0.4.4 by @iameskild in https://github.com/Quansight/qhub/pull/1470

### New Contributors

- @konkapv made their first contribution in https://github.com/Quansight/qhub/pull/1424

## Release v0.4.3 - July 7, 2022

### Feature changes and enhancements

Enhancements for this release include:

- Integrating Argo Workflow
- Integrating kbatch
- Adding `cost-estimate` CLI subcommand (Infracost)
- Add `panel-serve` as a CDS dashboard option
- Add option to use RetroLab instead of default JupyterLab

### What's Changed

- Update the login/Keycloak docs page by @gabalafou in https://github.com/Quansight/qhub/pull/1289
- Add configuration option so myst parser generates anchors for heading… by @costrouc in https://github.com/Quansight/qhub/pull/1299
- Image scanning by @HarshCasper in https://github.com/Quansight/qhub/pull/1291
- Fix display version behavior by @viniciusdc in https://github.com/Quansight/qhub/pull/1275
- [Docs] Add docs about custom Identity providers for Authentication by @viniciusdc in https://github.com/Quansight/qhub/pull/1273
- Add prefect token var to CI when needed by @viniciusdc in https://github.com/Quansight/qhub/pull/1279
- ci: prevent image scans on main image builds by @HarshCasper in https://github.com/Quansight/qhub/pull/1300
- Integrate `kbatch` by @iameskild in https://github.com/Quansight/qhub/pull/1258
- add `retrolab` to the base jupyter image by @tonyfast in https://github.com/Quansight/qhub/pull/1222
- Update pre-commit, remove vale by @iameskild in https://github.com/Quansight/qhub/pull/1282
- Argo Workflows by @Adam-D-Lewis in https://github.com/Quansight/qhub/pull/1252
- Update minio, postgresql chart repo location by @iameskild in https://github.com/Quansight/qhub/pull/1308
- Fix broken AWS, set minimum desired size to 1, enable 0 scaling by @tylerpotts in https://github.com/Quansight/qhub/pull/1304
- v0.4.2 release notes by @iameskild in https://github.com/Quansight/qhub/pull/1323
- install dask lab ext from main by @iameskild in https://github.com/Quansight/qhub/pull/1321
- Overrides default value for dask-labextension by @viniciusdc in https://github.com/Quansight/qhub/pull/1327
- CI: Add Infracost to GHA CI for infra cost tracking by @HarshCasper in https://github.com/Quansight/qhub/pull/1316
- Add check for highest supported k8s version by @aktech in https://github.com/Quansight/qhub/pull/1336
- Increase the default instance sizes by @peytondmurray in https://github.com/Quansight/qhub/pull/1338
- Add panel-serve as a CDS dashboard option by @iameskild in https://github.com/Quansight/qhub/pull/1070
- Generate QHub Costs via `infracost` by @HarshCasper in https://github.com/Quansight/qhub/pull/1340
- Add release-checklist issue template by @iameskild in https://github.com/Quansight/qhub/pull/1314
- Fix missing import: `rich` : broken qhub init with cloud by @aktech in https://github.com/Quansight/qhub/pull/1353
- Bump qhub-dask version to 0.4.3 by @peytondmurray in https://github.com/Quansight/qhub/pull/1341
- Remove the need for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be set with Digital Ocean deployment by @costrouc in https://github.com/Quansight/qhub/pull/1344
- Revert "Remove the need for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be set with Digital Ocean deployment" by @viniciusdc in https://github.com/Quansight/qhub/pull/1355
- Upgrade kbatch version by @iameskild in https://github.com/Quansight/qhub/pull/1335
- Drop support for python 3.7 in dask environment by @peytondmurray in https://github.com/Quansight/qhub/pull/1354
- Add useful terminal utils to jlab image by @dharhas in https://github.com/Quansight/qhub/pull/1361
- Tweak bashrc by @dharhas in https://github.com/Quansight/qhub/pull/1363
- Fix bug where vscode extensions are not installing by @viniciusdc in https://github.com/Quansight/qhub/pull/1360

### New Contributors

- @gabalafou made their first contribution in https://github.com/Quansight/qhub/pull/1289
- @peytondmurray made their first contribution in https://github.com/Quansight/qhub/pull/1338
- @dharhas made their first contribution in https://github.com/Quansight/qhub/pull/1361

**Full Changelog**: https://github.com/Quansight/qhub/compare/v0.4.1...v0.4.3

## Release v0.4.2 - June 8, 2022

### Incident postmortem

#### Bitnami update breaks post v0.4.0 releases

On June 2, 2022, GitHub user @peytondmurray reported [issue 1306](https://github.com/Quansight/qhub/issues/1306), stating that he was unable to deploy QHub using either the latest release `v0.4.1` or installing `qhub` from `main`. As verified by @peytondmurray and others, during your first `qhub deploy`, the deployment halts and complains about two invalid Helm charts missing from the bitnami `index.yaml`.

[Bitnami's decision to update how long they keep old Helm charts in their index for](https://github.com/bitnami/charts/issues/10539) has essentially broken all post `v0.4.0` versions of QHub.

This is a severe bug that will affect any new user who tries to install and deploy QHub with any version less than `v0.4.2` and greater than or equal to `v0.4.0`.

Given the impact and severity of this bug, the team has decided to quickly cut a hotfix.

#### AWS deployment failing due to old auto-scaler helm chart

On May 27, 2022, GitHub user @tylerpotts reported [issue 1302](https://github.com/Quansight/qhub/issues/1302), stating that he was unable to deploy QHub using the latest release `v0.4.1` (or installing `qhub` from `main`). As described in the original issue, the deployment failed complaining about the deprecated `v1beta` Kubernetes API. This led to the discovery that we were using an outdated `cluster_autoscaler` helm chart.

The solution is to update from `v1beta` to `v1` Kubernetes API for the appropriate resources and update the reference to the `cluster_autoscaler` helm chart.

Given the impact and severity of this bug, the team has decided to quickly cut a hotfix.

### Bug fixes

This release is a hotfix for the issue summarized in the following:

- [issue 1319](https://github.com/Quansight/qhub/issues/1319)
- [issue 1306](https://github.com/Quansight/qhub/issues/1306)
- [issue 1302](https://github.com/Quansight/qhub/issues/1302)

### What's Changed

- Update minio, postgresql chart repo location by @iameskild in [PR 1308](https://github.com/Quansight/qhub/pull/1308)
- Fix broken AWS, set minimum desired size to 1, enable 0 scaling by @tylerpotts in [PR 1304](https://github.com/Quansight/qhub/issues/1304)

## Release v0.4.1 - May 10, 2022

### Feature changes and enhancements

Enhancements for this release include:

- Add support for pinning the IP address of the load balancer via terraform overrides
- Upgrade to Conda-Store to `v0.3.15`
- Add ability to limit JupyterHub profiles based on users/groups

### Bug fixes

This release addresses several bugs with a slight emphasis on stabilizing the core services while also improving the end user experience.

### What's Changed

- [BUG] Adding back feature of limiting profiles for users and groups by @costrouc in [PR 1169](https://github.com/Quansight/qhub/pull/1169)
- DOCS: Add release notes for v0.4.0 release by @HarshCasper in [PR 1170](https://github.com/Quansight/qhub/pull/1170)
- Move ipython config within jupyterlab to docker image with more robust jupyterlab ssh tests by @costrouc in [PR 1143](https://github.com/Quansight/qhub/pull/1143)
- Removing custom dask_gateway from qhub and idle_timeout for dask clusters to 30 min by @costrouc in [PR 1151](https://github.com/Quansight/qhub/pull/1151)
- Overrides.json now managed by qhub configmaps instead of inside docker image by @costrouc in [PR 1173](https://github.com/Quansight/qhub/pull/1173)
- Adding examples to QHub jupyterlab by @costrouc in [PR 1176](https://github.com/Quansight/qhub/pull/1176)
- Bump conda-store version to 0.3.12 by @costrouc in [PR 1179](https://github.com/Quansight/qhub/pull/1179)
- Fixing concurrency not being specified in configuration by @costrouc in [PR 1180](https://github.com/Quansight/qhub/pull/1180)
- Adding ipykernel as default to environment along with ensure conda-store restarted on config change by @costrouc in [PR 1181](https://github.com/Quansight/qhub/pull/1181)
- keycloak dev docs by @danlester in [PR 1184](https://github.com/Quansight/qhub/pull/1184)
- Keycloakdev2 by @danlester in [PR 1185](https://github.com/Quansight/qhub/pull/1185)
- Setting minio storage to by default be same as filesystem size for Conda-Store environments by @costrouc in [PR 1188](https://github.com/Quansight/qhub/pull/1188)
- Bump Conda-Store version in Qhub to 0.3.13 by @costrouc in [PR 1189](https://github.com/Quansight/qhub/pull/1189)
- Upgrade mrparkers to 3.7.0 by @danlester in [PR 1183](https://github.com/Quansight/qhub/pull/1183)
- Mdformat tables by @danlester in [PR 1186](https://github.com/Quansight/qhub/pull/1186)
- [ImgBot] Optimize images by @imgbot in [PR 1187](https://github.com/Quansight/qhub/pull/1187)
- Bump conda-store version to 0.3.14 by @costrouc in [PR 1192](https://github.com/Quansight/qhub/pull/1192)
- Allow terraform init to upgrade providers within version specification by @costrouc in [PR 1194](https://github.com/Quansight/qhub/pull/1194)
- Adding missing **init** files by @costrouc in [PR 1196](https://github.com/Quansight/qhub/pull/1196)
- Release 0.3.15 for Conda-Store by @costrouc in [PR 1205](https://github.com/Quansight/qhub/pull/1205)
- Profilegroups by @danlester in [PR 1203](https://github.com/Quansight/qhub/pull/1203)
- Render `.gitignore`, black py files by @iameskild in [PR 1206](https://github.com/Quansight/qhub/pull/1206)
- Update qhub-dask pinned version by @iameskild in [PR 1224](https://github.com/Quansight/qhub/pull/1224)
- Fix env doc links and add corresponding tests by @aktech in [PR 1216](https://github.com/Quansight/qhub/pull/1216)
- Update conda-store-environment variable `type` by @iameskild in [PR 1213](https://github.com/Quansight/qhub/pull/1213)
- Update release notes - justification for changes in `v0.4.0` by @iameskild in [PR 1178](https://github.com/Quansight/qhub/pull/1178)
- Support for pinning the IP address of the load balancer via terraform overrides by @aktech in [PR 1235](https://github.com/Quansight/qhub/pull/1235)
- Bump moment from 2.29.1 to 2.29.2 in /tests_e2e by @dependabot in [PR 1241](https://github.com/Quansight/qhub/pull/1241)
- Update cdsdashboards to 0.6.1, Voila to 0.3.5 by @danlester in [PR 1240](https://github.com/Quansight/qhub/pull/1240)
- Bump minimist from 1.2.5 to 1.2.6 in /tests_e2e by @dependabot in [PR 1208](https://github.com/Quansight/qhub/pull/1208)
- output check fix by @Adam-D-Lewis in [PR 1244](https://github.com/Quansight/qhub/pull/1244)
- Update panel version to fix jinja2 recent issue by @viniciusdc in [PR 1248](https://github.com/Quansight/qhub/pull/1248)
- Add support for terraform overrides in cloud and VPC deployment for Azure by @aktech in [PR 1253](https://github.com/Quansight/qhub/pull/1253)
- Add test-release workflow by @iameskild in [PR 1245](https://github.com/Quansight/qhub/pull/1245)
- Bump async from 3.2.0 to 3.2.3 in /tests_e2e by @dependabot in [PR 1260](https://github.com/Quansight/qhub/pull/1260)
- [WIP] Add support for VPC deployment for GCP via terraform overrides by @aktech in [PR 1259](https://github.com/Quansight/qhub/pull/1259)
- Update login instructions for training by @iameskild in [PR 1261](https://github.com/Quansight/qhub/pull/1261)
- Add docs for general node upgrade by @iameskild in [PR 1246](https://github.com/Quansight/qhub/pull/1246)
- [ImgBot] Optimize images by @imgbot in [PR 1264](https://github.com/Quansight/qhub/pull/1264)
- Fix project name and domain at None by @pierrotsmnrd in [PR 856](https://github.com/Quansight/qhub/pull/856)
- Adding name convention validator for QHub project name by @viniciusdc in [PR 761](https://github.com/Quansight/qhub/pull/761)
- Minor doc updates by @iameskild in [PR 1268](https://github.com/Quansight/qhub/pull/1268)
- Enable display of Qhub version by @viniciusdc in [PR 1256](https://github.com/Quansight/qhub/pull/1256)
- Fix missing region from AWS provider by @viniciusdc in [PR 1271](https://github.com/Quansight/qhub/pull/1271)
- Re-enable GPU profiles for GCP/AWS by @viniciusdc in [PR 1219](https://github.com/Quansight/qhub/pull/1219)
- Release notes for `v0.4.1` by @iameskild in [PR 1272](https://github.com/Quansight/qhub/pull/1272)

### New Contributors

- @dependabot made their first contribution in [PR 1241](https://github.com/Quansight/qhub/pull/1241)

[**Full Changelog**](https://github.com/Quansight/qhub/compare/v0.4.0...v0.4.1)

## Release v0.4.0.post1 - April 7, 2022

This post-release addresses the a few minor bugs and updates the release notes.
There are no breaking changes or API changes.

- Render `.gitignore`, black py files - [PR 1206](https://github.com/Quansight/qhub/pull/1206)
- Update qhub-dask pinned version - [PR 1224](https://github.com/Quansight/qhub/pull/1224)
- Update conda-store-environment variable `type` - [PR 1213](https://github.com/Quansight/qhub/pull/1213)
- Update release notes - justification for changes in `v0.4.0` - [PR 1178](https://github.com/Quansight/qhub/pull/1178)
- Merge spawner and profile env vars to ensure dashboard sharing vars are provided to dashboard servers - [PR 1237](https://github.com/Quansight/qhub/pull/1237)

## Release v0.4.0 - March 17, 2022

**WARNING**

> If you're looking for a stable version of QHub, please consider `v0.3.14`. The `v0.4.0` has many breaking changes and has rough edges that will be resolved in upcoming point releases.

We are happy to announce the release of `v0.4.0`! This release lays the groundwork for many exciting new features and improvements in the future, stay tuned.

Version `v0.4.0` introduced many design changes along with a handful of user-facing changes that require some justification. Unfortunately as a result of these changes, QHub
instances that are upgraded from previous version to `v0.4.0` will irrevocably break.

Until we have a fully functioning backup mechanism, anyone looking to upgrade is highly encouraged to backup their data, see the
[upgrade docs](https://docs.qhub.dev/en/latest/source/admin_guide/breaking-upgrade.html) and more specifically, the
[backup docs](https://docs.qhub.dev/en/latest/source/admin_guide/backup.html).

These design changes were considered important enough that the development team felt they were warranted. Below we try to highlight a few of the largest changes
and provide justification for them.

- Replace Terraforms resource targeting with staged Terraform deployments.
  - _Justification_: using Terraform resource targeting was never an ideal way of handing off outputs from stage to the next and Terraform explicitly warns its users that it's only
    intended to be used "for exceptional situations such as recovering from errors or mistakes".
- Fully remove `cookiecutter` as a templating mechanism.
  - _Justification_: Although `cookiecutter` has its benefits, we were becoming overly reliant on it as a means of rendering various scripts needed for the deployment. Reading through
    Terraform scripts with scattered `cookiecutter` statements was increasing troublesome and a bit intimidating. Our IDEs are also much happier about this change.
- Removing users and groups from the `qhub-config.yaml` and replacing user management with Keycloak.
  - _Justification_: Up until now, any change to QHub deployment needed to be made in the `qhub-config.yaml` which had the benefit of centralizing any configuration. However it
    also potentially limited the kinds of user management tasks while also causing the `qhub-config.yaml` to balloon in size. Another benefit of removing users and groups from the
    `qhub-config.yaml` that deserves highlighting is that user management no longer requires a full redeployment.

Although breaking changes are never fun, we hope the reasons outlined above are encouraging signs that we are working on building a better, more stable, more flexible product. If you
experience any issues or have any questions about these changes, feel free to open an [issue on our Github repo](https://github.com/Quansight/qhub/issues).

### Breaking changes

Explicit user facing changes:

- Upgrading to `v0.4.0` will require a filesystem backup given the scope and size of the current change set.
  - Running `qhub upgrade` will produce an updated `qhub-config.yaml` and a JSON file of users that can then be imported into Keycloak.
- With the addition of Keycloak, QHub will no longer support `security.authentication.type = custom`.
  - No more users and groups in the `qhub-config.yaml`.

### Feature changes and enhancements

- Authentication is now managed by Keycloak.
- QHub Helm extension mechanism added.
- Allow JupyterHub overrides in the `qhub-config.yaml`.
- `qhub support` CLI option to save Kubernetes logs.
- Updates `conda-store` UI.

### What's Changed

<details>

- Enabling Vale CI with GitHub Actions by @HarshCasper in https://github.com/Quansight/qhub/pull/871
- Qhub upgrade by @danlester in https://github.com/Quansight/qhub/pull/870
- Documentation cleanup by @HarshCasper in https://github.com/Quansight/qhub/pull/873
- \[Docs\] Add Traefik wildcard docs by @viniciusdc in https://github.com/Quansight/qhub/pull/876
- replace deprecated "minikube cache add" with "minikube image load" by @Adam-D-Lewis in https://github.com/Quansight/qhub/pull/880
- Azure Python needs different env var names to Terraform by @danlester in https://github.com/Quansight/qhub/pull/882
- Add notes about broken upgrades by @tylerpotts in https://github.com/Quansight/qhub/pull/877
- Keycloak integration first pass by @danlester in https://github.com/Quansight/qhub/pull/848
- K8s tests - keycloak adduser by @danlester in https://github.com/Quansight/qhub/pull/890
- Documentation cleanup by @HarshCasper in https://github.com/Quansight/qhub/pull/889
- Improvements to templates and readme by @trallard in https://github.com/Quansight/qhub/pull/893
- Keycloak docs by @danlester in https://github.com/Quansight/qhub/pull/901
- DOCS: Add a PR Template by @HarshCasper in https://github.com/Quansight/qhub/pull/900
- Delete existing `.gitlab-ci.yml` when rendering by @iameskild in https://github.com/Quansight/qhub/pull/887
- Qhub Extension (Ready for Review) by @Adam-D-Lewis in https://github.com/Quansight/qhub/pull/886
- Updates to Readme by @trallard in https://github.com/Quansight/qhub/pull/897
- Mirror docker images to ghcr and quay container registry by @aktech in https://github.com/Quansight/qhub/pull/912
- Fix CI: skip failure on cleanup by @aktech in https://github.com/Quansight/qhub/pull/910
- Create and solve envs using mamba by @iameskild in https://github.com/Quansight/qhub/pull/915
- Pin terraform providers by @Adam-D-Lewis in https://github.com/Quansight/qhub/pull/914
- qhub-config.yaml as a secret by @danlester in https://github.com/Quansight/qhub/pull/905
- Setup/Add integration/deployment tests via pytest by @aktech in https://github.com/Quansight/qhub/pull/922
- Disable/Remove the stale bot by @viniciusdc in https://github.com/Quansight/qhub/pull/923
- Integrates Hadolint for Dockerfile linting by @HarshCasper in https://github.com/Quansight/qhub/pull/917
- Reduce minimum nodes in user and dask node pools to 0 for Azure / GCP by @tarundmsharma in https://github.com/Quansight/qhub/pull/723
- Allow jupyterhub.overrides in qhub-config.yaml by @danlester in https://github.com/Quansight/qhub/pull/930
- qhub destroy using targets by @danlester in https://github.com/Quansight/qhub/pull/948
- Take AWS region from AWS_DEFAULT_REGION into qhub-config.yaml on init… by @danlester in https://github.com/Quansight/qhub/pull/950
- cookicutter template out of raw by @danlester in https://github.com/Quansight/qhub/pull/951
- kubernetes-initialization depends_on kubernetes by @danlester in https://github.com/Quansight/qhub/pull/952
- Add timeout to terraform import command by @tylerpotts in https://github.com/Quansight/qhub/pull/949
- Timeout in process (for import) by @danlester in https://github.com/Quansight/qhub/pull/955
- Remove user/groups from YAML by @danlester in https://github.com/Quansight/qhub/pull/956
- qhub upgrade custom auth plus tests by @danlester in https://github.com/Quansight/qhub/pull/946
- Add minimal support `centos` images by @iameskild in https://github.com/Quansight/qhub/pull/943
- Keycloak Export by @danlester in https://github.com/Quansight/qhub/pull/947
- qhub cli tool to save kubernetes logs - `qhub support` by @tarundmsharma in https://github.com/Quansight/qhub/pull/818
- Add docs for deploying QHub to existing EKS cluster by @iameskild in https://github.com/Quansight/qhub/pull/944
- Add jupyterhub-idle-culler to jupyterhub image by @danlester in https://github.com/Quansight/qhub/pull/959
- Robust external container registry by @danlester in https://github.com/Quansight/qhub/pull/945
- use qhub-jupyterhub-theme 0.3.3 to simplify JupyterHub config by @danlester in https://github.com/Quansight/qhub/pull/966
- Get kubernetes version for all cloud providers + pytest refactor by @iameskild in https://github.com/Quansight/qhub/pull/927
- Merge hub.extraEnv env vars by @danlester in https://github.com/Quansight/qhub/pull/968
- DOCS: Removing errors from documentation by @HarshCasper in https://github.com/Quansight/qhub/pull/941
- keycloak.realm_display_name by @danlester in https://github.com/Quansight/qhub/pull/973
- minor updates to keycloak docs by @tylerpotts in https://github.com/Quansight/qhub/pull/977
- CI changes for QHub by @HarshCasper in https://github.com/Quansight/qhub/pull/989
- Update `upgrade` docs and general doc improvements by @iameskild in https://github.com/Quansight/qhub/pull/990
- Remove `scope`, `oauth_callback_url` during upgrade step by @iameskild in https://github.com/Quansight/qhub/pull/997
- Adding Conda-Store to QHub by @costrouc in https://github.com/Quansight/qhub/pull/967
- Fix Jupyterlab docker build by @viniciusdc in https://github.com/Quansight/qhub/pull/1001
- DOCS: Fix broken link in setup doc by @HarshCasper in https://github.com/Quansight/qhub/pull/1006
- Fix Kubernetes local test deployment by @viniciusdc in https://github.com/Quansight/qhub/pull/1002
- Initial commit for auth and stages workflow by @costrouc in https://github.com/Quansight/qhub/pull/1003
- Fix formatting issues with black #1003 by @viniciusdc in https://github.com/Quansight/qhub/pull/1020
- use pyproject.toml and setup.cfg for packaging by @tonyfast in https://github.com/Quansight/qhub/pull/986
- Increase timeout/attempts for keycloak check by @viniciusdc in https://github.com/Quansight/qhub/pull/1023
- Fix issue with traefik issuing certificates with let's encrypt acme by @costrouc in https://github.com/Quansight/qhub/pull/1017
- Fixing cds dashboard conda environments being shown by @costrouc in https://github.com/Quansight/qhub/pull/1025
- Fix input variable support for multiple types by @viniciusdc in https://github.com/Quansight/qhub/pull/1029
- Fix Black/Flake8 problems by @danlester in https://github.com/Quansight/qhub/pull/1039
- Add remote state condition for 01-terraform-state provisioning by @viniciusdc in https://github.com/Quansight/qhub/pull/1042
- Round versions for upgrade and schema by @danlester in https://github.com/Quansight/qhub/pull/1038
- Code Server is now installed via conda, and the Jupyterlab Extension is https://github.com/betatim/vscode-binder/ by @costrouc in https://github.com/Quansight/qhub/pull/1044
- Removing cookiecutter from setup.cfg requirements by @costrouc in https://github.com/Quansight/qhub/pull/1026
- Destroy terraform-state stage when condition match by @viniciusdc in https://github.com/Quansight/qhub/pull/1051
- Fix up adding support for security.keycloak.realm_display_name key by @costrouc in https://github.com/Quansight/qhub/pull/1054
- Move external_container_reg to earlier stage by @danlester in https://github.com/Quansight/qhub/pull/1053
- Adding ability to specify overrides back into keycloak configuration by @costrouc in https://github.com/Quansight/qhub/pull/1055
- Deprecating terraform_modules option since no longer used by @costrouc in https://github.com/Quansight/qhub/pull/1057
- Adding security.shared_users_group option for default users group by @costrouc in https://github.com/Quansight/qhub/pull/1056
- Fix up adding back jupyterhub overrides option by @costrouc in https://github.com/Quansight/qhub/pull/1058
- prevent_deploy flag for safeguarding upgrades by @danlester in https://github.com/Quansight/qhub/pull/1047
- CI: Add layer caching for Docker images by @HarshCasper in https://github.com/Quansight/qhub/pull/1061
- Additions to TCP/DNS stage check, fix 1027 by @iameskild in https://github.com/Quansight/qhub/pull/1063
- FIX: Remove concurrency groups by @HarshCasper in https://github.com/Quansight/qhub/pull/1064
- Stage 08 extensions and realms/logout by @danlester in https://github.com/Quansight/qhub/pull/1069
- Auto create/destroy azure resource group by @viniciusdc in https://github.com/Quansight/qhub/pull/1071
- Add CICD schema and render workflows by @iameskild in https://github.com/Quansight/qhub/pull/1068
- Ensure that the shared folder symlink only exists if user has shared folders by @costrouc in https://github.com/Quansight/qhub/pull/1074
- Adds the ability on render to deleted targeted files or directories by @costrouc in https://github.com/Quansight/qhub/pull/1073
- DOCS: QHub 101 by @HarshCasper in https://github.com/Quansight/qhub/pull/1011
- remove jovyan user by @tylerpotts in https://github.com/Quansight/qhub/pull/1089
- More finely scoped github actions and kubernetes_test build docker images by @costrouc in https://github.com/Quansight/qhub/pull/1088
- Adding clearml overrides by @costrouc in https://github.com/Quansight/qhub/pull/1059
- Reorganizing render, deploy, destroy to unify stages input_vars, tf_objects, checks, and state_imports by @costrouc in https://github.com/Quansight/qhub/pull/1091
- Updates/fixes for rendering CICD workflows by @iameskild in https://github.com/Quansight/qhub/pull/1086
- fix bug in state_01_terraform_state function call by @tylerpotts in https://github.com/Quansight/qhub/pull/1094
- Use paths instead of paths-ignore so that test only run on changes to given paths by @costrouc in https://github.com/Quansight/qhub/pull/1097
- \[ENH\] - Update issue templates by @trallard in https://github.com/Quansight/qhub/pull/1083
- Generate independent objects for terraform-state resources by @viniciusdc in https://github.com/Quansight/qhub/pull/1102
- Complete implementation of destroy which goes through each stage by @costrouc in https://github.com/Quansight/qhub/pull/1103
- Change AWS Kubernetes provider authentication to use data.eks_cluster instead of exec by @costrouc in https://github.com/Quansight/qhub/pull/1107
- Relax qhub destroy to attempt to continue destroying resources by @costrouc in https://github.com/Quansight/qhub/pull/1109
- Breaking upgrade docs (0.4) by @danlester in https://github.com/Quansight/qhub/pull/1087
- Simplify default images by @tylerpotts in https://github.com/Quansight/qhub/pull/1114
- Change group structure by @danlester in https://github.com/Quansight/qhub/pull/1112
- Adding status field to each destroy stage to print status by @costrouc in https://github.com/Quansight/qhub/pull/1116
- Incorrect mapping of values to gcp node group instance types by @costrouc in https://github.com/Quansight/qhub/pull/1117
- FIX: Remove Conda Store from default images by @HarshCasper in https://github.com/Quansight/qhub/pull/1119
- Minor fix to `setup.cfg` by @iameskild in https://github.com/Quansight/qhub/pull/1122
- \[DOC\]- Update contribution guidelines by @trallard in https://github.com/Quansight/qhub/pull/1080
- Adding tests to visit additional endpoints by @costrouc in https://github.com/Quansight/qhub/pull/1118
- Adding tests for juypterhub-ssh, jhub-client, and vs code by @costrouc in https://github.com/Quansight/qhub/pull/1123
- Update Keycloak docs by @iameskild in https://github.com/Quansight/qhub/pull/1093
- Upgrade conda-store v0.3.10 and simplify specification of image by @HarshCasper in https://github.com/Quansight/qhub/pull/1130
- \[ImgBot\] Optimize images by @imgbot in https://github.com/Quansight/qhub/pull/1140
- Adjust Idle culler settings and add internal culling by @viniciusdc in https://github.com/Quansight/qhub/pull/1133
- \[BUG\] Removing jovyan home directory and issue with nss configuration by @costrouc in https://github.com/Quansight/qhub/pull/1142
- \[DOC\] Add `troubleshooting` docs by @iameskild in https://github.com/Quansight/qhub/pull/1139
- Update user login guides by @viniciusdc in https://github.com/Quansight/qhub/pull/1144
- \[ImgBot\] Optimize images by @imgbot in https://github.com/Quansight/qhub/pull/1146
- Workaround for kubernetes-client version issue by @iameskild in https://github.com/Quansight/qhub/pull/1148
- Make the commit of the terraform rendering optional (replaces PR 995) by @iameskild in https://github.com/Quansight/qhub/pull/1149
- Fix typos in user guide docs by @ericdatakelly in https://github.com/Quansight/qhub/pull/1154
- Minor docs clean up for v0.4.0 release by @iameskild in https://github.com/Quansight/qhub/pull/1155
- Read-the-docs and documentation updates by @tonyfast in https://github.com/Quansight/qhub/pull/1153
- Add markdown formatter for doc wrapping by @viniciusdc in https://github.com/Quansight/qhub/pull/1152
- remove deprecated param `count` from `.cirun.yml` by @aktech in https://github.com/Quansight/qhub/pull/1164
- Use qhub-bot for keycloak deployment/check by @iameskild in https://github.com/Quansight/qhub/pull/1167
- Only list active conda-envs for dask-gateway by @iameskild in https://github.com/Quansight/qhub/pull/1162

</details>

### New Contributors

- @imgbot made their first contribution in https://github.com/Quansight/qhub/pull/1140
- @ericdatakelly made their first contribution in https://github.com/Quansight/qhub/pull/1154

**Full Changelog**: https://github.com/Quansight/qhub/compare/v0.3.13...v0.4.0

## Release 0.3.13 - October 13, 2021

### Breaking changes

- No known breaking changes

### Feature changes and enhancements

- Allow users to specify external Container Registry ([#741](https://github.com/Quansight/qhub/pull/741))
- Integrate Prometheus and Grafana into QHub ([#733](https://github.com/Quansight/qhub/pull/733))
- Add Traefik Dashboard ([#797](https://github.com/Quansight/qhub/pull/797))
- Make ForwardAuth optional for ClearML ([#830](https://github.com/Quansight/qhub/pull/830))
- Include override configuration for Prefect Agent ([#813](https://github.com/Quansight/qhub/pull/813))
- Improve authentication type checking ([#834](https://github.com/Quansight/qhub/pull/834))
- Switch to pydata Sphinx theme ([#805](https://github.com/Quansight/qhub/pull/805))

### Bug fixes

- Add force-destroy command (only for AWS at the moment) ([#694](https://github.com/Quansight/qhub/pull/694))
- Include namespace in conda-store PVC ([#716](https://github.com/Quansight/qhub/pull/716))
- Secure ClearML behind ForwardAuth ([#721](https://github.com/Quansight/qhub/pull/721))
- Fix connectivity issues with AWS EKS via Terraform ([#734](https://github.com/Quansight/qhub/pull/734))
- Fix conda-store pod eviction and volume conflicts ([#740](https://github.com/Quansight/qhub/pull/740))
- Update `remove_existing_renders` to only delete QHub related files/directories ([#800](https://github.com/Quansight/qhub/pull/800))
- Reduce number of AWS subnets down to 4 to increase the number of available nodes by a factor of 4 ([#839](https://github.com/Quansight/qhub/pull/839))

## Release 0.3.11 - May 7, 2021

### Breaking changes

### Feature changes and enhancements

- better validation messages on github auto provisioning

### Bug fixes

- removing default values from pydantic schema which caused invalid yaml files to unexpectedly pass validation
- make kubespawner_override.environment overridable (prior changes were overwritten)

## Release 0.3.10 - May 6, 2021

### Breaking changes

- reverting `qhub_user` default name to `jovyan`

### Feature changes and enhancements

### Bug fixes

## Release 0.3.9 - May 5, 2021

### Breaking changes

### Feature changes and enhancements

### Bug fixes

- terraform formatting in cookiecutter for enabling GPUs on GCP

## Release 0.3.8 - May 5, 2021

### Breaking changes

### Feature changes and enhancements

- creating releases for QHub simplified
- added an image for overriding the dask-gateway being used

### Bug fixes

- dask-gateway exposed by default now properly
- typo in cookiecutter for enabling GPUs on GCP

## Release 0.3.7 - April 30, 2021

### Breaking changes

### Feature changes and enhancements

- setting `/bin/bash` as the default terminal

### Bug fixes

- `jhsingle-native-proxy` added to the base jupyterlab image

## Release 0.3.6 - April 29, 2021

### Breaking changes

- simplified bash jupyterlab image to no longer have dashboard packages panel, etc.

### Feature changes and enhancements

- added emacs and vim as default editors in image
- added jupyterlab-git and jupyterlab-sidecar since they now support 3.0
- improvements with `qhub destroy` cleanly deleting resources
- allow user to select conda environments for dashboards
- added command line argument `--skip-terraform-state-provision` to allow for skipping terraform state provisioning in `qhub deploy` step
- no longer render `qhub init` `qhub-config.yaml` file in alphabetical order
- allow user to select instance sizes for dashboards

### Bug fixes

- fixed gitlab-ci before_script and after_script
- fixed jovyan -> qhub_user home directory path issue with dashboards

## Release 0.3.5 - April 28, 2021

### Breaking changes

### Feature changes and enhancements

- added a `--skip-remote-state-provision` flag to allow `qhub deploy` within CI to skip the remote state creation
- added saner defaults for instance sizes and jupyterlab/dask profiles
- `qhub init` no longer renders `qhub-config.yaml` in alphabetical order
- `spawn_default_options` to False to force dashboard owner to pick profile
- adding `before_script` and `after_script` key to `ci_cd` to allow customization of CI process

### Bug fixes

## Release 0.3.4 - April 27, 2021

### Breaking changes

### Feature changes and enhancements

### Bug fixes

- remaining issues with ci_cd branch not being fully changed

## Release 0.3.3 - April 27, 2021

### Breaking changes

### Feature changes and enhancements

### Bug fixes

- Moved to ruamel as yaml parser to throw errors on duplicate keys
- fixed a url link error in cds dashboards
- Azure fixes to enable multiple deployments under one account
- Terraform formatting issue in acme_server deployment
- Terraform errors are caught by qhub and return error code

### Breaking changes

## Release 0.3.2 - April 20, 2021

### Bug fixes

- prevent gitlab-ci from freezing on gitlab deployment
- not all branches were configured via the `branch` option in `ci_cd`

## Release 0.3.1 - April 20, 2021

### Feature changes an enhancements

- added gitlab support for CI
- `ci_cd` field is now optional
- AWS provider now respects the region set
- More robust errors messages in cli around project name and namespace
- `git init` default branch is now `main`
- branch for CI/CD is now configurable

### Bug fixes

- typo in `authenticator_class` for custom authentication

## Release 0.3.0 - April 14, 2021

### Feature changes and enhancements

- Support for self-signed certificate/secret keys via kubernetes secrets
- [jupyterhub-ssh](https://github.com/yuvipanda/jupyterhub-ssh) (`ssh` and `sftp` integration) accessible on port `8022` and `8023` respectively
- VSCode([code-server](https://github.com/cdr/code-server)) now provided in default image and integrated with jupyterlab
- [Dask Gateway](https://gateway.dask.org/) now accessible outside of cluster
- Moving fully towards traefik as a load balancer with tight integration with dask-gateway
- Adding ability to specify node selector label for general, user, and worker
- Ability to specify `kube_context` for local deployments otherwise will use default
- Strict schema validation for `qhub-config.yaml`
- Terraform binary is auto-installed and version managed by qhub
- Deploy stage will auto render by default removing the need for render command for end users
- Support for namespaces with qhub deployments on kubernetes clusters
- Full JupyterHub theming including colors now.
- JupyterHub docker image now independent from zero-to-jupyterhub.
- JupyterLab 3 now default user Docker image.
- Implemented the option to locally deploy QHub allowing for local testing.
- Removed the requirement for DNS, authorization is now password-based (no more OAuth requirements).
- Added option for password-based authentication
- CI now tests local deployment on each commit/PR.
- QHub Terraform modules are now pinned to specific git branch via `terraform_modules.repository` and `terraform_modules.ref`.
- Adds support for Azure cloud provider.

### Bug fixes

### Breaking changes

- Terraform version is now pinned to specific version
- `domain` attributed in `qhub-config.yaml` is now the url for the cluster

### Migration guide

0. Version `<version>` is in format `X.Y.Z`
1. Create release branch `release-<version>` based off `main`
2. Ensure full functionality of QHub this involves at a minimum ensuring

- \[ \] GCP, AWS, DO, and local deployment
- \[ \] "Let's Encrypt" successfully provisioned
- \[ \] Dask Gateway functions properly on each
- \[ \] JupyterLab functions properly on each

3. Increment the version number in `qhub/VERSION` in format `X.Y.Z`
4. Ensure that the version number in `qhub/VERSION` is used in pinning QHub in the github actions `qhub/template/{{ cookiecutter.repo_directory }}/.github/workflows/qhub-ops.yaml`
   in format `X.Y.Z`
5. Create a git tag pointing to the release branch once fully tested and version numbers are incremented `v<version>`

---

## Release 0.2.3 - February 5, 2021

### Feature changes, and enhancements

- Added conda prerequisites for GUI packages.
- Added `qhub destroy` functionality that tears down the QHub deployment.
- Changed the default repository branch from `master` to `main`.
- Added error message when Terraform parsing fails.
- Added templates for GitHub issues.

### Bug fixes

- `qhub deploy -c qhub-config.yaml` no longer prompts unsupported argument for `load_config_file`.
- Minor changes on the Step-by-Step walkthrough on the docs.
- Revamp of README.md to make it concise and highlight Nebari Slurm.

### Breaking changes

- Removed the registry for DigitalOcean.

## Thank you for your contributions!

> [Brian Larsen](https://github.com/brl0), [Rajat Goyal](https://github.com/RajatGoyal), [Prasun Anand](https://github.com/prasunanand), and
> [Rich Signell](https://github.com/rsignell-usgs) and [Josef Kellndorfer](https://github.com/jkellndorfer) for the insightful discussions.



---
File: nebari/SECURITY.md
---

# Security Policy

## Supported Versions

We support only the latest version, and we use [CalVer](https://calver.org/) for versioning.

You should feel comfortable upgrading if you're using our documented public APIs and pay attention to `DeprecationWarnings`. Whenever there is a need to break compatibility, it is announced in the [Changelog](https://www.nebari.dev/docs/references/RELEASE) and will raise a `DeprecationWarning` before it's finally really broken.

## Reporting a Vulnerability

If you think you found a vulnerability, please report it at [nebari/security](https://github.com/nebari-dev/nebari/security/advisories/new). Please do not report security vulnerabilities on our public issue tracker. Exposing vulnerabilities publicly without giving maintainers a chance to release a fix puts users at risk.
