Kubeflow Pipelines allows you to build and run portable, scalable machine learning workflows using Kubernetes-managed containers.

Learn more about Kubeflow Pipelines from their project documentation page.

Choosing your user interface

The K Flow platform comes with two main ways of interacting with Kubeflow Pipelines. The easier method is called Elyra, a Jupyter Lab extension that allows you to build reusable workflows without needing to write code. Advanced users can also work directly with the Kubeflow Pipelines SDK.

The main advantage of Elyra is the ability to use a pre-built catalog of components. K Flow offers a wide variety of pipeline components, and we are constantly developing new ones. Most pre-built examples that we provide will use Elyra, so it’s important to become familiar.

Although Elyra is very powerful, it has some key limitations that might prompt users with more advanced use cases to work d irectly with the Kubeflow Pipelines python SDK. There are two major versions of Kubeflow Pipelines, and as of June 2023 Elyra only supports the v1 SDK.

Getting started with Elyra

These instructions allow you to submit Kubeflow Pipeline jobs from a Kubeflow Notebook server. They must be run once any time you create a new setup.

AWS

In your Kubeflow notebook, open a terminal and run the following command. Be sure to update the S3_BUCKET_NAME and KUBEFLOW_HOST as appropriate.

S3_BUCKET_NAME=kflow-eks-dev-kfp \\
KUBEFLOW_HOST=dev.aws.kflow.ai \\
elyra-metadata create runtimes \\
	--json="{ \\"display_name\\": \\"Kubeflow\\", \\"metadata\\": { \\"tags\\": [], \\"display_name\\": \\"Kubeflow\\", \\"engine\\": \\"Argo\\", \\"auth_type\\": \\"KUBERNETES_SERVICE_ACCOUNT_TOKEN\\", \\"api_endpoint\\": \\"<http://ml-pipeline.kubeflow.svc.cluster.local:8888>\\", \\"public_api_endpoint\\": \\"<https://$>{KUBEFLOW_HOST}/pipeline\\", \\"cos_auth_type\\": \\"AWS_IAM_ROLES_FOR_SERVICE_ACCOUNTS\\", \\"cos_endpoint\\": \\"<https://s3.$>{AWS_REGION}.amazonaws.com\\", \\"cos_bucket\\": \\"${S3_BUCKET_NAME}\\", \\"runtime_type\\": \\"KUBEFLOW_PIPELINES\\" }, \\"schema_name\\": \\"kfp\\" }" \\
	--schema_name="kfp"

Then, set up your component catalog:

elyra-metadata create component-catalogs \\
	--name="kflow" \\
	--display_name "K Flow" \\
	--paths="['/home/jovyan/shared/lib/pipeline-components'"] \\
	--schema_name="local-directory-catalog" \\
	--runtime_type="KUBEFLOW_PIPELINES"

GCP

In your Kubeflow notebook, open a terminal and run the following command. Be sure to update the KUBEFLOW_HOST as appropriate.

KUBEFLOW_HOST=dev.gcp.kflow.ai \\
elyra-metadata create runtimes \\
	--json="{ \\"display_name\\": \\"Kubeflow\\", \\"metadata\\": { \\"tags\\": [], \\"display_name\\": \\"Kubeflow\\", \\"engine\\": \\"Argo\\", \\"auth_type\\": \\"KUBERNETES_SERVICE_ACCOUNT_TOKEN\\", \\"api_endpoint\\": \\"<http://ml-pipeline.kubeflow.svc.cluster.local:8888>\\", \\"public_api_endpoint\\": \\"<https://$>{KUBEFLOW_HOST}/pipeline\\", \\"cos_auth_type\\": \\"[USER_CREDENTIALS](<https://kflowai.notion.site/d8f9ca556d63458d91dd4d5c6ffa0917>)\\", \\"cos_endpoint\\": \\"<http://minio-service.kubeflow.svc.cluster.local:9000>\\", \\"cos_bucket\\": \\"elyra\\", \\"cos_username\\": \\"minio\\", \\"cos_password\\": \\"minio123\\", \\"runtime_type\\": \\"KUBEFLOW_PIPELINES\\" }, \\"schema_name\\": \\"kfp\\" }" \\
	--schema_name="kfp"

Then, set up your component catalog:

elyra-metadata create component-catalogs \\
	--name="kflow" \\
	--display_name "K Flow" \\
	--paths="['/home/jovyan/shared/lib/pipeline-components'"] \\
	--schema_name="local-directory-catalog" \\
	--runtime_type="KUBEFLOW_PIPELINES"

Getting started with Kubeflow Pipelines SDK

TODO