Kubernetes Basics for Data Scientists

Frequently Asked Questions: Kubernetes Basics for Data Scientists

What is Kubernetes? Kubernetes is an open-source platform designed to automate the deployment, scaling, and management of containerized applications.
How can data scientists benefit from Kubernetes?

Kubernetes allows data scientists to easily deploy, scale, and manage their machine learning workloads, enabling efficient resource utilization, version control, and reproducibility of experiments. This leads to faster development and deployment of machine learning models in production.
What are the key Kubernetes components?
- Node: A worker machine in a Kubernetes cluster, which can be a physical or virtual machine.
- Pod: A pod is the smallest execution unit in Kubernetes. A pod encapsulates one or more applications. Pods are ephemeral by nature. If a pod (or the node it executes on) fails, Kubernetes automatically creates a new replica of that pod to continue operations. Pods may include one or more containers (such as Docker containers). Most pods have a single container. Pods also provide environmental dependencies, including persistent storage volumes (storage that is permanent and available to all pods in the cluster) and configuration data needed to run container(s) within the pod.
- Namespace: A Kubernetes abstraction that groups and organizes cluster resources into separate, non-overlapping virtual spaces, allowing for resource isolation, access control, and management of multiple projects, teams, or environments within the same cluster. Namespaces make it easier to manage and maintain complex applications and support multi-tenant deployments.
- Job: A Kubernetes object that represents a finite task (a task that is intended to run once). Kubernetes manages a job by creating one or more Pods and ensuring their successful completion.
- Service: A Kubernetes object that abstracts network access to a set of Pods, providing a stable IP address and DNS name.
- Deployment: A higher-level abstraction for managing Pods, ensuring the desired number of replicas and updates to the application.
- ConfigMap: A Kubernetes object used to store non-confidential data in key-value pairs, decoupling configuration details from container images.
- Secret: A Kubernetes object used to store sensitive data, like passwords or API keys, separate from the Pod specification.
- Events: In the context of Kubernetes, events are records of important occurrences within the cluster, such as the creation or deletion of objects, scaling operations, or errors. Events provide insights into the state and behavior of the cluster, helping users understand, diagnose, and troubleshoot issues within their Kubernetes environment.
How do I use the kubectl command to get logs, events, and more?
- To get logs within a Notebook, open a Terminal and run: kubectl logs $HOSTNAME
- To list pods: kubectl get pods -o wide
- To get logs from a specific Pod: kubectl logs <pod_name>
- To get logs from a specific container within a Pod: kubectl logs <pod_name> -c <container_name>
- To get a list of nodes: kubectl get nodes -o wide
- To get events (sorted by timestamp): kubectl get events --sort-by='.metadata.creationTimestamp'
- To get a list of Katib experiments: kubectl get experiment
How do I use ‘kubectl describe’?
- To describe a specific Pod: kubectl describe pod <pod_name>
- To describe all Pods in a namespace: kubectl describe pods -n <namespace_name>
- To describe a Katib Experiment: kubectl describe experiment <experiment_name>
- To describe a specific Job: kubectl describe job <job_name>
- To describe a specific Service: kubectl describe service <service_name>
- To describe a specific Deployment: kubectl describe deployment <deployment_name>
- To describe a specific ConfigMap: kubectl describe configmap <configmap_name>
- To describe a specific Secret: kubectl describe secret <secret_name>
To get a list of all of the different types of things that are defined in a Kubernetes cluster: kubectl api-resources. For each resource listed, you can use commands like kubectl get <resource-name> and kubectl describe <resource-name>