How to create Jobs in Kubernetes

This article discusses how to automate Tasks in Kubernetes and OpenShift using Jobs and Cron Jobs. We will show some example on how to create and manage them. Then, we will discuss the best practices about using Jobs in Kubernetes.

In a Kubernetes environment, you can use Jobs to automate tasks that need to run once or at specific intervals, such as backups, data processing, and clean-up activities. They provide a declarative way to schedule and manage these tasks, allowing you to specify when and how often they should be executed.

A Job creates one or more Pods, and these pods run until they complete their tasks. The Job itself is considered “complete” when all its pods have successfully terminated.

Here are the key features of a Kubernetes Job:

Pod Completion: A Job ensures that a specified number of pods successfully complete their tasks before considering the entire Job complete.
Parallelism: You can specify the number of parallel pods running at any given time. If a pod fails, the Job creates a new pod to replace it.

Restart Policy: Jobs have a “restart policy” of “OnFailure” by default. This means that pods that fail are replaced, but successful pods are not re-run.
Pod Backoff Limit: You can specify the number of times a Job should retry creating a pod if it fails before giving up.
Successful Completion: The Job is considered successful when all its pods complete successfully and no more pods are created.

Creating a simple Job

Firstly, we will see how to create a simple Job from a file definition. Create the following simple-job.yaml file:

apiVersion: batch/v1
kind: Job
metadata:
  name: my-job
spec:
  template:
    spec:
      containers:
      - name: my-container
        image: alpine
        command: ["/bin/sh", "-c"]
        args:
        - |
          #!/bin/sh
          echo "Performing some basic computations..."
          let "a = 5 + 3"
          let "b = 2 * 4"
          let "c = a + b"
          echo "Result: $c"
      restartPolicy: Never

Then, in order to create it, use the kubectl or the oc client tool. For example:

kubectl create -f simple-job.yaml

A Pod will start and its status will be “Completed” after the completion of the Task:

my-job-pbtx7           0/1     Completed   0          35m

You can check the execution logs as follows:

$ kubectl logs my-job-pbtx7
Performing some basic computations...
Result: 16

Another (quicker) option is to create the Job from the Command Line. For this purpose, you can use the ‘create job‘ command as follows:

$ kubectl create job another-job --image=alpine -- echo "Hello World"

Creating a Cron Job

A Cron Job differs from the standard Job as you can execute it repeatedly using a cron expression. As with the regular Job, you can either use a definition file or the command line to create a new one. For example, here is how to create a Cron Job that outputs an echo command every minute:

$ kubectl create cronjob hello --image=busybox:1.28   --schedule="*/1 * * * *" -- echo "Hello World"

The above Cron Job will keep creating and terminating new Pods for every execution:

$ kubectl get pods
NAME                   READY   STATUS              RESTARTS   AGE
hello-28211829-cxdms   0/1     Completed           0          60s
hello-28211830-q47zd   0/1     ContainerCreating   0          0s

Turning a Job into a CronJob

There is no command line option to convert an existing Job into a CronJob. However, it does not require too much effort. Firstly, get your Job in YAML format:

kubectl get job my-job -o yaml > my-job.yaml

This command retrieves the YAML definition of the Job object named my-job and saves it to a file named my-job.yaml. You can specify a different name for the Job object and the output file as needed.

Then, edit the YAML definition to create a new CronJob object. For example:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: my-cronjob
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        # Paste the contents of the 'spec.template' field from the Job definition here

Managing your Jobs

To list the Jobs available in your namespace you can use the following command:

kubectl get jobs

On the other hand, to cancel the execution of a Cron Job you can execute the “delete job” command. For example:

kubectl delete job <job-name>

Finally, it’s worth mentioning that you can also manage the creation/disposal of Jobs from the OpenShift Web Console. Enter the Administration View of your namespace. Then, choose the CronJobs or Jobs left panel:

Use Cases for Jobs

You should use Jobs in Kubernetes and OpenShift primarily for running short-lived tasks to completion. They are suitable for scenarios where you need to perform a task and ensure it’s executed once successfully. For example:

Batch Processing: Running data processing tasks, calculations, or data transformations in a batch mode.

Data Import/Export: Importing or exporting data to/from databases, cloud storage, or other data sources.
Backup and Restore: Running backup tasks for databases, files, or other critical data.
Periodic Tasks: Running tasks at specified intervals, similar to cron jobs, but for containerized applications.

One-Time Operations: Running tasks such as database migrations, initialization, or setup tasks.
Cleaning and Maintenance: Running cleanup tasks, such as deleting temporary files or expired data.

Unfitting Use Cases for Jobs

While Jobs are useful for many scenarios, they might not be the best fit for all use cases. Here are some scenarios where other Kubernetes primitives might be more appropriate:

Long-Running Services: If you need to run a service that should be available continuously, a Deployment or StatefulSet is a better choice. Jobs are better for finite tasks, not for maintaining ongoing services.
Scaling Services: Jobs don’t automatically handle scaling. For services that need to scale up or down based on demand, Deployments, ReplicationControllers, or ReplicaSets are more appropriate.
Continuous Workflows: For complex workflows involving multiple steps, dependencies, and state management, tools like Kubernetes’ Argo Workflows might be more suitable.

Persistent Data Processing: If your task involves continuous processing of data streams or requires stateful processing, Kubernetes Streams or dedicated stream processing tools like Apache Kafka or Apache Flink might be better options.
Inter-Pod Communication: Jobs are isolated tasks. If your task involves communication and coordination between multiple pods or microservices, you might need to design a more complex application architecture.

Conclusion

Cron Jobs are a powerful tool for automating recurring tasks in Kubernetes and OpenShift environments. By understanding how to create, schedule, and manage Cron Jobs, you can streamline operations, reduce manual intervention, and ensure the timely execution of critical processes in your containerized applications. Integrating Cron Jobs into your workflow can contribute to a more efficient and reliable cloud-native infrastructure.

Found the article helpful? if so please follow us on Socials