Search Blog

INSIGHTS

6 min read

by Miklos Csendes

Published on 12/19/2017

Last updated on 03/21/2024

Published on 12/19/2017

Last updated on 03/21/2024

Introduction to spotguides

Subscribe to

the Shift!

Get emerging insights on emerging technology straight to your inbox.

Note: The Spotguides feature mentioned in this post is outdated and not available anymore. In case you are interested in a similar feature, contact us for details.

Last week we released the first version of Pipeline - a PaaS with end to end support for cloud native apps, from GitHub commit hooks deployed to the cloud in minutes to the use of a fully customizable CI/CD workflow. At the core of the Pipeline PaaS are its spotguides - a collection of workflow/pipeline steps defined in a .pipeline.yml file and a few Drone plugins. In this post we'd like to demystify spotguides and describe, step by step, how they work; the next post will be a tutorial on how to write a custom spotguide and its associated plugin.

From a distance each spotguide is just a customizable CI/CD pipeline defined in a yaml file, a plugin written in Golang and a Docker container that can be deployed/executed.

Note: The Pipeline CI/CD module mentioned in this post is outdated and not available anymore. You can integrate Pipeline to your CI/CD solution using the Pipeline API. Contact us for details.

Building blocks

Pipeline

Pipeline is an API and execution engine that provisions Kubernetes clusters and container engines in the cloud, deploys applications and is independent of the application/spotguide it deploys - the same way Kubernetes is. Any application that can be packaged as a Docker container and has a manifest file or helm chart can be deployed to a supported cloud provider or on-prem Kubernetes cluster and managed by Pipeline. This is true of applications like Apache Spark, Kafka and Zeppelin, but, at the same time, Pipeline is not tied to big data workloads (it's a generic microservice platform) and supports applications like Java (with cgroups) and distributed and resilient databases (exposing mysql and postgres wire protocols as a service). At the risk of over-simplifying things, the Pipeline API is just one execution step in a CI/CD workflow - is actually governed by the Pipeline CI/CD plugin.

Default plugins

There are a few well defined out-of-the-box plugins that are already part of the Drone CI/CD component. Any complete list of those plugins would be quite large, but, just to highlight a few, some that we frequently use are:

Docker - a plugin to build and publish Docker images to a container registry
git - a plugin that clones git repositories
s3 cache/sync - a plugin that caches build artifacts to S3 compatible storage backends like Minio or Rook, and syncs files with a bucket
azure/google storage - a plugin for publishing files to Azure and Google blob storage
dockerhub - a plugin to trigger a remote Docker Hub build
slack - a plugin for Slack notifications

Most of these plugins require a credential or a keypair to access, and manage remote resources. The CI/CD system supports a convenient way to pass secrets (like passwords and ssh keys), without the need to actually place them alongside a workflow definition and store them in GitHub. You can do this either by using the API or the CLI, or by passing them into the plugin at runtime like ENV variables, or, if you're running Kubernetes (like we do), through secrets or config maps.

Custom plugins

Spotguides are application specific. The pipeline/workflow steps described in the .pipeline.yml file reflect the typical lifecycle of the application, and are you usually unique. Needles to say, the CI/CD workflow/pipeline is fully customizable and supports parallel or conditional execution. Custom plugins sit at the core of any spotguide. We've written custom plugins for our default supported apps; these plugins are extremely simple to build (they usually take 1-2 days) and have well defined interfaces. Variable injection, execution as a container, security, etc are all out of the realm of concern for a plugin's writer - these are default services you already get from the CI/CD engine. By way of an example, take a look at our Apache Spark spotguide. This is how you get from a GitHub commit hook to a running Spark application on Kubernetes in minutes. The overall flow looks like this: This flow translates to the following plugin flow: The building blocks for the Spark spotguide are as follows:

Component	Source code
Spark RSS Helm charts	https://github.com/banzaicloud/banzai-charts/tree/master/stable/spark-rss
Spark Shuffle Helm charts	https://github.com/banzaicloud/banzai-charts/tree/master/stable/spark-shuffle
Spark Helm charts	https://github.com/banzaicloud/banzai-charts/tree/master/stable/spark
K8S Proxy plugin	https://github.com/banzaicloud/drone-plugin-k8s-proxy
Spark K8S submit plugin	https://github.com/banzaicloud/drone-plugin-spark-submit-k8s
Pipeline client plugin	https://github.com/banzaicloud/drone-plugin-pipeline-client

This combination of plugins written in Golang, the .pipeline.yml file and Kubernetes deployment definitions (Helm charts in our case) composes a spotguide. As you can see, spotguides are application specific. However, the platform that deploys and governs them - Pipeline - is agnostic. This is an easy and powerful way to integrate any distributed application that can be containerized so it will run on our microservice PaaS. Pipeline creates and defines the runtime - which is Kubernetes - and deploys the application - which are described by Helm charts - through a REST API.

Helm charts

We use Helm charts to deploy and orchestrate the applications we deploy. In order to write a spotguide, you'll need a Helm chart (or a low level deployment k8s unit like a manifest) and some orchestration logic (maybe). Take, for instance, one of the examples we deploy and use - a distributed database. Kubernetes does not differentiate between resources and priorities when deploying applications. Helm charts do have dependencies but there is no ordering. Because Helm 3.0 has so far not been released, we provide default init containers for a predefined number of protocols to allow ordering and higher level readiness probes. Such basic ordering is a database startup; if you're deploying a simple web app with Pipeline that requires a database, it is deployed in parallel, however, the web app will fail until the database starts, is initiated and is ready to serve requests. These request failures show up in the logs, and trace and potentially trigger the default Prometheus alerts we deploy for the application. This is not ideal. But k8s does not currently have an out-of-the-box solution (at least not until Helm 3.0 is released). Thus, we provide protocol specific init containers that are able to serve startup orders, initialize applications and send readiness probes.

.pipeline.yml

The final piece of this equation is the yaml file. The pipeline.yml connects these components (except the upcoming UI and CLI) in a single unit, and describes workflow steps, defines the underlying plugins and their associated Helm charts. The yaml is pretty simple to read, maintain and execute. One added benefit is that, since all the steps above are containerized (plugins, for example), they can be used with other commercial CI/CD systems like CircleCI or Travis.

About Banzai Cloud Pipeline

Banzai Cloud’s Pipeline provides a platform for enterprises to develop, deploy, and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures — multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, CI/CD, and so on — are default features of the Pipeline platform.

Subscribe to

the Shift!

Get emerging insights on emerging technology straight to your inbox.

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.

Download

Insights

Top 15 software supply chain attacks: Case studies

Cloud Native Kubernetes

Inside Outshift

Continuous learning shapes a product management career path: Insights from Alex Jauch

Team

Inside Outshift

Cloud Unfiltered explores AI and cloud computing trends, platform engineering, and more

Cloud Native Artificial Intelligence

Subscribe  to

the Shift

Get

emerging insights

on emerging technology straight to your inbox.

The Shift keeps you at the forefront of cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations that are shaping the future of technology.

Insights

Inside Outshift

Collaborations

Product

Categories

Search Blog

by Miklos Csendes

Published on 12/19/2017

Last updated on 03/21/2024

Published on 12/19/2017

Last updated on 03/21/2024

Introduction to spotguides

Get emerging insights on emerging technology straight to your inbox.

Building blocks

Pipeline

Default plugins

Custom plugins

Helm charts

.pipeline.yml

About Banzai Cloud Pipeline

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Related articles

Insights

Top 15 software supply chain attacks: Case studies

Inside Outshift

Continuous learning shapes a product management career path: Insights from Alex Jauch

Inside Outshift

Cloud Unfiltered explores AI and cloud computing trends, platform engineering, and more