Tag Archives: Kubernetes

Learn Kubernetes with Google: Join us live on October 6!

 

Graphic describing the Multi-cluster Services API functionalities

Kubernetes hasn’t stopped growing since it was released by Google as an open source project back in June 2014: from July 7, 2020 to a year later in 2021, there were 2,284 new contributors to the project1. And that’s not all: in 2020 alone, the Kubernetes project had 35 stable graduations2. These are 35 new features that are ready for production use in a Kubernetes environment. Looking at the CNCF Survey 2020, use of Kubernetes has increased to 83%, up from 78% in 2019. With these many new people joining the community, and the project gaining so much complexity: how can we make sure that Kubernetes remains accessible to everyone, including newcomers?

This is the question that inspired the creation of Learn Kubernetes with Google, a content program where we develop resources that explain how to make Kubernetes work best for you. At the Google Open Source Programs Office, we believe that increasing access for everyone starts by democratizing knowledge. This is why we started with a series of short videos that focus on specific Kubernetes topics, like the Gateway API, Migrating from Dockershim to Containerd, the Horizontal Pod Autoscaler, and many more topics!

Join us live

On October 6, 2021, we are launching a series of live events where you can interact live with Kubernetes experts from across the industry and ask questions—register now and join for free! “Think beyond the cluster: Multi-cluster support on Kubernetes” is a live panel that brings together the following experts:
  • Laura Lorenz - Software Engineer (Google) / Member of SIG Multicluster in the Kubernetes project
  • Tim Hockin - Software Engineer (Google) / Co-Chair of SIG Network in the Kubernetes project
  • Jeremy Olmsted-Thompson - Sr Staff software Engineer (Google) / Co-Chair of the SIG Multicluster in the Kubernetes project
  • Ricardo Rocha - Computing Engineer (CERN) / TOC Member at the CNCF
  • Paul Morie - Software Engineer (Apple) / Co-Chair of the SIG Multicluster in the Kubernetes project
Why is Multi-cluster support in Kubernetes important? Kubernetes has brought a unified method of managing applications and their infrastructure. Engineering your application to be a global service requires that you start thinking beyond a single cluster; yet, there are many challenges when deploying multiple clusters at a global scale. Multi-cluster has many advantages, it lets you minimize the latency and optimize it for the people consuming your application.

In this panel, we will review the history behind multi-cluster, why you should use it, how companies are deploying multi-cluster, and what are some efforts in upstream Kubernetes that are enabling it today. Check out the “Resources” tab on the event page to learn more about the Kubernetes MCS API and Join us on Oct 6!

By María Cruz, Program Manager – Google Open Source Programs Office

1 According to devstats

Kubernetes Community Annual Report 2020

El Carro extends the flexibility and choices for Oracle databases on Kubernetes

When we released El Carro, our goal was to provide the best experience possible to run Oracle databases on Kubernetes with the help of our operator. Today, we want to take a closer look at how that works. The diagram below shows the high-level architecture of a database that is managed by El Carro. At the core is the actual database instance with its background processes which run in a single container that contains the Oracle installation. So how does this container image get created and what goes into it? The image itself is essentially a snapshot of a filesystem that contains an operating system, packages and other software, and custom scripts. Specifically for El Carro, an image is made up of a base OS, required packages, and an Oracle database installation. The image must be stored on a container registry that is accessible by the Kubernetes cluster, and El Carro will expect oracle binaries to be installed in certain paths—or create symbolic links to those locations.

Architecture Diagram showing the operator controlling the db container.

Initially, El Carro worked with 12c for Enterprise Edition and 18c for Express Edition. And while 12c is still popular with many users, the extended support ended this summer. So the first news is that we added support for 19c, Oracle’s long term release. The choice should be easy for any new database deployments, but the options don’t end there.

We know that DBAs have different preferences in how and where software gets installed and we believe that making different options available will ultimately empower users. With the exception of Express Edition, redistributing is not a right granted by Oracle licenses, preventing the community from providing a public container registry with usable images. Rather than that, each user will have to build their own image based on binaries they download from Oracle themselves, using their own license agreement with Oracle. All of the other containers used by the El Carro operator use open source software and are made available on our public registry, so that you do not have to build and host them yourself.

Option 1 - Use El Carro to build your own image with GCP

If you are using GCP, then we have an easy way for you to create custom images, you just upload Oracle binaries and patches to your own GCS bucket and start a Cloud Build job that will create the container image for you and upload it to your own, private container registry. A single build script and serverless cloud services take care of the whole process, so that you don’t have to worry about building locally and moving more images across the internet. In addition to creating seeded images (see below), this method also allows you to build containers with the Oracle Patches such as Release Update Revisions (RURs).

Diagram of container image build pipeline where a cloud build job reads installation files from GCS and writes finished images to GCR.

Option 2 - Use El Carro to build you own image locally

You can also use the same Dockerfile and build process from Option 1—but without Google Cloud. Download Oracle installers and patches locally or to a VM used for the builds—then start a script that invokes Docker and builds the image on that machine. Lastly, tag and push the container image to a container registry of your choice. You will have to do a few more steps yourself if you don’t use Cloud Build, but you get the same image and customization options as with Option 1.

Option 3 - Use Oracle build scripts to build your own image

Oracle also maintains an open source repository of scripts to build container images with their database. Maybe you are already using those images either with docker or Kubernetes, or you prefer to use Oracle’s own build method over ours. We recently added functionality to El Carro to make sure that the resulting images work just as well as the ones that El Carro can build for you.

Option 4 - Use Oracle’s Container Registry directly

There is a way to avoid building your own images: The Oracle Container Registry contains pre-built images that can be used with our Kubernetes operator directly and without modification. But since Oracle’s registry can only be accessed by customers, it is protected with a password. After accepting Oracle’s license conditions, one can either copy images to their own registry, or configure OCR as a private repository in Kubernetes.

The Power of Seeding

Aside from the installation, it is the creation of a database that takes the longest time in the initial provisioning process and it is often a frustrating wait time before you can log in and use your database for the first time after creation. To reduce this wait time, the first two options allow you to build a pre-seeded database image that already contains a snapshot of a created and configured database. That way this initialization step is moved to the container build process and minimizes the startup time of new database instances.

Aside from the wait time, relying on a seeded image (i.e. including an empty database in the image can provide consistency in config options if the same image is to be used in multiple deployments).

Option 1 - El Carro on GCP

Option 2 - El Carro local build

Option 3 - Oracle local build

Option 4 - Oracle Container Registry

Versions

12c, 18c, 19c

12c, 18c, 19c

12c, 18c, 19c

19c

Editions

XE, EE

XE, EE

EE

EE

Patches Updates

yes

yes

no

no

Seeded Images

yes

yes

no

no

Automatic build pipeline

yes

no

no

n/a

Conclusion

We believe in an open cloud approach and empowering users with choice and flexibility. In the context of running Oracle databases on Kubernetes that means that you get to choose your database container images. El Carro provides build scripts that allow you to not only customize containers but also to increase security and robustness with the ability to bake patches and updates into the container image. Seeding container images with a database further reduces the deployment time by avoiding this step on first startup - which is especially useful in environments that create many databases - such as automatic test pipelines.

But other users may feel more comfortable in receiving support when they use Oracle’s pre-built images from their registry.

The choice is yours. Just know that El Carro is here to help you modernize your Oracle database workloads with Kubernetes. And if you have any other feature requests or choices that matter to you—let us know by filing an issue on Github.

By Bjoern Rost, Product Manager and Ash Gbadamassi, Software Engineer – Cloud Databases

Verifiable Supply Chain Metadata for Tekton


If you've been paying attention to the news at all lately, you've probably noticed that software supply chain attacks are rapidly becoming a big problem. Whether you're trying to prevent these attacks, responding to an ongoing one or recovering from one, you understand that knowing what is happening in your CI/CD pipeline is critical.

Fortunately, the Kubernetes-native Tekton project – an open-source framework for creating CI/CD systems – was designed with security in mind from Day One, and the new Tekton Chains project is here to help take it to the next level. Tekton Chains securely captures metadata for CI/CD pipeline executions. We made two really important design decisions early on in Tekton that make supply chain security easy: declarative pipeline definitions and explicit state transitions. This next section will explain what these mean in practice and how they make it easy to build a secure delivery pipeline.


Definitions or “boxes and arrows”
Just like everything in your high school physics class, a CI/CD pipeline can be modeled as a series of boxes. Each box has some inputs, some outputs, and some steps that happen in the middle. Even if you have one big complicated bash script that fetches dependencies, builds programs, runs tests, downloads the internet and deploys to production, you can draw boxes and arrows to represent this flow. The boxes might be really big, but you can do it.

Since the initial whiteboard sketches, the Pipeline and Task CRDs in Tekton were designed to allow users to define each step of their pipeline at a granular level. These types include support for mandatory declared inputs, outputs, and build environments. This means you can track exactly what sources went into a build, what tools were used during the build itself and what artifacts came out at the end. By breaking up a large monolithic pipeline into a series of smaller, reusable steps, you can increase visibility into the overall system. This makes it easier to understand your exposure to supply chain attacks, detect issues when they do happen and recover from them after.


Explicit transitions
After a pipeline is defined, there are a few approaches to orchestrating it: level-triggered and edge-triggered. Like most of the Kubernetes ecosystem, Tekton is designed to operate in a level-triggered fashion. This means steps are executed explicitly by a central orchestrator which runs one task, waits for completion, then decides what to do next. In edge-based systems, a pipeline definition would be translated into a set of events and listeners. Each step fires off events when it completes, and these events are then picked up by listeners which run the next set of steps.

Event-based or edge-triggered systems are easy to reason about, but can be tricky to manage at scale. They also make it much harder to track an artifact as it flows through the entire system. Each step in the pipeline only knows about the one immediately before it; no step is responsible for tracking the entire execution. This can become problematic when you try to understand the security posture of your delivery pipeline.

Tekton was designed with the opposite approach in mind - level-triggered. Instead of a Rube-Goldberg machine tied together with duct tape and clothespins, Tekton is more like an explicit assembly-line. Level-triggered systems like Tekton move from state-to-state in a calculated manner by a central orchestrator. They require more explicit-design up front, but they are easier to observe and reason about after. Supply chains that use systems like Tekton are more secure.


Secure delivery pipeline through chains and provenance
So how do these two design decisions combine to make supply chain security easier? Enter Tekton Chains.

By observing the execution of a Task or a Pipeline and paying careful attention to the inputs, outputs, and steps along the way, we can make it easier to track down what happened and why later on. This "observer" can be run in a separate trust domain and cryptographically sign all of this captured metadata as it's stored, leaving a tamper-proof activity ledger. This technique is called "verifiable builds." This securely generated metadata can be used in a number of ways, from audit logging to recovering from security breaches to pre-deployment policy enforcement.

You can install Chains into any Tekton-enabled cluster and configure it to generate this cryptographically-signed supply chain metadata for your builds. Chains supports pluggable signature systems like PGP, x509 and Cloud KMS's. Payloads can be generated in a few different industry-standard formats like the RedHat Simple-Signing and the In-Toto Provenance specifications. The full documentation is available here, but you can get started quickly with something like this:


For this tutorial, you’ll need access to a GKE Kubernetes cluster and a GCR registry with push credentials. The cluster should already have Tekton Pipelines installed.


Install Tekton Chains into your cluster:

$ kubectl apply --filename https://storage.googleapis.com/tekton-releases/chains/latest/release.yaml



Next, you’ll set up registry authentication for the Tekton Chains controller, so that it can push OCI image signatures to your registry. To set up authentication, you’ll create a Service Account and download credentials:

$ export PROJECT_ID=<GCP Project ID>

$ gcloud iam service-accounts create tekton-chains

$ gcloud iam service-accounts keys create credentials.json --iam-account=tekton-chains@${PROJECT_ID}.iam.gserviceaccount.com



Now, create a Kubernetes Secret from your credentials file so the Chains controller can access it:

$ kubectl create secret docker-registry registry-credentials \

  --docker-server=gcr.io \

  --docker-username=_json_key \

  [email protected] \

  --docker-password="$(cat credentials.json)" \

  -n tekton-chains

$ kubectl patch serviceaccount tekton-chains-controller \

  -p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}" -n tekton-chains



We can use cosign to generate a keypair as a Kubernetes secret, which the Chains controller will use for signing. Cosign will ask for a password, which will be stored in the secret:

$ cosign generate-key-pair -k8s tekton-chains/signing-secrets


Next, you’ll need to set up authentication to your GCR registry for the kaniko task as another Kubernetes Secret.

$ export CREDENTIALS_SECRET=kaniko-credentials

$ kubectl create secret generic $CREDENTIALS_SECRET --from-file credentials.json



Now, we’ll create a kaniko-chains task which will build and push a container image to your registry. Tekton Chains will recognize that an image has been built, and sign it automatically.

$ kubectl apply -f https://raw.githubusercontent.com/tektoncd/chains/main/examples/kaniko/gcp/kaniko.yaml

$ cat <<EOF | kubectl apply -f -

apiVersion: tekton.dev/v1beta1

kind: TaskRun

metadata:

  name: kaniko-run

spec:

  taskRef:

    name: kaniko-gcp

  params:

  - name: IMAGE

    value: gcr.io/${PROJECT_ID}/kaniko-chains

  workspaces:

  - name: source

    emptyDir: {}

  - name: credentials

    secret:

      secretName: ${CREDENTIALS_SECRET} 

EOF



Wait for the TaskRun to complete, and give the Tekton Chains controller a few seconds to sign the image and store the signature. You should be able to verify the signature with cosign and your public key:

$ cosign verify -key cosign.pub gcr.io/${PROJECT_ID}/kaniko-chains


Congratulations! You’ve successfully signed and verified an OCI image with Tekton Chains and cosign.


What's Next
Within Chains, we'll be improving integration with other supply-chain security projects. This includes support for Binary Transparency and Verifiable Builds through integrations with the Sigstore and In-Toto projects. We'll also be improving and providing a set of well-designed, highly secure Tasks and Pipeline definitions in the TektonCD Catalog.

In Tekton Pipelines, we plan on finishing up TEP-0025 (Hermekton) to enable the support for hermetic build execution. If you want to play around with it now, hermekton can be run as an alpha feature in experimental mode. When hermekton is enabled, a build runs in a locked-down environment without network connectivity. Hermetic builds guarantee all inputs have been explicitly declared ahead-of-time, providing for a more auditable supply-chain. Hermetic builds and Chains align well, because the hermeticity build property is contained in the full build provenance captured by Chains. Chains can generate and attest to metadata specifying exactly which sections of a build had network access.

This means policy can be defined around exactly which build tools are allowed to access the network and which ones are not. This metadata can be used in policies at build time (banning compilers with security vulnerabilities) or stored and used by policy engines at deploy time (only code-reviewed and verifiably built containers are allowed to run).

We believe supply-chain security must be built-in and by default. No task orchestrator can promise perfect supply-chain security, but TektonCD was designed with unique features in mind that make it easier to do the right thing. We're always looking for feedback on the design, goals and requirements. You can reach out on GitHub or the #chains Slack channel.

Modernizing Oracle operations with Kubernetes and El Carro

Google Cloud is releasing El Carro, an open source tool to help you transform and modernize your Oracle database operations. El Carro implements the Kubernetes operator pattern to deliver automation for provisioning and ongoing operations like backups, patching, and high availability for databases running in hybrid and multi-cloud environments. And it does so using the same declarative syntax that DevOps teams are using to manage applications. With El Carro, users can choose to modernize and transform their database operations in place and benefit from a consistent management experience and hybrid and multi-cloud portability. Released under the Apache License 2.0, you are free to use El Carro in any Kubernetes environment—you are in control.

Containers and Kubernetes deliver portability on standardized infrastructure, and today Oracle supports databases running in containers; they’ve also released container build files and images and helm charts to simplify provisioning. What is missing for the next level of integration is support for lifecycle operations and an extension of the Kubernetes API to the primitives needed for database management.

In addition, fully managed or autonomous services for Oracle may not make available all the required features, such as Active Data Guard, Multitenant, and In-Memory, parameters/flags, versions, and patch levels. DBAs also find themselves locked out of many roles, including sysadmin and root. These restrictions make many cloud architects fall back to lift and shift Oracle databases onto infrastructure as a service offerings and miss out on opportunities to modernize and transform database operations. And with transactional databases growing in number and criticality, organizations are struggling to deliver innovation and modernization. Engineers are already busy keeping up with sprawl and mundane operational tasks while adhering to strict change management processes.

How do we solve this database operations gap?

El Carro solves this. It is built with scalability in mind, using the same container orchestration infrastructure, Kubernetes, that powers many businesses and is a top choice for modern architectures. Its open API allows you to manage your database configurations as declarative code, enabling CI/CD or Gitops workflows for auditability and control mechanisms. El Carro automates many database lifecycle operations, like backups, replication, and patching. And, when it distributes databases on the nodes of a cluster, it is aware of the priority and resource requirements of each database to optimize tight packing while respecting quality of service. Lastly, it helps DBAs by delivering automation without restrictions and leaving DBAs in full control over their systems. You can choose to let the operator drive for you, but you can also take over the steering wheel yourself at any time.

Because Kubernetes is now the standard for portable infrastructure automation and orchestration, engineers appreciate how Kubernetes abstracts complex problems into manageable infrastructure as code. Kubernetes can scale from small projects to large projects that support the infrastructure that powers Google products and services for billions of users around the world. Moreover, Google pioneers the next generation of infrastructure as code that we refer to as Configuration as Data to declaratively establish a contract between developer intent and the runtime operation. According to the Cloud Native Survey 2020, two-thirds of respondents were either already running stateful workloads in production or were considering doing so within the next 12 months. We expect that datastores are going to drive the next wave of enterprise Kubernetes adoption.

A number of open source operators for databases, such as PostgreSQL, MySQL, and many others, have been released, are actively maintained by the community, and are popular among developers and architects looking for a hands-off approach to manage databases with their applications. El Carro extends the list of database operators to include Oracle.

What are we building with El Carro for Oracle databases?

The operator pattern emerged in late 2016 as an extension of the Kubernetes API and control loop aimed at automating more complicated and application-specific tasks that are beyond the native Kubernetes objects.

El Carro implements a custom resource definition (CRD), which is tailored to database management. Users set and change attributes of the custom resource using the Kubernetes API the same way they do for built-in objects such as pods, deployments, or services. The El Carro controller observes changes to the CRs and compares the declared state with the current reality in the cluster, then makes the necessary changes. Those changes could either affect the Kubernetes resources used by the database such as persistent volumes or the pod itself, or may result in issuing calls via SQL or command line tools to the database to create and modify users or other database objects.

Here’s a look at how this works:
El Carro Architecture
El Carro Architecture

The diagram above shows how the major components of a database managed by the El Carro Operator interact with each other. The controller monitors the CRD for any changes made by admins. It creates and manages the cluster resources that make up the actual database deployment: persistent volumes for filesystems and data, a pod to run containers with the actual database, and a daemon that allows the controller to securely run SQL commands on the database. And lastly, a service makes sqlnet connections available to applications and end users that can either run in the same Kubernetes cluster or outside of it.

At release time, the El Carro Operator can provision Oracle databases of 12c Enterprise Edition and 18c Express Edition. It manages instance parameters, pluggable databases, and users. You can take and restore backups either using rman or storage snapshots, and we are working to add additional features.

How to get involved with El Carro?

In the development process, we collaborated with users and partners in the Oracle community to help us validate the approach. "Pythian has helped Oracle users to automate and optimize the operations of their mission-critical systems for over 20 years,” says Simon Pane, principal consultant at Pythian. “We are excited about the possibilities that El Carro brings to users on their cloud modernization journeys. We are proud to work with the community on a vision for the future of database management.".

Sean Scott covers Docker for databases on his blog oraclesean.com, and says: "There are many benefits to running Oracle databases in containers. Adding Kubernetes orchestration introduces new opportunities to bring the DevOps and Oracle communities together."

You can try out El Carro today. Follow the quick start guide and try out provisioning of instances, databases, users. Import data via Data Pump, manage instance parameters, choose between different methods for backups, and try out a restore. Have a look at how we integrate with external logging and monitoring solutions. Reach out via our Google group and leave feedback for what features you would like to see next, or even create your own patch and pull request on GitHub.

By Bjoern Rost - Product Manager and Boris Dali - Team Lead, Engineering

Using MicroK8s with Anthos Config Management in the world of IoT

When dealing with large scale Kubernetes deployments, managing configuration and policy is often very complicated. We discussed why Kubernetes’ declarative approach to configuration as data has become the most popular choice for most users a few weeks ago. Today, we will discuss bringing this approach to your MicroK8 deployments using Anthos Config Management.
Image of Anthos Config Management + Cloud Source Repositories + MicroK8s
Anthos Config Management helps you easily create declarative security and operational policies and implement them at scale for your Kubernetes deployments across hybrid and multi-cloud environments. At a high level, you represent the desired state of your deployment as code committed to a central Git repository. Anthos Config Management will ensure the desired state is achieved and also maintained across all your registered clusters.

You can use Anthos Config Management for both your Kubernetes Engine (GKE) clusters as well as on Anthos attached clusters. Anthos attached clusters is a deployment option that extends Anthos’ reach into Kubernetes clusters running in other clouds as well as edge devices and the world of IoT, the Internet of Things. In this blog you will learn by experimenting with attached clusters with MicroK8s, a conformant Kubernetes platform popular in IoT and edge environments.

Consider an organization with a large number of distributed manufacturing facilities or laboratories that use MicroK8s to provide services to IoT devices. In such a deployment, Anthos can help you manage remote clusters directly from the Anthos Console rather than investing engineering resources to build out a multitude of custom tools.

Consider the diagram below.

Diagram of Anthos Config Management with MicroK8s on the Factory Floor with IoT
This diagram shows a set of “N” factory locations each with a MicroK8s cluster supporting IoT devices such as lights, sensors, or even machines. You register each of the MicroK8s clusters in an Anthos environ: a logical collection of Kubernetes clusters. When you want to deploy the application code to the MicroK8s clusters, you commit the code to the repository and Anthos Config Management takes care of the deployment across all locations. In this blog we will show you how you can quickly try this out using a MicroK8s test deployment.

We will use the following Google Cloud services:
  • Compute Engine provides an Ubuntu instance for a single-node MicroK8s cluster. Ubuntu will use cloud-init to install MicroK8s and generate shell scripts and other files to save time.
  • Cloud Source Repositories will provide the Git-based repository to which we will commit our workload.
  • Anthos Config Management will perform the deployment from the repository to the MicroK8s cluster.

Let’s start with a picture

Here’s a diagram of how these components fit together.

Diagram of how Anthos Config Management works together with MicroK8s
  • A workstation instance is created from which Terraform is used to deploy four components: (1) an IAM service account, (2) a Google Compute Engine Instance with MicroK8s using permissions provided by the service account, (3) a Kubernetes configuration repo provided by Cloud Source Repositories, and (4) a public/private key pair.
  • The GCE instance will use the service account key to register the MicroK8s cluster with an Anthos environ.
  • The public key from the public/ private key pair will be registered to the repository while the private key will be registered with the MicroK8s cluster.
  • Anthos Config Management will be configured to point to the repository and branch to poll for updates.
  • When a Kubernetes YAML document is pushed to the appropriate branch of the repository, Anthos Config Management will use the private key to connect to the repository, detect that a commit has been made against the branch, fetch the files and apply the document to the MicroK8s cluster.
Anthos Config Management enables you to deploy code from a Git repository to Kubernetes clusters that have been registered with Anthos. Google Cloud officially supports GKE, AKS, and EKS clusters, but you can use other conformant clusters such as MicroK8s in accordance with your needs. The repository below shows you how to register a single MicroK8s cluster to receive deployments. You can also scale this to larger numbers of clusters all of which can receive updates from commitments to the repository. If your organization has large numbers of IoT devices supported by Kubernetes clusters you can update all of them from the Anthos console to provide for consistent deployments across the organization regardless of the locations of the clusters, including the IoT edge. If you would like to learn more, you can build this project yourself. Please check out this Git repository and learn firsthand about how Anthos can help you manage Kubernetes deployments in the world of IoT.

By Jeff Levine, Customer Engineer – Google Cloud

Kubernetes: Efficient Multi-Zone Networking with Topology Aware Routing

Topology Aware Routing of Services, a feature that was first introduced as alpha in the Kubernetes 1.17 release, aims to solve an often overlooked issue with Kubernetes Services; that they are not region aware.

Kubernetes services provide a uniform, durable, and easy to use method of accessing a variety of different backend applications. These backend applications are most commonly an exposed app running within your pods. Kubernetes does this by reserving a static virtual IP and DNS name, unique to it throughout the cluster and turning them into simple load balancers.

While this model is great for small clusters or applications, if you have thousands of nodes, your cluster spans multiple regions, or your application is latency sensitive then the service model can start to break down a bit. By default, each endpoint in a service has an equal opportunity to be selected as the destination. If you’re accessing a service with a backend hosted in the same zone, there’s a high probability that you’d be directed to a pod in a completely separate zone—likely in a completely separate region—and is what Topology Awareness intends to solve.

The Topology Aware Routing of Services feature added the concept of topologyKeys as an additional field in service objects. It allows you to define a set of node labels that could be used to route traffic closer to where it originated from.

Example Service with topologyKeys


apiVersion: v1
kind: Service
metadata:
  Name: my-app-web
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  topologyKeys:

    - "topology.kubernetes.io/zone"

    - "topology.kubernetes.io/region"

In this example, the service makes use of some commonly used labels for its topology preferences. It signals that when kube-proxy is routing traffic for that service, it should only route to pods within the same zone or region the traffic is originating from.

This is great! Traffic should remain “close” to where it originated and remove unnecessary latency.

While topologyKeys is available as alpha in 1.17, it hasn’t yet graduated to the next stage because the first pass at building topology-aware routing surfaced many challenges and scalability issues.

Each node in the cluster now has to manage a potentially complex ruleset for each service that would require more frequent updating. In clusters with thousands of pods or thousands of nodes, this solution quickly becomes untenable.

Another pain point with this implementation depends on how your application was distributed across a zone or region, as it's quite possible that a singular pod would be receiving ALL traffic for that zone or region. The preference list doesn’t take into account the performance of the pod on the receiving end and could potentially cause an outage.

These problems have led the Kubernetes Network Special Interest Group (SIG) to do a full re-evaluation of how to approach the Topology Awareness implementation.

What’s Planned for Topology Aware Routing?
The new design is intended to automatically handle the routing of services so that they will be load-balanced across a minimum number of the closest possible endpoints. It does this by applying an algorithm using two of the topology keys to signal affinity for service routing: topology.kubernetes.io/region and topology.kubernetes.io/zone without having to specify them via topologyKeys at the service level.

This algorithm works by establishing a dynamic threshold for a service where it calculates an expected number of endpoints per zone. It then builds a list of available endpoints for that service, prioritizing the ones that are in the same zone. If there are not enough endpoints to meet that expected number, it adds them from other zones until it reaches its expected number of endpoints. This list of expected endpoints, or a subset of endpoints are then passed to the nodes within that zone.

These nodes no longer have to maintain the complex set of rules like they had in the first iteration, and now just manage the small subset of endpoints for each service. This is less flexible than its predecessor, but it drastically reduces the performance overhead when compared to the previous method, while also covering the majority of use-cases. A big win for everyone.

These features are slated to graduate to alpha in the 1.21 release in the first part of 2021. If Topology Aware Routing would be of value to you, please consider taking the time to test it when it becomes available. Early feedback is highly appreciated and helps shape the direction of the feature.

Until then, if you’d like to learn more about Service Topology, Endpoint Slice, and the various algorithms that have been evaluated for service routing, check out Rob Scott’s presentation: Improving Network Efficiency with Topology Aware Routing, on November 19th, at KubeCon + CloudNativeCon North America.



By Bob Killen, Program Manager – Google Open Source Programs Office

Kubernetes Ingress Goes GA

The Kubernetes Ingress API, first introduced in late 2015 as an experimental beta feature, has finally graduated as a stable API and is included in the recent 1.19 release of Kubernetes.

The goal of the Ingress API is to provide a simple uniform means of describing the routing of HTTP or HTTPS traffic from outside a cluster to backend services within a cluster; independent of the Ingress Controller being used. An Ingress controller is a 3rd party application, such as Nginx or an external service like the Google Cloud Load Balancer (GCLB), that performs the actual routing of the HTTP(S) traffic. This uniform API, supported by the Ingress Controllers made it easy to create simple HTTP(S) load balancers, however most use-cases required something more complex.

By early 2019, the Ingress API had remained in beta for close to four years. Beta APIs are not intended to be relied upon for business-critical production use, yet many users were using the Ingress API in some level of production capacity. After much discussion, the Kubernetes Networking Special Interest Group (SIG) proposed a path forward to bring the Ingress API to GA primarily by introducing two changes in Kubernetes 1.18. These were: a new field, pathType, to the Ingress API; and a new Ingress resource type, IngressClass. Combined, they provide a means of guaranteeing a base level of compatibility between different path prefix matching implementations, along with opening the door to further extension by the Ingress Controller developers in a uniform and consistent pattern.

What does this mean for you? You can be assured that the path prefixes you use will be evaluated the same way across Ingress Controllers implementations, and the Ingress configuration sprawl across Annotations, ConfigMaps and CustomResourceDefinitions (CRDs) will be consolidated into a single IngressClass resource type.

pathType

The pathType field specifies one of three ways that an Ingress Object’s path should be interpreted:
  • ImplementationSpecific: Path prefix matching is delegated to the Ingress Controller (IngressClass).
  • Exact: Matches the URL path exactly (case sensitive)
  • Prefix: Matches based on a URL path prefix split by /. Matching is case sensitive and done on a path element by element basis.

NOTE: ImplementationSpecific was configured as the default pathType in the 1.18 release. In 1.19 the defaulting behavior was removed and it MUST be specified. Paths that do not include an explicit pathType will fail validation.

Pre Kubernetes 1.18

Kubernetes 1.19+

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: minimal-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - http:
    paths:
    - path: /testpath
      backend:
        serviceName: test
        servicePort: 80

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: minimal-ingress
spec:
  ingressClassName: external-lb
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test
            port:
              number: 80



These changes not only make room for backwards-compatible configurations with the ImplementationSpecific pathType, but also enables more portable workloads between Ingress Controllers with Exact or Prefix pathType.

IngressClass

The new IngressClass resource takes the place of various different Annotations, ConfigMaps, Environment Variables or Command Line Parameters that you would regularly pass to an Ingress Controller directly. Instead, it has a generic parameters field that can be used to reference controller specific configuration.


Example IngressClass Resource

apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  name: external-lb
spec:
  controller: example.com/ingress-controller
  parameters:
    apiGroup: k8s.example.com
    kind: IngressParameters
    name: external-lb

Source: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class

In this example, the parameters resource would include configuration options implemented by the example.com/ingress-controller ingress controller. These items would not need to be passed as Annotations or a ConfigMap as they would in versions prior to Kubernetes 1.18.

How do you use IngressClass with an Ingress Object? You may have caught it in the earlier example, but the Ingress resource’s spec has been updated to include an ingressClassName field. This field is similar to the previous kubernetes.io/ingress.class annotation but refers to the name of the corresponding IngressClass resource.

Other Changes

Several other small changes went into effect with the graduation of Ingress to GA in 1.19. A few fields have been remapped/renamed and support for resource backends has been added.

Remapped Ingress Fields

Resource Backend
A Resource Backend is essentially a pointer or ObjectRef (apiGroup, kind, name) to another resource in the same namespace. Why would you want to do this? Well, it opens the door to all sorts of future possibilities such as routing to static object storage hosted in GCS or S3, or another internal form of storage.

NOTE: Resource Backend and Service Backends are mutually exclusive. Only one field can be specified at a time.

Deprecation Notice

With the graduation of Ingress in the 1.19 release, it officially puts the older iterations of the API (extensions/v1beta1 and networking.k8s.io/v1beta1) on a clock. Following the Kubernetes Deprecation Policy, the older APIs are slated to be removed in Kubernetes 1.22.

Should you migrate right now (September 2020)? Not yet. The majority of Ingress Controllers have not added support for the new GA Ingress API. Ingress-GCE, the Ingress Controller for Google Kubernetes Engine (GKE) should be updated to support the Ingress GA API in Q4 2020. Keep your eyes on the GKE rapid release channel to stay up to date on it, and Kubernetes 1.19’s availability.

What’s Next for Ingress?

The Ingress API has had a rough road getting to GA. It is an essential resource for many, and the changes that have been introduced help manage that complexity while keeping it relatively light-weight. However, even with the added flexibility that has been introduced it doesn’t cover a variety of complex use-cases.

SIG Network has been working on a new API referred to as “Service APIs” that takes into account the lessons learned from the previous efforts of working on Ingress. These Service APIs are not intended to replace Ingress, but instead compliment it by providing several new resources that could enable more complex workflows.


By Bob Killen, Program Manager, Google Open Source Programs Office

Kpt: Packaging up your Kubernetes configuration with git and YAML since 2014

Kubernetes configuration manifests have become an industry standard for deploying both custom and off-the-shelf applications (as well as for infrastructure). Manifests are combined into bundles to create higher-level deployable systems as well as reusable blueprints (such as a product offering, off the shelf software, or customizable starting point for a new application).

However, most teams lack the expertise or desire to create bespoke bundles of configuration from scratch and instead: 1) either fork them from another bundle, or 2) use some packaging solution which generates manifests from code.

Teams quickly discover they need to customize, validate, audit and re-publish their forked/ generated bundles for their environment. Most packaging solutions to date are tightly coupled to some format written as code (e.g. templates, DSLs, etc). This introduces a number of challenges when trying to extend, build on top of, or integrate them with other systems. For example, how does one update a forked template from upstream, or how does one apply custom validation?

Packaging is the foundation of building reusable components, but it also incurs a productivity tax on the users of those components.

Today we’d like to introduce kpt, an OSS tool for Kubernetes packaging, which uses a standard format to bundle, publish, customize, update, and apply configuration manifests.

Kpt is built around an “as data” architecture bundling Kubernetes resource configuration, a format for both humans and machines. The ability for tools to read and write the package contents using standardized data structures enables powerful new capabilities:
  • Any existing directory in a Git repo with configuration files can be used as a kpt package.
  • Packages can be arbitrarily customized and later pull in updates from upstream by merging them.
  • Tools and automation can perform high-level operations by transforming and validating package data on behalf of users or systems.
  • Organizations can develop their own tools and automation which operate against the package data.
  • Existing tools and automation that work with resource configuration “just work” with kpt.
  • Existing solutions that generate configuration (e.g. from templates or DSLs) can emit kpt packages which enable the above capabilities for them.

Example workflow with kpt

Now that we’ve established the benefits of using kpt for managing your packages of Kubernetes config, lets walk through how an enterprise might leverage kpt to package, share and use their best practices for Kubernetes across the organization.


First, a team within the organization may build and contribute to a repository of best practices (pictured in blue) for managing a certain type of application, for example a microservice (called “app”). As the best practices are developed within an organization, downstream teams will want to consume and modify configuration blueprints based on them. These blueprints provide a blessed starting point which adheres to organization policies and conventions.

The downstream team will get their own copy of a package by downloading it to their local filesystem (pictured in red) using kpt pkg get. This clones the git subdirectory, recording upstream metadata so that it can be updated later.

They may decide to update the number of replicas to fit their scaling requirements or may need to alter part of the image field to be the image name for their app. They can directly modify the configuration using a text editor (as would be done before). Alternatively, the package may define setters, allowing fields to be set programmatically using kpt cfg set. Setters streamline workflows by providing user and automation friendly commands to perform common operations.

Once the modifications have been made to the local filesystem, the team will commit and push their package to an app repository owned by them. From there, a CI/CD pipeline will kick off and the deployment process will begin. As a final customization before the package is deployed to the cluster, the CI/CD pipeline will inject the digest of the image it just built into the image field (using kpt cfg set). When the image digest has been set, the CI/CD pipeline can send the manifests to the cluster using kpt live apply. Kpt live operates like kubectl apply, providing additional functionality to prune resources deleted from the configuration and block on rollout completion (reporting status of the rollout back to the user).

Now that we’ve walked through how you might use kpt in your organization, we’d love it if you’d try it out, read the docs, or contribute.

One more thing

There’s still a lot to the story we didn’t cover here. Expect to hear more from us about:
  • Using kpt with GitOps
  • Building custom logic with functions
  • Writing effective blueprints with kpt and kustomize
By Phillip Wittrock, Software Engineer and Vic Iglesias, Cloud Solutions Architect

Securing open source: How Google supports the new Kubernetes bug bounty

At Google, we care deeply about the security of open-source projects, as they’re such a critical part of our infrastructure—and indeed everyone’s. Today, the Cloud-Native Computing Foundation (CNCF) announced a new bug bounty program for Kubernetes that we helped create and get up and running. Here’s a brief overview of the program, other ways we help secure open-source projects and information on how you can get involved.

Launching the Kubernetes bug bounty program

Kubernetes is a CNCF project. As part of its graduation criteria, the CNCF recently funded the project’s first security audit, to review its core areas and identify potential issues. The audit identified and addressed several previously unknown security issues. Thankfully, Kubernetes already had a Product Security Committee, including engineers from the Google Kubernetes Engine (GKE) security team, who respond to and patch any newly discovered bugs. But the job of securing an open-source project is never done. To increase awareness of Kubernetes’ security model, attract new security researchers, and reward ongoing efforts in the community, the Kubernetes Product Security Committee began discussions in 2018 about launching an official bug bounty program.

Find Kubernetes bugs, get paid

What kind of bugs does the bounty program recognize? Most of the content you’d think of as ‘core’ Kubernetes, included at https://github.com/kubernetes, is in scope. We’re interested in common kinds of security issues like remote code execution, privilege escalation, and bugs in authentication or authorization. Because Kubernetes is a community project, we’re also interested in the Kubernetes supply chain, including build and release processes that might allow a malicious individual to gain unauthorized access to commits, or otherwise affect build artifacts. This is a bit different from your standard bug bounty as there isn’t a ‘live’ environment for you to test—Kubernetes can be configured in many different ways, and we’re looking for bugs that affect any of those (except when existing configuration options could mitigate the bug). Thanks to the CNCF’s ongoing support and funding of this new program, depending on the bug, you can be rewarded with a bounty anywhere from $100 to $10,000.

The bug bounty program has been in a private release for several months, with invited researchers submitting bugs and to help us test the triage process. And today, the new Kubernetes bug bounty program is live! We’re excited to see what kind of bugs you discover, and are ready to respond to new reports. You can learn more about the program and how to get involved here.

Dedicated to Kubernetes security

Google has been involved in this new Kubernetes bug bounty from the get-go: proposing the program, completing vendor evaluations, defining the initial scope, testing the process, and onboarding HackerOne to implement the bug bounty solution. Though this is a big effort, it’s part of our ongoing commitment to securing Kubernetes. Google continues to be involved in every part of Kubernetes security, including responding to vulnerabilities as part of the Kubernetes Product Security Committee, chairing the sig-auth Kubernetes special interest group, and leading the aforementioned Kubernetes security audit. We realize that security is a critical part of any user’s decision to use an open-source tool, so we dedicate resources to help ensure we’re providing the best possible security for Kubernetes and GKE.

Although the Kubernetes bug bounty program is new, it isn’t a novel strategy for Google. We have enjoyed a close relationship with the security research community for many years and, in 2010, Google established our own Vulnerability Rewards Program (VRP). The VRP provides rewards for vulnerabilities reported in GKE and virtually all other Google Cloud services. (If you find a bug in GKE that isn’t specific to Kubernetes core, you should still report it to the Google VRP!) Nor is Kubernetes the only open-source project with a bug bounty program. In fact, we recently expanded our Patch Rewards program to provide financial rewards both upfront and after-the-fact for security improvements to open-source projects.

Help keep the world’s infrastructure safe. Report a bug to the Kubernetes bug bounty, or a GKE bug to the Google VRP.

By Maya Kaczorowski, Product Manager, Container Security; and Aaron Small, Product Manager, GKE On-Prem security

Building Skills, Building Community

Year after year, we hear from conference attendees that it's not just the content they came for, it's the connections. Meeting new people, getting new perspectives, making new friends (and sometimes hiring them!) is a big part of KubeCon Life. We want to make sure that the Kubecon community is welcoming to people from diverse backgrounds but just being welcoming is not enough: we have to actually do the work to help people get through the door.

The easiest way to help people get through the door is through diversity scholarships. One of the biggest blockers to full participation in our community is just having the resources to get to the room where it happens, and a diversity scholarship—not just a ticket, but travel assistance too—helps increase participation.

1: Going Swagless

This Kubecon we want you to take away the really important things from the conference: new knowledge and new connections... not just another pen or plastic doodad. (Although to be fair, we will also have plenty of stickers... stickers aren't swag, they're an essential part of Kubecon!)

Google prides itself on being a data-driven company, so when we need to decide where we can spend our dollars to make the most impact and do the most good for the Kubecon community, we turn to the data. We know there is an issue from the CNCF KubeCon report in Seattle 2018 reporting in 11% women (and that’s not even a complete diversity metric). Now looking at the things conference attendees have told us they value about Kubecon, we put together this handy chart to help us guide our decision-making:
Travel + Conf Ticket ScholarshipBranded Pen
Face to face learning
Career development
OSS community building
Writing tools

We also need to consider externalities when we make our decisions—and going #swagless and dedicating those resources to improving the conversation and community at Kubecon has some positive externalities: less plastic (and lighter luggage going home) is better for the planet, too!

If our work to support diversity and inclusion at Kubecon has inspired you and you want to know what your org can do to participate, there is plenty of room in the #swagless tent for everyone—redirect your swag budget to D&I efforts. Shoutouts to conference organizers like SpringOne that went totally swagless this year!

2: Diversity Lunch + Hack

Our commitment to a welcoming environment and a diverse community doesn't stop at getting people in the door: we also need to work on inclusion. Our diversity lunch and hack is a place where people can:
  • Build their skills through pair programming
  • Get installation help
  • Do deep-dives on k8s topics
  • Connect with others in the community
Our diversity lunch isn't just talking about diversity: it's about working towards diversity through skill-building and creating stronger community bonds. Register here!

We welcomed 220 friends and allies in Barcelona and expect to continue the sold-out streak in San Diego (get your ticket now)!

3: Redirecting Even More

But wait, there's more! We're not just going #swagless, we're also redirecting all the hands-on workshop registration fees ($50) from Anthos Day, Anthos&GKE Lab, OSS: Agones, Knative, and Kubeflow to the diversity scholarship fund. You can build a stronger, more diverse community while you build your skills—a total twofer. (And our workshops are also walking the walk of inclusion by being accessible themselves: if you need support to attend a workshop, whether financial or physical, send us a note.

4: Hiring

Also, one of the best things any company can do to drive D&I is to hire people who will help your company become more diverse, whether as a consultant to help you build your program, or as a team member who will help you bring a wider perspective to your product! Come meet a Googler at any of the activities we are doing during the week to discuss jobs at Google Cloud: g.co/Kubecon.

By: Paris Pittman, Google Open Source