Tag Archives: CEL

Common Expressions For Portable Policy and Beyond


I am thrilled to introduce Common Expression Language, a simple expression language that's great for writing small snippets of logic which can be run anywhere with the same semantics in nanoseconds to microseconds evaluation speed. The launch of cel.dev marks a major milestone in the growth and stability of the language.

It powers several well known products such as Kubernetes where it's used to protect against costly production misconfigurations:

    object.spec.replicas <= 5

Cloud IAM uses CEL to enable fine-grained authorization:

    request.path.startsWith("/finance")

Versatility

CEL is both open source and openly governed, making it well adopted both inside and outside Google. CEL is used by a number of large tech companies, either internally or as part of their public product offering. As a reflection of this open governance, cel.dev has been launched to share more about the language and the community around it.

So, what makes CEL a good choice for these applications? Why is CEL unique or different from Amazon's Cedar Policy or Open Policy Agent's Rego? These are great questions, and common ones (no pun intended):

  • Highly optimized evaluation O(ns) - O(μs)
  • Portable with stacks supported in C++, Java, and Go
  • Thousands of conformance tests ensure consistent behavior across stacks
  • Supports extension and subsetting

Subsetting is crucial for preserving predictable compute / memory impacts, and it only exists in CEL. As any latency-critical service maintainer will tell you, it's vital to have a clear understanding of compute and memory implications of any new feature. Imagine you've chosen an expression language, validated its functionality meets your security, serving, and scaling goals, but after launching an update to the library introduces new functionality which can't be disabled and leaves your product vulnerable to attack. Your alternatives are to fork the library and accept the maintenance costs, introduce custom validation logic which is likely to be insufficient to prevent abuse, or to redesign your service. CEL supported subsetting allows you to ensure that what was true at the initial product launch will remain true until you decide when to expose more of its functionality to customers.

Cedar Policy language was developed by Amazon. It is open source, implemented in Rust, and offers formal verification. Formal verification allows Cedar to validate policy correctness at authoring time. CEL is not just a policy language, but a broader expression language. It is used within larger policy projects in a way that allows users to augment their existing systems rather than adopt an entirely new one.

Formal verification often has challenges scaling to large amounts of data and is based on the belief that a formal proof of behavior is sufficient evidence of compliance. However, regulations and policies in natural language are often ambiguous. This means that logical translations of these policies are inherently ambiguous as well. CEL's approach to this challenge of compliance and reasoning about potentially ambiguous behaviors is to support evaluation over partial data at a very high speed and with minimal resources.

CEL is fast enough to be used in networking policies and expressive enough to be used in detailed application policies. Having a greater coverage of use cases improves your ability to reason about behavior holistically. But, what if you don't have all the data yet to know exactly how the policy will behave? CEL supports partial evaluation using four-valued logic which makes it possible to determine cases which are definitely allowed, denied, or where policy behavior is conditional on additional inputs. This allows for what-if analysis against historical data as well as against speculative data from new use cases or proposed policy changes.

Open Policy Agent's Rego is also open source, implemented in Golang and based on Datalog, which makes it possible to offer similar proofs as Cedar. However, the language is much more powerful than Cedar, and more powerful than CEL. This expressiveness means that OPA Rego is fantastic for non-latency critical, single-tenant solutions, but difficult to safely embed in existing offerings.


Four-valued Logic

CEL uses commutative logical operators that can render a true, false, error, or unknown status. This is a scalable alternative to formal verification and the expressiveness of Datalog. Four-valued logic allows CEL to evaluate over a partial set of inputs to deliver either a definitive result or communicate that more information is necessary to complete the evaluation.

What is four-valued logic?

True, false, and error outcomes are considered definitive: no additional information will change the outcome of the expression. Consider the following case:

    1/0 != 0 && false

In traditional programming languages, this expression would be an error; however, in CEL the outcome is false.

Now consider the following case where an input variable, called unknown_var is marked as unknown:

    unknown_var && true

The outcome of this expression is UNKNOWN{unknown_var} indicating that once the variable is known, the evaluation can be safely completed. An unknown indicates what was missing, and alerts the user to fix the outcome with more information. This technique both borrows from and extends SQL three-valued predicate logic which uses TRUE, FALSE, and NULL with commutative logical operators. From a CEL perspective, the error state is akin to SQL NULL that arises when there is an absence of information.


CEL compatibility with SQL

CEL leverages SQL semantics to ensure that it can be seamlessly translated to SQL. SQL optimizers perform significantly better over large data sets, making it possible to evaluate over data at rest. Imagine trying to scale a single expression evaluation over tens of millions of records. Even if the expression evaluates within a single microsecond, the evaluation would still take tens of seconds. The more complex the expression, the greater the latency. SQL excels at this use case, so translation from CEL to SQL was an important factor in the design in order to unlock the possibility of performant policy checks both online with CEL and offline with SQL.


Thank you CEL Community!

We’re proud to announce cel.dev as a major milestone in the maturity and adoption of the language, and we look forward to working with you to make CEL the best building block for writing latency-critical, portable logic. Feel free to contact us at [email protected]

By Tristan Swadell – Senior Staff Software Engineer

Kubernetes CRD Validation Using CEL

Motivation

CRDs was used to support two major categories of built-in validation:

  • CRD structural schemas: Provide type checking of custom resources against schemas.
  • OpenAPIv3 validation rules: Provide regex ('pattern' property), range limits ('minimum' and 'maximum' properties) on individual fields and size limits on maps and lists ('minItems', 'maxItems').

For use cases that cannot be covered by build-in validation support:

  • Admission Webhooks: have validating admission webhook for further validation
  • Custom validators: write custom checks in several languages such as Rego

While admission webhooks do support CRDs validation, they significantly complicate the development and operability of CRDs.

To provide an self-contained, in-process validation, an inline expression language - Common Expression Language (CEL), is introduced into CRDs such that a much larger portion of validation use cases can be solved without the use of webhooks.

It is sufficiently lightweight and safe to be run directly in the kube-apiserver, has a straight-forward and unsurprising grammar, and supports pre-parsing and typechecking of expressions, allowing syntax and type errors to be caught at CRD registration time.


CRD validation rule

CRD validation rules are promoted to GA in Kubernetes 1.29 to validate custom resources based on validation rules.

Validation rules use the Common Expression Language (CEL) to validate custom resource values. Validation rules are included in CustomResourceDefinition schemas using the x-kubernetes-validations extension.

The Rule is scoped to the location of the x-kubernetes-validations extension in the schema. And self variable in the CEL expression is bound to the scoped value.

All validation rules are scoped to the current object: no cross-object or stateful validation rules are supported.

For example:

...
  openAPIV3Schema:
    type: object
    properties:
      spec:
        type: object
        x-kubernetes-validations:
          - rule: "self.minReplicas <= self.replicas"
            message: "replicas should be greater than or equal to minReplicas."
          - rule: "self.replicas <= self.maxReplicas"
            message: "replicas should be smaller than or equal to maxReplicas."
        properties:
          ...
          minReplicas:
            type: integer
          replicas:
            type: integer
          maxReplicas:
            type: integer
        required:
          - minReplicas
          - replicas
          - maxReplicas

will reject a request to create this custom resource:

apiVersion: "stable.example.com/v1"


kind: CronTab


metadata:


 name: my-new-cron-object


spec:


 minReplicas: 0


 replicas: 20


 maxReplicas: 10

with the response:

The CronTab "my-new-cron-object" is invalid:
* spec: Invalid value: map[string]interface {}{"maxReplicas":10, "minReplicas":0, "replicas":20}: replicas should be smaller than or equal to maxReplicas.

x-kubernetes-validations could have multiple rules. The rule under x-kubernetes-validations represents the expression which will be evaluated by CEL. The message represents the message displayed when validation fails.

Note: You can quickly test CEL expressions in CEL Playground.

Validation rules are compiled when CRDs are created/updated. The request of CRDs create/update will fail if compilation of validation rules fail. Compilation process includes type checking as well.

Validation rules support a wide range of use cases. To get a sense of some of the capabilities, let's look at a few examples:

Validation Rule

Purpose

self.minReplicas <= self.replicas

Validate an integer field is less than or equal to another integer field

'Available' in self.stateCounts

Validate an entry with the 'Available' key exists in a map

self.set1.all(e, !(e in self.set2))

Validate that the elements of two sets are disjoint

self == oldSelf

Validate that a required field is immutable once it is set

self.created + self.ttl < self.expired

Validate that 'expired' date is after a 'create' date plus a 'ttl' duration

Validation rules are expressive and flexible. See the Validation Rules documentation to learn more about what validation rules are capable of.


CRD transition rules

Transition Rules make it possible to compare the new state against the old state of a resource in validation rules. You use transition rules to make sure that the cluster's API server does not accept invalid state transitions. A transition rule is a validation rule that references 'oldSelf'. The API server only evaluates transition rules when both an old value and new value exist.

Transition rule examples:

Transition Rule

Purpose

self == oldSelf

For a required field, make that field immutable once it is set. For an optional field, only allow transitioning from unset to set, or from set to unset.

(on parent of field) has(self.field) == has(oldSelf.field)

on field: self == oldSelf

Make a field immutable: validate that a field, even if optional, never changes after the resource is created (for a required field, the previous rule is simpler).

self.all(x, x in oldSelf)

Only allow adding items to a field that represents a set (prevent removals).

self >= oldSelf

Validate that a number is monotonically increasing.

Using the Functions Libraries

Validation rules have access to a couple different function libraries:

Examples of function libraries in use:

Validation Rule

Purpose

!(self.getDayOfWeek() in [0, 6]

Validate that a date is not a Sunday or Saturday.

isUrl(self) && url(self).getHostname() in [a.example.com', 'b.example.com']

Validate that a URL has an allowed hostname.

self.map(x, x.weight).sum() == 1

Validate that the weights of a list of objects sum to 1.

int(self.find('^[0-9]*')) < 100

Validate that a string starts with a number less than 100

self.isSorted()

Validate that a list is sorted

Resource use and limits

To prevent CEL evaluation from consuming excessive compute resources, validation rules impose some limits. These limits are based on CEL cost units, a platform and machine independent measure of execution cost. As a result, the limits are the same regardless of where they are enforced.


Estimated cost limit

CEL is, by design, non-Turing-complete in such a way that the halting problem isn’t a concern. CEL takes advantage of this design choice to include an "estimated cost" subsystem that can statically compute the worst case run time cost of any CEL expression. Validation rules are integrated with the estimated cost system and disallow CEL expressions from being included in CRDs if they have a sufficiently poor (high) estimated cost. The estimated cost limit is set quite high and typically requires an O(n2) or worse operation, across something of unbounded size, to be exceeded. Fortunately the fix is usually quite simple: because the cost system is aware of size limits declared in the CRD's schema, CRD authors can add size limits to the CRD's schema (maxItems for arrays, maxProperties for maps, maxLength for strings) to reduce the estimated cost.

Good practice:

Set maxItems, maxProperties and maxLength on all array, map (object with additionalProperties) and string types in CRD schemas! This results in lower and more accurate estimated costs and generally makes a CRD safer to use.


Runtime cost limits for CRD validation rules

In addition to the estimated cost limit, CEL keeps track of actual cost while evaluating a CEL expression and will halt execution of the expression if a limit is exceeded.

With the estimated cost limit already in place, the runtime cost limit is rarely encountered. But it is possible. For example, it might be encountered for a large resource composed entirely of a single large list and a validation rule that is either evaluated on each element in the list, or traverses the entire list.

CRD authors can ensure the runtime cost limit will not be exceeded in much the same way the estimated cost limit is avoided: by setting maxItems, maxProperties and maxLength on array, map and string types.


Adoption and Related work

This feature has been turned on by default since Kubernetes 1.25 and finally graduated to GA in Kubernetes 1.29. It raised a lot of interest and is now widely used in the Kubernetes ecosystem. We are excited to share that the Gateway API was able to replace all the validating webhook previously used with this feature.

After CEL was introduced into Kubernetes, we are excited to expand the power to multiple areas including the Admission Chain and authorization config. We will have a separate blog to introduce further.

We look forward to working with the community on the adoption of CRD Validation Rules, and hope to see this feature promoted to general availability in upcoming Kubernetes releases.


Acknowledgements

Special thanks to Joe Betz, Kermit Alexander, Ben Luddy, Jordan Liggitt, David Eads, Daniel Smith, Dr. Stefan Schimanski, Leila Jalali and everyone who contributed to CRD Validation Rules!

By Cici Huang – Software Engineer

Kubernetes CRD Validation Using CEL

Motivation

CRDs was used to support two major categories of built-in validation:

  • CRD structural schemas: Provide type checking of custom resources against schemas.
  • OpenAPIv3 validation rules: Provide regex ('pattern' property), range limits ('minimum' and 'maximum' properties) on individual fields and size limits on maps and lists ('minItems', 'maxItems').

For use cases that cannot be covered by build-in validation support:

  • Admission Webhooks: have validating admission webhook for further validation
  • Custom validators: write custom checks in several languages such as Rego

While admission webhooks do support CRDs validation, they significantly complicate the development and operability of CRDs.

To provide an self-contained, in-process validation, an inline expression language - Common Expression Language (CEL), is introduced into CRDs such that a much larger portion of validation use cases can be solved without the use of webhooks.

It is sufficiently lightweight and safe to be run directly in the kube-apiserver, has a straight-forward and unsurprising grammar, and supports pre-parsing and typechecking of expressions, allowing syntax and type errors to be caught at CRD registration time.


CRD validation rule

CRD validation rules are promoted to GA in Kubernetes 1.29 to validate custom resources based on validation rules.

Validation rules use the Common Expression Language (CEL) to validate custom resource values. Validation rules are included in CustomResourceDefinition schemas using the x-kubernetes-validations extension.

The Rule is scoped to the location of the x-kubernetes-validations extension in the schema. And self variable in the CEL expression is bound to the scoped value.

All validation rules are scoped to the current object: no cross-object or stateful validation rules are supported.

For example:

...
  openAPIV3Schema:
    type: object
    properties:
      spec:
        type: object
        x-kubernetes-validations:
          - rule: "self.minReplicas <= self.replicas"
            message: "replicas should be greater than or equal to minReplicas."
          - rule: "self.replicas <= self.maxReplicas"
            message: "replicas should be smaller than or equal to maxReplicas."
        properties:
          ...
          minReplicas:
            type: integer
          replicas:
            type: integer
          maxReplicas:
            type: integer
        required:
          - minReplicas
          - replicas
          - maxReplicas

will reject a request to create this custom resource:

apiVersion: "stable.example.com/v1"


kind: CronTab


metadata:


 name: my-new-cron-object


spec:


 minReplicas: 0


 replicas: 20


 maxReplicas: 10

with the response:

The CronTab "my-new-cron-object" is invalid:
* spec: Invalid value: map[string]interface {}{"maxReplicas":10, "minReplicas":0, "replicas":20}: replicas should be smaller than or equal to maxReplicas.

x-kubernetes-validations could have multiple rules. The rule under x-kubernetes-validations represents the expression which will be evaluated by CEL. The message represents the message displayed when validation fails.

Note: You can quickly test CEL expressions in CEL Playground.

Validation rules are compiled when CRDs are created/updated. The request of CRDs create/update will fail if compilation of validation rules fail. Compilation process includes type checking as well.

Validation rules support a wide range of use cases. To get a sense of some of the capabilities, let's look at a few examples:

Validation Rule

Purpose

self.minReplicas <= self.replicas

Validate an integer field is less than or equal to another integer field

'Available' in self.stateCounts

Validate an entry with the 'Available' key exists in a map

self.set1.all(e, !(e in self.set2))

Validate that the elements of two sets are disjoint

self == oldSelf

Validate that a required field is immutable once it is set

self.created + self.ttl < self.expired

Validate that 'expired' date is after a 'create' date plus a 'ttl' duration

Validation rules are expressive and flexible. See the Validation Rules documentation to learn more about what validation rules are capable of.


CRD transition rules

Transition Rules make it possible to compare the new state against the old state of a resource in validation rules. You use transition rules to make sure that the cluster's API server does not accept invalid state transitions. A transition rule is a validation rule that references 'oldSelf'. The API server only evaluates transition rules when both an old value and new value exist.

Transition rule examples:

Transition Rule

Purpose

self == oldSelf

For a required field, make that field immutable once it is set. For an optional field, only allow transitioning from unset to set, or from set to unset.

(on parent of field) has(self.field) == has(oldSelf.field)

on field: self == oldSelf

Make a field immutable: validate that a field, even if optional, never changes after the resource is created (for a required field, the previous rule is simpler).

self.all(x, x in oldSelf)

Only allow adding items to a field that represents a set (prevent removals).

self >= oldSelf

Validate that a number is monotonically increasing.

Using the Functions Libraries

Validation rules have access to a couple different function libraries:

Examples of function libraries in use:

Validation Rule

Purpose

!(self.getDayOfWeek() in [0, 6]

Validate that a date is not a Sunday or Saturday.

isUrl(self) && url(self).getHostname() in [a.example.com', 'b.example.com']

Validate that a URL has an allowed hostname.

self.map(x, x.weight).sum() == 1

Validate that the weights of a list of objects sum to 1.

int(self.find('^[0-9]*')) < 100

Validate that a string starts with a number less than 100

self.isSorted()

Validate that a list is sorted

Resource use and limits

To prevent CEL evaluation from consuming excessive compute resources, validation rules impose some limits. These limits are based on CEL cost units, a platform and machine independent measure of execution cost. As a result, the limits are the same regardless of where they are enforced.


Estimated cost limit

CEL is, by design, non-Turing-complete in such a way that the halting problem isn’t a concern. CEL takes advantage of this design choice to include an "estimated cost" subsystem that can statically compute the worst case run time cost of any CEL expression. Validation rules are integrated with the estimated cost system and disallow CEL expressions from being included in CRDs if they have a sufficiently poor (high) estimated cost. The estimated cost limit is set quite high and typically requires an O(n2) or worse operation, across something of unbounded size, to be exceeded. Fortunately the fix is usually quite simple: because the cost system is aware of size limits declared in the CRD's schema, CRD authors can add size limits to the CRD's schema (maxItems for arrays, maxProperties for maps, maxLength for strings) to reduce the estimated cost.

Good practice:

Set maxItems, maxProperties and maxLength on all array, map (object with additionalProperties) and string types in CRD schemas! This results in lower and more accurate estimated costs and generally makes a CRD safer to use.


Runtime cost limits for CRD validation rules

In addition to the estimated cost limit, CEL keeps track of actual cost while evaluating a CEL expression and will halt execution of the expression if a limit is exceeded.

With the estimated cost limit already in place, the runtime cost limit is rarely encountered. But it is possible. For example, it might be encountered for a large resource composed entirely of a single large list and a validation rule that is either evaluated on each element in the list, or traverses the entire list.

CRD authors can ensure the runtime cost limit will not be exceeded in much the same way the estimated cost limit is avoided: by setting maxItems, maxProperties and maxLength on array, map and string types.


Adoption and Related work

This feature has been turned on by default since Kubernetes 1.25 and finally graduated to GA in Kubernetes 1.29. It raised a lot of interest and is now widely used in the Kubernetes ecosystem. We are excited to share that the Gateway API was able to replace all the validating webhook previously used with this feature.

After CEL was introduced into Kubernetes, we are excited to expand the power to multiple areas including the Admission Chain and authorization config. We will have a separate blog to introduce further.

We look forward to working with the community on the adoption of CRD Validation Rules, and hope to see this feature promoted to general availability in upcoming Kubernetes releases.


Acknowledgements

Special thanks to Joe Betz, Kermit Alexander, Ben Luddy, Jordan Liggitt, David Eads, Daniel Smith, Dr. Stefan Schimanski, Leila Jalali and everyone who contributed to CRD Validation Rules!

By Cici Huang – Software Engineer