Author Archives: Sarah O'Rourke

Securing open-source: how Google supports the new Kubernetes bug bounty



At Google, we care deeply about the security of open-source projects, as they’re such a critical part of our infrastructure—and indeed everyone’s. Today, the Cloud-Native Computing Foundation (CNCF) announced a new bug bounty program for Kubernetes that we helped create and get up and running. Here’s a brief overview of the program, other ways we help secure open-source projects and information on how you can get involved.

Launching the Kubernetes bug bounty program

Kubernetes is a CNCF project. As part of its graduation criteria, the CNCF recently funded the project’s first security audit, to review its core areas and identify potential issues. The audit identified and addressed several previously unknown security issues. Thankfully, Kubernetes already had a Product Security Committee, including engineers from the Google Kubernetes Engine (GKE) security team, who respond to and patch any newly discovered bugs. But the job of securing an open-source project is never done. To increase awareness of Kubernetes’ security model, attract new security researchers, and reward ongoing efforts in the community, the Kubernetes Product Security Committee began discussions in 2018 about launching an official bug bounty program.

Find Kubernetes bugs, get paid

What kind of bugs does the bounty program recognize? Most of the content you’d think of as ‘core’ Kubernetes, included at https://github.com/kubernetes, is in scope. We’re interested in common kinds of security issues like remote code execution, privilege escalation, and bugs in authentication or authorization. Because Kubernetes is a community project, we’re also interested in the Kubernetes supply chain, including build and release processes that might allow a malicious individual to gain unauthorized access to commits, or otherwise affect build artifacts. This is a bit different from your standard bug bounty as there isn’t a ‘live’ environment for you to test—Kubernetes can be configured in many different ways, and we’re looking for bugs that affect any of those (except when existing configuration options could mitigate the bug). Thanks to the CNCF’s ongoing support and funding of this new program, depending on the bug, you can be rewarded with a bounty anywhere from $100 to $10,000.

The bug bounty program has been in a private release for several months, with invited researchers submitting bugs and to help us test the triage process. And today, the new Kubernetes bug bounty program is live! We’re excited to see what kind of bugs you discover, and are ready to respond to new reports. You can learn more about the program and how to get involved here.

Dedicated to Kubernetes security

Google has been involved in this new Kubernetes bug bounty from the get-go: proposing the program, completing vendor evaluations, defining the initial scope, testing the process, and onboarding HackerOne to implement the bug bounty solution. Though this is a big effort, it’s part of our ongoing commitment to securing Kubernetes. Google continues to be involved in every part of Kubernetes security, including responding to vulnerabilities as part of the Kubernetes Product Security Committee, chairing the sig-auth Kubernetes special interest group, and leading the aforementioned Kubernetes security audit. We realize that security is a critical part of any user’s decision to use an open-source tool, so we dedicate resources to help ensure we’re providing the best possible security for Kubernetes and GKE.

Although the Kubernetes bug bounty program is new, it isn’t a novel strategy for Google. We have enjoyed a close relationship with the security research community for many years and, in 2010, Google established our own Vulnerability Rewards Program (VRP). The VRP provides rewards for vulnerabilities reported in GKE and virtually all other Google Cloud services. (If you find a bug in GKE that isn’t specific to Kubernetes core, you should still report it to the Google VRP!) Nor is Kubernetes the only open-source project with a bug bounty program. In fact, we recently expanded our Patch Rewards program to provide financial rewards both upfront and after-the-fact for security improvements to open-source projects.

Help keep the world’s infrastructure safe. Report a bug to the Kubernetes bug bounty, or a GKE bug to the Google VRP.

Announcing updates to our Patch Rewards program in 2020


At Google, we strive to make the internet safer and that includes recognizing and rewarding security improvements that are vital to the health of the entire web. In 2020, we are building on this commitment by launching a new iteration of our Patch Rewards program for third-party open source projects.

Over the last six years, we have rewarded open source projects for security improvements after they have been implemented. While this has led to overall improved security, we want to take this one step further.

Introducing upfront financial help
Starting on January 1, 2020, we’re not only going to reward proactive security improvements after the work is completed, but we will also complement the program with upfront financial support to provide an additional resource for open source developers to prioritize security work. For example, if you are a small open source project and you want to improve security, but don’t have the necessary resources, this new reward can help you acquire additional development capacity.

We will start off with two support levels :
  • Small ($5,000): Meant to motivate and reward a project for fixing a small number of security issues. Examples: improvements to privilege separation or sandboxing, cleanup of integer artimetrics, or more generally fixing vulnerabilities identified in open source software by bug bounty programs such as EU-FOSSA 2 (see ‘Qualifying submissions’ here for more examples).
  • Large ($30,000): Meant to incentivize a larger project to invest heavily in security, e.g. providing support to find additional developers, or implement a significant new security feature (e.g. new compiler mitigations).
Nomination process

Anyone can nominate an open source project for support by filling out http://goo.gle/patchz-nomination. Our Patch Reward Panel will review submissions on a monthly basis and select a number of projects that meet the program criteria. The panel will let submitors know if a project has been chosen and will start working with the project maintainers directly.

Projects in scope

Any open source project can be nominated for support. When selecting projects, the panel will put an emphasis on projects that either are vital to the health of the Internet or are end-user projects with a large user base.

What do we expect in return?

We expect to see security improvements to open source software. Ideally, the project can provide us
with a short blurb or pointers to some of the completed work that was possible because of our support. We don’t want to add bureaucracy, but would like to measure the success of the program.
What about the existing Patch Rewards program?
This is an addition to the existing program, the current Patch Rewards program will continue as it stands today.

Protecting programmatic access to user data with Binary Authorization for Borg


At Google, the safety of user data is our paramount concern and we strive to protect it comprehensively. That includes protection from insider risk, which is the possible risk that employees could use their organizational knowledge or access to perform malicious acts. Insider risk also covers the scenario where an attacker has compromised the credentials of someone at Google to facilitate their attack. There are times when it’s necessary for our services and personnel to access user data as part of fulfilling our contractual obligations to you: as part of their role, such as user support; and programmatically, as part of a service. Today, we’re releasing a whitepaper, “Binary Authorization for Borg: how Google verifies code provenance and implements code identity,” that explains one of the mechanisms we use to protect user data from insider risks on Google's cluster management system Borg.

Binary Authorization for Borg is a deploy-time enforcement check

Binary Authorization for Borg, or BAB, is an internal deploy-time enforcement check that reduces insider risk by ensuring that production software and configuration deployed at Google is properly reviewed and authorized, especially when that code has the ability to access user data. BAB ensures that code and configuration deployments meet certain standards prior to being deployed. BAB includes both a deploy-time enforcement service to prevent unauthorized jobs from starting, and an audit trail of the code and configuration used in BAB-enabled jobs.

BAB ensures that Google's official software supply chain process is followed. First, a code change is reviewed and approved before being checked into Google's central source code repository. Next, the code is verifiably built and packaged using Google's central build system. This is done by creating the build in a secure sandbox and recording the package's origin in metadata for verification purposes. Finally, the job is deployed to Borg, with a job-specific identity. BAB rejects any package that lacks proper metadata, that did not follow the proper supply chain process, or that otherwise does not match the identity’s predefined policy.

Binary Authorization for Borg allows for several kinds of security checks

BAB can be used for many kinds of deploy-time security checks. Some examples include:
  • Is the binary built from checked in code?
  • Is the binary built verifiably?
  • Is the binary built from tested code?
  • Is the binary built from code intended to be used in the deployment?
After deployment, a job is continuously verified for its lifetime, to check that jobs that were started (and any that may still be running) conform to updates to their policies.

Binary Authorization for Borg provides other security benefits
Though the primary purpose of BAB is to limit the ability of a potentially malicious insider to run an unauthorized job that could access user data, BAB has other security benefits. BAB provides robust code identity for jobs in Google’s infrastructure, tying a job’s identity to specific code, and ensuring that only the specified code can be used to exercise the job identity’s privileges. This allows for a transition from a job identity—trusting an identity and any of its privileged human users transitively—to a code identity—trusting a specific piece of reviewed code to have specific semantics and which cannot be modified without an approval process.

BAB also dictates a common language for data protection, so that multiple teams can understand and meet the same requirements. Certain processes, such as those for financial reporting, need to meet certain change management requirements for compliance purposes. Using BAB, these checks can be automated, saving time and increasing the scope of coverage.

Binary Authorization for Borg is part of the BeyondProd model
BAB is one of several technologies used at Google to mitigate insider risk, and one piece of how we secure containers and microservices in production. By using containerized systems and verifying their BAB requirements prior to deployment, our systems are easier to debug, more reliable, and have a clearer change management process. More details on how Google has adopted a cloud-native security model are available in another whitepaper we are releasing today, “BeyondProd: A new approach to cloud-native security.”

In summary, implementing BAB, a deploy-time enforcement check, as part of Google’s containerized infrastructure and continuous integration and deployment (CI/CD) process has enabled us to verify that the code and configuration we deploy meet certain standards for security. Adopting BAB has allowed Google to reduce insider risk, prevent possible attacks, and also support the uniformity of our production systems. For more information about BAB, read our whitepaper, “Binary Authorization for Borg: how Google verifies code provenance and implements code identity.”

Additional contributors to this whitepaper include Kevin Chen, Software Engineer; Tim Dierks, Engineering Director; Maya Kaczorowski, Product Manager; Gary O’Connor, Technical Writing; Umesh Shankar, Principal Engineer; Adam Stubblefield, Distinguished Engineer; and Wilfried Teiken, Software Engineer; with special recognition to the entire Binary Authorization for Borg team for their ideation, engineering, and leadership

Detecting unsafe path access patterns with PathAuditor



#!/bin/sh
cat /home/user/foo


What can go wrong if this command runs as root? Does it change anything if foo is a symbolic link to /etc/shadow? How is the output going to be used?

Depending on the answers to the questions above, accessing files this way could be a vulnerability. The vulnerability exists in syscalls that operate on file paths, such as open, rename, chmod, or exec. For a vulnerability to be present, part of the path has to be user controlled and the program that executes the syscall has to be run at a higher privilege level. In a potential exploit, the attacker can substitute the path for a symlink and create, remove, or execute a file. In many cases, it's possible for an attacker to create the symlink before the syscall is executed.

At Google, we have been working on a solution to find these potentially problematic issues at scale: PathAuditor. In this blog post we'll outline the problem and explain how you can avoid it in your code with PathAuditor.

Let’s take a look at a real world example. The tmpreaper utility contained the following code to check if a directory is a mount point:
if ((dst = malloc(strlen(ent->d_name) + 3)) == NULL)
       message (LOG_FATAL, "malloc failed.\n");
strcpy(dst, ent->d_name);
strcat(dst, "/X");
rename(ent->d_name, dst);
if (errno == EXDEV) {
[...]


This code will call rename("/tmp/user/controlled", "/tmp/user/controlled/X"). Under the hood, the kernel will resolve the path twice, once for the first argument and once for the second, then perform some checks if the rename is valid and finally try to move the file from one directory to the other.

However, the problem is that the user can race the kernel code and replace the “/tmp/user/controlled” with a symlink just between the two path resolutions.

A successful attack would look roughly like this:
  • Make “/tmp/user/controlled” a file with controlled content.
  • The kernel resolves that path for the first argument to rename() and sees the file.
  • Replace “/tmp/user/controlled” with a symlink to /etc/cron.
  • The kernel resolves the path again for the second argument and ends up in /etc/cron.
  • If both the tmp and cron directories are on the filesystem, the kernel will move the attacker controlled file to /etc/cron, leading to code execution as root.
Can we find such bugs via automated analysis? Well, yes and no. As shown in the tmpreaper example, exploiting these bugs can require some creativity and it depends on the context if they’re vulnerabilities in the first place. Automated analysis can uncover instances of this access pattern and will gather as much information as it can to help with further investigation. However, it will also naturally produce false positives.

We can’t tell if a call to open(/user/controlled, O_RDONLY) is a vulnerability without looking at the context. It depends on whether the contents are returned to the user or are used in some security sensitive way. A call to chmod(/user/controlled, mode) depending on the mode can be either a DoS or a privilege escalation. Accessing files in sticky directories (like /tmp) can become vulnerabilities if the attacker found an additional bug to delete arbitrary files.

How Pathauditor works

To find issues like this at scale we wrote PathAuditor, a tool that monitors file accesses and logs potential vulnerabilities. PathAuditor is a shared library that can be loaded into processes using LD_PRELOAD. It then hooks all filesystem related libc functions and checks if the access is safe. For that, we traverse the path and check if any component could be replaced by an unprivileged user, for example if a directory is user-writable. If we detect such a pattern, we log it to syslog for manual analysis.

Here's how you can use it to find vulnerabilities in your code:
  • LD_PRELOAD the library to your binary and then analyse its findings in syslog. You can also add the library to /etc/ld.so.preload, which will preload it in all binaries running on the system.
  • It will then gather the PID and the command line of the calling process, arguments of the vulnerable function, and a stack trace -- this provides a starting point for further investigation. At this point, you can use the stack trace to find the code path that triggered the violation and manually analyse what would happen if you would point the path to an arbitrary file or directory.
  • For example, if the code is opening a file and returning the content to the user then you could use it to read arbitrary files. If you control the path of chmod or chown, you might be able to change the permissions of chosen files and so on.
PathAuditor has proved successful at Google and we're excited to share it with the community. The project is still in the early stages and we are actively working on it. We look forward to hearing about any vulnerabilities you discover with the tool, and hope to see pull requests with further improvements.

Try out the PathAuditor tool here.

Marta Rożek was a Google Summer intern in 2019 and contributed to this blog and the PathAuditor tool

Using a built-in FIDO authenticator on latest-generation Chromebooks



We previously announced that starting with Chrome 76, most latest-generation Chromebooks gained the option to enable a built-in FIDO authenticator backed by hardware-based Titan security. For supported services (e.g. G Suite, Google Cloud Platform), enterprise administrators can now allow end users to use the power button on these devices to protect against certain classes of account takeover attempts. This feature is disabled by default, however, administrators can enable it by changing DeviceSecondFactorAuthentication policy in the Google Admin console.

Before we dive deeper into this capability, let’s first cover the main use cases FIDO technology solves, and then explore how this new enhancement can satisfy an advanced requirement that can help enterprise organizations.

Main use cases
FIDO technology aims to solve three separate use cases for relying parties (or otherwise referred to as Internet services) by helping to:
  1. Prevent phishing during initial login to a service on a new device;
  2. Reverify a user’s identity to a service on a device on which they’ve already logged in to; 
  3. Confirm that the device a user is connecting from is still the original device where they logged in from previously. This is typically needed in the enterprise. 
Security-savvy professionals may interpret the third use case as a special instance of use case #2. However, there are some differences, which we break down a bit further below:
  • In case #2, the problem that FIDO technology tries to solve is re-verifying a user’s identity by unlocking a private key stored on the device.
  • In case #3, FIDO technology helps to determine whether a previously created key is still available on the original device without any proof of who the user is.
How use case #1 works: Roaming security keys 

Because the whole premise of this use case is one in which the user logs in on a brand new device they’ve never authenticated before, this requires the user to have a FIDO security key (removeable, cross-platform, or a roaming authenticator). By this definition, a built-in FIDO authenticator on Chrome OS devices would not be able to satisfy this requirement, because it would not be able to help verify the user’s identity without being set up previously. Upon initial log-in, the user’s identity is verified together with the presence of a security key (such as Google’s Titan Security Key) previously tied to their account.

Titan Security Keys 


Once the user is successfully logged in, trust is conferred from the security key to the device on which the user is logging on, usually by placing a cookie or other token on the device in order for the relying party to “remember” that the user already performed a second factor authenticator on this device. Once this step is completed, it is no longer necessary to require a physical second factor on this device because the presence of the cookie signals to the relying party that this device is to be trusted.

Optionally, some services might require the user to still periodically verify that it’s the correct user in front of the already recognized device (for example, particularly sensitive and regulated services such as financial services companies). In almost all cases, it shouldn’t be necessary for the user to also-in addition to providing their knowledge factor (such as a password) - re-present their second factor when re-authenticating as they’ve already done that during initial bootstrapping.

Note that on Chrome OS devices, your data is encrypted when you’re not logged on, which further protects your data against malicious access.

How use case #2 works: Re-authentication 

Frequently referred to as “re-authentication,” use case #2 allows a relying party to reverify that the same user is still interacting with the service from a previously verified device. This mainly happens when a user performs an action that’s particularly sensitive, such as changing their password or when interacting with regulated services, such as financial services companies. In this case, a built-in biometric authenticator (e.g. a fingerprint sensor or PIN on Android devices) can be registered, which offers users a more convenient way to re-verify their identity to the service in question. In fact, we have recently enabled this use case on Android devices for some Google services.

Additionally, there are security benefits to this particular solution, as the relying party doesn’t only have to trust a previously issued cookie, but can now both verify that the right user is present (by means of a biometric) and that a particular private key is available on this particular device. Sometimes this promise is made based on key material stored in hardware (e.g. Titan security in Pixel Slate), which can be a strong indicator that the relying party is interacting with the right user on the right device.

How use case #3 works: Built-in device authenticator

The challenge of verifying that a device a user has previously logged in on is still the device from which they’re interacting with the relying party, is what the built-in FIDO authenticator on most latest-generation Chromebooks is able to help solve.

Earlier we noted that upon initial log-in, relying parties regularly place cookies or tokens on a user’s device, so they can remember that a user has previously authenticated. Under some circumstances, such as when there’s malware present on a device, it might be possible for these tokens to be exfiltrated. Asking for the “touch of a built-in authenticator” at regular intervals helps the relying party know that the user is still interacting from a legitimate device which has previously been issued a token. It also helps verify that the token has not been exfiltrated to a different device since FIDO authenticators offer increased protection against the exfiltration of the private key. This is because it’s usually housed in the hardware itself. For example, in the case of most latest-generation Chromebooks (e.g. Pixel Slate), it’s protected by hardware-based Titan security.

Pixel Slate devices are built with hardware-based Titan security 

In the case of our implementation on Chrome OS, the FIDO keys are also scoped to the specific logged in user, meaning that every user on the device essentially gets their own FIDO authenticator that can’t be accessed across user boundaries. We expect this use case to be particularly useful in enterprise environments, which is why the feature is not enabled by default. Administrators can enable it in the Google Admin console.

We still highly recommend users to have a primary FIDO security key, such as Titan Security Key or an Android phone. This should be used in conjunction with a “FIDO re-authentication” policy, which is supported by G Suite.

Enabling the built-in FIDO authenticator in the Google Admin console

Even though it’s technically possible to register the built-in FIDO authenticator on a Chrome OS device as a “security key” with services, it’s best to avoid this instance as users can run an increased risk of account lockout if they ever need to sign in to the service from a different machine.

Supported Chromebooks
Starting with Chrome 76, most latest-generation Chromebooks gained the option to enable a built-in FIDO authenticator backed by hardware-based Titan security. To see if your Chromebook can be enabled with this capability, you can navigate to chrome://system and check the “tpm-version” entry. If “vendor” equals “43524f53”, then your Chromebook is backed by Titan security.

Navigating to chrome://system on your Chromebook

Summary

In summary, we believe that this new enhancement can provide value to enterprise organizations that want to confirm that the device a user is connecting from is still the original device from which a user logged in from in the past. Most users, however, should be using roaming FIDO security keys, such as Titan Security Key, their Android phone, or security keys from other vendors, in order to avoid account lockouts.

OpenTitan – open sourcing transparent, trustworthy, and secure silicon


Security begins with secure infrastructure. To have higher confidence in the security and integrity of the infrastructure, we need to anchor our trust at the foundation - in a special-purpose chip.

Today, along with our partners, we are excited to announce OpenTitan - the first open source silicon root of trust (RoT) project. OpenTitan will deliver a high-quality RoT design and integration guidelines for use in data center servers, storage, peripherals, and more. Open sourcing the silicon design makes it more transparent, trustworthy, and ultimately, secure.

The OpenTitan logo



Anchoring trust in silicon

Silicon RoT can help ensure that the hardware infrastructure and the software that runs on it remain in their intended, trustworthy state by verifying that the critical system components boot securely using authorized and verifiable code. Silicon RoT can provide many security benefits by helping to:
  • Ensure that a server or a device boots with the correct firmware and hasn't been infected by a low-level malware.
  • Provide a cryptographically unique machine identity, so an operator can verify that a server or a device is legitimate.
  • Protect secrets like encryption keys in a tamper-resistant way even for people with physical access (e.g., while a server or a device is being shipped).
  • Provide authoritative, tamper-evident audit records and other runtime security services.
The silicon RoT technology can be used in server motherboards, network cards, client devices (e.g., laptops, phones), consumer routers, IoT devices, and more. For example, Google has relied on a custom-made RoT chip, Titan, to help ensure that machines in Google’s data centers boot from a known trustworthy state with verified code; it is our system root of trust. Recognizing the importance of anchoring the trust in silicon, together with our partners we want to spread the benefits of reliable silicon RoT chips to our customers and the rest of the industry. We believe that the best way to accomplish that is through open source silicon.

Raising the transparency and security bar

Similar to open source software, open source silicon can:
  1. Enhance trust and security through design and implementation transparency. Issues can be discovered early, and the need for blind trust is reduced.
  2. Enable and encourage innovation through contributions to the open source design.
  3. Provide implementation choice and preserve a set of common interfaces and software compatibility guarantees through a common, open reference design.
The OpenTitan project is managed by the lowRISC CIC, an independent not-for-profit company with a full-stack engineering team based in Cambridge, UK, and is supported by a coalition of like-minded partners, including ETH Zurich, G+D Mobile Security, Google, Nuvoton Technology, and Western Digital.


The founding partners of the OpenTitan project


OpenTitan is an active engineering project staffed by a team of engineers representing a coalition of partners who bring ideas and expertise from many perspectives. We are transparently building the logical design of a silicon RoT, including an open source microprocessor (the lowRISC Ibex, a RISC-V-based design), cryptographic coprocessors, a hardware random number generator, a sophisticated key hierarchy, memory hierarchies for volatile and non-volatile storage, defensive mechanisms, IO peripherals, secure boot, and more. With OpenTitan, a coalition of partners have come together to deliver a more open, transparent, and high-quality RoT.



A comparison of the major design components of a traditional RoT and an OpenTitan RoT



The OpenTitan project is rooted in three key principles:
  • Transparency - anyone can inspect, evaluate, and contribute to OpenTitan’s design and documentation to help build more transparent, trustworthy silicon RoT for all.
  • High quality - we are building a high-quality logically-secure silicon design, including reference firmware, verification collateral, and technical documentation.
  • Flexibility - adopters can reduce costs and reach more customers by using a vendor- and platform-agnostic silicon RoT design that can be integrated into data center servers, storage, peripheral and other devices.

Participating in the OpenTitan project

OpenTitan will be helpful for chip manufacturers, platform providers, and security-conscious enterprise organizations that want to enhance their infrastructure with silicon-based security. Visit our GitHub repository today.

If you are interested in actively collaborating on OpenTitan to help make secure open source silicon a reality, we encourage you to contact the OpenTitan team. If you would like your product to be considered for a pilot OpenTitan RoT integration, the team would be excited to hear from you.

OpenTitan – open sourcing transparent, trustworthy, and secure silicon


Security begins with secure infrastructure. To have higher confidence in the security and integrity of the infrastructure, we need to anchor our trust at the foundation - in a special-purpose chip.

Today, along with our partners, we are excited to announce OpenTitan - the first open source silicon root of trust (RoT) project. OpenTitan will deliver a high-quality RoT design and integration guidelines for use in data center servers, storage, peripherals, and more. Open sourcing the silicon design makes it more transparent, trustworthy, and ultimately, secure.

The OpenTitan logo



Anchoring trust in silicon

Silicon RoT can help ensure that the hardware infrastructure and the software that runs on it remain in their intended, trustworthy state by verifying that the critical system components boot securely using authorized and verifiable code. Silicon RoT can provide many security benefits by helping to:
  • Ensure that a server or a device boots with the correct firmware and hasn't been infected by a low-level malware.
  • Provide a cryptographically unique machine identity, so an operator can verify that a server or a device is legitimate.
  • Protect secrets like encryption keys in a tamper-resistant way even for people with physical access (e.g., while a server or a device is being shipped).
  • Provide authoritative, tamper-evident audit records and other runtime security services.
The silicon RoT technology can be used in server motherboards, network cards, client devices (e.g., laptops, phones), consumer routers, IoT devices, and more. For example, Google has relied on a custom-made RoT chip, Titan, to help ensure that machines in Google’s data centers boot from a known trustworthy state with verified code; it is our system root of trust. Recognizing the importance of anchoring the trust in silicon, together with our partners we want to spread the benefits of reliable silicon RoT chips to our customers and the rest of the industry. We believe that the best way to accomplish that is through open source silicon.

Raising the transparency and security bar

Similar to open source software, open source silicon can:
  1. Enhance trust and security through design and implementation transparency. Issues can be discovered early, and the need for blind trust is reduced.
  2. Enable and encourage innovation through contributions to the open source design.
  3. Provide implementation choice and preserve a set of common interfaces and software compatibility guarantees through a common, open reference design.
The OpenTitan project is managed by the lowRISC CIC, an independent not-for-profit company with a full-stack engineering team based in Cambridge, UK, and is supported by a coalition of like-minded partners, including ETH Zurich, G+D Mobile Security, Google, Nuvoton Technology, and Western Digital.


The founding partners of the OpenTitan project


OpenTitan is an active engineering project staffed by a team of engineers representing a coalition of partners who bring ideas and expertise from many perspectives. We are transparently building the logical design of a silicon RoT, including an open source microprocessor (the lowRISC Ibex, a RISC-V-based design), cryptographic coprocessors, a hardware random number generator, a sophisticated key hierarchy, memory hierarchies for volatile and non-volatile storage, defensive mechanisms, IO peripherals, secure boot, and more. With OpenTitan, a coalition of partners have come together to deliver a more open, transparent, and high-quality RoT.



A comparison of the major design components of a traditional RoT and an OpenTitan RoT



The OpenTitan project is rooted in three key principles:
  • Transparency - anyone can inspect, evaluate, and contribute to OpenTitan’s design and documentation to help build more transparent, trustworthy silicon RoT for all.
  • High quality - we are building a high-quality logically-secure silicon design, including reference firmware, verification collateral, and technical documentation.
  • Flexibility - adopters can reduce costs and reach more customers by using a vendor- and platform-agnostic silicon RoT design that can be integrated into data center servers, storage, peripheral and other devices.

Participating in the OpenTitan project

OpenTitan will be helpful for chip manufacturers, platform providers, and security-conscious enterprise organizations that want to enhance their infrastructure with silicon-based security. Visit our GitHub repository today.

If you are interested in actively collaborating on OpenTitan to help make secure open source silicon a reality, we encourage you to contact the OpenTitan team. If you would like your product to be considered for a pilot OpenTitan RoT integration, the team would be excited to hear from you.

How Google adopted BeyondCorp: Part 4 (services)



Intro
This is the final post in a series of four, in which we set out to revisit various BeyondCorp topics and share lessons that were learnt along the internal implementation path at Google.

The first post in this series focused on providing necessary context for how Google adopted BeyondCorp, Google’s implementation of the zero trust security model. The second post focused on managing devices - how we decide whether or not a device should be trusted and why that distinction is necessary. The third post focused on tiered access - how to define access tiers and rules and how to simplify troubleshooting when things go wrong.

This post introduces the concept of gated services, how to identify and, subsequently, migrate them and the associated lessons we learned along the way.

High level architecture for BeyondCorp

Identifying and gating services

How do you identify and categorize all the services that should be gated?

Google began as a web-based company, and as it matured in the modern era, most internal business applications were developed with a web-first approach. These applications were hosted on similar internal architecture as our external services, with the exception that they could only be accessed on corporate office networks. Thus, identifying services to be gated by BeyondCorp was made easier for us due to the fact that most internal services were already properly inventoried and hosted via standard, central solutions. Migration, in many cases, was as simple as a DNS change. Solid IT asset inventory systems and maintenance are critical to migrating to a zero trust security model.

Enforcement of zero trust access policies began with services which we determined would not be meaningfully impacted by the change in access requirements. For most services, requirements could be gathered via typical access log analysis or consulting with service owners. Services which could not be readily gated by default ACL requirements required service owners to develop strict access groups and/or eliminate risky workflows before they could be migrated.

How do you know which trust tier is needed for every service?
As discussed in our previous blog post, Google makes internal services available based on device trust tiers. Today, those services are accessible by the highest trust tier by default.

When the intent of the change is to restrict access to a service to a specific group or team, service owners are free to propose access changes to add or remove restrictions to their service. Access changes which are deemed to be sufficiently low risk can be automatically approved. In all other cases, such as where the owning team wants to expose a service to a risky device tier, they must work with security engineers to follow the principle of least privilege and devise solutions.

What do you do with services that are incompatible with BeyondCorp ideals?

It may not always be possible to gate an application by the preferred zero trust solution. Services that cannot be easily gated typically fall into these categories:
  • Type 1: "Non-proxyable protocols", e.g. non-HTTP/HTTPS traffic.
  • Type 2: Low latency requirements or localized high throughput traffic.
  • Type 3: Administrative and emergency access networks.
The typical first step in finding a solution for these cases is finding a way to remove the need for that service altogether. In many cases, this was made possible by deprecating or replacing systems which could not be made compatible with the BeyondCorp implementation.

When that was not an option, we found that no single solution would work for all critical requirements:
  • Solutions for the "Type 1" traffic have generally involved maintaining a specialized client tunneling which strongly enforces authentication and authorization decisions on the client and the server end of the connection. This is usually client/server type traffic which is similar to HTTP traffic in that connectivity is typically multi-point to point.
  • Solutions to the "Type 2" problems generally rely on moving BeyondCorp-compatible compute resources locally or developing a solution tightly integrated with network access equipment to selectively forward "local" traffic without permanently opening network holes.
  • As for “Type 3,” it would be ideal to completely eliminate all privileged internal networks. However, the reality is that some privileged networking will likely always be required to maintain the network itself and also to provide emergency access during outages.
It should be noted that server-to-server traffic in secure production data center environments does not necessarily rely on BeyondCorp, although many systems are integrated regardless, due to the Service-Oriented Design benefits that BeyondCorp inherently provides. 

How do you prioritize gating?

Prioritization starts by identifying all the services that are currently accessible via internal IP-access alone and migrating the most critical services to BeyondCorp, while working to slowly ratchet down permissions via exception management processes. Criticality of the service may also depend on the number and type of users, sensitivity of data handled, security and privacy risks enabled by the service.

Migration logistics

Most services required integration testing with the BeyondCorp proxy. Service teams were encouraged to stand up "test" services which were used to test functionality behind the BeyondCorp proxy. Most services that performed their own access control enforcement were reconfigured to instead rely on BeyondCorp for all user/group authentication and authorization. Service teams have been encouraged to develop their own "fine-grained" discretionary access controls in the services by leveraging session data provided by the BeyondCorp proxy.

Lessons learnt

Allow coarse gating and exceptions

Inventory: It's easy to overlook the importance of keeping a good inventory of services, devices, owners and security exceptions. The journey to a BeyondCorp world should start by solving organizational challenges when managing and maintaining data quality in inventory systems. In short, knowing how a service works, who should access it, and what makes that acceptable are the central tenets of managing BeyondCorp. Fine-grained access control is severely complicated when this insight is missing.

Legacy protocols: Most large enterprises will inevitably need to support workflows and protocols which cannot be migrated to a BeyondCorp world (in any reasonable amount of time). Exception management and service inventory become crucial at this stage while stakeholders develop solutions.
Run highly reliable systems
The BeyondCorp initiative would not be sustainable at Google’s scale without the involvement of various Site Reliability Engineering (SRE) teams across the inventory systems, BeyondCorp infrastructure and client side solutions. The ability to successfully achieve wide-spread adoption of changes this large can be hampered by perceived (or in some cases, actual) reliability issues. Understanding the user workflows that might be impacted, working with key stakeholders and ensuring the transition is smooth and trouble-free for all users helps protect against backlash and avoids users finding undesirable workarounds. By applying our reliability engineering practices, those teams helped to ensure that the components of our implementation all have availability and latency targets, operational robustness, etc. These are compatible with our business needs and intended user experiences.

Put employees in control as much as possible

Employees cover a broad range of job functions with varying requirements of technology and tools. In addition to communicating changes to our employees early, we provide them with self-service solutions for handling exceptions or addressing issues affecting their devices. By putting our employees in control, we help to ensure that security mechanisms do not get in their way, helping with the acceptance and scaling processes.

Summary

Throughout this series of blog posts, we set out to revisit and demystify BeyondCorp, Google’s internal implementation of a zero trust security model. The four posts had different focus areas - setting context, devices, tiered access and, finally, services (this post).

If you want to learn more, you can check out the BeyondCorp research papers. In addition, getting started with BeyondCorp is now easier using zero trust solutions from Google Cloud (context-aware access) and other enterprise providers. Lastly, stay tuned for an upcoming BeyondCorp webinar on Cloud OnAir in a few months where you will be able to learn more and ask us questions. We hope that these blog posts, research papers, and webinars will help you on your journey to enable zero trust access.

Thank you to the editors of the BeyondCorp blog post series, Puneet Goel (Product Manager), Lior Tishbi (Program Manager), and Justin McWilliams (Engineering Manager).

How Google adopted BeyondCorp: Part 4 (services)



Intro
This is the final post in a series of four, in which we set out to revisit various BeyondCorp topics and share lessons that were learnt along the internal implementation path at Google.

The first post in this series focused on providing necessary context for how Google adopted BeyondCorp, Google’s implementation of the zero trust security model. The second post focused on managing devices - how we decide whether or not a device should be trusted and why that distinction is necessary. The third post focused on tiered access - how to define access tiers and rules and how to simplify troubleshooting when things go wrong.

This post introduces the concept of gated services, how to identify and, subsequently, migrate them and the associated lessons we learned along the way.

High level architecture for BeyondCorp

Identifying and gating services

How do you identify and categorize all the services that should be gated?

Google began as a web-based company, and as it matured in the modern era, most internal business applications were developed with a web-first approach. These applications were hosted on similar internal architecture as our external services, with the exception that they could only be accessed on corporate office networks. Thus, identifying services to be gated by BeyondCorp was made easier for us due to the fact that most internal services were already properly inventoried and hosted via standard, central solutions. Migration, in many cases, was as simple as a DNS change. Solid IT asset inventory systems and maintenance are critical to migrating to a zero trust security model.

Enforcement of zero trust access policies began with services which we determined would not be meaningfully impacted by the change in access requirements. For most services, requirements could be gathered via typical access log analysis or consulting with service owners. Services which could not be readily gated by default ACL requirements required service owners to develop strict access groups and/or eliminate risky workflows before they could be migrated.

How do you know which trust tier is needed for every service?
As discussed in our previous blog post, Google makes internal services available based on device trust tiers. Today, those services are accessible by the highest trust tier by default.

When the intent of the change is to restrict access to a service to a specific group or team, service owners are free to propose access changes to add or remove restrictions to their service. Access changes which are deemed to be sufficiently low risk can be automatically approved. In all other cases, such as where the owning team wants to expose a service to a risky device tier, they must work with security engineers to follow the principle of least privilege and devise solutions.

What do you do with services that are incompatible with BeyondCorp ideals?

It may not always be possible to gate an application by the preferred zero trust solution. Services that cannot be easily gated typically fall into these categories:
  • Type 1: "Non-proxyable protocols", e.g. non-HTTP/HTTPS traffic.
  • Type 2: Low latency requirements or localized high throughput traffic.
  • Type 3: Administrative and emergency access networks.
The typical first step in finding a solution for these cases is finding a way to remove the need for that service altogether. In many cases, this was made possible by deprecating or replacing systems which could not be made compatible with the BeyondCorp implementation.

When that was not an option, we found that no single solution would work for all critical requirements:
  • Solutions for the "Type 1" traffic have generally involved maintaining a specialized client tunneling which strongly enforces authentication and authorization decisions on the client and the server end of the connection. This is usually client/server type traffic which is similar to HTTP traffic in that connectivity is typically multi-point to point.
  • Solutions to the "Type 2" problems generally rely on moving BeyondCorp-compatible compute resources locally or developing a solution tightly integrated with network access equipment to selectively forward "local" traffic without permanently opening network holes.
  • As for “Type 3,” it would be ideal to completely eliminate all privileged internal networks. However, the reality is that some privileged networking will likely always be required to maintain the network itself and also to provide emergency access during outages.
It should be noted that server-to-server traffic in secure production data center environments does not necessarily rely on BeyondCorp, although many systems are integrated regardless, due to the Service-Oriented Design benefits that BeyondCorp inherently provides. 

How do you prioritize gating?

Prioritization starts by identifying all the services that are currently accessible via internal IP-access alone and migrating the most critical services to BeyondCorp, while working to slowly ratchet down permissions via exception management processes. Criticality of the service may also depend on the number and type of users, sensitivity of data handled, security and privacy risks enabled by the service.

Migration logistics

Most services required integration testing with the BeyondCorp proxy. Service teams were encouraged to stand up "test" services which were used to test functionality behind the BeyondCorp proxy. Most services that performed their own access control enforcement were reconfigured to instead rely on BeyondCorp for all user/group authentication and authorization. Service teams have been encouraged to develop their own "fine-grained" discretionary access controls in the services by leveraging session data provided by the BeyondCorp proxy.

Lessons learnt

Allow coarse gating and exceptions

Inventory: It's easy to overlook the importance of keeping a good inventory of services, devices, owners and security exceptions. The journey to a BeyondCorp world should start by solving organizational challenges when managing and maintaining data quality in inventory systems. In short, knowing how a service works, who should access it, and what makes that acceptable are the central tenets of managing BeyondCorp. Fine-grained access control is severely complicated when this insight is missing.

Legacy protocols: Most large enterprises will inevitably need to support workflows and protocols which cannot be migrated to a BeyondCorp world (in any reasonable amount of time). Exception management and service inventory become crucial at this stage while stakeholders develop solutions.
Run highly reliable systems
The BeyondCorp initiative would not be sustainable at Google’s scale without the involvement of various Site Reliability Engineering (SRE) teams across the inventory systems, BeyondCorp infrastructure and client side solutions. The ability to successfully achieve wide-spread adoption of changes this large can be hampered by perceived (or in some cases, actual) reliability issues. Understanding the user workflows that might be impacted, working with key stakeholders and ensuring the transition is smooth and trouble-free for all users helps protect against backlash and avoids users finding undesirable workarounds. By applying our reliability engineering practices, those teams helped to ensure that the components of our implementation all have availability and latency targets, operational robustness, etc. These are compatible with our business needs and intended user experiences.

Put employees in control as much as possible

Employees cover a broad range of job functions with varying requirements of technology and tools. In addition to communicating changes to our employees early, we provide them with self-service solutions for handling exceptions or addressing issues affecting their devices. By putting our employees in control, we help to ensure that security mechanisms do not get in their way, helping with the acceptance and scaling processes.

Summary

Throughout this series of blog posts, we set out to revisit and demystify BeyondCorp, Google’s internal implementation of a zero trust security model. The four posts had different focus areas - setting context, devices, tiered access and, finally, services (this post).

If you want to learn more, you can check out the BeyondCorp research papers. In addition, getting started with BeyondCorp is now easier using zero trust solutions from Google Cloud (context-aware access) and other enterprise providers. Lastly, stay tuned for an upcoming BeyondCorp webinar on Cloud OnAir in a few months where you will be able to learn more and ask us questions. We hope that these blog posts, research papers, and webinars will help you on your journey to enable zero trust access.

Thank you to the editors of the BeyondCorp blog post series, Puneet Goel (Product Manager), Lior Tishbi (Program Manager), and Justin McWilliams (Engineering Manager).

USB-C Titan Security Keys – available tomorrow in the US




Securing access to online accounts is critical for safeguarding private, financial, and other sensitive data online. Phishing - where an attacker tries to trick you into giving them your username and password - is one of the most common causes of data breaches. To protect user accounts, we’ve long made it a priority to offer users many convenient forms of 2-Step Verification (2SV), also known as two-factor authentication (2FA), in addition to Google’s automatic protections. These measures help to ensure that users are not relying solely on passwords for account security.

For users at higher risk (e.g., IT administrators, executives, politicians, activists) who need more effective protection against targeted attacks, security keys provide the strongest form of 2FA. To make this phishing-resistant security accessible to more people and businesses, we recently built this capability into Android phones, expanded the availability of Titan Security Keys to more regions (Canada, France, Japan, the UK), and extended Google’s Advanced Protection Program to the enterprise.

Starting tomorrow, you will have an additional option: Google’s new USB-C Titan Security Key, compatible with your Android, Chrome OS, macOS, and Windows devices.



USB-C Titan Security Key


We partnered with Yubico to manufacture the USB-C Titan Security Key. We have had a long-standing working and customer relationship with Yubico that began in 2012 with the collaborative effort to create the FIDO Universal 2nd Factor (U2F) standard, the first open standard to enable phishing-resistant authentication. This is the same security technology that we use at Google to protect access to internal applications and systems.

USB-C Titan Security Keys are built with a hardware secure element chip that includes firmware engineered by Google to verify the key’s integrity. This is the same secure element chip and firmware that we use in our existing USB-A/NFC and Bluetooth/NFC/USB Titan Security Key models manufactured in partnership with Feitian Technologies.

USB-C Titan Security Keys will be available tomorrow individually for $40 on the Google Store in the United States. USB-A/NFC and Bluetooth/NFC/USB Titan Security Keys will also become available individually in addition to the existing bundle. Bulk orders are available for enterprise organizations in select countries.


We highly recommend all users at a higher risk of targeted attacks to get Titan Security Keys and enroll into the Advanced Protection Program (APP), which provides Google’s industry-leading security protections to defend against evolving methods that attackers use to gain access to your accounts and data. You can also use Titan Security Keys for any site where FIDO security keys are supported for 2FA, including your personal or work Google Account, 1Password, Coinbase, Dropbox, Facebook, GitHub, Salesforce, Stripe, Twitter, and more.