Tag Archives: cloud

Extending support for App Engine bundled services (Module 17)

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Background

App Engine initially launched in 2008, providing a suite of bundled services making it convenient for applications to access a database (Datastore), caching service (Memcache), independent task execution (Task Queue), Google Sign-In authentication (Users), or large "blob" storage (Blobstore), or other companion services. However, apps leveraging those services can only run their apps on App Engine.

To increase app portability and help Google move towards its goal of having the most open cloud on the market, App Engine launched its 2nd-generation service in 2018, initially removing those legacy services. The newer platform allows developers to upgrade apps to the latest language runtimes, such as moving from Python 2 to 3 or Java 8 to 11 (and today, Java 17). One of the major drawbacks to the 1st-generation runtimes is that they're customized, proprietary, and restrictive in what you can use or can't.

Instead, the 2nd-generation platform uses open source runtimes, meaning ability to follow standard development practices, use common/known idioms, and have fewer restrictions of 3rd-party libraries, and obviating the need to copy or "vendor" them with your code. Unfortunately, to use these newer runtimes, migrating away from App Engine services were required because while you could upgrade language releases, there was no access to bundled services, breaking apps or requiring complete rewrites, making it a showstopper for many users.

Due to their popularity and the desire to ease the upgrade process for customers, the App Engine team restored access to most (but not all) of those services in Fall 2021. Today's Serverless Migration Station video demonstrates how to continue usage of bundled services available to Python 3 developers.

Showing App Engine users how to use bundled services on Python 3


Performing the upgrade

Modernizing the typical Python 2 App Engine app looks something like this:
  1. Migrate from the webapp2 framework (not available in Python 3)
  2. Port from Python 2 to 3, preserve use of bundled services
  3. Optional migration to Cloud standalone or similar 3rd-party services

The first step is to move to a standard Python web framework like Flask, Django, Pyramid, etc. Below is some pseudocode from Migration Module 1 demonstrating how to migrate from webapp2 to Flask:

codeblocks for porting Python 2 sample app from webapp2 to Flask
Step 1: Port Python 2 sample app from webapp2 to Flask

The key changes are bolded in the above code snippets. Notice how the App Engine NDB code [the Visit class definition plus store_visit() and fetch_visits() functions] are unaffected by this web framework migration. The full webapp2 code sample can be found in the Module 0 repo folder while the completed migration to Flask sample is located in the Module 1 repo folder.

After your app has ported frameworks, you're free to upgrade to Python 3 while preserving access to the bundled services if your app uses any. Below is pseudocode demonstrating how to upgrade the same sample app to Python 3 as well as the code changes needed to continue to use App Engine NDB:

codeblocks for porting sample app to Python 3, preserving use of NDB bundled service
Step 2: Port sample app to Python 3, preserving use of NDB bundled service

The original app was designed to work under both Python 2 and 3 interpreters, so no language changes were required in this case. We added an import of the new App Engine SDK followed by the key update wrapping the WSGI object so the app can access the bundled services. As before, the key updates are bolded. Some updates to configuration are also required, and those are outlined in the documentation and the (Module 17) codelab.

The NDB code is also left untouched in this migration. Not all of the bundled services feature such a hands-free migration, and we hope to cover some of the more complex ones ahead in Module 22. Java, PHP, and Go users have it even better, requiring fewer or no code changes at all. The Python 2 Flask sample is located in the Module 1 repo folder, and the resulting Python 3 app can be found in the Module 1b repo folder.

The immediate benefit of step two is the ability to upgrade to a more current version of language runtime. This leaves the third step of migrating off the bundled services as optional, especially if you plan on staying on App Engine for the long-term.


Additional options

If you decide to migrate off the bundled services, you can do so on your own timeline. It should be a consideration should you ever want to move to modern serverless platforms such as Cloud Functions or Cloud Run, to lower-level platforms because you want more control, like GKE, our managed Kubernetes service, or Compute Engine VMs.

Step three is also where the rest of the Serverless Migration Station content may be useful:

*code samples and codelabs available; videos forthcoming

As far as moving to modern serverless platforms, if you want to break apart a large App Engine app into multiple microservices, consider Cloud Functions. If your organization has added containerization as part of your software development workflow, consider Cloud Run. It's suitable for apps if you're familiar with containers and Docker, but even if you or your team don't have that experience, Cloud Buildpacks can do the heavy lifting for you. Here are the relevant migration modules to explore:


    Wrap-up

    Early App Engine users appreciate the convenience of the platform's bundled services, and after listening to user feedback, adding them back to 2nd-generation runtimes is another way we can help developers modernize their apps. Whether upgrading to newer language runtimes to stay on App Engine and continue to use its bundled services, migrating to Cloud standalone products, or shifting to other serverless platforms, the Google Cloud team aims to provide the tools to help streamline your modernization efforts.

    All Serverless Migration Station content (codelabs, videos, source code [when available]) can be accessed at its open source repo. While our content initially focuses on Python users, the Cloud team is working on covering other language runtimes, so stay tuned. Today's video features a special guest to provide a teaser of what to expect for Java. For additional video content, check out the broader Serverless Expeditions series.

    Migrating from App Engine Memcache to Cloud Memorystore (Module 13)

    Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

    Introduction and background

    The previous Module 12 episode of the Serverless Migration Station video series demonstrated how to add App Engine Memcache usage to an existing app that has transitioned from the webapp2 framework to Flask. Today's Module 13 episode continues its modernization by demonstrating how to migrate that app from Memcache to Cloud Memorystore. Moving from legacy APIs to standalone Cloud services makes apps more portable and provides an easier transition from Python 2 to 3. It also makes it possible to shift to other Cloud compute platforms should that be desired or advantageous. Developers benefit from upgrading to modern language releases and gain added flexibility in application-hosting options.

    While App Engine Memcache provides a basic, low-overhead, serverless caching service, Cloud Memorystore "takes it to the next level" as a standalone product. Rather than a proprietary caching engine, Cloud Memorystore gives users the option to select from a pair of open source engines, Memcached or Redis, each of which provides additional features unavailable from App Engine Memcache. Cloud Memorystore is typically more cost efficient at-scale, offers high availability, provides automatic backups, etc. On top of this, one Memorystore instance can be used across many applications as well as incorporates improvements to memory handling, configuration tuning, etc., gained from experience managing a huge fleet of Redis and Memcached instances.

    While Memcached is more similar to Memcache in usage/features, Redis has a much richer set of data structures that enable powerful application functionality if utilized. Redis has also been recognized as the most loved database by developers in StackOverflow's annual developers survey, and it's a great skill to pick up. For these reasons, we chose Redis as the caching engine for our sample app. However, if your apps' usage of App Engine Memcache is deeper or more complex, a migration to Cloud Memorystore for Memcached may be a better option as a closer analog to Memcache.

    Migrating to Cloud Memorystore for Redis featured video

    Performing the migration

    The sample application registers individual web page "visits," storing visitor information such as IP address and user agent. In the original app, the most recent visits are cached into Memcache for an hour and used for display if the same user continuously refreshes their browser during this period; caching is a one way to counter this abuse. New visitors or cache expiration results new visits as well as updating the cache with the most recent visits. Such functionality must be preserved when migrating to Cloud Memorystore for Redis.

    Below is pseudocode representing the core part of the app that saves new visits and queries for the most recent visits. Before, you can see how the most recent visits are cached into Memcache. After completing the migration, the underlying caching infrastructure has been swapped out in favor of Memorystore (via language-specific Redis client libraries). In this migration, we chose Redis version 5.0, and we recommend the latest versions, 5.0 and 6.x at the time of this writing, as the newest releases feature additional performance benefits, fixes to improve availability, and so on. In the code snippets below, notice how the calls between both caching systems are nearly identical. The bolded lines represent the migration-affected code managing the cached data.

    Switching from App Engine Memcache to Cloud Memorystore for Redis

    Wrap-up

    The migration covered begins with the Module 12 sample app ("START"). Migrating the caching system to Cloud Memorystore and other requisite updates results in the Module 13 sample app ("FINISH") along with an optional port to Python 3. To practice this migration on your own to help prepare for your own migrations, follow the codelab to do it by-hand while following along in the video.

    While the code migration demonstrated seems straightforward, the most critical change is that Cloud Memorystore requires dedicated server instances. For this reason, a Serverless VPC connector is also needed to connect your App Engine app to those Memorystore instances, requiring more dedicated servers. Furthermore, neither Cloud Memorystore nor Cloud VPC are free services, and neither has an "Always free" tier quota. Before moving forward this migration, check the pricing documentation for Cloud Memorystore for Redis and Serverless VPC access to determine cost considerations before making a commitment.

    One key development that may affect your decision: In Fall 2021, the App Engine team extended support of many of the legacy bundled services like Memcache to next-generation runtimes, meaning you are no longer required to migrate to Cloud Memorystore when porting your app to Python 3. You can continue using Memcache even when upgrading to 3.x so long as you retrofit your code to access bundled services from next-generation runtimes.

    A move to Cloud Memorystore and today's migration techniques will be here if and when you decide this is the direction you want to take for your App Engine apps. All Serverless Migration Station content (codelabs, videos, source code [when available]) can be accessed at its open source repo. While our content initially focuses on Python users, we plan to cover other language runtimes, so stay tuned. For additional video content, check out our broader Serverless Expeditions series.

    How can App Engine users take advantage of Cloud Functions?

    Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

    Introduction

    Recently, we discussed containerizing App Engine apps for Cloud Run, with or without Docker. But what about Cloud Functions… can App Engine users take advantage of that platform somehow? Back in the day, App Engine was always the right decision, because it was the only option. With Cloud Functions and Cloud Run joining in the serverless product suite, that's no longer the case.

    Back when App Engine was the only choice, it was selected to host small, single-function apps. Yes, when it was the only option. Other developers have created huge monolithic apps for App Engine as well… because it was also the only option. Fast forward to today where code follows more service-oriented or event-driven architectures. Small apps can be moved to Cloud Functions to simplify the code and deployments while large apps could be split into smaller components, each running on Cloud Functions.

    Refactoring App Engine apps for Cloud Functions

    Small, single-function apps can be seen as a microservice, an API endpoint "that does something," or serve some utility likely called as a result of some event in a larger multi-tiered application, say to update a database row or send a customer email message. App Engine apps require some kind web framework and routing mechanism while Cloud Function equivalents can be freed from much of those requirements. Refactoring these types of App Engine apps for Cloud Functions will like require less overhead, helps ease maintenance, and allow for common components to be shared across applications.

    Large, monolithic applications are often made up of multiple pieces of functionality bundled together in one big package, such as requisitioning a new piece of equipment, opening a customer order, authenticating users, processing payments, performing administrative tasks, and so on. By breaking this monolith up into multiple microservices into individual functions, each component can then be reused in other apps, maintenance is eased because software bugs will identify code closer to their root origins, and developers won't step on each others' toes.

    Migration to Cloud Functions

    In this latest episode of Serverless Migration Station, a Serverless Expeditions mini-series focused on modernizing serverless apps, we take a closer look at this product crossover, covering how to migrate App Engine code to Cloud Functions. There are several steps you need to take to prepare your code for Cloud Functions:

    • Divest from legacy App Engine "bundled services," e.g., Datastore, Taskqueue, Memcache, Blobstore, etc.
    • Cloud Functions supports modern runtimes; upgrade to Python 3, Java 11, or PHP 7
    • If your app is a monolith, break it up into multiple independent functions. (You can also keep a monolith together and containerize it for Cloud Run as an alternative.)
    • Make appropriate application updates to support Cloud Functions

      The first three bullets are outside the scope of this video and its codelab, so we'll focus on the last one. The changes needed for your app include the following:

      1. Remove unneeded and/or unsupported configuration
      2. Remove use of the web framework and supporting routing code
      3. For each of your functions, assign an appropriate name and install the request object it will receive when it is called.

      Regarding the last point, note that you can have multiple "endpoints" coming into a single function which processes the request path, calling other functions to handle those routes. If you have many functions in your app, separate functions for every endpoint becomes unwieldy; if large enough, your app may be more suited for Cloud Run. The sample app in this video and corresponding code sample only has one function, so having a single endpoint for that function works perfectly fine here.

      This migration series focuses on our earliest users, starting with Python 2. Regarding the first point, the app.yaml file is deleted. Next, almost all Flask resources are removed except for the template renderer (the app still needs to output the same HTML as the original App Engine app). All app routes are removed, and there's no instantiation of the Flask app object. Finally for the last step, the main function is renamed more appropriately to visitme() along with a request object parameter.

      This "migration module" starts with the (Python 3 version of the) Module 2 sample app, applies the steps above, and arrives at the migrated Module 11 app. Implementing those required changes is illustrated by this code "diff:"

      Migration of sample app to Cloud Functions

      Next steps

      If you're interested in trying this migration on your own, feel free to try the corresponding codelab which leads you step-by-step through this exercise and use the video for additional guidance.

      All migration modules, their videos (when published), codelab tutorials, START and FINISH code, etc., can be found in the migration repo. We hope to also one day cover other legacy runtimes like Java 8 as well as content for the next-generation Cloud Functions service, so stay tuned. If you're curious whether it's possible to write apps that can run on App Engine, Cloud Functions, or Cloud Run with no code changes at all, the answer is yes. Hope this content is useful for your consideration when modernizing your own serverless applications!

    Google Cloud projects: Tips and best practices

    By Peter Jacobsen, Google Technical Writer

    Least privilege

    Always apply the principle of least privilege when you provide access to Google Cloud resources. The best practice is to grant only the most limited predefined roles or custom roles that meet your needs.

    For more information, see Least privilege.

    Google Cloud billing alerts

    Set up Google Cloud billing alerts for your projects at specified intervals for early warning of usage patterns, and to help reduce costs.

    For more information, see Create, edit, or delete budgets and budget alerts.

    API quotas

    API quotas protect the Google infrastructure from excessive API requests. Traffic is blocked when the level of requests reaches the daily API quota level or a per-user rate limit.

    To avoid disruptions due to an API quota level that's too low, set the quota for your app or API appropriately. Note that the lead time for the increase of quotas is one month.

    For more information, see API Quotas.

    Checklist for production-ready enterprise workloads

    Use this checklist to set up scalable, production-ready enterprise workloads. Note that the checklist assumes that you're an administrator with control over your company's Google Cloud resources.

    For more information, see Google Cloud setup checklist.

    Google Workspace domain ownership of projects

    Google Workspace domain ownership of your group's project lets you tie it into a Google Workspace account, rather than have it tied to a personal account.

    For more information, see Best practices for planning accounts and organizations.

    Identity-Aware Proxy (IAP)

    IAP lets you hide your website until you’re ready for people to see it. IAP establishes a central authorization layer for apps accessed by HTTPS, so you can adopt an app-level access-control model rather than use network-level firewalls. When IAP protects an app or resource, only users who have the correct Identity and Access Management (IAM) role can access it through the proxy.

    For more information, see Identity-Aware Proxy overview.

    Cloud Build

    Cloud Build can import source code from a variety of repositories or cloud storage spaces, execute a build to your specifications, and produce artifacts, such as Docker containers or Java archives. You can configure builds to fetch dependencies and run unit tests, static analyzes, and integration tests.

    For more information, see Cloud Build.

    Useful Google Cloud tools and services

    Google Cloud has many tools and services that can help you create and keep your projects in sync, such as:

    • Cloud Build: executes your builds on Google Cloud infrastructure.
    • Google Cloud Deploy: deploys releases continuously to Google Kubernetes Engine.
    • Container Registry: provides a single place for your team to manage Docker images and control access.
    • Artifact Registry: provides a single place for your organization to manage container images and language packages, such as Maven and npm.
    • Cloud Source Repositories: provides a single place for your team to store, manage, and track code.
    • Cloud Deployment Manager: automates the creation and management of Google Cloud resources.

    Google Groups for management across projects

    Google Groups can help you manage teams across projects, which includes the setup of the group access through IAM. Groups such as project teams, departments, or classmates can communicate and collaborate with Google Groups. If you want to invite a group to an event or share documents with a group, you can send a single email to everyone in the group.

    For more information about how to set up a group, see Google Groups.

    Watch for Google suggestions

    Google provides many useful tips and suggestions for best practices within the context of your work. For example, if you go to a project that you haven't used in a while, you may get a warning like this one:

    If you click the link, you see a page that tells you how to apply role recommendations to help you enforce the principle of least privilege to ensure that principals have only the permissions that they actually need. Google offers many suggestions for best practices such as this one, so watch for them as you work.

    Here's an example of a useful in-console recommendation that you might see from your billing page:

    If you click Learn more, you arrive at a Cloud billing checklist, which is part of a longer billing-specific checklist that you might find useful.

    Here's another example found on the API & Services page:

    If you click Edit settings, you arrive on a page where you can change the settings.

    Prediction Framework, a time saver for Data Science prediction projects

    Posted by Álvaro Lamas, Héctor Parra, Jaime Martínez, Julia Hernández, Miguel Fernandes, Pablo Gil

    Acquiring high value customers using predicted Lifetime Value, taking specific actions on high propensity of churn users, generating and activating audiences based on machine learning processed signals…All of those marketing scenarios require of analyzing first party data, performing predictions on the data and activating the results into the different marketing platforms like Google Ads as frequently as possible to keep the data fresh.

    Feeding marketing platforms like Google Ads on a regular and frequent basis, requires a robust, report oriented and cost reduced ETL & prediction pipeline. These pipelines are very similar regardless of the use case and it’s very easy to fall into reinventing the wheel every time or manually copy & paste structural code increasing the risk of introducing errors.

    Wouldn't it be great to have a common reusable structure and just add the specific code for each of the stages?

    Here is where Prediction Framework plays a key role in helping you implement and accelerate your first-party data prediction projects by providing the backbone elements of the predictive process.

    Prediction Framework is a fully customizable pipeline that allows you to simplify the implementation of prediction projects. You only need to have the input data source, the logic to extract and process the data and a Vertex AutoML model ready to use along with the right feature list, and the framework will be in charge of creating and deploying the required artifacts. With a simple configuration, all the common artifacts of the different stages of this type of projects will be created and deployed for you: data extraction, data preparation (aka feature engineering), filtering, prediction and post-processing, in addition to some other operational functionality including backfilling, throttling (for API limits), synchronization, storage and reporting.

    The Prediction Framework was built to be hosted in the Google Cloud Platform and it makes use of Cloud Functions to do all the data processing (extraction, preparation, filtering and post-prediction processing), Firestore, Pub/Sub and Schedulers for the throttling system and to coordinate the different phases of the predictive process, Vertex AutoML to host your machine learning model and BigQuery as the final storage of your predictions.

    Prediction Framework Architecture

    To get involved and start using the Prediction Framework, a configuration file needs to be prepared with some environment variables about the Google Cloud Project to be used, the data sources, the ML model to make the predictions and the scheduler for the throttling system. In addition, custom queries for the data extraction, preparation, filtering and post-processing need to be added in the deploy files customization. Then, the deployment is done automatically using a deployment script provided by the tool.

    Once deployed, all the stages will be executed one after the other, storing the intermediate and final data in the BigQuery tables:

    • Extract: this step will, on a timely basis, query the transactions from the data source, corresponding to the run date (scheduler or backfill run date) and will store them in a new table into the local project BigQuery.
    • Prepare: immediately after the extract of the transactions for one specific date is available, the data will be picked up from the local BigQuery and processed according to the specs of the model. Once the data is processed, it will be stored in a new table into the local project BigQuery.
    • Filter: this step will query the data stored by the prepare process and will filter the required data and store it into the local project BigQuery. (i.e only taking into consideration new customers transactionsWhat a new customer is up to the instantiation of the framework for the specific use case. Will be covered later).
    • Predict: once the new customers are stored, this step will read them from BigQuery and call the prediction using Vertex API. A formula based on the result of the prediction could be applied to tune the value or to apply thresholds. Once the data is ready, it will be stored into the BigQuery within the target project.
    • Post_process: A formula could be applied to the AutoML batch results to tune the value or to apply thresholds. Once the data is ready, it will be stored into the BigQuery within the target project.

    One of the powerful features of the prediction framework is that it allows backfilling directly from the BigQuery user interface, so in case you’d need to reprocess a whole period of time, it could be done in literally 4 clicks.

    In summary: Prediction Framework simplifies the implementation of first-party data prediction projects, saving time and minimizing errors of manual deployments of recurrent architectures.

    For additional information and to start experimenting, you can visit the Prediction Framework repository on Github.

    Prediction Framework, a time saver for Data Science prediction projects

    Posted by Álvaro Lamas, Héctor Parra, Jaime Martínez, Julia Hernández, Miguel Fernandes, Pablo Gil

    Acquiring high value customers using predicted Lifetime Value, taking specific actions on high propensity of churn users, generating and activating audiences based on machine learning processed signals…All of those marketing scenarios require of analyzing first party data, performing predictions on the data and activating the results into the different marketing platforms like Google Ads as frequently as possible to keep the data fresh.

    Feeding marketing platforms like Google Ads on a regular and frequent basis, requires a robust, report oriented and cost reduced ETL & prediction pipeline. These pipelines are very similar regardless of the use case and it’s very easy to fall into reinventing the wheel every time or manually copy & paste structural code increasing the risk of introducing errors.

    Wouldn't it be great to have a common reusable structure and just add the specific code for each of the stages?

    Here is where Prediction Framework plays a key role in helping you implement and accelerate your first-party data prediction projects by providing the backbone elements of the predictive process.

    Prediction Framework is a fully customizable pipeline that allows you to simplify the implementation of prediction projects. You only need to have the input data source, the logic to extract and process the data and a Vertex AutoML model ready to use along with the right feature list, and the framework will be in charge of creating and deploying the required artifacts. With a simple configuration, all the common artifacts of the different stages of this type of projects will be created and deployed for you: data extraction, data preparation (aka feature engineering), filtering, prediction and post-processing, in addition to some other operational functionality including backfilling, throttling (for API limits), synchronization, storage and reporting.

    The Prediction Framework was built to be hosted in the Google Cloud Platform and it makes use of Cloud Functions to do all the data processing (extraction, preparation, filtering and post-prediction processing), Firestore, Pub/Sub and Schedulers for the throttling system and to coordinate the different phases of the predictive process, Vertex AutoML to host your machine learning model and BigQuery as the final storage of your predictions.

    Prediction Framework Architecture

    To get involved and start using the Prediction Framework, a configuration file needs to be prepared with some environment variables about the Google Cloud Project to be used, the data sources, the ML model to make the predictions and the scheduler for the throttling system. In addition, custom queries for the data extraction, preparation, filtering and post-processing need to be added in the deploy files customization. Then, the deployment is done automatically using a deployment script provided by the tool.

    Once deployed, all the stages will be executed one after the other, storing the intermediate and final data in the BigQuery tables:

    • Extract: this step will, on a timely basis, query the transactions from the data source, corresponding to the run date (scheduler or backfill run date) and will store them in a new table into the local project BigQuery.
    • Prepare: immediately after the extract of the transactions for one specific date is available, the data will be picked up from the local BigQuery and processed according to the specs of the model. Once the data is processed, it will be stored in a new table into the local project BigQuery.
    • Filter: this step will query the data stored by the prepare process and will filter the required data and store it into the local project BigQuery. (i.e only taking into consideration new customers transactionsWhat a new customer is up to the instantiation of the framework for the specific use case. Will be covered later).
    • Predict: once the new customers are stored, this step will read them from BigQuery and call the prediction using Vertex API. A formula based on the result of the prediction could be applied to tune the value or to apply thresholds. Once the data is ready, it will be stored into the BigQuery within the target project.
    • Post_process: A formula could be applied to the AutoML batch results to tune the value or to apply thresholds. Once the data is ready, it will be stored into the BigQuery within the target project.

    One of the powerful features of the prediction framework is that it allows backfilling directly from the BigQuery user interface, so in case you’d need to reprocess a whole period of time, it could be done in literally 4 clicks.

    In summary: Prediction Framework simplifies the implementation of first-party data prediction projects, saving time and minimizing errors of manual deployments of recurrent architectures.

    For additional information and to start experimenting, you can visit the Prediction Framework repository on Github.

    Knative applies to become a CNCF incubating project

    Image of the Knative logo

    In 2018, the Knative project was founded and released by Google, and was subsequently developed in close partnership with IBM, Red Hat, VMware, and SAP. The project provides a serverless experience layer on Kubernetes, providing the building blocks you need to build and deploy modern, container-based serverless applications. Over the last three years, Knative has become the most widely-installed serverless layer on Kubernetes. More recently, Knative 1.0 was released, reaching an important milestone that was made possible thanks to the contributions and collaboration of over 600 developers in the community.

    Google has worked closely with key maintainers and partners on the evolution of Knative, including conformance definition and stability ahead of the 1.0 milestone. To enable the next phase of community-driven innovation in Knative, today we have submitted Knative to the Cloud Native Computing Foundation (CNCF) for consideration as an incubating project, which begins the process to donate the Knative trademark, IP, and code.

    As a leader in serverless computing, we’re committed to the future of Knative, and offering Knative 1.0 conformant Cloud Run and Cloud Run For Anthos products. Finding a home in the CNCF secures Knative’s long-term future and encourages continuing and open innovation. This donation recognizes the adoption and investment in Knative from the community, and will encourage further multi-vendor innovation, broader education and training.

    At Google, we believe that using open source comes with a responsibility to contribute, sustain, and improve the projects that help drive innovation and make better software. We are excited to see how developers will continue to build and innovate in serverless using Knative.

    By Alexandra Bush and Edd Wilder-James, Google Open Source

    Four areas of open source contributions from Cloud Databases

    Open Cloud enables you to develop software faster, innovate more easily, and scale more efficiently—while also reducing technology risk. Google has a long history of leadership in open source, and today, I want to look back at our activities around open source projects, for databases, over the past year.

    Give developers the best tools to be efficient

    Developers choose to build applications with managed database services on Google Cloud to benefit from velocity, scalability, security, and performance. To enable you to be most efficient and deliver your best possible work, we deliver tools and frameworks that work with your preferred development environments, no matter if you develop in the cloud or on premises. To make local testing, building and continuous integration easier for our cloud-native databases, we released emulators for Cloud Spanner, Firestore, and Cloud Bigtable so that you can test your code wherever you develop it - without the need to create or re-create cloud infrastructure with every test run.

    Another area where we are helping developers is with instrumentation of Cloud SQL for easier debugging and performance tuning. With Cloud SQL Insights it is easier than ever to pinpoint underperforming SQL statements. That said, without additional instrumentation, it can be cumbersome to identify the source code or microservice that issued that SQL - let alone tying a SQL statement to a client session and its context. So we released Sqlcommenter as an open source library that will automatically add this instrumentation as SQL comments in queries that are generated by popular ORMs like Hibernate, Django, Sqlalchemy, and others (repo blog). We didn’t stop there, but merged Sqlcommenter with OpenTelemetry (blog) to add SQL insights from instrumented queries back to OpenTelemetry traces.

    Lastly, we want to broaden access to our differentiated offerings, like Spanner. The recently announced Spanner PostgreSQL interface allows organizations to access Spanner’s industry-leading consistency and availability at scale using tools and skills from the popular PostgreSQL ecosystem. This new way of working with Spanner provides familiarity for developers and portability for administrators. (blog) Learn more in the documentation or sign up for the preview today.

    Provide connectivity that is simple and secure

    Connecting to APIs and databases from an application running in the cloud should be simple and secure. That’s why we recommend using IAM and Application Default Credentials when authenticating to other services. The Cloud SQL Proxy (repo) has been doing this and also setting up firewalls for you for a while. It works by running a local client either inside your VM or a GKE cluster. This year, we added libraries for Java (repo) and Python (repo) that can provide similar functionality without the overhead of running an extra client such as the proxy.

    Cloud Spanner also offers an open source adapter for its new PostgreSQL interface (repo). This local proxy allows tools, starting with psql, to connect to a Spanner database using the PostgreSQL wire protocol.

    Image 1: White pipes in datacenter

    Manage cloud infrastructure with the tools of your choice

    When it comes to provisioning, monitoring, and managing your cloud database services, flexibility and choice are important. We provide you with our cloud console, gcloud cli, and APIs as well as our own Deployment Manager. That said, you may prefer different ways to manage cloud infrastructure - whether through interactive tools or scripts or embedded into CI/CD pipelines that support GitOps or other controls, checks, and balances. Terraform is one of those open tools that is very popular - and we ensure that our cloud databases can be managed from it as documented in this blog about creating Spanner instances with Terraform.

    If you manage the majority of your resources with Kubernetes either directly or through package managers like Helm, then our Kubernetes Config Connector (KCC) might be for you. In a nutshell, KCC exposes Google Cloud services such as Cloud SQL, Spanner, and others as Custom Resources in Kubernetes. This allows you to create and reconcile cloud resources outside of Kubernetes just like K8s native objects.

    Once you are managing cloud infrastructure with CI/CD, the next step is to extend that same mechanism to manage objects within your databases such as tables, indexes, and views. To that extent we have released a Liquibase extension for Cloud Spanner.

    Help you to move data with confidence

    Cloud journeys often involve moving data either in a lift and shift process or sometimes replatforming to a different database. Whatever your journey, we want to simplify the process and give you the confidence that your migration is successful.

    For enterprise users with Oracle databases, we have several open source projects. First, we have the Optimus Prime database assessment tool (repo) that queries your database and collects information about schemas and historic performance to be analyzed for migration complexity and consolidation potential. Our own professional services teams have been using this toolset to plan migrations to Bare Metal Solution for Oracle.

    Some Oracle users are looking for opportunities to transform their workloads to fit with their bigger strategy of modernizing applications with Kubernetes. For this group we developed and open sourced the El Carro Kubernetes operator for Oracle. This not only automates database lifecycle tasks for systems running on Kubernetes, but also exposes declarative APIs for these operations.

    If your application supports replatforming from Oracle to PostgreSQL, then we have a toolset for schema conversion along with dataflow pipelines that will read the output of a change data capture job and load it into a PostgreSQL database. What a great use-case for Datastream - our new serverless change data capture service.

    Another case of heterogeneous database migration is to move MySQL or PostgreSQL databases to Cloud Spanner. HarbourBridge helps with the evaluation and data migration, and our latest contribution was adding support for DynamoDB as a source database. Part of every heterogeneous migration should be to validate that the source and target data are matching - we have released the Data Validation Toolkit for that use-case. DVT can connect to a number of source and target databases and compare the data on each side - giving you the confidence that your migration did not miss or change any records.

    Conclusion

    Whether you are migrating existing databases or you are building your next application in the cloud - we want to make your journey as comfortable and seamless as possible. Open source projects play a big role in meeting you where you are and providing you with the connectivity options, language support, and tools you want for management and migrations.

    By Bjoern Rost, Product Manager, Google Cloud Databases

    Migrating App Engine push queues to Cloud Tasks

    Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

    Banner image that shows the Cloud Task logo

    Introduction

    The previous Module 7 episode of Serverless Migration Station gave developers an idea of how App Engine push tasks work and how to implement their use in an existing App Engine ndb Flask app. In this Module 8 episode, we migrate this app from the App Engine Datastore (ndb) and Task Queue (taskqueue) APIs to Cloud NDB and Cloud Tasks. This makes your app more portable and provides a smoother transition from Python 2 to 3. The same principle applies to upgrading other legacy App Engine apps from Java 8 to 11, PHP 5 to 7, and up to Go 1.12 or newer.

    Over the years, many of the original App Engine services such as Datastore, Memcache, and Blobstore, have matured to become their own standalone products, for example, Cloud Datastore, Cloud Memorystore, and Cloud Storage, respectively. The same is true for App Engine Task Queues, whose functionality has been split out to Cloud Tasks (push queues) and Cloud Pub/Sub (pull queues), now accessible to developers and applications outside of App Engine.

    Migrating App Engine push queues to Cloud Tasks video

    Migrating to Cloud NDB and Cloud Tasks

    The key updates being made to the application:

    1. Add support for Google Cloud client libraries in the app's configuration
    2. Switch from App Engine APIs to their standalone Cloud equivalents
    3. Make required library adjustments, e.g., add use of Cloud NDB context manager
    4. Complete additional setup for Cloud Tasks
    5. Make minor updates to the task handler itself

    The bulk of the updates are in #3 and #4 above, and those are reflected in the following "diff"s for the main application file:

    Screenshot shows primary differences in code when switching to Cloud NDB & Cloud Tasks

    Primary differences switching to Cloud NDB & Cloud Tasks

    With these changes implemented, the web app works identically to that of the Module 7 sample, but both the database and task queue functionality have been completely swapped to using the standalone/unbundled Cloud NDB and Cloud Tasks libraries… congratulations!

    Next steps

    To do this exercise yourself, check out our corresponding codelab which leads you step-by-step through the process. You can use this in addition to the video, which can provide guidance. You can also review the push tasks migration guide for more information. Arriving at a fully-functioning Module 8 app featuring Cloud Tasks sets the stage for a larger migration ahead in Module 9. We've accomplished the most important step here, that is, getting off of the original App Engine legacy bundled services/APIs. The Module 9 migration from Python 2 to 3 and Cloud NDB to Cloud Firestore, plus the upgrade to the latest version of the Cloud Tasks client library are all fairly optional, but they represent a good opportunity to perform a medium-sized migration.

    All migration modules, their videos (when available), codelab tutorials, and source code, can be found in the migration repo. While the content focuses initially on Python users, we will cover other legacy runtimes soon so stay tuned.

    How to use App Engine push queues in Flask apps

    Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

    Banner image that shows the Cloud Task logo

    Introduction

    Since its original launch in 2008, many of the core Google App Engine services such as Datastore, Memcache, and Blobstore, have matured to become their own standalone products: for example, Cloud Datastore, Cloud Memorystore, and Cloud Storage, respectively. The same is true for App Engine Task Queues with Cloud Tasks. Today's Module 7 episode of Serverless Migration Station reviews how App Engine push tasks work, by adding this feature to an existing App Engine ndb Flask app.

    App Engine push queues in Flask apps video

    That app is where we left off at the end of Module 1, migrating its web framework from App Engine webapp2 to Flask. The app registers web page visits, creating a Datastore Entity for each. After a new record is created, the ten most recent visits are displayed to the end-user. If the app only shows the latest visits, there is no reason to keep older visits, so the Module 7 exercise adds a push task that deletes all visits older than the oldest one shown. Tasks execute asynchronously outside the normal application flow.

    Key updates

    The following are the changes being made to the application:

    1. Add use of App Engine Task Queues (taskqueue) API
    2. Determine oldest visit displayed, logging and saving that timestamp
    3. Create task to delete old visits
    4. Update web page template to display timestamp threshold
    5. Log how many and which visits (by Entity ID) are deleted

    Except for #4 which occurs in the HTML template file, these updates are reflected in the "diff"s for the main application file:

    Screenshot of App Engine push tasks application source code differences

    Adding App Engine push tasks application source code differences

    With these changes implemented, the web app now shows the end-user which visits will be deleted by the new push task:

    Screenshot of VisitMe example showing last ten site visits. A red circle around older visits being deleted

    Sample application output

    Next steps

    To do this exercise yourself, check out our corresponding codelab which leads you step-by-step through the process. You can use this in addition to the video, which can provide guidance. You can also review the push queue documentation for more information. Arriving at a fully-functioning Module 7 app featuring App Engine push tasks sets the stage for migrating it to Cloud Tasks (and Cloud NDB) ahead in Module 8.

    All migration modules, their videos (when available), codelab tutorials, and source code, can be found in the migration repo. While the content focuses initially on Python users, we will cover other legacy runtimes soon so stay tuned.