Tag Archives: DataStore

Migrating from App Engine Memcache to Cloud Memorystore (Module 13)

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Introduction and background

The previous Module 12 episode of the Serverless Migration Station video series demonstrated how to add App Engine Memcache usage to an existing app that has transitioned from the webapp2 framework to Flask. Today's Module 13 episode continues its modernization by demonstrating how to migrate that app from Memcache to Cloud Memorystore. Moving from legacy APIs to standalone Cloud services makes apps more portable and provides an easier transition from Python 2 to 3. It also makes it possible to shift to other Cloud compute platforms should that be desired or advantageous. Developers benefit from upgrading to modern language releases and gain added flexibility in application-hosting options.

While App Engine Memcache provides a basic, low-overhead, serverless caching service, Cloud Memorystore "takes it to the next level" as a standalone product. Rather than a proprietary caching engine, Cloud Memorystore gives users the option to select from a pair of open source engines, Memcached or Redis, each of which provides additional features unavailable from App Engine Memcache. Cloud Memorystore is typically more cost efficient at-scale, offers high availability, provides automatic backups, etc. On top of this, one Memorystore instance can be used across many applications as well as incorporates improvements to memory handling, configuration tuning, etc., gained from experience managing a huge fleet of Redis and Memcached instances.

While Memcached is more similar to Memcache in usage/features, Redis has a much richer set of data structures that enable powerful application functionality if utilized. Redis has also been recognized as the most loved database by developers in StackOverflow's annual developers survey, and it's a great skill to pick up. For these reasons, we chose Redis as the caching engine for our sample app. However, if your apps' usage of App Engine Memcache is deeper or more complex, a migration to Cloud Memorystore for Memcached may be a better option as a closer analog to Memcache.

Migrating to Cloud Memorystore for Redis featured video

Performing the migration

The sample application registers individual web page "visits," storing visitor information such as IP address and user agent. In the original app, the most recent visits are cached into Memcache for an hour and used for display if the same user continuously refreshes their browser during this period; caching is a one way to counter this abuse. New visitors or cache expiration results new visits as well as updating the cache with the most recent visits. Such functionality must be preserved when migrating to Cloud Memorystore for Redis.

Below is pseudocode representing the core part of the app that saves new visits and queries for the most recent visits. Before, you can see how the most recent visits are cached into Memcache. After completing the migration, the underlying caching infrastructure has been swapped out in favor of Memorystore (via language-specific Redis client libraries). In this migration, we chose Redis version 5.0, and we recommend the latest versions, 5.0 and 6.x at the time of this writing, as the newest releases feature additional performance benefits, fixes to improve availability, and so on. In the code snippets below, notice how the calls between both caching systems are nearly identical. The bolded lines represent the migration-affected code managing the cached data.

Switching from App Engine Memcache to Cloud Memorystore for Redis

Wrap-up

The migration covered begins with the Module 12 sample app ("START"). Migrating the caching system to Cloud Memorystore and other requisite updates results in the Module 13 sample app ("FINISH") along with an optional port to Python 3. To practice this migration on your own to help prepare for your own migrations, follow the codelab to do it by-hand while following along in the video.

While the code migration demonstrated seems straightforward, the most critical change is that Cloud Memorystore requires dedicated server instances. For this reason, a Serverless VPC connector is also needed to connect your App Engine app to those Memorystore instances, requiring more dedicated servers. Furthermore, neither Cloud Memorystore nor Cloud VPC are free services, and neither has an "Always free" tier quota. Before moving forward this migration, check the pricing documentation for Cloud Memorystore for Redis and Serverless VPC access to determine cost considerations before making a commitment.

One key development that may affect your decision: In Fall 2021, the App Engine team extended support of many of the legacy bundled services like Memcache to next-generation runtimes, meaning you are no longer required to migrate to Cloud Memorystore when porting your app to Python 3. You can continue using Memcache even when upgrading to 3.x so long as you retrofit your code to access bundled services from next-generation runtimes.

A move to Cloud Memorystore and today's migration techniques will be here if and when you decide this is the direction you want to take for your App Engine apps. All Serverless Migration Station content (codelabs, videos, source code [when available]) can be accessed at its open source repo. While our content initially focuses on Python users, we plan to cover other language runtimes, so stay tuned. For additional video content, check out our broader Serverless Expeditions series.

Jetpack DataStore – wrap up

Posted by Simona Stojanovic, Android Developer Relations Engineer

MADSkills Jetpack DataStore 

Now that our MAD Skills series on Jetpack DataStore is complete, let’s do a quick wrap up of all the things we’ve covered in each episode:


Episode 1 — Introduction to Jetpack DataStore

We started with the basics of Jetpack DataStore — how it works and the changes and improvements it brings compared to SharedPreferences. We also discussed how to decide between its two implementations, Preferences and Proto DataStore, as well as how to choose between DataStore and Room.

Check out the blog post and the video:


Episode 2 — All about Preferences DataStore

Go deeper into Preferences DataStore: how to create it, read, and write data and how to handle exceptions, all of which should, hopefully, provide you with enough information to decide if it’s the right choice for your app.

Here’s the blog post and the video:


Episode 3 — All about Proto DataStore

Learn about Proto DataStore: how to create it, read, and write data and how to handle exceptions, to better understand the scenarios that make Proto a great choice.

If you prefer reading, here’s the blog post, otherwise, here’s the video:


Episode 4 — DataStore-serialization, sync work, and dependency injection

Episode 4 introduces several different concepts related to DataStore to understand how it works under the hood, so that you have everything at your disposal to use it in a production environment. We focus on: Kotlin Data class serialization, synchronous work, and dependency injection with Hilt.

Take a look at our blogs and video:

DataStore and dependency injection

DataStore and Kotlin serialization

DataStore and synchronous work


Episode 5 — DataStore-handling data migration and testing

Finally, in the fifth episode of our Jetpack DataStore series, we cover two additional concepts around DataStore: DataStore-to-DataStore migrations and testing. Hopefully, this will provide you all the information you need to add DataStore to your app successfully.

Check out the blogs and the video:

DataStore and data migration

DataStore and testing

Cloud NDB to Cloud Datastore migration

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

An optional migration

Serverless Migration Station is a mini-series from Serverless Expeditions focused on helping users on one of Google Cloud's serverless compute platforms modernize their applications. The video today demonstrates how to migrate a sample app from Cloud NDB (or App Engine ndb) to Cloud Datastore. While Cloud NDB suffices as a current solution for today's App Engine developers, this optional migration is for those who want to consolidate their app code to using a single client library to talk to Datastore.

Cloud Datastore started as Google App Engine's original database but matured to becoming its own standalone product in 2013. At that time, native client libraries were created for the new product so non-App Engine apps as well as App Engine second generation apps could access the service. Long-time developers have been using the original App Engine service APIs to access Datastore; for Python, this would be App Engine ndb. While the legacy ndb service is still available, its limitations and lack of availability in Python 3 are why we recommend users switch to standalone libraries like Cloud NDB in the preceding video in this series.

While Cloud NDB lets users break free from proprietary App Engine services and upgrade their applications to Python 3, it also gives non-App Engine apps access to Datastore. However, Cloud NDB's primary role is a transition tool for Python 2 App Engine developers. Non-App Engine developers and new Python 3 App Engine developers are directed to the Cloud Datastore native client library, not Cloud NDB.

As a result, those with a collection of Python 2 or Python 3 App Engine apps as well as non-App Engine apps may be using completely different libraries (ndb, Cloud NDB, Cloud Datastore) to connect to the same Datastore product. Following the best practices of code reuse, developers should consider consolidating to a single client library to access Datastore. Shared libraries provide stability and robustness with code that's constantly tested, debugged, and battle-proven. Module 2 showed users how to migrate from App Engine ndb to Cloud NDB, and today's Module 3 content focuses on migrating from Cloud NDB to Cloud Datastore. Users can also go straight from ndb directly to Cloud Datastore, skipping Cloud NDB entirely.

Migration sample and next steps

Cloud NDB follows an object model identical to App Engine ndb and is deliberately meant to be familiar to long-time Python App Engine developers while use of the Cloud Datastore client library is more like accessing a JSON document store. Their querying styles are also similar. You can compare and contrast them in the "diffs" screenshot below and in the video.

The diffs between the Cloud NDB and Cloud Datastore versions of the sample app

The "diffs" between the Cloud NDB and Cloud Datastore versions of the sample app

All that said, this migration is optional and only useful if you wish to consolidate to using a single client library. If your Python App Engine apps are stable with ndb or Cloud NDB, and you don't have any code using Cloud Datastore, there's no real reason to move unless Cloud Datastore has a compelling feature inaccessible from your current client library. If you are considering this migration and want to try it on a sample app before considering for yours, see the corresponding codelab and use the video for guidance.

It begins with the Module 2 code completed in the previous codelab/video; use your solution or ours as the "START". Both Python 2 (Module 2a folder) and Python 3 (Module 2b folder) versions are available. The goal is to arrive at the "FINISH" with an identical, working app but using a completely different Datastore client library. Our Python 2 FINISH can be found in the Module 3a folder while Python 3's FINISH is in the Module 3b folder. If something goes wrong during your migration, you can always rollback to START, or compare your solution with our FINISH. We will continue our Datastore discussion ahead in Module 6 as Cloud Firestore represents the next generation of the Datastore service.

All of these learning modules, corresponding videos (when published), codelab tutorials, START and FINISH code, etc., can be found in the migration repo. We hope to also one day cover other legacy runtimes like Java 8 and others, so stay tuned. Up next in Module 4, we'll take a different turn and showcase a product crossover, showing App Engine developers how to containerize their apps and migrate them to Cloud Run, our scalable container-hosting service in the cloud. If you can't wait for either Modules 4 or 6, try out their respective codelabs or access the code samples in the table at the repo above. Migrations aren't always easy, and we hope content like this helps you modernize your apps.

Migrating from App Engine ndb to Cloud NDB

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Migrating to standalone services

Today we're introducing the first video showing long-time App Engine developers how to migrate from the App Engine ndb client library that connects to Datastore. While the legacy App Engine ndb service is still available for Datastore access, new features and continuing innovation are going into Cloud Datastore, so we recommend Python 2 users switch to standalone product client libraries like Cloud NDB.

This video and its corresponding codelab show developers how to migrate the sample app introduced in a previous video and gives them hands-on experience performing the migration on a simple app before tackling their own applications. In the immediately preceding "migration module" video, we transitioned that app from App Engine's original webapp2 framework to Flask, a popular framework in the Python community. Today's Module 2 content picks up where that Module 1 leaves off, migrating Datastore access from App Engine ndb to Cloud NDB.

Migrating to Cloud NDB opens the doors to other modernizations, such as moving to other standalone services that succeed the original App Engine legacy services, (finally) porting to Python 3, breaking up large apps into microservices for Cloud Functions, or containerizing App Engine apps for Cloud Run.

Moving to Cloud NDB

App Engine's Datastore matured to becoming its own standalone product in 2013, Cloud Datastore. Cloud NDB is the replacement client library designed for App Engine ndb users to preserve much of their existing code and user experience. Cloud NDB is available in both Python 2 and 3, meaning it can help expedite a Python 3 upgrade to the second generation App Engine platform. Furthermore, Cloud NDB gives non-App Engine apps access to Cloud Datastore.

As you can see from the screenshot below, one key difference between both libraries is that Cloud NDB provides a context manager, meaning you would use the Python with statement in a similar way as opening files but for Datastore access. However, aside from moving code inside with blocks, no other changes are required of the original App Engine ndb app code that accesses Datastore. Of course your "YMMV" (your mileage may vary) depending on the complexity of your code, but the goal of the team is to provide as seamless of a transition as possible as well as to preserve "ndb"-style access.

The difference between the App Engine ndb and Cloud NDB versions of the sample app

The "diffs" between the App Engine ndb and Cloud NDB versions of the sample app

Next steps

To try this migration yourself, hit up the corresponding codelab and use the video for guidance. This Module 2 migration sample "STARTs" with the Module 1 code completed in the previous codelab (and video). Users can use their solution or grab ours in the Module 1 repo folder. The goal is to arrive at the end with an identical, working app that operates just like the Module 1 app but uses a completely different Datastore client library. You can find this "FINISH" code sample in the Module 2a folder. If something goes wrong during your migration, you can always rollback to START, or compare your solution with our FINISH. Bonus content migrating to Python 3 App Engine can also be found in the video and codelab, resulting in a second FINISH, the Module 2b folder.

All of these learning modules, corresponding videos (when published), codelab tutorials, START and FINISH code, etc., can be found in the migration repo. We hope to also one day cover other legacy runtimes like Java 8 and others, so stay tuned! Developers should also check out the official Cloud NDB migration guide which provides more migration details, including key differences between both client libraries.

Ahead in Module 3, we will continue the Cloud NDB discussion and present our first optional migration, helping users move from Cloud NDB to the native Cloud Datastore client library. If you can't wait, try out its codelab found in the table at the repo above. Migrations aren't always easy; we hope this content helps you modernize your apps and shows we're focused on helping existing users as much as new ones.

Prefer Storing Data with Jetpack DataStore

Posted by Florina Muntenescu, Android Developer Advocate, Rohit Sathyanarayana, Software Engineer

Welcome Jetpack DataStore, now in alpha - a new and improved data storage solution aimed at replacing SharedPreferences. Built on Kotlin coroutines and Flow, DataStore provides two different implementations: Proto DataStore, that lets you store typed objects (backed by protocol buffers) and Preferences DataStore, that stores key-value pairs. Data is stored asynchronously, consistently, and transactionally, overcoming most of the drawbacks of SharedPreferences.

SharedPreferences vs DataStore

SharedPreferences

* SharedPreferences has a synchronous API that can appear safe to call on the UI thread, but which actually does disk I/O operations. Furthermore, apply() blocks the UI thread on fsync(). Pending fsync() calls are triggered every time any service starts or stops, and every time an activity starts or stops anywhere in your application. The UI thread is blocked on pending fsync() calls scheduled by apply(), often becoming a source of ANRs.

** SharedPreferences throws parsing errors as runtime exceptions.

In both implementations, DataStore saves the preferences in a file and performs all data operations on Dispatchers.IO unless specified otherwise.

While both Preferences DataStore and Proto DataStore allow saving data, they do this in different ways:

  • Preference DataStore, like SharedPreferences, has no way to define a schema or to ensure that keys are accessed with the correct type.
  • Proto DataStore lets you define a schema using Protocol buffers. Using Protobufs allows persisting strongly typed data. They are faster, smaller, simpler, and less ambiguous than XML and other similar data formats. While Proto DataStore requires you to learn a new serialization mechanism, we believe that the strongly typed schema advantage brought by Proto DataStore is worth it.

Room vs DataStore

If you have a need for partial updates, referential integrity, or support for large/complex datasets, you should consider using Room instead of DataStore. DataStore is ideal for small , simple datasets and does not support partial updates or referential integrity.

Using DataStore

Start by adding the DataStore dependency. If you’re using Proto DataStore, make sure you also add the proto dependency:

// Preferences DataStore
implementation "androidx.datastore:datastore-preferences:1.0.0-alpha01"


// Proto DataStore
implementation  "androidx.datastore:datastore-core:1.0.0-alpha01"

When working with Proto DataStore, you define your schema in a proto file in the app/src/main/proto/ directory. See the protobuf language guide for more information on defining a proto schema.

syntax = "proto3";

option java_package = "<your package name here>";
option java_multiple_files = true;

message Settings {
  int my_counter = 1;
}

Create the DataStore

Create the DataStore with the Context.createDataStore() extension functions.

// with Preferences DataStore
val dataStore: DataStore<Preferences> = context.createDataStore(
    name = "settings"
)

If you’re using Proto DataStore, you’ll also have to implement the Serializer interface to tell DataStore how to read and write your data type.

object SettingsSerializer : Serializer<Settings> {
    override fun readFrom(input: InputStream): Settings {
        try {
            return Settings.parseFrom(input)
        } catch (exception: InvalidProtocolBufferException) {
            throw CorruptionException("Cannot read proto.", exception)
        }
    }

    override fun writeTo(t: Settings, output: OutputStream) = t.writeTo(output)
}


// with Proto DataStore
val settingsDataStore: DataStore<Settings> = context.createDataStore(
    fileName = "settings.pb",
    serializer = SettingsSerializer
)

Read data from DataStore

DataStore exposes the stored data in a Flow, either in a Preferences object or as the object defined in your proto schema. DataStore ensures that data is retrieved on Dispatchers.IO so your UI thread isn’t blocked.

With Preferences DataStore:

val MY_COUNTER = preferencesKey<Int>("my_counter")
val myCounterFlow: Flow<Int> = dataStore.data
     .map { currentPreferences ->
        // Unlike Proto DataStore, there's no type safety here.
        currentPreferences[MY_COUNTER] ?: 0   
   }

With Proto DataStore:

val myCounterFlow: Flow<Int> = settingsDataStore.data
    .map { settings ->
        // The myCounter property is generated for you from your proto schema!
        settings.myCounter 
    }

Write data to DataStore

To write data, DataStore offers a suspending DataStore.updateData() function that gives you the current state of the stored data as a parameter—either as a Preferences object, or an instance of the object defined in the proto schema. The updateData() function updates the data transactionally in an atomic read-write-modify operation. The coroutine completes once the data is persisted on disk.

Preferences DataStore also provides a DataStore.edit() function to make it easier to update data. Instead of receiving a Preferences object, you receive a MutablePreferences object which you edit. As with updateData(), the changes are applied to disk after the transform block completes, and the coroutine completes once data is persisted to disk.

With Preferences DataStore:

suspend fun incrementCounter() {
    dataStore.edit { settings ->
        // We can safely increment our counter without losing data due to races!
        val currentCounterValue = settings[MY_COUNTER] ?: 0
        settings[MY_COUNTER] = currentCounterValue + 1
    }
}

With Proto DataStore:

suspend fun incrementCounter() {
    settingsDataStore.updateData { currentSettings ->
        // We can safely increment our counter without losing data due to races!
        currentSettings.toBuilder()
            .setMyCounter(currentSettings.myCounter + 1)
            .build()
    }
}

Migrate from SharedPreferences to DataStore

To migrate from SharedPreferences to DataStore, you need to pass in a SharedPreferencesMigration object to the DataStore builder. DataStore can automatically migrate from SharedPreferences to DataStore for you. Migrations are run before any data access can occur in DataStore. This means that your migration must have succeeded before DataStore.data returns any values and before DataStore.updateData() can update the data.

If you’re migrating to Preferences DataStore, you can use the default SharedPreferencesMigration implementation and just pass in the name used to construct your SharedPreferences.

With Preferences DataStore:

val dataStore: DataStore<Preferences> = context.createDataStore(
    name = "settings",
    migrations = listOf(SharedPreferencesMigration(context, "settings_preferences"))
)

When migrating to Proto DataStore, you’ll have to implement a mapping function that defines how to migrate from the key-value pairs used by SharedPreferences to the DataStore schema you defined.

With Proto DataStore:

val settingsDataStore: DataStore<Settings> = context.createDataStore(
    produceFile = { File(context.filesDir, "settings.preferences_pb") },
    serializer = SettingsSerializer,
    migrations = listOf(
        SharedPreferencesMigration(
            context,
            "settings_preferences"            
        ) { sharedPrefs: SharedPreferencesView, currentData: UserPreferences ->
            // Map your sharedPrefs to your type here
          }
    )
)

Wrap-up

SharedPreferences comes with several drawbacks: a synchronous API that can appear safe to call on the UI thread, no mechanism for signaling errors, lack of transactional API, and more. DataStore is a replacement for SharedPreferences that addresses most of these shortcomings. DataStore includes a fully asynchronous API using Kotlin coroutines and Flow, handles data migration, guarantees data consistency, and handles data corruption.

As DataStore is still in alpha, we need your help to make it better! To get started, find out more about DataStore in our documentation and try it out by taking our codelabs: Preferences DataStore codelab and Proto DataStore codelab. Then, let us know how we can improve the library by creating issues on the Issue Tracker.