Category Archives: Google Cloud Platform Blog

Product updates, customer stories, and tips and tricks on Google Cloud Platform

How Qubit and GCP helped Ubisoft create personalized customer experiences



Editor’s note: Today’s blog post comes from Alex Olivier, product manager at Qubit. He’ll be taking us through the solution Qubit provided for Ubisoft, one of the world’s largest gaming companies, to help them personalize customer experiences through data analysis.

Our platform helps brands across a range of sectors — from retail and gaming to travel and hospitality — deliver a personalized digital experience for users. To do so, we analyze thousands of data points throughout a customer’s journey, taking the processing burden away from our clients. This insight prompts our platform to make a decision — for example, including a customer in a VIP segment, or identifying a customer’s interest in a certain product — and adapts the visitor’s experience accordingly.

As one of the world's largest gaming companies, Ubisoft faced a problem that challenges many enterprises: a data store so big it was difficult and time-consuming to analyze. “Data took between fifteen and thirty minutes to process,” explained Maxime Bosvieux, EMEA Ecommerce Director at Ubisoft. “This doesn’t sound like much, but the modern customer darts from website to website, and if you’re unable to provide them with the experience they’re looking for, when they’re looking for it, they’ll choose the competitor who can.” That’s when they turned to Qubit and Google Cloud Platform.

A cloud native approach.


From early on, we made the decision to be an open ecosystem so as to provide our clients and partners with flexibility across technologies. When designing our system, we saw that the rise of cloud computing could transform not only how platform companies like ours process data, but also how they interface with customers. By providing Cloud-native APIs across the stack, our clients could seamlessly use open source tools and utilities with Qubit’s systems that run on GCP. Many of these tools interface with gsutil via the command-line, call BigQuery, or even upload to Cloud Storage buckets via CyberDuck.

We provision and provide our clients access to their own GCP project. The project contains all data processed and stored from their websites, apps and back-end data sources. Clients can then access both batch and streaming data, be it a user's predicted preferred category, a real-time calculation of lifetime value, or which customer segment the user belongs to. A client can access this data within seconds, regardless of their site’s traffic volume at that moment.


Bringing it all together for Ubisoft.


One of the first things Ubisoft realized is that they needed access to all of their data, regardless of the source. Qubit Live Tap gave Ubisoft access to the full take of their data via BigQuery (and through BI tools like Google Analytics and Looker). Our system manages all data processing and schema management, and reports out actionable next steps. This helps speed up the process of understanding the customer in order to provide better personalization. Using BigQuery’s scaling abilities, Live Tap generates machine learning and AI driven insights for clients like Ubisoft. This same system also lets them access their data in other BI and analytics tools such as Google Data Studio.

We grant access to clients like Ubisoft through a series of views in their project that point back to their master data store. The BigQuery IAM model (permissions provisioning for shared datasets) allows views to be authorized across multiple projects, removing the need to do batch copies between instances, which might cause some data to become stale. As Qubit streams data into the master tables, the views have direct access to it: analysts who perform queries in their own BigQuery project get access to the latest, real-time data.

Additionally, because the project provided is a complete GCP environment, clients like Ubisoft can also provision additional resources. We have clients who create their own Dataproc clusters, or import data provided by Qubit in BigQuery or via a PubSub topic to perform additional analysis and machine learning in a single environment. This process avoids the data wrangling problems commonly encountered in closed systems.

By combining Google Cloud Dataflow, Bigtable and BigQuery, we’re able to process vast amounts of data quickly and at petabyte-scale. During a typical month, Qubit’s platform will provide personalized experiences for more than 100 million users, surface 28 billion individual visitor experiences from ML-derived conclusions on customer data and use AI to simulate more than 2.3 billion customers journeys.

All of this made a lot of sense to Ubisoft. “We’re a company famous for innovating quickly and pushing the limits of what can be done,” Maxime Bosvieux told us. “That requires stable and robust technology that leverages the latest in artificial intelligence to build our segmentation and personalization strategies.”

Helping more companies move to the cloud with effective and efficient migrations.


We’re thrilled that the infrastructure we built with GCP has helped clients like Ubisoft scale data processing far beyond previous capabilities. Our integration into the GCP ecosystem is making this scalability even more attractive to organizations switching to the cloud. While porting data to a new provider can be daunting, we’re helping our clients make a more manageable leap to GCP.

Monitor and manage your costs with Cloud Platform billing export to BigQuery



The flexibility and scalability of the cloud means that your usage can fluctuate dramatically from day to day with demand. And while you always pay only for what you use, customers often ask us to help them better understand their bill.

A prerequisite for understanding your bill is better access to detailed usage and billing data. So today, we are excited to announce the general availability of billing export to BigQuery, our data warehouse service, enabling a more granular and timely view into your GCP costs than ever before.

Billing export to BigQuery is a new and improved version of our existing billing export to CSV/JSON files, and like the name implies, exports your cloud usage data directly into a BigQuery dataset. Once the data is there, you can write simple SQL queries in BigQuery, visualize your data in Data Studio, or programmatically export the data into other tools to analyze your spend.

New billing data is exported automatically into the dataset as it becomes available-- usually multiple times per day. BigQuery billing export also contains a few new features to help you organize your data:
  • User labels to categorize and track costs 
  • Additional product metadata to organize by GCP services: 
    • Service description 
    • Service category 
    • SKU ID to uniquely identify each resource type 
  • Export time to help organize cost by invoice 

Getting started with billing export to BigQuery 


It’s easy to export billing data into BigQuery and start to analyze it. The first step is to enable the export, which begins to build your billing dataset, following these setup instructions. Note that you need Billing Admin permissions in GCP to enable export so check you have the appropriate permissions or work with your Billing Admin.

Once you have billing export set up, the data will automatically start being populated within a few hours. Your BigQuery dataset will continue to automatically update as new data is available.


NOTE: Your BigQuery dataset only reflects costs incurred from the date you set up billing export; we will not backfill billing data at this time. While our existing CSV and JSON export features continue to remain available in their current format, we strongly encourage you to enable billing export to BigQuery as early as possible to build out your billing dataset, and to take advantage of the more granular cost analysis it allows.

Querying the billing export data


Now that you've populated your dataset, you can start the fun part--data analysis. You can export the full dataset, complete with new elements such as user labels, or write queries against the data to answer specific questions. Here are a couple of simple examples of how you might use BigQuery queries on exported billing data.

Query every row without grouping


The most granular view of your billing costs is to query every row without grouping. Assume all fields, except labels and resource types, are the same (project, product, and so on).

SELECT
     resource_type,
     TO_JSON_STRING(labels) as labels,
     cost as cost
FROM `project.dataset.table`;

Group by label map as a JSON string 

This is a quick and easy way to break down cost by each label combination.

SELECT
     TO_JSON_STRING(labels) as labels,
     sum(cost) as cost
FROM `project.dataset.table`
GROUP BY labels;

You can see more query examples or write your own.

Visualize Spend Over Time with Data Studio


Many business intelligence tools natively integrate with BigQuery as the backend datastore. With Data Studio, you can easily visualize your BigQuery billing data, and with a few clicks set up a dashboard and get up-to-date billing reports throughout the day, using labels to slice and dice your GCP bill.


You can find detailed instructions about how to copy and setup a Data Studio template here: Visualize spend over time with Data Studio

Here at Google Cloud, we’re all about making your cloud costs as transparent and predictable as possible. To learn more about billing export to BigQuery, check out the documentation, and let us know how else we can help you understand your bill, by sending us feedback.

Intel Performance Libraries and Python Distribution enhance performance and scaling of Intel® Xeon® Scalable (‘Skylake’) processors on GCP



Google was pleased to be the first cloud vendor to offer the latest-generation Intel® Xeon® Scalable (‘Skylake’) processors in February 2017. With their higher core counts, improved on-chip interconnect with the new Intel® Mesh Architecture, enhanced memory subsystems and Intel® Advanced Vector Extensions-512 (AVX-512) functional units, these processors are a great fit for demanding HPC applications that need high floating-point operation rates (FLOPS) and the operand bandwidth to feed the processing pipelines.
New Intel® Mesh Architecture for Xeon Scalable Processors

Skylake raises the performance bar significantly, but a processor is only as powerful as the software that runs on it. So today we're announcing that the Intel Performance Libraries are now freely available for Google Cloud Platform (GCP) Compute Engine. These libraries, which include the Intel® Math Kernel Library, Intel® Data Analytics Acceleration Library, Intel® Performance Primitives, Intel® Threading Building Blocks, and Intel® MPI Library, integrate key communication and computation kernels that have been tuned and optimized for this latest Intel processor family, in terms of both sequential pipeline flow and parallel execution. These components are useful across all the Intel Xeon processor families in GCP, but they're of particular interest for applications that can use them to fully exploit the scale of 96 vCPU instances on Skylake-based servers.

Scaling out to Skylake can result in dramatic performance improvements. This parallel SGEMM matrix multiplication benchmark result, run by Intel engineers on GCP, shows the advantage obtained by going from a 64 vCPU GCP instance on an Intel® Xeon processor E5 (“Broadwell”) system to an instance with 96 vCPUs on Intel Xeon Scalable (“Skylake”) processors, using the Intel® MKL on GCP. Using half or fewer of the available vCPUs reduces hyper-thread sharing of AVX-512 functional units and leads to higher efficiency.

In addition to pre-compiled performance libraries, GCP users now have free access to the Intel® Distribution for Python, a distribution of both python2 and python3, which uses the Intel instruction features and pipelines for maximum effect.

The following chart shows example performance improvements delivered by the optimized scikit-learn K-means functions in the Intel® Distribution for Python over the stock open source Python distribution.
We’re delighted that Google Cloud Platform users will experience the best of Intel® Xeon® Scalable processors using the Intel® Distribution for Python and the Intel performance libraries Intel® MKL, Intel® DAAL, Intel® TBB, Intel® IPP and Intel® MPI. These software tools are carefully tuned to deliver the workload-optimized performance benefits of the advanced processors that Google has deployed, including 96 vCPUs and workload-optimized vector capabilities provided by Intel® AVX-512.”  
Sanjiv Shah, VP and GM, Software Development tools for technical, enterprise, and cloud computing at Intel
For more information about Intel and GCP, or to access the installation instructions for the Intel Performance Library and Python packages, visit the Intel and Google Cloud Platform page.

With Multi-Region support in Cloud Spanner, have your cake and eat it too



Today, we’re thrilled to announce the general availability of Cloud Spanner Multi-Region configurations. With this release, we’ve extended Cloud Spanner’s transactions and synchronous replication across regions and continents. That means no matter where your users may be, apps backed by Cloud Spanner can read and write up-to-date (strongly consistent) data globally and do so with minimal latency for end users. In other words, your app now has an accurate, consistent view of the data it needs to support users whether they’re around the corner or around the globe. Additionally, when running a Multi-Region instance, your database is able to survive a regional failure.

This release also delivers an industry-leading 99.999% availability SLA with no planned downtime. That’s 10x less downtime (< 5min / year) than database services with four nines of availability.

Cloud Spanner is the first and only enterprise-grade, globally distributed and strongly consistent database service built specifically for the cloud that combines the benefits and familiarity of relational database semantics with non-relational scale and performance. It now supports a wider range of application workloads, from a single node in a single region to massive instances that span regions and continents. At any scale, Cloud Spanner behaves the same, delivering a single database experience.


Since we announced the general availability of Cloud Spanner in May, customers, from startups to enterprises, have rethought what a database can do, and have been migrating their mission critical production workloads to it. For example, Mixpanel, a business analytics service, moved their sharded MySQL database to Cloud Spanner to handle user-id lookups when processing events from their customers' end-users web browser and mobile devices.

No more trade-offs


For years, developers and IT organizations were forced to make painful compromises between the horizontal scalability of non-relational databases and the transactions, structured schema and complex SQL queries offered by traditional relational databases. With the increase in volume, variety and velocity of data, companies had to layer additional technologies and scale-related workarounds to keep up. These compromises introduced immense complexity and only addressed the symptoms of the problem, not the actual problem.

This summer, we announced an alliance with marketing automation provider Marketo, Inc., which is migrating to GCP and Cloud Spanner. Companies around the world rely on Marketo to orchestrate, automate, and adapt their marketing campaigns via the Marketo Engagement Platform. To meet the demands of its customers today, and tomorrow, Marketo needed to be able to process trillions of activities annually, creating an extreme-scale big data challenge. When it came time to scale its platform, Marketo did what many companies do  it migrated to a non-relational database stack. But if your data is inherently transactional, going to a system without transactions and keeping data ordered and readers consistent is very hard.

"It was essential for us to have order sequence in our app logic, and with Cloud Spanner, it’s built in. When we started looking at GCP, we quickly identified Cloud Spanner as the solution, as it provided relational semantics and incredible scalability within a managed service. We hadn’t found a Cloud Spanner-like product in other clouds. We ran a successful POC and plan to move several massive services to Cloud Spanner. We look forward to Multi-Region configurations, as they give us the ability to expand globally and reduce latencies for customers on the other side of the world" 
— Manoj Goyal, Marketo Chief Product Officer

Mission-critical high availability


For global businesses, reliability is expected but maintaining that reliability while also rapidly scaling can be a challenge. Evernote, a cross-platform app for individuals and teams to create, assemble, nurture and share ideas in any form, migrated to GCP last year. In the coming months, it will mark the next phase of its move to the cloud by migrating to a single Cloud Spanner instance to manage over 8 billion plus pieces of its customers’ notes, replacing over 750 MySQL instances in the process. Cloud Spanner Multi-Region support gives Evernote the confidence it needs to make this bold move.
"At our size, problems such as scalability and reliability don't have a simple answer, Cloud Spanner is a transformational technology choice for us. It will give us a regionally distributed database storage layer for our customers’ data that can scale as we continue to grow. Our whole technology team is excited to bring this into production in the coming months."
Ben McCormack, Evernote Vice President of Operations

Strong consistency with scalability and high performance


Cloud Spanner delivers scalability and global strong consistency so apps can rely on an accurate and ordered view of their data around the world with low latency. Redknee, for example, provides enterprise software to mobile operators to help them charge their subscribers for their data, voice and texts. Its customers' network traffic currently runs through traditional database systems that are expensive to operate and come with processing capacity limitations.
“We want to move from our current on-prem per-customer deployment model to the cloud to improve performance and reliability, which is extremely important to us and our customers. With Cloud Spanner, we can process ten times more transactions per second (using a current benchmark of 55k transactions per second), allowing us to better serve customers, with a dramatically reduced total cost of ownership." 
— Danielle Royston, CEO, Redknee

Revolutionize the database admin and management experience


Standing up a globally consistent, scalable relational database instance is usually prohibitively complex. With Cloud Spanner, you can create an instance in just a few clicks and then scale it simply using the Google Console or programmatically. This simplicity revolutionizes database administration, freeing up time for activities that drive the business forward, and enabling new and unique end-user experiences.

A different way of thinking about databases


We believe Cloud Spanner is unique among databases and cloud database services, offering a global relational database, not just a feature to eventually copy or replicate data around the world. At Google, Spanner powers apps that process billions of transactions per day across many Google services. In fact, it has become the default database internally for apps of all sizes. We’re excited to see what your company can do with Cloud Spanner as your database foundation.

Want to learn more? Check out the many whitepapers discussing the technology behind Cloud Spanner. Then, when you’re ready to get started, follow our Quickstart guide to Cloud Spanner, or Kelsey Hightower’s post How to get started with Cloud Spanner in 5 minutes.

Introducing Certified Kubernetes (and Google Kubernetes Engine!)



When Google launched Kubernetes three years ago, we knew based on our 10 years of experience with Borg how useful it would be to developers. But even we couldn’t have predicted just how successful it would become. Kubernetes is one of the world’s highest velocity open source projects, supported by a diverse community of contributors. It was designed at its heart to run anywhere, and dozens of vendors have created their own Kubernetes offerings.

It's critical to Kubernetes users that their applications run reliably across different Kubernetes environments, and that they can access the new features in a timely manner. To ensure a consistent developer experience across different Kubernetes offerings, we’ve been working with the Cloud Native Computing Foundation (CNCF) and the Kubernetes community to create the Certified Kubernetes Conformance Program. The Certified Kubernetes program officially launched today, and our Kubernetes service is among the first to be certified.

Choosing a Certified Kubernetes platform like ours and those from our partners brings both benefits and peace of mind, especially for organizations with hybrid deployments. With the greater compatibility of Certified Kubernetes, you get:
  • Smooth migrations between on-premises and cloud environments, and a greater ability to split a single workload across multiple environments 
  • Consistent upgrades
  • Access to community software and support resources
The CNCF hosts a complete list of of Certified Kubernetes platforms and distributions. If you use a Kubernetes offering that's not on the list, encourage them to become certified as soon as possible!

Putting the K in GKE


One of the benefits of participating in the Certified Kubernetes Conformance Program is being able to use the name “Kubernetes” in your product. With that, we’re taking this opportunity to rename Container Engine to Kubernetes Engine. From the beginning, Container Engine’s acronym has been GKE in a nod to Kubernetes. Now, as a Certified Kubernetes offering, we can officially put the K in GKE.

While the Kubernetes Engine name is new, everything else about the service is unchanged—it’s still the same great managed environment for deploying containerized applications that you trust to run your production environments. To learn more about Kubernetes Engine, visit the product page, or the documentation for a wealth of quickstarts, tutorials and how-tos. And as always, if you’re just getting started with containers and Google Cloud Platform, be sure to sign up for a free trial.

Announcing integration of Altair HPC applications with Google Cloud



Engineering today requires access to unprecedented computing resources to simulate, test and design the products that make modern life possible. Here at Google Cloud, one of our goals is to democratize and simplify access to advanced computing resources and promote the sciences and engineering.

With that, we’re excited to announce a new technology partnership between Google Cloud Platform (GCP), Intel and Altair, a leading software provider for engineering and science applications, including high performance computing (HPC) applications for computer-aided engineering, simulation, product design, Internet of Things and others.

Starting today, you can launch virtual HPC appliances running Altair and other HPC applications on GCP using Altair’s PBScloud.io. PBScloud provides a central command center, a simple user experience, easy deployment, real-time monitoring and resource management for HPC use cases. It also includes features for job submission, job monitoring and result visualization. PBScloud.io also works and orchestrates across multiple public clouds and traditional on-premises deployments.
Altair applications available on GCP via PBScloud.io
Before cloud computing, engineers and scientists were constrained by the limitations of on-premise computing resources and clusters. Long queue times, suboptimal hardware utilization and frustrated users were commonplace. With Google Cloud, you can test your ideas quickly, pay for exactly what you need and only while you need it. Now with Altair’s PBScloud.io, you also have easy, turn-key access to state-of-the-art science and engineering applications on GCP’s advanced, scalable hardware and infrastructure.

Compare, for example, the performance of Altair RADIOSS on Intel’s latest generation Xeon processor codenamed Skylake on Compute Engine vs. its performance on previous generation CPUs. Note that RADIOSS demonstrated product scalability by taking advantage of all 96 vCPUs on GCP.


We’re excited to bring this collaboration to you and even more excited to see what you'll build with Altair’s software on our platform. If you’re at the SC17 conference, be sure to drop by the Google Cloud, Altair and Intel booths for talks, demos and to talk about HPC on Google Cloud.

Check out PBScloud.io and sign up for a GCP trial at no cost today.

Introducing Open in Cloud Shell, a new way to create frictionless tutorials



If you’ve ever created any sort of interactive content that involves code—a tutorial, codelab, a demo—you know that setup and installation can be the most time consuming and painful parts of the experience for your users. This is also where you stand the greatest chance of losing your audience—in today’s fast paced world, onerous prerequisites can trigger a high bounce rate.

In this article, we’ll describe a new feature of the Google Cloud Platform (GCP), Open in Cloud Shell, which makes setting up tutorials and Github repos simpler, faster and more engaging for your users — in other words, more frictionless.
For a real world example, check out any of the Google node.js samples, like the one shown above for the Video Intelligence API sample.

This capability relies on Google Cloud Shell, an on-demand interactive shell prompt running in the cloud. Cloud Shell requires no setup or maintenance on your part, since we automatically allocate and maintain the underlying VM for you.

A year ago, we added the Cloud Shell Editor, which adds a cloud-based IDE into Cloud Shell, making it easy to edit projects alongside an interactive command line, all hosted in the cloud.

And now, with Open in Cloud Shell, your users can automatically open a cloud shell, complete with a run-time specification of a Git repo to auto-clone, by clicking a link or pressing a button. For example, adding this link to your tutorial:
Open in Cloud Shell
renders a button that:
  • opens a cloud shell session 
  • automatically clones the your-first-pwapp repo 
  • loads the project into the cloud editor and 
  • opens the README.md file in an editor tab
And by clicking this button, users are ready to go in seconds, rather than minutes — leaving them with more time for meaningful learning.

For all the details about this feature, see the public documentation here.

A few examples


Whether you’re writing blog articles, codelabs, webpages, Github Readme files, or any other form of interactive content, this feature offers the ability to significantly reduce startup friction for your users. Let’s take a look at a few examples.

Webpage

Add a link like this to your page:
Open in Cloud Shell
Resulting in the following:


Github repo


You can also add a similar button or link to your repo’s README.md (or other markdown file) using syntax like this:

## Open this repo in Google Cloud Shell

[![Open in Cloud Shell](http://gstatic.com/cloudssh/images/open-btn.png)](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/marcacohen/gcslock&page=editor&open_in_editor=README.md)
Resulting in the following:


Google Codelab


Codelabs are authored in Google Docs with specific formatting rules. By adding the following section (or a variation thereof) to your codelab source doc, you get a one-click setup to a cloud hosted shell, complete with integrated code editor and web server, pre-cloned repo and other features:

         Getting set up


          Open a development environment in Cloud Shell

          Click the following button to open a cloud shell with all required source code and
          access to a built-in code editor:

          Open in Cloud Shell

Resulting in the following:

In the above example, one button press replaces instructions to manually install the git tools, clone a source repo, install and operate a local web server and have access to/and knowledge about how to use a local text editor/IDE. All of those functions and more can now be hosted in the cloud!

We think Open in Cloud Shell is a really handy feature for anyone creating or consuming instructional resources based on GCP. You can read more about this feature in the Open in Cloud Shell documentation. And let us know what you think via the Feedback link on the bottom of the feature documentation page!

Learn, connect and share with Community Tutorials



Google Cloud Platform (GCP) gives you options—so many that our technical writers can't possibly explain everything you can do, no matter how fast they type. Did you know you can deploy a Grails app to App Engine? Or use Matplotlib to visualize your Stackdriver Monitoring metrics for Cloud Bigtable? You'd probably never find tutorials like these in the official documentation, but you can find them in the Community Tutorials.

GCP's Community Tutorials are written by contributors, including Googlers, to share specialized knowledge about how to get stuff done on GCP. Anyone can contribute a tutorial to our GitHub repo. GCP staff reviews submissions and usually provides some editorial feedback and technical review. When your submission is approved, we publish the tutorial right to the GCP website.

We want you to use these tutorials as a resource, but we also want you to get excited about contributing. With Community Tutorials, you can:

  • Write and submit a new tutorial: just follow the tutorial (see what we did there?). 
  • Submit pull requests to improve existing tutorials: there's an Edit on GitHub link at the top of each page. 
  • File a bug: there's a Report an issue link at the top of each page. 
  • Give us ideas for new tutorials: request a tutorial on GitHub


Here's what's in it for you:

  • Spread awareness of your open source project by showing how it works on GCP. 
  • Build your street cred. Every tutorial you write that we publish contains a link to your GitHub profile. 
  • Share what you know. If you figured out how to do something hard, you can help others by saving them some time. 
  • Connect with Googlers. You can give feedback on tutorials they've written and get their feedback on yours. 
  • Sharpen up your writing skills. You'll get feedback from our technical writing staff. 
  • Earn the undying love and admiration of your peers in the GCP community. No explanation needed.

We launched Community Tutorials in beta in April, and already their numbers are starting to snowball: As of this writing, there are 104 Community Tutorials to choose from, with dozens more in the pipeline. We can't wait to see how you contribute!

5 steps to better GCP network performance



We’re admittedly a little biased, but we’re pretty proud of our networking technology. Jupiter, the Andromeda network virtualization stack and TCP-BBR all ride on datacenters around the world and  the intercontinental cables that connect them all.

As a Google Cloud customer, your applications already have access to this fast, global network, giving your VM-to-VM communication top-tier performance. Furthermore, because Google peers its egress traffic directly with a number of companies (including Cloudflare), you can get content to your customers faster, with lower egress costs.

With that in mind, it’s really easy to make small configuration changes, location updates or architectural changes that can inadvertently limit the networking performance of your system. Here are the top five things you can do to get the most out of Google Cloud.

1. Know your tools

Testing your networking performance is the first step to improving your environment. Here are the tools I use on a daily basis:
  • Iperf is a commonly used network testing tool that can create TCP/UDP data streams and measure the throughput of the network that carries them. 
  • Netperf is another good network testing tool, which is also used by the PerfKitBenchmark suite to test performance and benchmark the various cloud providers against one another. 
  • traceroute is a computer network diagnostic tool to measure and display packets’ routes across a network. It records the route’s history as the round-trip times of the packets received from each successive host in the route; the sum of the mean times in each hop is a measure of the total time spent to establish the connection.
These tools are battle-hardened, really well documented, and should be the cornerstone of your performance efforts.

2. Put instances in the right zones


One important thing to remember about network latency is that it’s a function of physics.

The speed of light traveling in a vacuum is 300,000 km/s, meaning that it takes about 10ms to travel a distance of ~3000km — about the distance of New York to Santa Fe. But because the internet is built on fiber-optic cable, which slows things down by a factor of ~1.52, data can only travel 1013km one way in that same 10ms.

So, the farther away two machines are, the higher their latency will be. Thankfully, Google has datacenter locations all around the world, making it easy to put your compute close to your users.


It’s worthwhile to take a regular look at where your instances are deployed, and see if there’s an opportunity to open up operations in a new region. Doing so will help reduce latency to the end user, and also help create a system of redundancy to help safeguard against various types of networking calamity.

3. Choose the right core-count for your networking needs


According to the Compute Engine documentation:

Outbound or egress traffic from a virtual machine is subject to maximum network egress throughput caps. These caps are dependent on the number of vCPUs that a virtual machine instance has. Each core is subject to a 2 Gbits/second (Gbps) cap for peak performance. Each additional core increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine.

In other words, the more virtual CPUs in a guest, the more networking throughput you get. You can see this yourself by setting up a bunch of instance types, and logging their IPerf performance:
You can clearly see that as the core count goes up, so does the avg. and max. throughput. Even with our simple test, we can see that hard 16Gbps limit on the higher machines.

As such, it’s critically important to choose the right type of instance for your networking needs. Picking something too large can cause you to over-provision (and over pay!), while too few cores places a hard limit on your maximum throughput speeds.

4. Use internal over external IPs


Any time you transfer data or communicate between VMs, you can achieve max performance by always using the internal IP to communicate. In many cases, the difference in speed can be drastic. Below, you can see for a N1 machine, the bandwidth measured through iperf to the external IP was only 884 MBits/sec

user@instance-2:~$ iperf -c 104.155.145.79 ------------------------------------------------------------
Client connecting to 104.155.145.79, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[  3] local 10.128.0.3 port 53504 connected with 104.155.145.79 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.03 GBytes   884 Mbits/sec

However, the internal IP between the two machines boasted 1.56 GBits / sec:

user@instance-2:~$ iperf -c 10.128.0.2
------------------------------------------------------------
Client connecting to 10.128.0.2, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[  3] local 10.128.0.3 port 38978 connected with 10.128.0.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2.27 GBytes  1.95 Gbits/sec

5. Rightsize your TCP window


If you have ever wondered why a connection transmits at a fraction of the available bandwidth — even when both the client and the server are capable of higher rates — then it might be due to a window size mismatch.

The Transmission Control Protocol (aka TCP) works by sending windows of data over the internet, relying on a straightforward system of handshakes and acknowledgements to ensure arrival and integrity of the data, and in some cases, to resend it. On the plus side, this results in a very stable internet. On the downside, it results in lots of extra traffic. And when the sender or receiver stop and wait for ACKs for previous windows/packets, this creates gaps in the data flow, limiting the maximum throughput of the connection.

Imagine, for example, a saturated peer that is advertising a small receive window, bad network weather and high packet loss resetting the congestion window, or explicit traffic shaping limiting the throughput of your connection. To address this problem, window sizes should be just big enough such that either side can continue sending data until it receives an ACK for an earlier packet. Keeping windows small limits your connection throughput, regardless of the available or advertised bandwidth between instances.

For the best performance possible in your application, you should really fine-tune window sizes depending on your client connections, estimated egress and bandwidth constraints. The good news is that the TCP window sizes on standard GCP VMs are are tuned for high-performance throughput. So be sure you test the defaults before you make any changes (sometimes, it might not be needed!)


Every millisecond counts

Getting peak performance across a cloud-native architecture is rarely achieved by fixing just one problem. It’s usually a combination of issues, the “death by a thousand cuts” as it were, that chips away at your performance, piece by piece. By following these five steps, you’ll be able to isolate, identify and address some of the most common culprits of poor network performance, to help you take advantage of all the networking performance that’s available to you.

If you’d like to know more about ways to optimize your Google Cloud applications, check out the rest of the Google Cloud Performance Atlas blog posts and videos. Because, when it comes to performance, every millisecond counts.

DNSSEC now available in Cloud DNS



Today, we're excited to announce that Google is adding DNSSEC support (beta) to our fully managed Google Cloud DNS service. Now you and your users can take advantage of the protection provided by DNSSEC without having to maintain it once it's set up.

Why is DNSSEC an important add-on to DNS?

Domain Name System Security Extensions (DNSSEC) adds security to the Domain Name System (DNS) protocol by enabling DNS responses to be validated. Having a trustworthy Domain Name System (DNS) that translates a domain name like www.example.com into its associated IP address is an increasingly important building block of today’s web-based applications. Attackers can hijack this process of domain/IP lookup and redirect users to a malicious site through DNS hijacking and man-in-the-middle attacks. DNSSEC helps mitigate the risk of such attacks by cryptographically signing DNS records. As a result, it prevents attackers from issuing fake DNS responses that may misdirect browsers to nefarious websites.

Google Cloud DNS and DNSSEC

Cloud DNS is a fast, reliable and cost-effective Domain Name System that powers millions of domains on the internet. DNSSEC in Cloud DNS enables domain owners to take easy steps to protect their domains against DNS hijacking and man-in-the-middle attacks. Advanced users may choose to use different signing algorithms and denial-of-existence types. We support several sizes of RSA and ECDSA keys, as well as both NSEC and NSEC3. Enabling support for DNSSEC brings no additional charges or changes to the terms of service. 
To start using DNSSEC, simply turn the feature to "on" within your DNS zone.
DNSSEC will be automatically enabled for that zone.
To learn more about getting started with DNSSEC for Cloud DNS, please refer to the documentation page.