Tag Archives: Storage & Databases

Cloud SQL for PostgreSQL: Managed PostgreSQL for your mobile and geospatial applications in Google Cloud



At Google Cloud Next ‘17, we announced support for PostgreSQL as part of Google Cloud SQL, our managed database service. With its extensibility, strong standards compliance and support from a vibrant open-source community, Postgres is the database of choice for many developers, especially for powering geospatial and mobile applications. Cloud SQL already supports MySQL, and now, PostgreSQL users can also let Google take care of mundane database administration tasks like applying patches and managing backups and storage capacity, and focus on developing great applications.

Feature highlights

Storage and data protection
  • Flexible backups: Schedule automatic daily backups or run them on-demand.
  • Automatic storage increase: Enable automatic storage increase and Cloud SQL will add storage capacity whenever you approach your limit.

Connections
  • Open standards: We embrace the PostgreSQL wire protocol (the standard connection protocol for PostgreSQL databases) and SSL, so you can access your database from nearly any application, running anywhere.
  • Security features: Our Cloud SQL Proxy creates a local socket and uses OAuth to help establish a secure connection with your application or PostgreSQL tool. It automatically creates the SSL certificate and makes more secure connections easier for both dynamic and static IP addresses.

Extensibility
  • Geospatial support: Easily enable the popular PostGIS extension for geospatial objects in Postgres.
  • Custom instance sizes: Create your Postgres instances with the optimal amount of CPU and memory for your workloads


Create Cloud SQL for PostgreSQL instances customized to your needs.


More features coming soon

We’re continuing to improve Cloud SQL for PostgreSQL during beta. Watch for the following:

  • Automatic failover for high availability
  • Read replicas
  • Additional extensions
  • Precise restores with point-in-time recovery
  • Compliance certification as part of Google’s Cloud Platform BAA

Case study: Descartes Labs delves into Earth’s resources with Cloud SQL for PostgreSQL

Using deep-learning to make sense of vast amounts of image data from Google Earth Engine, NASA, and other satellites, Descartes Labs delivers invaluable insights about natural resources and human population. They provide timely and accurate forecasts on such things as the growth and health of crops, urban development, the spread of forest fires and the availability of safe drinking water across the globe.

Cloud SQL for PostgreSQL integrates seamlessly with the open-source components that make up Descartes Labs’ environment. Google Earth Engine combines a multi-petabyte catalog of satellite imagery and geospatial datasets with planetary-scale analysis capabilities and makes it available for scientists, researchers and developers to detect changes, map trends and quantify differences on the Earth's surface. With ready-to-use data sets and an API, Earth Engine data is core to Descartes Labs’ product. Combining this with NASA data and the popular OpenStreetMap data, Descartes Labs takes full advantage of the open source community.

Descartes Labs’ first application tracks corn crops based on a 13-year historical backtest. It predicts the U.S. corn yield faster and more accurately than the U.S. Department of Agriculture.
click to enlarge

Descartes adopted Cloud SQL for PostgreSQL early on because it allowed them to focus on developing applications rather than on mundane database management tasks. “Cloud SQL gives us more time to work on products that provide value to our customers,” said Tim Kelton, Descartes Labs Co-founder and Cloud Architect. “Our individual teams, who are building micro services, can quickly provision a database on Cloud SQL. They don't need to bother compiling Geos, Proj4, GDAL, and Lib2xml to leverage PostGIS. And when PostGIS isn’t needed, our teams use PostgreSQL without extensions or MySQL, also supported by Cloud SQL.”

According to Descartes Labs, Google Cloud Platform (GCP) is like having a virtual supercomputer on demand, without all the usual space, power, cooling and networking issues. Cloud SQL for PostgreSQL is a key piece of the architecture that backs the company’s satellite image analysis applications.
click to enlarge
In developing their newest application, GeoVisual Search, the team benefited greatly from automatic storage increases in Cloud SQL for PostgreSQL. “Ever tried to estimate how a compressed 54GB XML file will expand in PostGIS?” Tim Kelton asked. “It’s not easy. We enabled Cloud SQL’s automatic storage increase, which allows the disk to start at 10GB and, in our case, automatically expanded to 387GB. With this feature, we don’t waste money or time by under- or over-allocating disk capacity as we would on a VM.”
click to enlarge
Because the team was able to focus on data models rather than on database management, development of the GeoVisual Search application proceeded smoothly. Descartes’ customers can now find the geospatial equivalent of a needle in a haystack: specific objects of interest in map images.

The screenshot below shows a search through two billion map tiles to find wind turbines.
click to enlarge
Tim’s parting advice for startups evaluating cloud solutions: “Make sure the solution you choose gives you the freedom to experiment, lets your team focus on product development rather than IT management and aligns with your company’s budget.”

See what GCP can do for you


Sign up for a $300 credit to try Cloud SQL and the rest of GCP. Start with inexpensive micro instances for testing and development. When you’re ready, you can easily scale them up to serve performance-intensive applications. As a bonus, everyone gets the 100% sustained use discount during beta, regardless of usage.

Our partner ecosystem can help you get started with Cloud SQL for PostgreSQL. To streamline data transfer, reach out to Alooma, Informatica, Segment, Stitch, Talend and Xplenty. For help with visualizing analytics data, try ChartIO, iCharts, Looker, Metabase and Zoomdata.
"PostgreSQL is one of Segment’s most popular database targets for our Warehouses product. Analysts and administrators appreciate its rich set of OLAP features and the portability they’re ensured by it being open source. In an increasingly “serverless” world, Google’s Cloud SQL for PostgreSQL offering allows our customers to eschew costly management and operations of their PostgreSQL instance in favor of effortless setup, and the NoOps cost and scaling model that GCP is known for across their product line."   Chris Sperandio, Product Lead, Segment
"At Xplenty, we see steady growth of prospects and customers seeking to establish their data and analytics infrastructure on Google Cloud Platform. Data integration is always a key challenge, and we're excited to support both Google Cloud Spanner and Cloud SQL for PostgreSQL both as data sources as well as targets, to continue helping companies integrate and prepare their data for analytics. With the robustness of Cloud Spanner and the popularity of PostgreSQL, Google continues to innovate and prove it is a world leader in cloud computing."   Saggi Neumann, CTO, Xplenty

No matter how far we take Cloud SQL, we still feel like we’re just getting started. We hope you’ll come along for the ride.


Google Cloud Platform: your Next home in the cloud



San Francisco Today at Google Cloud Next ‘17, we’re thrilled to announce new Google Cloud Platform (GCP) products, technologies and services that will help you imagine, build and run the next generation of cloud applications on our platform.

Bring your code to App Engine, we’ll handle the rest

In 2008, we launched Google App Engine, a pioneering serverless runtime environment that lets developers build web apps, APIs and mobile backends at Google-scale and speed. For nearly 10 years, some of the most innovative companies built applications that serve their users all over the world on top of App Engine. Today, we’re excited to announce into general availability a major expansion of App Engine centered around openness and developer choice that keeps App Engine’s original promise to developers: bring your code, we’ll handle the rest.

App Engine now supports Node.js, Ruby, Java 8, Python 2.7 or 3.5, Go 1.8, plus PHP 7.1 and .NET Core, both in beta, all backed by App Engine’s 99.95% SLA. Our managed runtimes make it easy to start with your favorite languages and use the open source libraries and packages of your choice. Need something different than what’s out of the box? Break the glass and go beyond our managed runtimes by supplying your own Docker container, which makes it simple to run any language, library or framework on App Engine.

The future of cloud is open: take your app to-go by having App Engine generate a Docker container containing your app and deploy it to any container-based environment, on or off GCP. App Engine gives developers an open platform while still providing a fully managed environment where developers focus only on code and on their users.


Cloud Functions public beta at your service

Up one level from fully managed applications, we’re launching Google Cloud Functions into public beta. Cloud Functions is a completely serverless environment to build and connect cloud services without having to manage infrastructure. It’s the smallest unit of compute offered by GCP and is able to spin up a single function and spin it back down instantly. Because of this, billing occurs only while the function is executing, metered to the nearest one hundred milliseconds.

Cloud Functions is a great way to build lightweight backends, and to extend the functionality of existing services. For example, Cloud Functions can respond to file changes in Google Cloud Storage or incoming Google Cloud Pub/Sub messages, perform lightweight data processing/ETL jobs or provide a layer of logic to respond to webhooks emitted by any event on the internet. Developers can securely invoke Cloud Functions directly over HTTP right out of the box without the need for any add-on services.

Cloud Functions is also a great option for mobile developers using Firebase, allowing them to build backends integrated with the Firebase platform. Cloud Functions for Firebase handles events emitted from the Firebase Realtime Database, Firebase Authentication and Firebase Analytics.

Growing the Google BigQuery universe: introducing BigQuery Data Transfer Service

Since our earliest days, our customers turned to Google to promote their advertising messages around the world, at a scale that was previously unimaginable. Today, those same customers want to use BigQuery, our powerful data analytics service, to better understand how users interact with those campaigns. With that, we’ve developed deeper integration between broader Google and GCP with the public beta of the BigQuery Data Transfer Service, which automates data movement from select Google applications directly into BigQuery. With BigQuery Data Transfer Service, marketing and business analysts can easily export data from Adwords, DoubleClick and YouTube directly into BigQuery, making it available for immediate analysis and visualization using the extensive set of tools in the BigQuery ecosystem.

Slashing data preparation time with Google Cloud Dataprep

In fact, our goal is to make it easy to import data into BigQuery, while keeping it secure. Google Cloud Dataprep is a new serverless browser-based service that can dramatically cut the time it takes to prepare data for analysis, which represents about 80% of the work that data scientists do. It intelligently connects to your data source, identifies data types, identifies anomalies and suggests data transformations. Data scientists can then visualize their data schemas until they're happy with the proposed data transformation. Dataprep then creates a data pipeline in Google Cloud Dataflow, cleans the data and exports it to BigQuery or other destinations. In other words, you can now prepare structured and unstructured data for analysis with clicks, not code. For more information on Dataprep, apply to be part of the private beta. Also, you’ll find more news about our latest database and data and analytics capabilities here and here.

Hello, (more) world

Not only are we working hard on bringing you new products and capabilities, but we want your users to access them quickly and securely  wherever they may be. That’s why we’re announcing three new Google Cloud Platform regions: California, Montreal and the Netherlands. These will bring the total number of Google Cloud regions up from six today, to more than 17 locations in the future. These new regions will deliver lower latency for customers in adjacent geographic areas, increased scalability and more disaster recovery options. Like other Google Cloud regions, the new regions will feature a minimum of three zones, benefit from Google’s global, private fibre network and offer a complement of GCP services.

Supercharging our infrastructure . . .

Customers run demanding workloads on GCP, and we're constantly striving to improve the performance of our VMs. For instance, we were honored to be the first public cloud provider to run Intel Skylake, a custom Xeon chip that delivers significant enhancements for compute-heavy workloads and a larger range of VM memory and CPU options.

We’re also doubling the number of vCPUs you can run in an instance from 32 to 64 and now offering up to 416GB of memory, which customers have asked us for as they move large enterprise applications to Google Cloud. Meanwhile, we recently began offering GPUs, which provide substantial performance improvements to parallel workloads like training machine learning models.

To continually unlock new energy sources, Schlumberger collects large quantities of data to build detailed subsurface earth models based on acoustic measurements, and GCP compute infrastructure has the unique characteristics that match Schlumberger's needs to turn this data into insights. High performance scientific computing is integral to its business, so GCP's flexibility is critical.

Schlumberger can mix and match GPUs and CPUs and dynamically create different shapes and types of virtual machines, choosing memory and storage options on demand.

"We are now leveraging the strengths offered by cloud computation stacks to bring our data processing to the next level. Ashok Belani, Executive Vice President Technology, Schlumberger

. . . without supercharging our prices

We aim to keep costs low. Today we announced Committed Use Discounts that provide up to 57% off the list price on Google Compute Engine, in exchange for a one or three year purchase commitment. Committed Use Discounts are based on the total amount of CPU and RAM you purchase, and give you the flexibility to use different instance and machine types; they apply automatically, even if you change instance types (or size). There are no upfront costs with Committed Use Discounts, and they are billed monthly. What’s more, we automatically apply Sustained Use Discounts to any additional usage above a commitment.

We're also dropping prices for Compute Engine. The specific cuts vary by region. Customers in the United States will see a 5% price drop; customers in Europe will see a 4.9% drop and customers using our Tokyo region an 8% drop.

Then there’s our improved Free Tier. First, we’ve extended the free trial from 60 days to 12 months, allowing you to use your $300 credit across all GCP services and APIs, at your own pace and on your own schedule. Second, we’re introducing new Always Free products  non-expiring usage limits that you can use to test and develop applications at no cost. New additions include Compute Engine, Cloud Pub/Sub, Google Cloud Storage and Cloud Functions, bringing the number of Always Free products up to 15, and broadening the horizons for developers getting started on GCP. Visit the Google Cloud Platform Free Tier page today for further details, terms, eligibility and to sign up.

We'll be diving into all of these product announcements in much more detail in the coming days, so stay tuned!

Google Cloud Platform bolsters support for relational databases



San Francisco Today, we announced new offerings in GCP’s database-services portfolio to give customers even more freedom to focus on building great apps for more use cases, rather than on management details.

In the early days of cloud computing, developers were constrained by the relatively limited choice of database services for production use cases, whether they were replacing on-premise apps or building new ones.

Those constraints have now virtually disappeared. With the announcement of Google Cloud Spanner last month, Google Cloud can meet the most stringent customer requirements for consistency, availability, and scalability in transactional database applications.

Cloud Spanner joins Google Cloud Datastore, Google Cloud Bigtable and Google Cloud SQL to deliver a complete set of databases on which developers can build great applications across a spectrum of use cases without being part-time DBAs. Furthermore, many third-parties have joined the Cloud Spanner ecosystem: Xplenty now supports data transfer to Cloud Spanner, iCharts, Looker, MicroStrategy and Zoomdata provide visual data analytics, and more partners are on their way.

Today, at Google Cloud NEXT ‘17, we're pleased to continue this story with the following announcements.

Cloud SQL for PostgreSQL (Beta)


With the beta availability of Cloud SQL for PostgreSQL in the coming week, it will easier to more securely connect to a database from just about any application, anywhere.

Cloud SQL for PostgreSQL implements the same design principles currently reflected in Cloud SQL for MySQL: namely, the ability to securely store and connect to your relational data via open standards. It also includes all the familiar advantages of a Google Cloud service in particular, the ability to focus on application development, rather than on tedious infrastructure-management operations.

Here’s how Descartes Labs, which uses machine learning to analyze and predict changes in US food supply based on satellite imagery, is already getting value from Cloud SQL for PostgreSQL:
Cloud SQL gives us more time to work on products that provide value to our customers. Our individual teams, who are building micro services, can quickly provision a database on Cloud SQL. They don't need to bother compiling Geos, Proj4, GDAL and Lib2xml to leverage PostGIS. And when PostGIS isn’t needed, our teams use PostgreSQL without extensions or MySQL, also supported by Cloud SQL.”  Tim Kelton, Co-founder and Cloud Architect, Descartes Labs

Getting started with Cloud SQL is easier than ever thanks to a growing list of partners. Partners already supporting Cloud SQL for PostgreSQL include Alooma, Informatica, Segment and Xplenty for data integration, and ChartIO, iCharts, Looker, Metabase and Zoomdata for visual analytics.

Thanks to your feedback, Cloud SQL for PostgreSQL will continue to improve during the beta period; we look forward to hearing about your experiences!

Improved support for MySQL and SQL Server Enterprise 


We have news about other relational-database offerings, as well:
  • Cloud SQL for MySQL improvements: Increased performance for demanding workloads via 32-core instances with up to 208GB of RAM, and central management of resources via Identity and Access Management (IAM) controls
  • Enhanced Microsoft SQL Server support: We announced availability for SQL Server Enterprise images in beta earlier this year; today, we're announcing that SQL Server Enterprise images on Google Compute Engine, and support for Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability Groups, are now both in GA.

Improved SSD Persistent Disk performance

SSD persistent disks now have increased throughput and IOPS performance, which are particularly beneficial for database and analytics workloads. Instances with 32 vCPUs provide up to 40k read IOPS and 30k write IOPS, as well as 800 MB/s of read throughput and 400 MB/s of write throughput. Instances with 16-31 vCPUs provide up to 25k read or write IOPS, 480 MB/s of read throughput, and 240 MB/S of write throughput. Refer to these docs for complete details about Persistent Disk performance limits.


Federated query on Cloud Bigtable


Finally, we're extending BigQuery's reach to query data inside Google Cloud Bigtable, the NoSQL database service designed for massive analytic or operational workloads that require low latency and high throughput (particularly common in Financial Services and IoT use cases). BigQuery users can already query data in Google Cloud Storage, Google Drive and Google Sheets; the ability to query data in Cloud Bigtable is the next step toward a seamless cloud platform in which data of all kinds can be analyzed conveniently via BigQuery, without the need to copy it across systems.

Next steps


With these announcements, developers now have more choices for moving workloads to the cloud than ever before, and greater freedom to focus on building the best possible apps. We urge you to sign up for a $300 credit to try Cloud SQL and the rest of GCP. Start with inexpensive micro instances for testing and development; when you’re ready, you can easily scale them to serve performance-intensive applications.

Introducing Cloud Spanner: a global database service for mission-critical applications





Today, we’re excited to announce the public beta for Cloud Spanner, a globally distributed relational database service that lets customers have their cake and eat it too: ACID transactions and SQL semantics, without giving up horizontal scaling and high availability.

When building cloud applications, database administrators and developers have been forced to choose between traditional databases that guarantee transactional consistency, or NoSQL databases that offer simple, horizontal scaling and data distribution. Cloud Spanner breaks that dichotomy, offering both of these critical capabilities in a single, fully managed service.
Cloud Spanner presents tremendous value for our customers who are retailers, manufacturers and wholesale distributors around the world. With its ease of provisioning and scalability, it will accelerate our ability to bring cloud-based omni-channel supply chain solutions to our users around the world,”  John Sarvari, Group Vice President of Technology, JDA
JDA, a retail and supply chain software leader, has used Google Cloud Platform (GCP) as the basis of its new application development and delivery since 2015 and was an early user of Cloud Spanner. The company saw its potential to handle the explosion of data coming from new information sources such as IoT, while providing the consistency and high availability needed when using this data.

Cloud Spanner rounds out our portfolio of database services on GCP, alongside Cloud SQL, Cloud Datastore and Cloud Bigtable.

As a managed service, Cloud Spanner provides key benefits to DBAs:
  • Focus on your application logic instead of spending valuable time managing hardware and software
  • Scale out your RDBMS solutions without complex sharding or clustering
  • Gain horizontal scaling without migration from relational to NoSQL databases
  • Maintain high availability and protect against disaster without needing to engineer a complex replication and failover infrastructure
  • Gain integrated security with data-layer encryption, identity and access management and audit logging

With Cloud Spanner, your database scales up and down as needed, and you'll only pay for what you use. It features a simple pricing model that charges for compute node-hours, actual storage consumption (no pre-provisioning) and external network access.

Cloud Spanner keeps application development simple by supporting standard tools and languages in a familiar relational database environment. It’s ideal for operational workloads supported by traditional relational databases, including inventory management, financial transactions and control systems, that are outgrowing those systems. It supports distributed transactions, schemas and DDL statements, SQL queries and JDBC drivers and offers client libraries for the most popular languages, including Java, Go, Python and Node.js.

More Cloud Spanner customers share feedback

Quizlet, an online learning tool that supports more than 20 million students and teachers each month, uses MySQL as its primary database; database performance and stability are critical to the business. But with users growing at roughly 50% a year, Quizlet has been forced to scale its database many times to handle this load. By splitting tables into their own databases (vertical sharding), and moving query load to replicas, it’s been able to increase query capacity  but this technique is reaching its limits quickly, as the tables themselves are outgrowing what a single MySQL shard can support. In its search for a more scalable architecture, Quizlet discovered Cloud Spanner, which will allow it to easily scale its relational database and simplify its application:
Based on our experience and performance testing, Cloud Spanner is the most compelling option we’ve seen to power a high-scale relational query workload. It has the performance and scalability of a NoSQL database, but can execute SQL so it’s a viable alternative to sharded MySQL. It’s an impressive technology and could dramatically simplify how we manage our databases.” Peter Bakkum, Platform Lead, Quizlet

The history of Spanner

For decades, developers have relied on traditional databases with a relational data model and SQL semantics to build applications that meet business needs. Meanwhile, NoSQL solutions emerged that were great for scale and fast, efficient data-processing, but they didn’t meet the need for strong consistency. Faced with these two sub-optimal choices that customers grapple with today, in 2007, a team of systems researchers and engineers at Google set out to develop a globally-distributed database that could bridge this gap. In 2012, we published the Spanner research paper that described many of these innovations. The result was a database that offers the best of both worlds:

(click to enlarge)

Remarkably, Cloud Spanner achieves this combination of features without violating the CAP Theorem. To understand how, read this post by the author of the CAP Theorem and Google Vice President of Infrastructure, Eric Brewer.

Over the years, we’ve battle-tested Spanner internally with hundreds of different applications and petabytes of data across data centers around the world. At Google, Spanner supports tens of millions of queries per second and runs some of our most critical services, including AdWords and Google Play.

If you have a MySQL or PostgreSQL system that's bursting at the seams, or are struggling with hand-rolled transactions on top of an eventually-consistent database, Cloud Spanner could be the solution you're looking for. Visit the Cloud Spanner page to learn more and get started building applications on our next-generation database service.

Inside Cloud Spanner and the CAP Theorem



Building systems that manage globally distributed data, provide data consistency and are also highly available is really hard. The beauty of the cloud is that someone else can build that for you.

The CAP theorem says that a database can only have two of the three following desirable properties:

  • C: consistency, which implies a single value for shared data
  • A: 100% availability, for both reads and updates
  • P: tolerance to network partitions

This leads to three kinds of systems: CA, CP and AP, based on what letter you leave out. Designers are not entitled to two of the three, and many systems have zero or one of the properties.

For distributed systems over a “wide area,” it's generally viewed that partitions are inevitable, although not necessarily common. If you believe that partitions are inevitable, any distributed system must be prepared to forfeit either consistency (AP) or availability (CP), which is not a choice anyone wants to make. In fact, the original point of the CAP theorem was to get designers to take this tradeoff seriously. But there are two important caveats: First, you only need to forfeit consistency or availability during an actual partition, and even then there are many mitigations. Second, the actual theorem is about 100% availability; a more interesting discussion is about the tradeoffs involved to achieve realistic high availability.

Spanner joins Google Cloud

Today, Google is releasing Cloud Spanner for use by Google Cloud Platform (GCP) customers. Spanner is Google’s highly available, global SQL database. It manages replicated data at great scale, both in terms of size of data and volume of transactions. It assigns globally consistent real-time timestamps to every datum written to it, and clients can do globally consistent reads across the entire database without locking.

In terms of CAP, Spanner claims to be both consistent and highly available despite operating over a wide area, which many find surprising or even unlikely. The claim thus merits some discussion. Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA.

The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.

However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages. For example, given there are many sources of outages for an application, if Spanner is an insignificant contributor to its downtime, then users are correct to not worry about it.

In practice, we find that Spanner does meet this bar, with more than five 9s of availability (less than one failure in 106). Given this, the target for multi-region Cloud Spanner will be right at five 9s, as it has some additional new pieces that will be higher risk for a while.

Inside Spanner 


The next question is, how is Spanner able to achieve this?

There are several factors, but the most important one is that Spanner runs on Google’s private network. Unlike most wide-area networks, and especially the public internet, Google controls the entire network and thus can ensure redundancy of hardware and paths, and can also control upgrades and operations in general. Fibers will still be cut, and equipment will fail, but the overall system remains quite robust.

It also took years of operational improvements to get to this point. For much of the last decade, Google has improved its redundancy, its fault containment and, above all, its processes for evolution. We found that the network contributed less than 10% of Spanner’s already rare outages.

Building systems that can manage data that spans the globe, provide data consistency and are also highly available is possible; it’s just really hard. The beauty of the cloud is that someone else can build that for you, and you can focus on innovation core to your service or application.

Next steps


For a significantly deeper dive into the details, see the white paper also released today. It covers Spanner, consistency and availability in depth (including new data). It also looks at the role played by Google’s TrueTime system, which provides a globally synchronized clock. We intend to release TrueTime for direct use by Cloud customers in the future.

Furthermore, look for the addition of new Cloud Spanner-related sessions at Google Cloud Next ‘17 in San Francisco next month. Register soon, because seats are limited.

Google Cloud Platform for data center professionals: what you need to know



At Google Cloud, we love seeing customers migrate to our platform. Companies move to us for a variety of reasons, from low costs to our machine learning offerings. Some of our customers, like Spotify and Evernote, have described the various reasons that motivated them to migrate to Google Cloud.

However, we recognize that a migration of any size can be a challenging project, so today we're happy to announce the first part of a new resource to help our customers as they migrate. Google Cloud Platform for Data Center Professionals is a guide for customers who are looking to move to Google Cloud Platform (GCP) and are coming from non-cloud environments. We cover the basics of running IT  Compute, Networking, Storage, and Management. We've tried to write this from the point of view of someone with minimal cloud experience, so we hope you find this guide a useful starting point.

This is the first part of an ongoing series. We'll add more content over time, to help describe the differences in various aspects of running your company's IT infrastructure.

We hope you find this useful in learning about GCP. Please tell us what you think and what else you would like to add, and be sure to follow along with our free trial when you sign up!

Top 12 Google Cloud Platform posts of 2016


From product news to behind-the-scenes stories to tips and tricks, we covered a lot of ground on the Google Cloud Platform (GCP) blog this year. Here are the most popular posts from 2016.

  1. Google supercharges machine learning tasks with TPU custom chip - A look inside our custom ASIC built specifically for machine learning. This chip fast-forwards technology seven years into the future. 
    Tensor Processing Unit board
  2. Bringing Pokemon Go to life - Niantic’s augmented reality game uses more than a dozen Google Cloud services to delight and physically exert millions of Pokemon chasers across the globe.


  3. New undersea cable expands capacity for Google APAC customers and users - Together with Facebook, Pacific Light Data Communication and TE SubCom, we’re building the first direct submarine cable system between Los Angeles and Hong Kong.
  4. Introducing Cloud Natural Language API, Speech API open beta and our West Coast Region expansion - Now anyone can use machine learning models to process unstructured data or to convert speech to text. We also announced the opening of our Oregon Cloud Region (us-west1).


  5. Google to acquire Apigee - Apigee, an API management provider, helps developers integrate with outside apps and services. (Our acquisition of cloud-based software buyer and seller, Orbitera, also made big news this year.)


  6. Top 5 GCP NEXT breakout sessions on YouTube (so far) - From Site Reliability Engineering (SRE) and container management to building smart apps and analyzing 25 billion stock market events in an hour, Google presenters kept the NEXT reel rolling. (Don’t forget to sign up for Google Cloud Next 2017, which is just around the corner!)


  7. Advancing enterprise database workloads on Google Cloud Platform - Announcing that our fully managed database services Cloud SQL, Cloud Bigtable and Cloud Datastore are all generally available, plus Microsoft SQL Server images for Google Compute Engine.


  8. Google Cloud machine learning family grows with new API, editions and pricing - The new Cloud Jobs API makes it easier to fill open positions, and GPUs spike compute power for certain jobs. Also included: custom TPUs in Cloud Vision API, Cloud Translation API premium and general availability of Cloud Natural Language API.


  9. Google Cloud Platform sets a course for new horizons - In one day, we announced eight new Google Cloud regions, BigQuery support for Standard SQL and Customer Reliability Engineering (CRE), a support model in which Google engineers work directly with customer operations teams.


  10. Finding Pete’s Dragon with Cloud Vision API - Learn how Disney used machine learning to create a “digital experience” that lets kids search for Pete’s friend Elliot on their mobile and desktop screens.
  11. Top 10 GCP sessions from Google I/O 2016 - How do you develop a Node.js backend for an iOS and Android based game? What about a real-time game with Firebase? How do you build a smart RasPI bot with Cloud Vision API? You'll find the answers to these and many other burning 


  12. Spotify chooses Google Cloud Platform to power its data infrastructure - As Spotify’s user base grew to more than 75 million, it moved its backend from a homegrown infrastructure to a scalable and reliable public cloud.

Thank you for staying up to speed on GCP happenings on our blog. We look forward to much more activity in 2017, and invite you to join in on the action if you haven't already. Happy holidays!

Bigtable paper earns the SIGOPS 2016 Hall of Fame Award



We’re honored and humbled to bring you the news that the original Bigtable paper (“Bigtable: A Distributed Storage System for Structured Data”) has received the SIGOPS Hall of Fame Award. The award was announced at the annual USENIX OSDI conference in Savannah, Georgia, on November 2.

Curated by the ACM Special Interest Group on Operating Systems (SIGOPS), the annual award recognizes the most influential papers published over the previous decade. The Bigtable paper, published in 2006 by Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes and Robert E. Gruber, joins a long list of pioneering research that had significant impact on academia, industry and the world. Past recipients include the Google File System paper (Ghemawat et al, 2003), the MapReduce paper (Dean and Ghemawat, 2004), and other historic papers that provided many of the foundational principles for distributed and cloud computing as we know them today.

In that tradition of technical thought leadership, Bigtable helped to kick-start one of the most transformational movements in modern distributed computing: NoSQL. Specifically, the published description of Bigtable and its use cases at Google in that paper either directly or indirectly led to the creation of open source implementations in the form of HBase, Cassandra and Accumulo, which have since become widely-adopted Apache projects. In that sense, one could argue that the Bigtable paper did for the NoSQL database industry what E.F. Codd’s 1970 relational data model paper did for the RDBMS industry.

From thought leadership to customer success on the cloud

There are multiple technical reasons that explain Bigtable’s influence, and reading the paper itself is the best way to understand them. Suffice to say here that Bigtable was among the first distributed stores ever described that could support storage of structured data on a petabyte scale with linear scalability, low latency and high-throughput performance. These characteristics were driven by what were highly demanding, future-looking requirements even at that time, and to this day, Bigtable powers many of our most popular user-facing products serving up to billions of users, including Search, Analytics, Maps and Gmail.

In recent years, perhaps the most significant development in the Bigtable ecosystem is that customers can now directly benefit from these same advantages via Google Cloud Bigtable, a managed service provided by Google Cloud Platform. Spotify, FIS, Energyworx, Qubit and many other customers are happily running their production workloads on Google Cloud Bigtable today, and we’re confident that we can help meet your needs, as well.

Explore Google Cloud Bigtable for yourself:

Treat Google Cloud Storage like a file system with our new PowerShell provider



Google Cloud Storage is pretty amazing. It offers near-infinite capacity, up to 99.95% availability and fees as low as $0.007GB/month. But storing data in the cloud has always had one drawback: you need to use specialized tools like gsutil to browse or access it. You can’t just treat Cloud Storage like a really, really, really big hard disk. That is, until now.

Navigating Cloud Storage with Cloud Tools for PowerShell

The latest release of Cloud Tools for PowerShell (included with the Cloud SDK for Windows) includes a PowerShell provider for Cloud Storage. PowerShell providers are a slick feature of Windows PowerShell that allows you to treat a data source as if it were a file system, to do things like browse the system registry or interact with a SQL Server instance. With a PowerShell provider for Cloud Storage, you can now use commands like cd, dir, copy, del, or even cat to navigate and manipulate your data in Cloud Storage.

To use the provider for Cloud Storage, first load the GoogleCloud PowerShell module by using any of its cmdlets, PowerShell’s lightweight commands. Then just cd into the gs:\ drive. You can now explore your data like you would any local disk. To see what buckets you have available in Cloud Storage, just type dir. The provider will use whatever credentials you have configured for the Cloud SDK (see gcloud init).

PS C:\> Import-Module GoogleCloud
WARNING: The names of some imported commands from the module 'GoogleCloud' include unapproved verbs that might make
them less discoverable. To find the commands with unapproved verbs, run the Import-Module command again with the
Verbose parameter. For a list of approved verbs, type Get-Verb.
PS C:\> cd gs:\
PS gs:\> dir | Select Name

Name
----
blog-posts
chrsmith-demos.appspot.com
chrsmith-pictures
database-snapshots-prod
staging.chrsmith-demos.appspot.com

...


To navigate your buckets and search for a specific object, just keep using cd and dir (which are aliases for the Set-Location and Get-ChildItem cmdlets respectively.) Note that just like the regular file system provider, you can use tab-completion for file and folder names.


Populating Google Cloud Storage

The following code snippet shows how to create a new bucket using mkdir and use the Set-Content cmdlet to create a new object. Notice that Get-Content takes an object name relative to the current folder in Google Cloud Storage, e.g. gs:\gootoso-test-bucket\folder.

PS gs:\> mkdir gootoso-test-bucket | Out-Null
PS gs:\> Set-Content gs:\gootoso-test-bucket\folder\file.txt `
   -Value "Hello, GCS!"
PS gs:\> Test-Path gs:\gootoso-test-bucket\folder\file.txt
True
PS gs:\> cd .\gootoso-test-bucket\folder
PS gs:\gootoso-test-bucket\folder> cat file.txt

Hello, GCS!


Of course you could do the same thing with the existing PowerShell cmdlets for Cloud Storage such as Get-GcsBucket, New-GcsObject, Copy-GcsObject and so on. But being able to use common commands like cd in the PowerShell provider provides a much more natural and productive experience.

Mixing Cmdlets and the PowerShell Provider

Since the PowerShell provider returns the same objects as other Cloud Storage cmdlets, you can intermix commands. For example:

PS gs:\gootoso-test-bucket\folder> $objs = dir
PS gs:\gootoso-test-bucket\folder> $objs[0].GetType().FullName
Google.Apis.Storage.v1.Data.Object
PS gs:\gootoso-test-bucket\folder> $objs | Read-GcsObject
Hello, GCS!

PS gs:\gootoso-test-bucket\folder> Write-GcsObject -Object $objs[0] -Contents "update"
PS gs:\> Remove-GcsBucket -Name gootoso-test-bucket


All of the objects returned are strongly typed, defined in the C# client library for the Cloud Storage API. That means you can use PowerShell’s particularly powerful pipelining features to access properties on the returned objects, for things like sorting and filtering.

This snippet shows how to get the largest file in the blog-posts Bucket, for any object under the images folder.

PS gs:\> cd gs:\blog-posts\images
PS gs:\blog-posts\images> $objects = dir -Recurse
PS gs:\blog-posts\images> $objects |
   Sort-Object Size -Descending |

   Select-Object -First 1 -Property Name,TimeCreated,Size


In short, the PowerShell provider for Cloud Storage simplifies a lot of tasks, so give it a whirl and try it for yourself. For more information on the provider as well as other PowerShell cmdlets, check out the PowerShell documentation.

Google Cloud Tools for PowerShell, including the new provider for Cloud Storage, is in beta. If you have any feedback on the cmdlet design, documentation, or have any other issues, please report it on GitHub. The code is open-source too, so pull requests are also welcome.

Announcing new storage classes for Google Cloud Storage: simplifying the storage and management of hot and cold data



Businesses seek the best price and performance to suit the storage needs of workloads ranging from multimedia serving, to data analytics and machine learning, to data backup/archiving, all of which drive demand for a variety of storage options. At Google, we aim to build a powerful cloud platform that can meet the needs of the most demanding customer workloads. Google Cloud Storage is a key part of that platform and offers developers and IT organizations durable and highly available object storage, with consistent APIs for ease of application integration, all at a low cost.

Today, we’re excited to announce a major refresh of Google Cloud Storage. We're introducing new storage classes, data lifecycle management tools, improved availability and lower prices  all to make it easy for our customers to store their data with the right type of storage. Whether a business needs to store and stream multimedia to their users, store data for machine learning and analytics or restore a critical archive without waiting for hours or days, Cloud Storage now offers a broad range of storage options to meet those needs.

We’re also excited to announce the continued expansion of our Google Cloud Platform (GCP) partner ecosystem, with partners already using the new Cloud Storage capabilities for use cases including content delivery, hybrid storage, archival, backup and disaster recovery.

New storage classes for Google Cloud Storage

We're announcing the general availability of four storage classes for Cloud Storage. These offer customers a consistent API and data access performance for all of their hot and cold data, with simple-to-understand and highly competitive pricing.
(click to enlarge)
Cloud Storage Coldline: a low-latency storage class for long-term archiving
Coldline is a new Cloud Storage class designed for long-term archival and disaster recovery. Coldline is perfect for the archival needs of big data or multimedia content, allowing businesses to archive years of data. Coldline provides fast and instant (millisecond) access to data and changes the way that companies think about storing and accessing their cold data.

At GCP, we believe that archival data should be as accessible as any other data. Coldline’s API and low latency data access are consistent with other storage classes. This means existing systems can now store and access Coldline data without any updates to the application, and can serve that data directly to end users in milliseconds. Priced at just $0.007 per gigabyte per month plus a simple and predictable access fee of $0.05 per GB retrieved, Coldline is the most economical storage class for data that's accessed less than once per year.

Coldline also works well with Nearline to provide tiered storage for data as it cools. Our recent work on Nearline latency and throughput ensures comparable performance across all storage classes.

To help you migrate your data to Coldline, and other Cloud Storage classes, we offer an easy-to-use Google Cloud Storage Transfer Service and have extended our Switch and Save program to include Coldline. Depending on the amount of data you're bringing to Coldline, you can receive several months of free storage, for up to 100PB of data. To learn more about Switch and Save, please contact our sales team.

Google Cloud Storage Multi-Regional and Regional
GCP customers use Cloud Storage for a variety of demanding use cases. Some use cases require highly available storage close to the Google Compute Engine instances. Others need higher levels of availability and geo-redundancy. We’re updating our storage classes to address those needs:

Google Cloud Storage Multi-Regional is a highly available and geo-redundant storage class. It’s the best storage class for business continuity, or for serving multimedia content to geographically distributed users.

In the case of a regional outage, Cloud Storage transparently routes requests to another available region, ensuring that applications continue to function without disruption. Multi-Regional storage is priced at $0.026 per GB per month, including storage of all replicas, replication over the Google network and connection rerouting. It’s currently available in three locations: US, EU and Asia. All existing Standard storage buckets in a multi-regional location have been converted to Multi-Regional storage class.

Vimeo, a media hosting, sharing and streaming service, leverages Google Cloud Storage Multi-Regional to ensure high availability, and low-latency access to data. Cloud Storage Nearline is used to minimize overall storage costs. To deliver the best possible experience, Vimeo leverages integration between Google Cloud Storage and Fastly, a real-time CDN service. With Fastly, Vimeo can deliver content from Google Cloud Storage to users’ instantly  performing at sub-150 millisecond response times.
We use Google Cloud Platform, including Google Cloud Storage and Compute Engine along with Fastly, for storing and delivering all popular and infrequently accessed content and to handle our peak transcode loads.  
- Naren Venkataraman, Senior Director of Engineering, Vimeo
Fastly customers need low-latency, high-throughput storage and fast, flexible, secure content delivery at the edge. The combined power of Google Cloud Storage and Fastly’s Cloud Accelerator allows customers like Vimeo to fully optimize content storage and delivery, controlling costs and improving global performance.  
- Lee Chen, Head of Strategic Partnerships, Fastly

Google Cloud Storage Regional is a highly available storage class redundant within a single region. It’s ideal for pairing storage and compute resources within a region, to deliver low end-to-end latency and high throughput for workloads such as data transcoding or big data analytics workloads running on Google Compute Engine, Google Cloud DataProc, Google Cloud Machine Learning or BigQuery for example.

Regional storage class is priced at $0.02 per GB per month. All existing Standard storage buckets in a regional location have been converted to the Regional storage class. This is equivalent to a 23% price reduction, and the pricing change for converted buckets takes effect November 1st.

Effective November 1st we're also introducing new lower API operations pricing for both Multi-Regional and Regional storage classes. Class A operations will cost $0.005 per 1,000 operations (50% price reduction), and Class B will cost $0.004 per 10,000 operations (60% price reduction).

With the addition of Coldline and the refresh of our storage classes with Multi-Regional and Regional, GCP customers will continue to enjoy the same API and consistent data access performance for all of their hot and cold data. With Coldline, no application changes are needed to leverage archived data and there’s no compromise on access time for that data, while Multi-Regional makes it simple to ensure that your data is highly-available and geo-redundant. Plus, we're delivering all of this with simple to understand and highly competitive pricing:
(click to enlarge)

New data management lifecycle capabilities

Many of our customers use multiple storage classes for their different workloads. Having a single API and consistent data access performance ensures applications can seamlessly leverage multiple storage classes. It should be easy for customers to also manage the appropriate storage tier for their data.

We're introducing the beta of new data lifecycle management capabilities to make it easier to manage data placement. Any Google Cloud Storage bucket can now hold data in different storage classes, and the lifecycle policy feature can automatically transition objects in-place to the appropriate colder storage class based on the age of the objects.

Expanding the Cloud Storage partner ecosystem

Many customers already use multiple Cloud Storage classes and will benefit from these storage updates, both directly through us and through our partners, a number of whom have already integrated the new Coldline storage class. Starting today, these partners are available to help you use our new storage classes in your own environment:

  • Fastly: Fastly is a content delivery network that lets businesses control how they serve content, provides real-time performance analytics and caches frequently changing content at the edge. Fastly enables customers to configure Google Cloud Storage as the origin, and Fastly’s Origin Shield designates a single point-of-presence (POP) to handle cache-misses across our entire network.
  • Veritas: Building on existing support for GCP, Veritas is committed to supporting Cloud Storage Coldline. The unique combination of Veritas Information Map, Veritas NetBackup and the GCP ensures customers can gain greater controls on data visibility as they move to the Google Cloud at global enterprise scale. Veritas' collaboration with Google further demonstrates the shared commitment to helping organizations around the world manage information.
  • Cloudian: The Cloudian HyperStore smart data storage platform seamlessly integrates with Google Cloud Storage (including Coldline) to provide anywhere from terabytes to hundreds of petabytes of on-premises storage. Policy-based data migration lets you move data to Coldline based on rules such as data type, age and frequency of access.
  • Cloudberry Lab: CloudBerry Backup is a cloud backup solution that leverages Coldline. In addition to offering real-time and/or scheduled regular backups, encryption, local disk image or bare metal restore, CloudBerry employs block level backup for maximum efficiency and provides alerting features to track each backup and restore plan remotely.
  • Komprise: Komprise data management software enables businesses to seamlessly manage the lifecycle of data and cut costs by over 70% by leveraging all the tiers of Cloud Storage transparently with existing on-premises storage. In under 15 minutes, customers can get a free assessment of how much data can move to Cloud Storage and the projected ROI with a free Komprise trial.
  • StorReduce: StorReduce’s inline deduplication software enables you to move terabytes to petabytes of data into Coldline (or other Cloud Storage tiers) and then use cloud services such as search on that data.
  • Cohesity: The Cohesity hyper-converged secondary storage system for enterprise data consolidates fragmented, inefficient islands of secondary storage into a virtually limitless storage platform. Coldline can be used for any data protection workload via. Cohesity’s policy-based administration capabilities.
  • Sureline: Sureline application mobility software delivers migration and recovery of virtual, cloud, physical or containerized applications and servers. It allows enterprises to use Coldline as the disaster recovery target for occasionally accessed DR images with SUREedge DR.

With an expanding partner ecosystem, more customers than ever before are now able to take advantage of the benefits of GCP.

To learn more about Cloud Storage and the new storage classes, visit our web page here, or do a deeper dive into our technical documentation. You can also sign up for a free trial, or contact our sales team.