Cloud Conversion, Part 2: Cloud Functions
This series of articles gives you the latest from our Google Cloud Platform (GCP) “lab” where we tinker, experiment and do feasibility studies on our wide GCP technology offering for upcoming customer cases.
So, with the summer gone it’s time to take another look at some sweet GCP tech. Summer is a great time of year to hack away on new stuff; while the evenings and weekends to be filled with various kinds of fun not necessarily (or even remotely) related to programming, the fact that many projects tend to have a slower phase during the summer holidays allows one to focus on experimenting and learning new tricks. So, this time we will be taking a look at Google Cloud Functions.
INTRODUCING CLOUD FUNCTIONS
Cloud Functions are a somewhat new addition to the GCP stack; they’re still in Beta and therefore not officially promoted by Google for use in anything critical as changes could break things. However, Google’s products tend to be really solid already in Beta. I have no personal anxiety about using Google Betas in production, as long as the customer is aware of what it means and what the possible backup plans are.
Cloud Functions are small bits of code that react to events: you can trigger a Cloud Function via an upload/change/delete of a storage object on Google Cloud Storage (‘GCS’), a Cloud Pub/Sub message or a direct HTTP call. Your function may then perform its task and optionally signal another part of your platform by a number of means. They are often referred to as serverless computing, meaning no-ops administration and easy deployment of your service. Naturally, there is a server (in this case, many), it’s just run by someone else! The implementation language / environment at the time of writing is Node.js v6; hopefully at least Python and Go will be added once Cloud Functions are out of Beta.
SAMPLE USE CASE 1: IMAGE MANIPULATION
Preparing for a customer case, I ported my old JPEG Preview Thumbnail [1] technology to a Cloud Function. The old implementation was a Google AppEngine / Python based microservice, which was processing images as they were sent to it over HTTP. This was working just fine, but it introduced a dependency via its API, and I thought an autonomous and reactive model would be nicer.
The function uses a GCS bucket as its input trigger. When an image file is uploaded to the bucket, the function downloads it, downscales it to thumbnail size and then extracts the JPEG part of the data. This data is then combined with the custom header and stored in a configured output GCS bucket with the same name as the original image. Over the course of processing, the original image’s dominant color is also calculated and stored as metadata for the output file. The service using the thumbnails could then also be listening for new entries in the output bucket and thus know when the thumbnail calculation of an image is complete.
This makes the function completely autonomous and requires no dependencies to the rest of the system; all the signaling and data transfer is done via GCS buckets.
This function has now also been released to the public with the MIT license; find the repo here: https://github.com/qvik/gcf-thumbnails
SAMPLE USE CASE 2: METADATA AUGMENTATION VIA DATA ANALYSIS
We were tasked with building an image analysis module for our customer Choicely [2]. The module’s purpose was to extract labels (“tags”) from images being uploaded by users through their platform and update their Datastore entities accordingly. The information would be used for providing a suggested set of image tags to the user at a later time and for categorizing images to improve searching and cataloging. In addition, the images had to be scanned for nudity, which would be flagged and possibly removed in compliance with their service guidelines.
Vision API was the obvious choice for the image analysis. The question was where and how to run the business logic. Should we add this functionality to their existing AppEngine module? Should we add a new microservice alongside the old main module? Having very recently played around with Cloud Functions, we suggested giving them a try and got the green light from the customer. All of a day and a half later, the first version was up and running, happily tagging the content images. Needless to say, they were happy with the results.
Like in the first example, the function uses a GCS bucket trigger to notice a fresh image upload. It then pushes the image to Vision API for analysis and accesses Datastore for the corresponding metadata entity, storing the results in the object. Thus, the metadata in Datastore gets augmented/enriched in an asynchronous and autonomous fashion: the next time the main application serves the metadata out in an API call response, the augmented data is passed to the client along with the pre-existing data. Smooth, eh?
IN CLOSING
Cloud Functions are a really nice way to implement small, isolated processing steps in your backend data flow. With the lack of dependencies and no need to worry about scalability or the need to implement separate APIs, etc., the development is extremely modular and can provide huge time savings.
The above-mentioned use cases might be among the most common uses for Cloud Functions, but there’s a myriad of possibilities to benefit from deploying them for your platform. We’re already using several in production and, once they are officially out of Beta, we’ll really be putting the pedal to the metal with Cloud Functions!
REFERENCES
[2] Choicely is an audience engagement platform with mobile apps, a website and a widget that can be embedded in customer websites. They provide extensive services for asking audience favourites or contestant rating. They also provide API’s to live data for TV shows and live events. https://choicely.com/