March 20, 2017

Keeping an eye on Cloud Functions with StackDriver Monitoring

Google Cloud Functions are awesome, however it can be hard to tell if these functions are running successfully in your production app, especially if they're being triggered by Realtime Database, Cloud Pub/Sub or other event triggers which wouldn't cause any error on your clients.

To fix this, we can use Google Cloud's Stackdriver Logging and Stackdriver Monitoring to alert you (via SMS, Email, Pagerduty, etc) when your functions are having issues.

Note: Your Firebase project will need to be on the pay-as-you-go "Blaze" plan to enable us to use Google Cloud features.

Let's dive in and set up a simple alert.

Open up your Cloud Console.

In the left navigation panel, find Stackdriver > Monitoring.

This will launch you into the seperate Strackdriver console. This console is separate because it's not tied to Google Cloud - it's also usable with AWS and other clouds.

You can quickly click through setting up your Stackdrive account.

Make sure that you're configuring Stackdriver to use the correct Google Cloud / Firebase project - my project is called "Upsheet".

You will be asked about adding other Google Cloud projects or AWS projects and installing the Stackdriver agent. Just keep hitting continue - we don't need any of that.

Eventually, you'll see this screen.

We're ready to dive in and create an alert. Hit the Launch Monitoring button you'll see the Stackdriver dashboard.

We don't need to do anything else in Stackdriver quite yet - head back to the Google Cloud console.

In the Cloud console, go to the left navigation and find the Logging section.

If you haven't seen Stackdriver Logging in Google Cloud before, this next section could be a little overwhelming. The important thing to understand is that all your logs (from App Engine, Compute Engine, Cloud Functions for Firebase, etc) all come here.

We can ignore most of this data though, let's filter it down to a single Cloud Function we want to create an alert for.

Navigate to Cloud Function > FUNCTION_NAME > All region. My function is called reddit_hourly_dispatcher. This is a function which runs every hour and reads some data from Reddit. Once you've selected a function, you'll see all the logs for that specific function.

I want to create an alert which will trigger if this function doesn't run at least once every hour.

Although I could make an advanced alert which looks for failures or other specific types of logs, in this case I just want to know that the function execution starts.

Looking through the logs, we see that an event with the text "Function execution started" gets logged every each time the function is invoked. I'll use this type of log as the basis for my alert.

If I click on the text of the log, it'll expand to show the details of the log entry.

We're only interested in the textPayload section. Clicking on the value of textPayload will give us the option to show all entries which have an exact match on that field.

We can see that the filter field at the top of the page will now have a very specific filter which matches every Function execution started event for my reddit_hourly_dispatcher function.

We can now see that this function is invoked every hour at around 26 minutes and 3 seconds after the hour, this excludes the 13:37 entry where I manually triggered the function as a test.

Now that this filter has been created, we can save this as a metric which we can use to create an alerting policy in Stackdriver Monitoring.

Hit the Create Metric button in the top left.

You'll be asked to give this metric a name and a description. This metric is a log of invocations of the reddit_hourly_dispatched function, so I named it reddit_hourly_dispatched-was-invoked. Then I hit the second Create Metric button.

Once you've created this metric, select the Log-based Metrics tab on the left side.

You can see our new metric now shows up under User-defined Metrics.

We want to create an alert from this metric, so select the three-dots menu and click Create alert from metric. This will send us back to Stackdriver Monitoring.

In Stackdriver Monitoring, you'll see that it has automatically put together the beginning of an alerting policy for you.

Make sure RESOURCE is set to cloud_function otherwise you may not see any data. If you still don't see any data, you may need to wait for your function to be invoked for it to be captured by your new metric.

At this point you can configure your policy as you'd like. In my case, I'd like to be alerted if my function isn't invoked at least once an hour. To be safe, I gave it a slightly larger window (2 hours) and create a policy which says "if invocations is below 0.01 (i.e is zero) for 2 hours then alert".

You can see the chart adds the threshold I defined (the blue line) so we can see it in comparison to the previous invocations of the function (the pink line).

Hit Save Coniditon.

Now we'll be able to finally wrap this up by attaching a notification to this alert.

As you can see, there are tons of advanced options for alerting. I like getting an SMS, so I chose that.

Under Documentation you can add in some documentation to be sent along with the outage alert. I wrote some notes to myself about what this metric actually means.

Finally you can give this policy a friendly name, save it and boom!

You've got an alert tied to your Google Cloud Functions for Firebase. If that function stops being called, then I'll get a text and I'll be able to investigate.

Obviously this is only the tip of the Stackdriver iceberg. You could create dozens of different alerting policies based on different Cloud Function metrics. You can also use these metrics to create dashboards with nifty charts so you can see how your app is running in a flash.

Stackdriver is an extremely powerful part of Google Cloud and I highly recommend diving into the Stackdriver docs and using it to monitor all parts of your apps!

Happy hacking!

Follow me on Twitter for more Cloud Functions Pro-tips!