Building a serverless blog - a newbie way

Building a serverless blog - a newbie way

- 13 mins

Intro

During my preparations to the AWS Solutions Architect Associate exam I wanted to test my knowledge in practice, especially try out the new concept - serverless, that recently gets more and more traction.

My day-to-day work is rather backend heavy, we deploy and configure hundreds of machines both in cloud and physical data centers, so I decided to create my personal blog and try some frontend combined with the newest AWS goodies.

The topic is going to be covered in more a series of posts. The first one focuses on my newbie way of doing it. In other words to get something working as quickly as possible. In the later posts I’m going to show how to harden, secure and show what can be done better. I’m assuming that my reader has basic knowledge of AWS.

What is serverless

The serverless architecture is not about not having servers at all, but rather about eliminating OPS burden. One no longer have to care about servers, but only about writing the code. Scaling, load balancing will be hidden, you are charged by request count (compute units) instead of capacity of the servers you create. No matter if you are using it or not. In serverless world if you don’t use compute units, you just don’t pay for them.

This is not serverless

In the AWS world a service responsible for all of this is called Lambda. Lambda is a very powerful implementation of before mentioned concept, it can be triggered by an HTTP request or any AWS event. Other cloud providers has their own implementation of this function for example Azure gives us Azure Cloud Functions and Google has Google Cloud Functions.

One question that might come to your mind is how does it differ from the platform as a service approach?

In the PasS model we still have to reserve capacity for our service - for example at Heroku you pay for every dyno and DB machine. In the FaaS model there is no limit for computing units (at least in theory). However, there are similarities in both situations we have limited or even no control over the box your app is running on, which can hurt you if you want to debug your app on production environment.

Why do you want to be serverless?

First of all, we want to get rid off thinking about scaling and configuring our infrastructure. All these tasks can be delegated - in our case to AWS. Thanks to that we can focus purely on the functionality and application code. The detailed discussion in this topic can be found here:

The Next Layer of Abstraction in Cloud Computing is Serverless

The thesis of the above mentioned blog post is: Lambda is doing to compute, what S3 did to storage.

Goal & Architecture

My first serverless application is the website you are currently looking at! Yes, yet another - write your own blog in 5 minutes tutorial. However, I find it as a very simple playground, where you don’t have to focus on functional requirements as much as on technical details.

Architecture of this website

The most of any blog content is usually static: posts after being written are published and they are not changing much, the most dynamic part - comments can be handled by disqus’ embeddable plugin. It seems that simple static html page can do the job. With advent of frameworks like Jekyll it is very easy to generate such blog from bunch of markdown files containing your posts.

Actually, there is no place for Lambda, isn’t it? That is why I decided to add contact form page, where readers can contact me and I don’t have to expose my email address to public. Going further, I have captcha included to eliminate bot generated requests and at the same time test how to deal with external services from our Lambda function. To make my life easier I used SNS to send me email notifications.

The contact form flow can be summarized in the following steps:

Lets get our hands dirty!

Static Content

I won’t describe how to configure and use Jekyll, but rather assume that we have a directory with generated static website. To host and serve it we are going to use S3 bucket. S3 is just an object storage, but it has a capability of serving static content via HTTP. The feature is called static website hosting and it is available via bucket properties - see the image attached below. To make it work we need to create a bucket and upload our html files to that bucket and make sure that all the files are public - use make public action on all files, otherwise they won’t be accessible via HTTP. This blog is stored in a git repo as well from which it can be easily recreated, so the bucket is not the only copy I have.

Taking this into account we can save some $ by using reduced redundancy storage class - you can do it by selecting all files -> more -> change storage class. By doing that we decrease durability from 99,999999999% to 99,99%, which is still good and we save $.

bucket props

There is one important thing which is not obvious - if you want to attach your domain - like in my case pawelpikula.pl, the bucket HAS to have the same name as the domain. Otherwise you won’t be able to set an alias to your apex domain in ROUTE53! You can see my domain configuration is below. As you can see we just point to s3-website.eu-central-1... service there - name of our bucket is not present there, it is going to be infered from our domain.

domain configuration

Contact form and API Gateway

Now the fun part begins! To make my life a bit harder I decided to use classic way of handling web form instead asynchronous HTTP request with JSON.

<form style="width: 100%;" action="https://nwzca4tmbd.execute-api.eu-central-1.amazonaws.com/prod/message" method="post">
  <input type="email" name="email" placeholder="Your email"/><br>
  <textarea rows="14" name="body" placeholder="Your message"/> <br/>
  <div style="clear: both; display: inline-block; width: 100%;">
  <div class="g-recaptcha" data-sitekey="6Le8RhoUAAAAAOWLjmPSmf-5KF7XOMs1yOgD-Uqk" style="float: left;" data-callback="enableBtn"></div>
   <div style="float: left;">
    <input id="send" type="submit" value="Send" disabled="true" />
   </div>
  </div>
</form>

As you can see - nothing extraordinary, we have two fields(email of the sender and message) and google captcha element. We are using POST method calling https://nwzca4tmbd.execute-api.eu-central-1.amazonaws.com/prod/message which points to one of AWS services - It is an API gateway endpoint.

API gateway is very neat service, it allows to create HTTP interfaces with authentication encryption, multi versioning etc. API gateway is able to create ready to use client libraries accessing the created API- JS, Android, iOS. It is also worth to mention that API GW support access/secret keys, once you create you API you are able to sell access to it on AWS marketplace.

API Gateway

On the other side there are plenty of integrations - API can call other API, for example private API inside our VPC, it can put something in S3 bucket or just call a lambda function. We are going to leverage the last option. API gateway is responsible for decoding HTTP input and transforming it into “lambda readable” format and of course, when lambda returns a value it is API GW job to transform it to a proper HTTP response.

In our case we decode POST body and if lambda successfully validates the params we want to redirect to main page, but there are couple of gotchas here.

First of all, decoding plain old POST form params is not so simple as if it was application-json request. We need a special code to map POST body to JSON(accepted by lambda). You need to click at Integration Request box and add the following script in body mapping templates for application/x-www-form-urlencoded content type:

{
    "data": {
        #foreach( $token in $input.path('$').split('&') )
            #set( $keyVal = $token.split('=') )
            #set( $keyValSize = $keyVal.size() )
            #if( $keyValSize >= 1 )
                #set( $key = $util.urlDecode($keyVal[0]) )
                #if( $keyValSize >= 2 )
                    #set( $val = $util.urlDecode($keyVal[1]) )
                #else
                    #set( $val = '' )
                #end
                "$key": "$val"#if($foreach.hasNext),#end
            #end
        #end
    }
}

The next issue I had with lambda integration is there is no explicit way to set HTTP status code from lambda. We just map and transform every response to certain HTTP code - there is no way manipulate it based on returned value. However, if lambda throws an exception - we can match on exception type and return different status. I used it here to generate redirect to main page. We will see it in the next section.

Lambda and SNS

There are couple of runtimes available: Java, C#, Node.js and Python. I decided to pick Python. The best choice is Node.js as the most of examples you can find are written using node. Python is also good, because warming up takes less time than in case of C# or Java, however in our case and scale it doesn’t really matter. The code I’m using is pasted below.

import recaptcha2
import json
import boto3

# entry point - executed when API is called
def lambda_handler(event, context):
    # exstract fields
    data = event["data"]
    captcha = data["g-recaptcha-response"]
    mail = data["email"]
    message = data["body"]

    if validate_captcha(captcha):
        send_notification(mail, message)
        raise Redirect("https://pawelpikula.pl")
    else:
        return "ok"

# use the lib to validate captcha, ofc I replaced my secret with xxxxx...
def validate_captcha(captcha_hash):
    secret = "xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    result = recaptcha2.verify(secret, captcha_hash)
    return result['success']

# construct a message and push it through SNS
def send_notification(email, body):
    message = email + " has sent: " + body
    client = boto3.client('sns')
    arn = "arn:aws:sns:eu-central-1:470898574782:BlogContactNotification"
    response = client.publish(
        TargetArn=arn,
        Message=json.dumps({'default': json.dumps(message)}),
        MessageStructure='json'
    )
    print response

# We need this class to be able to generate redirect to main page
class Redirect(Exception):
    pass

The lambda_handler is the entry point to this application. It just validates captcha and constructs SNS message that is published into BlogContactNotification SNS topic from which I get emails.

To access AWS services I’m using boto3 library which is available by default in python lambda runtime, you need to make sure that your lambda role has access to used services - in my case SNS. However, I’m also using recaptcha2 lib which was not present. There is no way to specify dependencies for a lambda in, let’s say, a separate file. To have extra dependencies you need to bundle them in your code package. To do that in python I used the follwowing command that installs the packages to specified directory.

pip install recaptcha2 -t lambda/

Improvements and security concerns

There are several things that can be done better and in more secure way.

First of all captcha keys are stored in plain text in the code - see the “secret” variable in validate_captcha function. We can decouple it a bit, there is a possibility to pass some constants to lambda via ENV variables. Moreover, these variables can be encrypted at rest and in transit by using AWS KMS. Similar approach can be found in various public CI/CD pipelines, where we pass credentials to external services in a similar fashion.

Current contact form has some flaws, especially from the end user perspective there is no clear indication that the message was sent. We have only redirect, that tells nothing. Better approach would be to use AJAX request instead of “redirect POST”. It would simplify our API GW config as accepting JSON is straight forward and actually, API gateway is able to generate java script code, so calling the API using generated lib is trivial.

I used SNS for sending emails, but there is a dedicated service for sending mail - SES - Simple Email Service. I used SNS because SES is not available in eu-central-1 with which I started my journey. It would be a good exercise to move existing setup to different region. We can leverage CloudFormation to be able to spin whole setup in any region with matter of minutes without any manual work. Speaking of manual work, I’ll show how to automate the whole process.

Stay tuned!

If you want to get notified when new posts are published, please follow me on twitter

comments powered by Disqus
rss facebook twitter github youtube mail spotify instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora