From the DC-Area API Meetup: How To Build A Scalable API on AWS in 10 Minutes

As a part of ProgrammableWeb's ongoing series of on-demand re-broadcasts of presentations that are given at the monthly Washington, DC-Area API meetup (anyone can attend), this article offers a video recording, and audio-only podcast, and a full transcript of the Dec 5, 2019 discussion given by calltime.ai's Adam Becker. Becker shows attendees how he and his partner collaborated to stand-up an API endpoint in almost no time using three Amazon Web Services (AWS); S3, Lambda, and API Gateway. In the video, he makes some important points about taking a DevOps-based approach to provisioning services; namely that all infrastructure should be provisioned with code that's written specifically to do the provisioning as opposed to manually going into the AWS console and fiddling with the various forms and parameters. Becker even shares the code so that anyone else can provision an API endpoint with code in exactly the same way he does it.

The DC-Area API Meetup almost always takes place on the first Tuesday of every month. The attendees consist of API enthusiasts and practitioners from all around the federal government as well as businesses and organizations that are local to the DC Metro area. There is no charge to attend and attendees get free pizza and beer, compliments of the sponsors. The meetup is always looking for great speakers and sustaining sponsors. If you're interested in either opportunity, please contact David Berlind at David.Berlind@programmableweb.com. If you're interested in attending, just visit the the meetup page and RSVP one of the upcoming meetups. It's that simple.

How To Build A Scalable API on AWS in 10 Minutes

Editor's Note: This and other original video content (interviews, demos, etc.) from ProgrammableWeb can also be found on ProgrammableWeb's YouTube Channel.

Audio-Only Version

Editor's note: ProgrammableWeb has started a podcast called ProgrammableWeb's Developers Rock Podcast. To subscribe to the podcast with an iPhone, go to ProgrammableWeb's iTunes channel. To subscribe via Google Play Music, go to our Google Play Music channel. Or point your podcatcher to our SoundCloud RSS feed or tune into our station on SoundCloud.

Tune into the ProgrammableWeb Radio Podcast on Google Play Music  Tune into the ProgrammableWeb Radio Podcast on Apple iTunes  Tune into the ProgrammableWeb Radio Podcast on SoundCloud


Transcript: How To Build A Scalable API on AWS in 10 Minutes

Adam Becker: Hi, everyone. I want to show you in 10 minutes how you can put together a scalable API on AWS without having to touch the AWS console even one time. Now, we might touch it a couple of times just to make sure that everything's going well, but you don't have to. You should follow along, or at least take a picture of this screen, so that you have access to this medium post that we put up just so that you can then copy and paste a bunch of blocks and the codes that you can go in and use it yourself for your own programs.

A couple of weeks ago, my team and I decided that it's about time to put together a button to allow our users to flag whenever they see data that looks wrong, or that looks inaccurate. Now, the data that we're dealing with is extremely important for our users because our users are Democratic candidates spanning the ballot from school board all the way up to the presidency and they're using the data that we're sharing with them to make decisions about the way that they strategize for their fundraising efforts. Whenever the data that they see is wrong or potentially wrong or suspicious, it's important that they flag it for us and file some report so that it can then go back into our system and improve the way we're building things.

So, this was the sketch that I put together. It's a very rough draft, something unusual with the data, people are supposed to click it, and then they flag whatever it is that looks wrong and it needs to go into our system. So, I went up to my co-founder on Slack and I said, "Hey, Dustin, here is a take on the data accuracy button." This is the visual of it, and this is the architecture. I said, "When they submit a response, it'd be great to have it write their response, including the campaign ID, to S3. And, if we share a bucket, or if you give me permissions to a bucket, say, that he creates, I'll just hook it up to the rest of our flow myself.

This is what the flow looks like. We've got a user submitting some report. This is the app backend. That app backend, in my mind, is then writing that message or that payload into S3. And then I have another instance, this is on the data engineering side, that is pulling in that request. We're seeing the report and we're sending it to an engineer. Now the engineer that is working on this, this person, her name is Emily and Emily's right here. Emily, you can say hello. She's right there.

Okay, so this is the infrastructure that I had in mind and it isn't a good infrastructure. And very quickly Dustin let me know that and he said, "Can't you just create," he says "can", but in my mind, it's "Can't you just create an endpoint that I can post the data to that's much cleaner than sharing an S3 bucket infrastructure wise?" So, what he had in mind is something that is much cleaner, which is basically this. This is the app backend. It's making some requests, some like POST requests, to this API gateway that is then triggering and AWS Lambda function that is writing it to S3 and can send it then to Emily.

So, I put that ... I sort of thought about that, I was like "Yeah, this is very clearly the right way to go." I said, "Good idea, here's an end point." And it took me about 10 minutes to put this together and the reason it took me 10 minutes is not because I'm an excellent programmer, but because we're using very good tools to do it. So, I want to share with you those tools and I want to actually replicate those exact steps that I had taken for us to put together this entire workflow where you're just exposing an endpoint and anybody can post data to it and that data then goes and is written directly into an S3 bucket.

So, before I do that, I want to see just by quick show of hands, how many people here are using AWS? Of those people who are using AWS, keep your hands up, how many are using CloudFormation? And how many are using Terraform, which is a layer then on top of that? And how many people are using an even more abstracted layer like Pulumi to do it, where you can then script all of your resource deployment using your favorite language like Go, JavaScript, Python? Okay, so it's that higher level of abstraction that I'll be showing you how you can utilize.

Just as background, you can go back to that medium post and see a little more about that. Infrastructure-as-code is sort of industry standard at this point. The idea is that you shouldn't be going around AWS and clicking around while you're provisioning different resources and then gluing them together and then having to remember what it is that you did in the beginning and then replicating those exact same steps for every different stack for your dev, for your staging, for your production. You shouldn't be doing all of that, which is all the stuff that I had been doing for a very long time.

Instead, you should use some tool that is exposing access to all of these resources that you see on the cloud, but using Python or using JavaScript, or using another language and then construct an entire infrastructure on the cloud and tear it down just as quickly. And you could do that programmatically and dynamically and there's many, many benefits for doing it this way. So, sort of the overarching framework looks like this. We have something that's like a model. JS and I'm just defining this variable, ec2, this is a new ec2 and you're just calling it some name and feeding certain parameters to it. So this is what we'll be doing. We'll be creating an ec2 instance, well, not in this case, but this is the idea of what you should be doing.

In our case, this is the exercise. We have an API gateway, we have a Lambda function, and then we have an S3 bucket. This is what we'll be doing. We'll be building a certain resource S3 and then building Lambda and then building an API. So let's do that. You should click this link. If you haven't yet, but so few of you have laptops. So that's not a concern. I'll be clicking this link and this is what it is. And so far as we have the time to do it, I actually want us to just run through this entire setup and we'll do that together. So the first thing you got to do is download Pulumi. So Pulumi is that higher-level abstraction. I had already done that and you could see it Pulumi version. Is this large enough?

Audience member: Bigger? Yeah, that's where it's at.

All right. Okay, I have Pulumi. Next thing is Node. Do I have Node? I have Node. Okay, cool. Otherwise, just brew install it or download Node, and I put the link to it. The second thing is to configure your AWS. If you don't already have an AWS account, you should create one and then remember to turn it off cause otherwise you're going to be charged for all that stuff. The easiest way to tell Pulumi where to find your AWS credentials is to just use the AWS CLI and to configure your own profile. So the way to do that is you install the AWS CLI and then you go AWS configure and it has all of your stuff there. So now Pulumi knows where to look for your AWS credentials. Cool. Now let's create a Pulumi project. So we go $pulumi new aws = javascript. So we're going to be using the no JS version and let's create a directory first.

We'll call it api-hangout. Let's go there. Good. All right, so now let's come up with a project name. Let's say 'api-hangout' — that works, description is fine. The stack is 'dev'. Stack is just a collection of different resources that you've put together that have a different configuration from one another. So for example, 'dev' would be one stack and 'staging' would be another stack. Let's just start with 'dev' for now and we'll deploy it to us-west-2. Now it's just going to download a bunch of stuff. But one of the things we can do is probably open this project and see if some of this stuff is already there. What did we call it? api-hangout.

Audience member : Yeah.

Adam: All right, let's see what we got. So I'm guessing and hoping that what it's doing while it's thinking is just downloading more and more of these node modules and we still have access to play with some of these things, but can you see this or is this too small?

Audience member: Blow it up.

Adam: Let's see how to do that. View. Maybe enter presentation mode? Nope. What happened?

Audience member: That's better.

Adam: Yeah. Okay. So now I need to see where everything is. Ah, here it is. You can see it better. Okay, cool. So we got a couple of files for free. The first one is just the YAML file for the Pulumi project. It defines just the name for all we have to worry about. The next thing is the Pulumi.dev.yaml. This is basically where you write out all of your configuration. So this can be like your Dodd and so you keep all of your environmental variables here. We don't have to worry about that either. The main thing we're going to be dealing with is this index.js. This is what starts, this is what gets exposed to the Pulumi project. And as soon as you do Pulumi up, Pulumi u p or up, it just reads this index.js and this is sort of what gets exposed to the rest of the program.

So one of the first things that it does is it just loads all of these different Pulumi libraries and you can already see that it's, it started out by giving us a free bucket. This is const bucket = new aws.S3.Bucket and this is the name that Pulumi is reading and so every resource gets its own name. This is not going to be the name of the bucket, but this is the name of the bucket resource in the Pulumi world. We can get a better sense for that I think here. Yeah, we have new and then the resource and then the name of the resource and then one of those parameters is going to be maybe the bucket name. Okay. Let's see if we're ready. Okay, we're ready. So now let's do pulumi up and it's going to show us what exactly will be deployed. Okay. Do you want to create these two things? Which is the stack and the S3 bucket. Let's look at the details. Some of this might be familiar if you've been working with CloudFormation. Yeah, let's do it. So do you want to perform this update? Yes. And now what we should do is go to S3 and see that it's actually worked.

All right. This is the bucket name. Let's see that we actually have it. Yes, it's right here. All right, nice. So we've just managed to create an S3 bucket without having to touch the console. Let's March through this a little bit more quickly now. The next thing we want to do is change the name of this bucket from my-bucket-[8327740] to something that's a little bit more explicit. So let's just destroy the stack. So we do pulumi destroy, and now it's just going to remove that bucket and we're going to rename it in. The way we're going to rename it is by making reference to the actual stack so that the name of the bucket that we're going to see on S3 has some reference to the stack. So it's 'dev' or 'prod' or 'staging'. So, let's do that. Instead of this. We'll do this, and apI-hangout. Let's say apI-hangout-bucket. All right, let's run this and we should see it having been created.

Let's see, there's going to be the name, apI-hangout-bucket.. Live demo is a risky business.

Speaker 3: Yes!

Adam We're taking those risks. Let's see it. Let's refresh. Okay, nice. Here it is: api-hangout-bucket-dev. Cool. Next thing we want to do is create the Lambda function, so this is the code for the Lambda function. It's just new.aws.lambda.CallbackFunction. Feed in the name, and then we feed in the parameters. If you don't know what the parameters are, it's very easy to just click on the documentations. It gives you all the different parameters that you can feed in. So let's do that Lambda. Okay.

Now, notice that there's a couple of interesting fields here. One is role and the other one is callback. So let's call this, let's say api-hangout-payloads-api-meetup-lambda, the name of this should be maybe api- hangout and we need a role and we need a Lambda function. So, a role just allows Lambda to interact with different resources. We can create one with the exact same template, right? It's just new something, something.Role. And then we named the role and then we feed in the parameters. Let's do that. I just created one new aws.iam.Role the name of the role, and then we assume a policy and you see there's basically nothing happening in that. Paul is, it's a very raw policy. Okay, let's create role payloads. We'll call this role-api-hangout. This is going to be the role and this is going to be the role that Lambda takes. And now Lambda also needs a function. What does that function that Lambda will be invoking?

We can start with a very basic function, right? This function basically does nothing, right? So, it's just an async function. There's an event, parameter and argument that is fed in and it's just returns a status code of 200 with a body of success. Let's add a bit more flesh to it. Let's decode the actual body of the event. So you can see here that whenever we're going to be invoking this Lambda, the event. body is going to have the payload itself. And so we have to convert that to a string after we turned it into a buffer. So now we have access to the payload and we want to write that payload to S3, right? This is what the goal is to take the payload into Lambda and to feed that into S3. This is the way you feed things into S3 using promises and await and also show you how we do es6 so it doesn't yell at us and we're almost done. Okay, cool. So now we put the payload into S3 and we have to specify what the putParams are, right? So we have to specify what the key is and which bucket we're feeding it into. And the key should be, this is just the name of the file. We could just pick the timestamp of when it was loaded. The body will be the payload and the bucket.

Notice this is interesting. Now it's expecting an environmental variable that has the name of the bucket. We haven't fed that in anywhere yet. So, this is one of the next things that we have to do when we create this function. We have to make sure that we're also giving it a ... we have to make sure that we're giving it the name of the bucket. So let's do that here, callback and the environment field in the variable. And notice, this is cool. The variable that we're feeding in is S3 bucket and that is just the bucket object that we created earlier, bucket.ID. So now, we're building all these interdependencies into this function. So, that's kind of cool. What else do we need to do? Ah, we don't have access. We are calling S3 but that is undefined yet. So let's just create an entry here and we're just requiring the AWS SDK.

That should be it. So let's just run this and see what happens.

Ah! Unexpected token, index 69. I knew something was up here. All right, let's try now. Okay, so do we want to create this role and this function? Yes. Okay. It's creating. And we should test it. Let's go to Lambda and the name of this function is going to be api-hangout-lambda. Let's see if it was created successfully. Okay, nice. It's created. Let's refresh the console.

All right, it's right here. Here we go. Boom. You can see it. And this is our code. So at least they deployed to Lambda nicely. The only problem now is that this Lambda is fairly isolated. We can't even trigger it with an HTTP request. And so you might remember this original image here. There's nothing that is interacting with this Lambda, so we need to put it behind an API gateway. So let's create the API gateway. And as you can probably guess already, it probably looks like new [awsx.]apigateway. with the name of API gateway and a bunch of parameters. This is it. Let API gateway = this whole thing. And we're very close to finishing. Let's call this [payloads.]api-hangout-api-gateway and now we need to pick a path, the path. So let's just call it a yeah, posting to S3. The method is a POST. This is the route and the event handler, it will trigger this Lambda, the one we just created. Nice. Let's pulumi up and as soon as this is done, we're going to notice something interesting about this Lambda... Load.

Ah! You can see that now it can be triggered by an API gateway. So this is very nice. Now I actually have a URL that I can copy into, say, Postman and then just post whatever payload to it. Now notice that while it can be triggered by the API gateway, it can't actually interact with any other resource. The reason it can't do it is because we haven't given that role the permission to interact with S3. So let's just do that and then we'll be done. For that we have to create a policy first. And you can see this is a fairly restrictive policy. The only thing that we're allowing in this policy is to put an object into S3. And what it takes in is one of its arguments is the bucket. So this is the bucket we had created a bit earlier. So let's just do that.

Okay, so we have the policy, we have a Lambda role. We haven't attached one to the other. We have to do that as well. So you can guess it's new. Something, something role, policy attachment. We feed in some argument. The argument is policy, ARN and the role. Okay. Let's call this api-hangout-post-to-S3-policy. And let's call this api-hangout-post-to-S3-policy-attachment. So this is the attachment. We're attaching the policy to the role. Let's do pulumi up. Ah, there's a problem. Looking for the beginning of object key, string. Policy contains an invalid JSON. What is going on here? Can somebody figure this out? Let's do this again. I have a feeling that it has something to do with...

Does anybody know? Can anybody see this? Nope. Okay, let's try now. Ah! Policy contains an invalid JSON, what do you do in this situation? What's that? Yeah. So this one is the string that gets applied as soon as you do bucket.ARN. Does this look like a valid JSON and then the right meetup to figure that out? Is this a valid JSON?

Audience member: Can you leave it up there so we can look at it.

Adam: Ah, okay. This one tells us that it isn't a valid JSON. What if we remove this? Okay. It didn't like this. You can't add comments to JSON. Good. Let's see if this was it. Yes!

Audience member: Yeah!

Adam: Okay. All right, cool. So let's go back to the Lambda and see whether it has the right role that has the right policy to interact with Amazon S3 and it looks like it does. Okay, cool. So unless we have some application error, this at least is infrastructure wise able to take in some payload and feed that into S3. Moment of truth. Let's figure that out. The name of our bucket was api-hangout-bucket-dev. Let's go in.

We got nothing here so far. And what is the URL? It's this, and it's a POST request. And what's the endpoint? Do you remember? We had created an endpoint when we created the API gateway posting to S3...posting to S3. All right, body, let's do 'meetupname : API meetup'. Okay. And let's see if this works.

It's loading. The first time you run Lambda, it might take a second because it's, it's still a cold container. It has to wait until it warms it up. But right now that it's warm, you could trigger this thousands of times concurrently. This is one of the benefits of this. This just scales horizontally as much as you wish. Let's see if it actually got logged into here. All right. It's here. No...? There you go. API meetup. That was five minutes. How long was that roughly? Okay. Do people know what AWS Lambda is? I realize I probably should have asked that question before we started.

Speaker: I've heard of it. (Laughter)

Adam: Okay. Yeah. Should I do it? Should I, I'm going to start over. So one of the benefits of running your code on the cloud is that you don't have to keep running a server for ... just to get some, let's say we have a function that gets triggered once a month for example. There is no reason to just deploy a server and pay for it to just continue to run all of the time. So instead you could just take a block of code and then just put it somewhere in AWS and ask AWS to just provide you a URL, like we did just now with the API gateway and then AWS basically gives you a URL that you can use to just trigger that function whenever you want to call it. And so you're not actually paying for any resources while it's just sitting there idly.

The other thing is that you could just, you could just keep calling a bunch of these Lambdas and run them all concurrently. You could see here if you go to Lambda, you scroll a little bit down. This is our S3 bucket by the way. You see fed that in is the environmental variable. Use unreserved account concurrency. I can run this 6,000 times concurrently. It's going to be very difficult to scale out an ec2 instance that does that for example, concurrently. And so you could even get that expanded if you just send them an email. I think by default it's like 2,000 and then I asked them to increase it and they're like, fine, so it is now 6,000. That's what AWS Lambda does.

Be sure to read the next DC-Area API Meetup article: What is an API and Why does the API Contract Matter So Much?