All posts filed under “Security & Privacy

Security and Privacy topics

comment 0

Lambda functions in a VPC

In my honest (and truly humble) opinion, VPCs don’t make much sense in a serverless architecture — it’s not that they don’t add value, it’s that the value the add isn’t worth the complexity you incur.

After all, you can’t log into a lambda function, there are no inward connections allowed. And it isn’t a persistent environment, some functions may timeout after just 2-3 seconds. Sure, network level security is still worthy pursuit, but for serverless, tightly managing IAM roles and looking after your software supply chain for vulnerabilities would be better value for your money.

But if you’ve got a fleet of EC2s already deployed in a VPC, and your Lambda function needs access them. Then you have no choice but to deploy that function in a VPC as well. Or, if your org requires full network logging of all your workloads, then you’ll also need VPC (and their flow logs) to comply with such requests.

Don’t get me wrong, there is value in having your functions in a VPC, just probably not as much as you think.

Put that aside though, let’s dive into the wonderful world of Lambda functions and VPCs

Working Example

First, imagine we deploy a simple VPC with 4 subnets.

  1. A Public Subnet with a Nat Gateway inside it.
  2. A Private Subnet which routes all traffic through that NAT Gateway
  3. A Private Subnet without internet (only local routing)
  4. A Private Subnet without internet but with a SSM VPCe inside it

Let’s label these subnets (1), (2) ,(3) and (4) for simplicity.

Now we write some Lambda functions, and deploy each of them to each subnet. The functions have an attached security group that allows all outgoing connections, and similarly each subnet has very liberal NACLs that allow incoming and outgoing connections.

Then we create a gateway S3 VPC-endpoint (VPCe), and route subnet (4) to it.

Finally, we enable private DNS on the entire VPC. And then outside the subnet we create a bucket and an System Manager Parameter Store Parameter (AWS really need better terms for these things).

The final network looks like this:

comment 0

Amazon KMS: Intro

Amazon KMS is one of the most integrated AWS services, but probably also the least understood. Most developers know about it, and what it can do, but never really fully realize the potential of the service. So here’s a rundown…

comment 0

Interactive Shell on a Lambda Function

One of a great things about Lambda functions is that you can’t SSH into it.

This sounds like a drawback, but actually it’s a great security benefit — you can’t hack what you can’t access. Although it’s rare to see SSH used as an entry path for attackers these days, it’s not uncommon to see organizations lose SSH keys every once in a while. So cutting down SSH access does limit the attack surface of the lambda — plus the fact, that the lambda doesn’t exist on a 24/7 server helps reduce that even further.

Your support engineers might still want to log onto a **server**, but in todays serverless paradigm, this is unnecessary. After all, logs no longer exists in /var/logs they’re on cloudwatch, and there is no need to change passwords or purge files because the lambdas recycle themselves after a while anyway. Leave those lambda functions alone will ya!

As a developer, you might want to see what is **in** the lambda function itself — like what binaries are available (and their versions), or what libraries and environment variables are set. For this, it’s far more effective to just log onto a lambci docker container — Amazon work very closely with lambci to ensure their container matches what’s available in a Lambda environment. Just run any of the following

  • docker run -ti lambci/lambda:build-python3.7 bash
  • docker run -ti lambci/lambda:build-python3.6 bash

Lambci provide a corresponding docker container for all AWS runtimes, they even provide a build image for each runtime, that comes prepackaged with tools like bash, gcc-c++, git and zip. This is the best way to explore a lambda function in interactive mode, and build lambda layers on.

But sometimes you’ll find yourself wanting to explore the actual lambda function you ran, like checking if the binary in the lambda layer was packaged correctly, or just seeing if a file was correctly downloaded into /tmp— local deploy has it’s limits, and that’s what this post is for.

comment 0

Securing Lambda Functions

First a definition.

A lambda function is a service provided by aws that runs code for you without the introducing the complexity of provisioning servers of managing Operating Systems. It belongs in a category of architectures called serverless architectures.

There’s a whole slew of folks trying to define with is serverless, but my favorite definition is this.

Serverless means No Server Ops

Joe Emison

They’re the final frontier of compute, where the idea is that developers just write code, while allowing AWS (or Google/MSFT) to take care of everything else. This includes H/W management, OS Patching, even application level maintenance like Webserver upgrades are not your problem anymore with serverless.

Nothing runs on fairy-dust though, serverless still has servers — but in this world those servers, their operating systems, and the underlying runtime (e.g. Python, Node, JVM) are fully managed services that you pay per use.

As a developer you write some code into a function. Upload that function to AWS — and now you can invoke this function over and over again without worrying about servers, operating systems or run-time.

But how does AWS achieve this?

Before we can understand how to secure a serverless function, we need to at least have a fair understanding of how Serverless functions (like AWS Lambda) work.

So how does a lambda function work?

comment 0

Thoughts on SingHealth Data Breach

On the 20th of July, Singaporean authorities announced a data breach affecting SingHealth, the country largest healthcare group. The breach impacted 1.5 million people who had used SingHealth services over the last 3 years.

Oh boy, another data breach with 1.5 million records … **yawn**.

But Singapore has less than 6 million people, so it’s a BIG deal to this island I currently call home. Here’s what happened.

The lowdown

According to the official Ministry announcement administrators discovered ‘unusual’ activity on one of their databases on 4-Jul, investigations confirmed the data breach a week later, and public announcement was made 10 days after confirmation.

4-Jul : IHiS’ database administrators detected unusual activity on one of SingHealth’s IT databases
10-Jul : Investigations confirmed the data breach, and all relevant authorities were informed
12-Jul : A Police Report is made
20-Jul : A public announcement is made

The official report states that “data was exfiltrated from 27 June 2018 to 4 July 2018…no further illegal exfiltration has been detected”.

The point of entry was ascertained to be “that the cyber attackers accessed the SingHealth IT system through an initial breach on a particular front-end workstation. They subsequently managed to obtain privileged account credentials to gain privileged access to the database”

And finally that “SingHealth will be progressively contacting all patients…to notify them if their data had been illegally exfiltrated. All the patients, whether or not their data were compromised, will receive an SMS notification over the next five days”

comment 0

Security Headers for Gov-TLS-Audit

Gov-TLS-Audit got a brand new domain today. No longer is it sharing a crummy domain with sayakenahack (which is still blocked in Malaysia!), it now has a place to call it’s own.

The domain cost me a whooping $18.00/yr on AWS, and involved a couple hours of registration and migration.

So I felt that while migrating domains, I might as well implement proper security headers as well. Security Headers are HTTP Headers that instruct the browser to deny or allow certain things, the idea being the more information the site tells the browser about itself, the less susceptible it is to attack.

I was shocked to find out that Gov-TLS-Audit had no security headers at all! I assumed AWS (specifically CloudFront) would take care of ‘some’ http headers for me — I was mistaken. Cloudfront takes care of the TLS implementation, but does not implement any security header for you, not even strict-transport-security which is TLS related.

So unsurprisingly, a newly created cloudfront distribution, using the reference AWS implementation, fails miserably when it comes to security headers.

I guess the reason is that HTTP headers are very site-dependant. Had Cloudfront done it automatically, it might have broken a majority of sites And implementing headers is one thing, but fixing the underlying problem is another — totally bigger problem.

But what security headers to implement?

comment 0

The GREAT .my outage of 2018

.my DNSKEY Failure
Boy, that’s a lot of RED!

Last week, MyNic suffered a massive outage taking out any website that had a .my domain, including local banks like and even government websites hosted on

Here’s a great report on what happened from IANIX. I’m no DNSSEC expert, but here’s my laymen reading of what happened:

  1. .my uses DNSSEC
  2. Up to 11-Jun,.my used a DNSKEY with key tag:25992
  3. For some reason, this key went missing on the 15-Jun, and was replaced with DNSKEY key tag:63366. Which is still a valid SEP for .my
  4. Unfortunately, the DS record on root, was still pointing to key tag:25992
  5. So DNSSEC starting failing
  6. 15 hours later, instead of correcting the error, someone tried to switch off DNSSEC removing all the signatures (RRSIG)
  7. But this didn’t work, as the parent zone still had a DS entry that pointed to key tag:25992 and hence was still expecting DNSSEC to be turned on.
  8. 5 hours after that, they added back the missing DNSKEY key tag:25992 (oh we found it!), but added invalid Signatures for all entries — still failing.
  9. Only 4 hours after that did they fix it, with the proper DS entry on root for DNSKEY key tag:63366and valid signatures.
  10. That’s a 24 hour outage on all .my domains.

So basically, something broke, they sat on it for 15 hours, then tried a fix, didn’t work. Tried something else 5 hours after that, didn’t work again! And finally after presumably a lot of praying to the Gods of the Internet and a couple animal sacrifices, managed to fix it after a 24-hour downtime.

I defend my fellow IT practitioners a lot on this blog, but this is a difficult one. Clearly this was the work of someone who didn’t know what they were doing, and refused to ask for help, instead tried one failed fix after another which made things worse. As my good friend Mark Twain would say — it’s like a Mouse trying to fix a pumpkin.

I don’t fully understand DNSSEC (it’s complicated), but I’m not in charge of a TLD. It’s unacceptable that someone could screw up this badly — and for that screw up to impact so many people, and all we got was a lousy press release.

The point is, it shouldn’t take 24 hours to resolve a DNSSEC issue, especially when it’s such a critical piece of infrastructure. I’ve gone through reports of similar DNSSEC failures, and in most cases recovery takes 1-5 hours. The TLD had a similar issue, that was resolved in an hour, very rarely do we see a 24 hour outage, so what gives?

I look forward to an official report from MyNIC to our spanking new communications ministry, and for that to be shared to the public.

comments 8

The Malaysian Ministry of Education Data Breach

Ok, I’ve been pretty involved in the latest data breach, so here’s my side of the story.

At around 11pm last Friday, I got a query from Zurairi at The Malay Mail, asking for a second opinion on a strange email the newsdesk received from an ‘anonymous source’. The email was  regular vulnerability disclosure, but one that was full of details, attached with an enormous amount of data.

This wasn’t a two-liner tweet, this was a detailed email with outlined sub-sections. It covered why they were sending the email, what the vulnerable system was, how to exploit the vulnerability and finally (and most importantly!) a link to a Google Drive folder containing Gigabytes of data.

The email pointed to a Ministry of Education site called SAPSNKRA, used for parents to check on their children’s exam results. Quick Google searches reveal the site had security issues in the past including one blog site advising parents to proceed past the invalid certificate warning in firefox. But let’s get back to the breach.

My first reaction was to test the vulnerability, and sure enough, the site was vulnerable to SQL Injection, in exactly the manner specified by the email. So far email looked legitimate.

Next, I verified the data in the Google Drive folder, by downloading the gigabytes of text files, and checking the IC Numbers of children I knew.

I further cross-checked a few parents IC numbers against the electoral roll. Most children have some indicator of their fathers name embedded in their own, either through a surname or the full name of the father after the bin, binti, a/l or a/p. By keying in the fathers IC number, and cross-referencing the fathers name against what was in the breach, it was easy to see that the data was the real deal.

So I called back Zurairi and confirmed to him that the data was real, and that the site should be taken offline. I also contacted a buddy of mine over at MKN, to see if he could help, and Zurairi had independently raised a ticket with MyCert (a ticket??!!) and tried to contact the Education Minister via his aide.

Obviously neither Zurairi nor myself, or any of the other journalist I kept in touch with, could report on the story. The site was still vulnerable, and we didn’t want someone else breaching it.

The next morning, I emailed the anonymous source and asked them to take down the Google Drive, explaining that the breach was confirmed, and people were working to take down the site. Hence there was no reason to continue exposing all of that personal information on the internet.

They agreed, and wiped the drive clean, and shortly after I got confirmation that the SAPSNKRA website had been taken down. So with the site down, and the Google Drive wiped cleaned, it seemed the worst was behind us.

Danger averted…at least for now.

But, since Data breaches last forever, and this was a breach, we should talk about what data was in the system. Zurairi did a good job here, but here’s my more detail take on the issue.