All posts filed under “Misc

Just a collection of stuff (mostly from my old blog)

comment 0

Contact Tracing Apps: In this context their OK.

I thought I’d write down my thoughts on contact tracing apps, especially since a recent BFM suggested 53% of Malaysians wouldn’t download a contact tracing app due to privacy concerns. It’s important for us to address this, as I firmly believe, that contact tracing is an important weapon in our arsenal against COVID-19, and having 54% of Malaysians dismiss outright is concerning.

But first, let’s understand what Privacy is.

Privacy is Contextual

Privacy isn’t secrecy. Secrecy is not telling anyone, but privacy is about having control over who you tell and in what context.

For example, if you met someone for the first time, at a friends birthday party, it would be completely rude and unacceptable to ask questions like:

  • What’s your weight?
  • What’s your last drawn salary?
  • What’s your age?

In that context you’re unlikely to find someone who will answer these questions truthfully.

But…

Age and weight, are perfectly acceptable questions for a Doctor to ask you at a medical appointment, and your last drawn salary is something any company looking to hire you will ask. We’ve come to accept these questions as OK — under these contexts.

You might still not want to answer them, which might mean you don’t get the job, or the best healthcare — but you certainly can’t be concerned by them. Far more people will answer these same questions truthfully if you change the context from random stranger at a party to doctors appointment.

So privacy is contextual, to justify concerns we have to evaluate both the context and the question before coming to a conclusion.

So let’s look at both, starting with the context:

comment 0

Sharding SQS

Potassium40 was a project I started to see how fast Lambda could really go. The project attempts to download the robots.txt files from 1 million websites as fast as it can. I chose robots file because — well it’s supposed…

comment 0

Why?!

The system, which was introduced on the first day of the 2020 school session yesterday, takes only two seconds to scan a pupil’s face before his personal information, such as full name, pupil number and class, is stored into the…

comment 0

Multi-Accounts for AWS with Google '+' emails.

Last week, I launched a new pipeline for Klayers to build Python3.8 Lambda layers in addition to Python3.7. For this, I needed a separate pipeline because not only is it a new runtime, but under the hood this Lambda uses a new Operating System (Amazon Linux 2 vs. Amazon Linux 1)

So I took the opportunity to make things right from an account hierarchy perspective. Klayers for Python3.7 lived in it’s own separate account from all my other hobby projects on AWS — but I kept all stages in it (default, dev and production). [note:Default is an odd-name, but it ties to the Terraform nomenclature]. This afforded some flexibility, but the account felt bloated from the weight of the different deployments — even though they existed in different regions.

It made no sense to have default and dev on the same account as production — especially since accounts were free. Having entirely separate accounts for prod & non-prod incurred no cost, and came with the benefit of additional free-tiers and tidier accounts with fewer resources in them — but the benefits don’t stop there.

comment 0

Android TV boxes

Android TV boxes, are computers that stream content from the internet onto your TV. The difference between them and your smart-phone is that it has a HDMI connector to your TV, and it usually comes pre-loaded with software to illegally stream content.

While the boxes themselves, are general purpose computers running Android (the most popular OS today), the real focus of any regulation should be on the software on the device and the internet-based streaming services that support them.

Which seems to be the case…

Today, TheStar reports that the MCMC will begin blocking these unauthorized streaming services, rendering the boxes that connect to them useless.

But, if the MCMC uses it’s usual method of DNS filtering to implement the block, it’ll be trivial for most folks to circumvent the issue, the boxes run Android after all. The government will very quickly find itself in a cat and mouse situation in trying to block them.

comment 0

2018 in Review

I started the year building out govScan.info, a site that audits .gov.my websites for TLS implementation. Overall I curated a list of ~5000 Malaysian government domains through various OSINT and enumeration techniques and now use that list to scan them…

comment 0

Introducing potassium-40

Over the past few weeks, I’ve been toying with lambda functions and thinking about using them for more than just APIs. I think people miss the most interesting aspect of serverless functions — namely that they’re massively parallel capability, which can do a lot more than just run APIs or respond to events.

There’s 2-ways AWS let’s you run lambdas, either via triggering them from some event (e.g. a new file in the S3 bucket) or invoking them directly from code. Invoking is a game-changer, because you can write code, that basically offloads processing to a lambda function directly from the code. Lambda is a giant machine, with huge potential.

What could you do with a 1000-core, 3TB machine, connected to a unlimited amount of bandwidth and large number of ip addresses?

Here’s my answer. It’s called potassium-40, I’ll explain the name later

So what is potassium-40

Potassium-40 is an application-level scanner that’s built for speed. It uses parallel lambda functions to do http scans on a specific domain.

Currently it does just one thing, which is to grab the robots.txt from all domains in the cisco umbrella 1 million, and store the data in the text file for download. (I only grab legitimate robots.txt file, and won’t store 404 html pages etc)

This isn’t a port-scanner like nmap or masscan, it’s not just scanning the status of a port, it’s actually creating a TCP connection to the domain, and performing all the required handshakes in order to get the robots.txtfile.

Scanning for the existence of ports requires just one SYN packet to be sent from your machine, even a typical banner grab would take 3-5 round trips, but a http connection is far more expensive in terms of resources, and requires state to be stored, it’s even more expensive when TLS and redirects are involved!

Which is where lambda’s come in. They’re effectively parallel computers that can execute code for you — plus AWS give you a large amount of free resources per month! So not only run 1000 parallel processes, but do so for free!

A scan of 1,000,000 websites will typically take less than 5 minutes.

But how do we scan 1 million urls in under 5 minutes? Well here’s how.