Sayakenahack: Epilogue

S

I keep this blog to help me think, and over the past week, the only thing I’ve been thinking about, was sayakenahack.

I’ve declined a dozen interviews, partly because I was afraid to talk about it, and partly because my thoughts weren’t in the right place. I needed time to re-group, re-think, and ponder.

This blog post is the outcome of that ‘reflective’ period.

The PR folks tell me to strike while the iron is hot, but you know — biar lambat asal selamat.

Why I started sayakenahack?

I’m one part geek and one part engineer. I see a problem and my mind races to build a solution. Building sayakenahack, while difficult, and sometimes frustrating, was super-duper fun. I don’t regret it for a moment, regardless of the sleepless nights it has caused me.

But that’s not the only reason.

I also built it to give Malaysians a chance to check whether they’ve been breached. I believe this is your right, and no one should withhold it from you. I also know that most Malaysians have no chance of ever checking the breach data themselves because they lack the necessary skills.

I know this, because 400,000 users have visited my post on “How to change your Unifi Password“.

400,000!!!

If they need my help to change a Wifi password, they’ve got no chance of finding the hacker forums, downloading the data, fixing the corrupted zip, and then searching for their details in file that is 10 million rows long — and no, Excel won’t fit 10mln rows.

So for at least 400,000 Malaysians, most of whom would have had their data leaked, there would have been zero chance of them ever finding out. ZERO!

The ‘normal’ world is highly tech-illiterate (I’ve even talked about it on BFM).  Sayakenahack was my attempt to make this accessible to common folks. To deny them this right of checking their data is just wrong.

But why tell them at all if there’s nothing they can do about it? You can’t put the genie back in the lamp.

What’s the point though?

A recurring criticism of the site is that people can’t do anything about the breach, hence the site pointless, and dangerous.

Which is crazy to me, because your right to know, shouldn’t be contingent on your ability to do something about it. Legal rights don’t work that way.

In any case, regardless of people’s ability to respond, the site has achieved a few things. One of which is give us clarity on the data.

Within the telco breach, once I loaded data from Maxis, Digi, Celcom, UMobile, TuneTalk, and RedTone — the total rows in the Database was only 37mln (not 46 mln as most claim). That’s because people (including lowyat) were blindly counting rows in Excel. But when you put that data into a data model and start inserting those millions of row into a database, things get interesting.

Some files were missing myKad numbers (hence not loaded in the system) and others had duplicates. I estimate that with the smaller telcos that number might have increased to 38mln, but not any higher.

I think the Malaysian public deserve more than a cursory Excel analysis — don’t you? And it gets better.

Some telco’s gave a lots of information, others gave a bare minimum. The amount of your personal data exposed in the breach is highly dependent on which telco you were subscribed to in 2014. And even then, some anomalies existed, such as missing names and addresses (this is typical of any IT system).

Hence, not everyone was equally affected, and the site was built to communicate ‘effectively’ how and where you were breached, with as much detail I could provide without infringing on people’s privacy.

My code checks each row of each file, and reports which fields are present, to give people very specific information about what data of theirs is in the breach.

A blanket statement like “All Malaysians impacted” is highly simplistic, and ineffective. Putting your masked number on a screen after you’ve entered your IC, really brings home the fact that your data has been leaked.

If you’re uncomfortable with seeing your own details on a screen, how much more uncomfortable would it be to know that any Tom, Dick and Harry online can view those same details about you at the click of a button.

Blocking my site, and trying to put the genie back in the bottle — that’s pointless. Giving people the ability to discern exactly what elements of their personal data is exposed is not.

So hopefully you get why I did, and now let’s move on to the question of legality.

Is it legal?

It was always going to be grey area, but I strive to make it more white than black.

The bar council seems to think sayakenahack is alright, and while I’m no lawyer, I obviously agree.

If you make it illegal to hold stolen data, then only criminals will have it. Legitimate researchers and journalist would thread cautiously around it, ensuring that none of it gets reported, and even less people are aware.

The only thing worse than having your data exposed, is not knowing about it being exposed.

If you make services like sayakenahack illegal,  people will start going to services like Leakedsource, if only to check on themselves. You will legitimize the business practice of selling stolen data, and put more folks at risk. Is that what you want?

Making something illegal has never deterred criminals from doing it. It’s kinda what they do.

Blindly following the PDPA, is to hold ourselves hostage to ancient legislation (7 years is 2-3 generations in tech) that was probably enacted by technically illiterate members of parliament. I’m not saying we don’t follow the law, I’m just saying we don’t blindly follow it. There are multiple exemptions to the law, and some of them can be applied her.

But even if this site was on the white side of grey, and even if my intentions were noble, and even if we assume it’s perfectly legal — how can you trust me?

How can we trust you?

My name, reputation  and even possibly freedom is based on keeping this thing legitimate. Like everything else, if you don’t trust me, don’t enter your IC number.

I cannot prove that I don’t log data, anymore than I convince you that there is no spaghetti monster orbiting the 3rd moon of Jupiter. You can’t prove a negative.

But at least trust in logic and maths.

200,000 users visited sayakenahack. Without the block, I estimate that number would have been 500,000.

But 500k out of 37mln is still less than 1.5%. ONE POINT FIVE PERCENT!!!

Accusing me of phishing through the site, is like accusing Bill Gates of pickpocketing at the MRT. Why would a billionaire with so much money, waste his time and risk himself, to steal a fraction of his fortune?

Why would I bother with 500 thousand, when I have 37 MILLION.

There’s no logical reason for me to log.

And yet they used it as an excuse to block me.

What do I think of the block?

Well, I don’t agree with censorship in general, regardless of whether it affects me or not. So the block isn’t a good thing. I wasn’t happy with the #potongsteam block, and I don’t even play that many games.

That being said, AWS cost dollars. so blocking the site actually saves me money :).

Did the MCMC or PDP contact you?

No.

Did you contact the MCMC or PDP?

Yes.

Both of them.

No answer so far.

What about siapakenahack and Lowyat?

Between the two of them, CF Fong and Vijandren Ramadass have accused me of manipulating data, being unethical, and glory-hogging.

Just to let you know, that I’m none of those things.

When I tweeted about the number of hits the site was getting, I thought it was cool that my architecture could support that kind of traffic. I never meant that data breaches were cool, which is what Vijandren suggest, and cites as the moment he ‘gave up’ on me.

And I’m more than happy for someone to critique my design, comment on the architecture, or point out flaws. But calling me unethical?! That’s harsh.

Even so, I’m letting bygones be bygones. Time for everyone (including me) to move on.

What’s next?

I go back to Malaysia soon. If you don’t see me after that, you’ll know where I am.

Take care Malaysia.

Conclusion

The vast (vast!) majority of folks have been supportive of sayakenahack, and for that I am (and will forever be) immensely grateful. Immensely !!

You can’t please everyone, and contrary to popular belief, I even appreciate the critics. At least they’re keeping you on your toes. I especially liked the legal disclaimer of siapakenahack — couldn’t stop laughing when I saw it.

So thanks everyone for your support. If the rumors I hear are right, this story is far from over. Expect earth shattering headlines next week — and hopefully I won’t be in them :).

Keith out (mic drop!!)

[FYI, that two year old interview from BFM has aged remarkably well, I highly recommend it (link here)]

9 comments

Leave a Reply to Ivy TongCancel reply

  • Dear Keith,

    I wrote a long post on this issue, but decided not to publish it to drag this along any further. So will just leave this comment here to clear things up.

    1. There are two definitions of the word manipulate. If you look it up in the dictionary (or via google), the first definition is – to handle or control (a tool, mechanism, information, etc.) in a skillful manner. So you might want to re-read my statement with the above in mind.

    2. You did a fantastic job in setting up the site, server-less and all, but your haste to launch it was your ultimate downfall. You covered all the bases as far as the architecture and functionality of the site, because you are well versed in it, but you failed to cover all the other issues that comes with putting up a site like that for public consumption.

    3. The data you obtained is about 3-4 generations older then the copies we had (we checked). You should know well enough how easily data degrades when the file sizes are above 1GB, especially in CSV/TSV formats. Add in compression + decompression, and download sites that aren’t exactly the best in the world and you end up with corrupted files. Now tell me this, can you wholeheartedly authenticate that all the data that you have put up is 100% correct?

    4. We had the same issue, and we had files which were in much better shape then what you had. We contacted the telcos not just for authorization, but because only the telcos are able to cross check and confirm the level of degradation of the data. When you set up something as sensitive as this, you make sure you are 100% sure you are disseminating verified information.

    5. We did our own checks as well – something you probably didn’t. We crossed checked around 200 or so records from various telcos with immediate family and friends, and surprisingly, the accuracy was only around 70%. Sometimes the numbers and IC was correct, and the addresses were wrong, and sometimes the IC’s appeared on other numbers which didn’t belong to them. The fact that the IC numbers appearing on postpaid numbers made us investigate further – calling up the numbers only to confirm that the individuals using the numbers have had it for many many years.

    6. When you first shared with me the site, i never imagined you were going to launch it almost immediately. The site was live and viral before i could share with you any of our findings. And then there was the whole issue of people finding numbers they didn’t register appearing beside their name and so on. So yes, our main cause of concern with the site was that the data wasn’t validated, and it was going to cause a lot of confusion – which neither you or me will be able to give them any concrete answer.

    7. Say for example, do you want to be held responsible for a domestic dispute if a wife keys in her husbands MyKad number and it shows he has a few additional numbers under his name that she is not aware of? If the numbers are indeed real, then its their problem, but if it is not, then you become liable for the dispute – because you do not have the means to verify the authenticity of the data that you disclosed. Did you put any kind of disclaimer with regards to this – i don’t think so.

    7. The final issue is the site doesn’t solve the actual problem. All it does is make people panic even more, without any avenue on what to do. Are you able to do anything more then say ‘stay safe malaysia’ after you tell a user that all his data is exposed? No you can’t. The authorities need to draw up a proper guideline, before releasing the information – because people need to know what they can do, and what is being done to ensure their safety by the people who actually hold the keys to the door.

    With great power comes great responsibility. While i do commend your heart and effort in putting up the site, in your rush to get this out there, you did not spend enough time to make sure that all the bases were covered.

    • Se7en you talk too much. Go back to /k/ and mollycoddle your sycophants and dumb fuck staffs. You’re a fucking sell out and you’re not even a coder. You’re a suit, a double headed snake head lawyer that cannot be trusted. All techies/geeks/nerds should beware of loyar buruk like se7en.

      • Wonder where you get your info from. Just to clear your confusion, i have a Com Science degree from a Malaysian Public University and am CCNA certified. You can check with a lot of the old timers who were around in the early days of the internet in Malaysia on what i used to do back then.

  • Not being able to do anything about it is one thing, but not knowing and thinking all is rainbow and sparkles in things as important as these is even worse. Thanks for raising the level of awareness and the need for more secure architecture and policy in the country.

  • the site is live or not does not change the impact of the leak incident but the site did create an awareness of the issue. with knowing the incident but choose not to spoke up is equal to silently agree with the data leak

    • Well first problem is G+ (I don’t even think it exist anymore!)

      Anyway, best way to contact me is via email. keith[at]keithrozario[dot]com