Agents productivity vs. Quality

This year, I managed to complete the entire problem set of advent of code. Last year, I could only complete 43 out of 50 problems. So there was ... progress!

If you're unfamiliar, Advent of Code is a series of daily coding challenges released during the season of advent (the period just before Christmas). I encourage you to try these challenges for yourself. None of them are easy (at least to me), but all of them solvable with enough elbow grease and time.

[]

The METR Study

The METR study is wild!

It's methodoloty is unlike any other. While previous studies went 'wide', the METR study went 'deep', focusing on 16 developers (instead of hundreds or thousands), but deeply analyzing the effect on AI on those 16 developers in a way 'wide' studies could not.

METR first identified 16 developers who were maintaining high quality open source repositories. These repositories had an average of 22k stars on Github, and ~1 million lines of code. 22k stars is probably 6-7 standard deviations above average on Github. To put a football perspective, these developers were maintaining the football equivalent of Real Madrid, while your average Enterprise app code base is Toa Payoh FC or Scunthorpe United.

[]

Software 3.0

If you have 3 apples, you take away 2, how many apples do you have?

1 of course!

But then again, you took away 2, so therefore you should have 2. The question is ambiguous depends on how you interpret the words 'have' and 'take-away'.

Compare that to something like this:

[]

Saying No.

I was "inspired"by this post.

In it, Philip Su, an E9 (Distinguished Engineer) from Facebook explains how he progressed in his career, which includes an impressive stint at Microsoft that saw him promoted every year for 8 years straight.

If you have that kind of accelerant so early in your career -- you're bound to be "Successful", or at least air-quotes successful.

Distinguished engineers at a FAANG company (Facebook, Amazon, Netflix, Google..etc) are the highest technical individual contributors in companies that value (and pay) their technical folks LOTS of money. There are fewer Distinguished Engineers at FAANGs than there are NBA players, that level of elite requires immense dedication.

I'm not as "successful" as Philip. I started my career as a Business Analyst in Shell, I was promoted (sort-of) twice in my 9 years there, then bounced around a bit, and joined Amazon as a L6. I left 4 years later at the very same level, promoted a grand total of ZERO times. Now I'm a partner engineer at Google -- a peon in the giant machinery that is Google Cloud. So I've worked at FAANG companies, but at more ground-level stuff, than in the stratosphere that Philip operating in.

But, I'm happy where I am, and pretty happy with my journey so far.

[]

Good, fast, or Cheap?

In Software delivery there's a famous saying: Good, Fast, or Cheap -- pick 2.

There's always trade-offs.

Fast and cheap can be measured easily. How much money and how much time are very simple questions to answer. You might think it's expensive, or slow -- but the objective hours and dollars are easy measurements.

It's not the same for 'Good' though. How do you measure 'Good'?

[]

What Challenge 13 taught me about LLMs.

While doing programming challenges in Advent of Code, I came across an interesting behavior of LLMs in coding assistants and decided to write about it to clear my thoughts.

First some background.

Advent of Code is a series of daily coding challenges released during the season of advent (the period just before Christmas). Each challenge has 2 parts, and you must solve part 1 before the part 2 is revealed. Part 2 is harder than Part 1, and usually requires re-writes to solve. Sometimes quite extensive rewrites, and others they are small incremental steps.

If you haven't done these challenges before, I encourage you to try. None of them are easy (at least to me), but all of them solvable with enough elbow grease and time.

That said, the challenges are still contrived. Firstly, the questions are much better written that what you'd see in a Jira ticket or requirements document,. They include a detailed description of what must be done, and sample inputs and outputs you can test. Secondly, the challenges extend beyond what most coders do on a daily basis, one challenge required writing a small program to 'defrag' a disk, another required building a tiny assembler that ran it's own program, and multiple questions involved you navigating a 2D maze with obstacles along the way. All fun things you will probably not do as a programmer in the real world.

I took on the challenges, both to improve my coding skills, and to learn how I could use coding assistants like in these close to real-world scenarios. The hope was I would gain some insight into how I could use these tools more effectively should I need to do something more than solving contrived programming challenges before Christmas.

OK. Background complete.

Let's move onto the challenge that changed the way I would look at LLMs forever.

[]

Overcoming Setbacks = Progress

We've all seen the "tiny gains post". How if you get one percent better each day for one year, you’ll end up thirty-seven times better by the time you’re done."

Well......

First of all, 1% isn't 'tiny'. I know a few bankers who'd sacrifice their first born for a daily increment of 1% on their portfolio. After all, how many bankers do you know have a 37x return on anything over a year.

Secondly, getting 1% better everyday is not possible. Just getting better everyday is not possible.

If you train in cycling, improving your speed by 7% every week is a ridiculously impossible goal. In cycling we measure power output, so if you improve 1% everyday, no matter where you start from -- you'll be out-sprinting Mark Cavendish within a year.

So forget 1% everyday.

1% sounds small -- but doing it everyday puts is a fairy-tale. That said.... the idea that making small consistent gains instead of large but inconsistent improvements is a good idea.

[]

The Tyranny of Best Practice

All architects know what's best practice, but only good architects know when to use them.

I've been in plenty conversations where someone goes "we should do X because it's best practice" -- and act that the discussion ended.

Best practice is what works for most people, most of the time. It isn't something that works for everyone all of the time -- otherwise we would mandate it across the board and architects would be out of their jobs.

[]

Remembering Sayakenahack

It's been 6 years now since the big sayakenahack debacle. I won't go into details on what happened, but ... I thought it'd be nice to take a stroll down memory lane with some pictures :)

[]