Can we stop with the (inappropriate) gatekeeping?

It’s another week, so it’s time for everyone’s favourite game: Gatekeeping.

In particular this example Chloe (a Senior Developer Advocate for Microsoft who does some cool stuff with code, while putting up with being a woman in tech on twitter) posted this:

Now there are a whole variety of reasons for this being a good thing, there’s evidence that diverse teams, while sometimes being worse at doing repetitive/samey tasks than less diverse teams, when thrown new problems do better.

Also, having people who aren’t white comp-sci males on a team leads to picking up on things, like an awareness of how your product might be mis-used. Abusers have used Venmo to send money to their victims, because “why would you want to stop someone sending you money”.

Of course, a man was here to quibble advise:

Now, machine-learning is an interesting discipline to pop up and claim that inexperienced people aren’t going to do a good job… we’ll go into that in a second.

Yes, it’s probably true that someone starting out will not be able to generate an entirely new model. But will they be able to follow tutorials and train one of the existing models? Likely yes.

Will they be able to replicate the many mistakes that ‘pro-fess-ion-al’ machine learning engineers have? Absolutely.

Machine learning has been used to codify our biases. Facial recognition performs worse on non-white faces… “flight risk assessment algorithms” which are commercially sensitive so can’t be audited, seem to report that certain communities are more of a risk.

Meanwhile there was that time that a “cancer detection” model, had actually been trained itself to detect the different colour of slide-frames that were used between control and malignant samples.

I’m just saying, that maybe Machine Learning isn’t yet the rigorous pillar of integrity and correctness that needs protection to preserve its pureness.

“React is for n00bs”

This is another good one.

When new devs start out and they use react, a variety of callouts appear:

  • “It’s too complicated, they need to learn the basics”
  • “React is too heavy, they need to learn to optimise”
  • “the amount of javascript we use on the web is too high and a security risk”
  • “if you don’t learn the basics of DOM manipulation how can you possibly do it well”
  • Server-side rendering of client-side apps is just a return to the old way
  • We shouldn’t be building apps on the web

Most of these are true to a greater or lesser extent, but you know what else is true?

This is what the web looks like now…

It is not where any of us would probably start, but it’s where we are.

Having architected a business system that uses React as the UI, that system would have been painfully unusable if every interaction was a page load on form reload… modal popups and API calls made it a better experience for users.

“They’re building unoptimised systems and that’s not good”

That is also true, however how do you learn to build an optimised system?

You ship something that gets to the point is needs to be optimised. Many systems don’t need to be… Good enough, is, well, good enough.

These things are analogous to scaling problems: if you get them, they’re nice to have.

We do want some gatekeeping

I don’t want a newbie coder to write the control software for a nuclear reactor… This is unlikely

But more realistically, the area that we need to find ways to help new programmers about about the basics of security.

I don’t want a newbie writing a user registration system, there are plenty of managed Identity Providers (IDP) out there like Auth0, Cognito, AzureAD, Login with Google, Login with Apple etc…

So yes, I wouldn’t want a newbie writing an IDP of any complexity, I can see them storing passwords in cleartext in a mysql database.

But we don’t talk about these things, or how we can give new programmers an intro to the “easy” 80% of security things: basic security on APIs, not storing secrets in your app, not using sequential/predictable IDs around the place.

It’s much more foundational “go and learn enough before we deem you WORTHY of writing for the web”.

Some people learn by doing a CompSci degree. I have one of those.

While it taught me a bunch of formal things, so much of what I’ve learned is by working with good people, making mistakes, and learning more.

I learned React in part because I was working with a bunch of coders who were learning it…As an old school HTML, JS, JQuery & CSS person, I was initially confused and scared of it. Then create-react-app appeared and I finally got it.

If we don’t turn down this obsession of gatekeeping entry, we don’t let new people learn.

We end-up with the same faces, and products will be worse for everyone. Us older-school people will get stale, stagnate and just write the same stuff until we get retired.

We can nudge better than with streaks…

The brittleness of breaking a streak needs to be broken somehow.

So yesterday I had my phone swiped by someone who was on a bicycle, this was thankfully one of the few times I’ve experience crime in my life, and was non-violent, and the phone has been remote wiped.

This is annoying, leaves me a little shaken, and also annoyed as the thing will likely be dismantled for parts. It’s locked and while I know the that is imperfect, it’s work to remove. But, I have insurance, so it’s all good mostly ok.

But, frustratingly because my phone was gone, I just lost my 205 day Activity streak from my watch.

This was slightly annoying because one bit of Apple did know I’d moved, the “with friends” feature, but I suspect that is just sharing 6 numbers every few hours (the target and amount done for the 3 categories). This isn’t the “health” database, which is a far more granular time-series database, so the streak is broken.

Adding insult to injury theft, the watch even congratulated me for hitting my goal and extending my streak; unaware the achievement would disappear into the electronic void.

And this is the problem with these things, my mind (at least) goes to a place that “getting back to 205 days will be hard, it’s winter”. This now becomes a bit “why bother” rather than incentive.

The same thing applies to Duo-Lingo, Headspace, etc. When it’s a daily thing, and you fall off the wagon, it feels difficult to get back on. I know it might not be as effective, but what if I want to do 3 days of week of each – my 10 minute self-improvement slot.

David Smiths MyTrends++ has the concept of rest days, if you get 7 days on, you’re allowed 1 day off. I think that works nicely.

However, these things also don’t account for if you got ill for a few days and couldn’t exercise. I hate to cite the UK privatised rail system, but they are (or at least were) allowed to declare void days, when evaluations didn’t count – I guess charitably you could say these were fair when absolutely awful weather hit.

If you’re ill and not feeling able to exercise for a few days, that’s (probably) not your fault. But I think, paradoxically, the bigger the streak the bigger the “I’ll never get there again” feelings that arise.

Amazon’s new Halo wearable judges you over a week (if I remember correctly from a podcast), rather than daily. Which I think it’s a more sensible, and understanding of your life.

I don’t know how to fix this as I’m not a behavioural psychologist, but I wonder if you should be earning the equivalent of “long service leave” – maybe you earn one cheat-day a month, up to a maximum of 5?

I don’t know the solution to this, but I think the current obsession with daily gamification goes isn’t really that great, and I know we can do better.

A feature request for LinkedIn

I like LinkedIn, but I would love if I could make recruitment messages more relevant.

I’m about to whine about recruitment, which I understand isn’t great when many good people are looking for work.

If you can do anything to help people in your network, recommendations, connecting people up – now is the time to lower your reputation-risk considerations (what if they aren’t a match, aren’t good) and do it anyway.

Although I dislike the Storification of LinkedIn, and find “Heart Warming Stories of Dubious Origin, About That One Time Someone Showed Basic Human Empathy” posts a little grating, I like LinkedIn.

I primarily work in the Media & Entertainment industry, and very often people move around. One time I was working with a team who were re-engineering a high profile transcode stack, and we needed to check compatibility that one consumer with very Fussy Set-Top Boxes specific H264 encoding parameters.

Searching on LinkedIn found that someone I’d previously worked with was now there, and that was one of those useful back-channels that actually get the work done, alongside the formal ones where invariably detail is lost in all the mediation layers.

I’ve previously found work through LinkedIn also, people in my network were looking and we had chats…

In both of these cases it was a route to contact people who I likely wouldn’t have managed otherwise.

The Bit Where I Bitch About Recruiters

While I know #NotAllRecruiters, many are somewhat annoying.

I’m quite specific in my profile intro of the kind of roles I’m open to, and still I get requests to be a: Permanent, SAP, Project Manager, in Bracknell.

That’s one technology I’ve never worked with (merely around) and 3 job qualities that I will avoid.

Tiresome for everyone, a waste of my time to read and theirs to send.

The over-engineered solution

As mentioned, I’ve a number of relatively simple conditions about jobs I’ll consider.

One time I got a message about a job that was “Only for Oxbridge graduates, but Imperial is also OK” – I know this was meant to be flattering and give the impression of an intellectual workplace (while also being a bit negging that “Imperial was almost good enough”). However, it just screamed of a horrendously toxic culture with Platinum Grade Gatekeeping.

So if you’re specific about what you’re looking for why don’t you get to state that in some questions, and when a recruiter who isn’t in your network wants to contact you, how about they’re given a page like this… (please excuse the 💩 mock)

A list of questions a recruiter might face: is the position permanent or contract, using appropriate technologies, what the salary is

Actually Maybe This Is Application for ML…

As I was writing this (helpfully after doing the 💩 mockup), I thought of a much better solution: If you can choose from a smaller range of criteria – and ones that could be detected by an ML classifier – LinkedIn could just run the classifiers you care about on an “out of network” message.

The score of the message could then drive a traffic light system: the message is accepted, outright denied, and if borderline the sender needs to click a “Yes, it’s appropriate and your classier is wrong, scouts honour, promise” button.

Would it work?

Unless there was a penalty for clicking “This Isn’t Spam” I doubt it would.

I also suspect it would hurt LinkedIn’s revenue too much, if having paid for Gold Premium Ultra, people aren’t able to send messages

To the good recruiters, who like great project managers are rare but invaluable – I’m sorry.

To the rest of you, I’m just not ready to do SAP in Bracknell.

Prescriptive Software Practices: Code Re-use Edition

Individual software practices don’t exist in a vacuum, and need to be viewed collectively.

Today I saw this tweet, that I initially violently agreed with, before realising the answer is really more “it depends”.

Now I fully agree that demanding that people write the abstraction layer before they’ve even written the first component to use the underlying tool, is a folly that leads to bad libraries. You don’t know how to best use the underlying API, and you don’t know how you want to use it, and which of the methods you want to wrap or enhance.

The requirement to wrap every ‘method’ is the main reason I dislike intermediate libraries, one time I asked “are we using this new AWS feature that’s perfect for our use case?” The answer: “No, we can’t because Terraform doesn’t support it yet.”

Any time you put something in-between you and the underlying service you’re introducing a potential roadblock. I’ll explain later how I think you can minimise this.

The main reason I think code-reuse/libraries are hard to get right is a conflict at the core of them:

  • A trivial library can be simple to use, but if the functionality is simple, what is it really adding?
  • A feature-filled library is usually (but not always) harder to make use of, and if most people only use a fraction of it, what makes it worth they overhead?

Things don’t exist in isolation…

Warning, inbound analogy: Very often “we” like to look to other counties and cite how wonderfully they do a thing. An example from the UK is that we’re often told that “people shouldn’t mind renting flats because Germany people tend to buy later.”

Which sounds great, but when you point out that Germany has a bunch of related things – longer leases, more freedom to decorate/change properties, and that they consistently build houses to maintain far more modest house-price rises – people tend to go quiet.

Returning to software, everything is similarly related and supported by other practices. If you don’t fully understand a problem, you can’t cleanly decompose it in a sensible collection of services, and only when you’ve done that will sensible opportunities for code re-use/libraries emerge. (At this point you’re welcome to argue that if you’ve decomposed your system properly then you should need to reimplement functions).

XP/Agile/Clean Code/BDD/TDD/… can become quasi-religious in how much you must adhere to all of their tenets. I suspect very few people are fully compliant with any one tribe, and to be effective as teams you need to view things are recommendations or possibilities, and not commandments that thou shall obey.

How to do code re-use right…

This is just my experience, but a few questions to ask or points that I’ve found have worked for the people I’ve worked with in the past:

  • Avoid needing them in a first place: if your transaction volume is low enough just have a dedicated service that does the particular thing… A single point of truth is the easiest way, but that isn’t always possible due to latency or cost concerns
  • Consider Security/Auth/Data-protection first: These are things that you need to create decent libraries/patterns for, because if the easiest thing is the right thing, you’re going to be making fewer critical mistakes, and it can make patching easier if you’re exposing a consistent interface but have to update an underlying library with breaking changes
  • Judge the demand: While many times people can be “wow, I didn’t realise I needed x until it appeared” unless it’s really obvious that lots of people have the exact problem, do you really need to write a library?
  • Understand it before you abstract it: Don’t write them first. My ideal preference is that when you have a few teams working in the domain, let them create distinct implementations. Afterwards, regroup and use that learning as the basis for a library. This is more work, but the result will be much better
  • Keep the library fresh: Is it one person’s job? Is it a collective whole-team effort? A library needs to be a living thing as the systems it interacts with will change. Developers will rightly shy away from using a clunky piece of abandoned code
  • Layer up in blocks: a client has a back-end system with specific authentication requirements and has been building out client libraries. There are 3 distinct libraries: connection management, request building and result parsing. You didn’t have to use all of these, and can just use the connection library if you want more direct access
  • Make your library defensive but permissive: TypeScript has got me back into typing, but previous experience makes me nervous. In micro-services environments a library update can require many unrelated deployments, when only be two components are functionally involved. Errors because enums aren’t valid can be useful, but can you expose the error when that property is accessed rather than parsed?

In summary…

Teams need to find their own path, and find where on the line between “Don’t Repeat Yourself” and “Just Copy-Paste/Do Your Own Thing” they lie. It is highly unlikely to be at either extreme.

“It Depends” isn’t a particularly enlightening answer, but like so many things about building decent products, it is what it is.

On THAT Excel Issue

What can we actually learn from the government’s Excel related issues?

There have been many comments posted in the last week about “excelgate” or whatever we want to call a life-threating data exchange problem. This post is not about absolving the government of blame for this, or the countless failings they’ve made across the Test & Trace programme. Between the app that everyone who understood iOS Bluetooth told them wouldn’t work, giving the bulk of Contract Tracing to private companies not to local health teams… I’m really not excusing them.

But, I do think there are more naunced lessons that can be learned beyond “LOL WOT THESE N00B5 USING M$ EXCEL. Y U NO PYTHON?” which is an exageration, but not by much, of some of what I’ve seen online.

I’m writing this based on the following assumptions/guesses: Data had to get from one system, to another – and .xls not .xlsx was used, this hit a row limit. (This really should have been an automated feed, but that’s not what I want to explore here, I want to explore how organisations can prevent people doing ‘good’ things)

So, we’re using an inappropriate data transfer format, with a hard limit of how many rows it can contain… This sets up a few different scenarios:

  • Nobody foresaw this problem
  • The problem was known, but the decision was taken not to fix it
  • It was known, people wanted to fix it, but couldn’t

If we explore these, I think there’s some learning we can take away for organisations we work for or with, about how some of our anti-patterns might lead to scenarios that put us into them.

Nobody Foresaw This

This would be the most damning of the outcomes. It was a risk that nobody had realised that they were living with, and crucially that the software doing the export didn’t warn you about.

Tips to avoid it:

  • To borrow from the WHO: Testers. Testers. Testers. Hire decent testers, the one who infuriates you with “What if this series of 3 highly improbable events happens?”
  • As we’ll come onto in a second, listen to them when they say these things.

It was known about, but decisions were taken not to fix.

These aren’t fun, especially as someone who predicted a particularly nasty auto-scaling bug one time, tried to warn people, but it wasn’t accepted that it needed to be fixed until it occurred, it can always leave you feeling “if only you’d argued the case better”.

But it’s legacy…

Matt Hancock, the UK Health & Social Care Secretary, described the system as (paraphrased) “Legacy and being replaced”.

We’ve been here, a system that is old, being replaced, is considered frozen because “it’s going away”. However, I know of systems that were due for replacement in the next 6 months, but 3+ years later development hasn’t started. This was used as a reason not to do relatively trivial UX changes, that could have been a great improvement to the operators.

Tips to avoid:

  • Until you unplug the server, turn off the instance or stop new data flowing into it, no system is “legacy”

“It’s very unlikely… we can live with it”

Nobody, apart from epidemologists and software billionaires, predicted a future epidemic on this scale – so I guess that maybe the problem was known, the decision could have been taken to live with it. Going back to the first recommendation and hiring a tester, sometimes so many scenarios are found, it’s easy to tune out because like Cassandra, the tester is always talking about problems.

Tips to avoid:

  • It’s ok not to fix everything, but if you’re living with a risk, make sure it’s known, and doesn’t fail silently.
  • Keep it in your risk log, and actually re-read that once a quarter and assess if they’re now more of a problem.
  • Try to be a little less agile, at least in methodological purity, and go beyond “what we’re building next” and look a few steps ahead.

We wanted to fix it…

This is when we get into some of the most depressing collection of scenarios:

“You can’t just make a change, this needs a PROJECT”

Changes need to be properly developed, tested and deployed, but sometimes this doesn’t need a full project structure created. When all improvements are painful to implement, people just accept and build workarounds, some of which you may not be aware are in place.

Tips to avoid:

  • Have a lightweight process for “short-order” requests that are small.
  • Find ways to bundle these into bigger releases alongside the “im-por-tant” work.

“It’s too expensive”

If you have a bad contract with your supplier, it could just cost too much to viably fix.

Tips to avoid:

  • Only buy software/services where the API is included, and is nice to develop against (I’m looking at you SOAP)
  • Have clear boundaries in your systems/components, own the integrations yourself, so you can swap components or combine as required

“The person who develops it is too busy/gone away”

You could imagine that if this system was modifiable, that right now the people with IT skills are maybe elsewhere working on the other plethora of systems that have have to be spun-up to cope with the current situation.

Worse though, is when software has gone-stale and while you maybe have developers who could work on the problem, nobody really understands how to build/deploy it anymore, it’s effectively stuck.

I’ve worked with clients who had problems with code going stale, and instituted very strict “if you modify a component you must fully adapt it to be inline with our current standards” to fix this. However, this just introduced a disincentive to make minor changes to improve things, because the developers knew that alongside 5 lines of functional code changes, they had to make 500 of dependency related changes.

Tips to avoid this:

  • Avoid one product/system/component being solely one persons ‘thing’.
  • Find ways to allow people to deploy minor changes as a BAU process, gradually updating components into modern ways of working without dogmatically requiring every component to be fully updated.

In conclusion

We’ve all used excel files or CSVs in email, or a google sheet as an interim solution. The problem is that these interims become permanent and eventually they stop working. I’m lucky in that mine were about keeping TV or VOD on-air, and not about life or death statistical reporting processes.

But still, let’s tone down the sneering “BUT WHY WASN’T IT AUTOMATED” talk, yes, it clearly should have been, but none of us know the decisions being made, or the available software hooks that the operators/developers had access to.

Always monitor your systems, spot where things can be better and make the incremental improvements because they add up over time. Never invest all your hope in the new system/rewrite because they’re always years away, and usually come with their own new ‘quirks’.

The One Boring Reason Why People Use the AWS Service

One of my clients recently started using a relatively new AWS CI/CD Service, and I just stumbled on a defensive/marketing type post from one of the traditional providers. And it made me realise how much vendors can miss the reason people choose to go with the AWS/GCP/Azure service, even if it’s inferior.

Aside: I’m not going to link to the article because they don’t deserve the clicks.

Back to their post, it went through a familiar structure:

  1. “But it doesn’t have all the features, our lovely features”
  2. “You can’t self-host, you’re LOCKED-IN!”
  3. “Why not buy into our broader platform?”

I’ll go through these in turn, before getting to the actual reasons.

“It doesn’t have the features…”

It doesn’t. It’s version 1 of an AWS product… they always launch very lean and gain new things.

And yes, it only supports 3 integrations while Vendor supports around 30. Turns out though those 3 are the most important ones. Others will be added I’m sure, but only where people will use them.

“You can’t self-host, you’re LOCKED-IN”

Good. I literally don’t want to.

I know that some Ops-Teams feel happier that they can touch a container or an instance, but this is a product that can be replaced quite easily, include by this Vendor should the need arise.

They do have a SaaS offering you can pay for, but it’s relatively expensive for small-teams. (And we’ll come onto legal things later)

“Why not buy into our broader platform?”

Lock-in to your cloud provider is bad, but if you use all of their products you can get a great unified experience… which sounds a little like, erm, lock-in.

The simple reason people choose the service on their Cloud… procurement

Companies generally make buying stuff difficult. Every new vendor is a new round of legal review, potentially procurement exercises. It’s a painful affair.

This Vendor does sell their SaaS platform on the AWS marketplace, but it’s another End User License Agreement (EULA) that needs to be accepted. And that means it has to evaluated by a legal-team: like most other EULAs the lawyers will probably go “Yeah, it’s got a bunch of stuff in it that nobody could ever enforce, so proceed at a tiny risk”.

When you already have a cloud-provider, and the legal/finance agreements are in place, it’s just easier to use the provided service.

The ‘default’ product may well be inferior, have less features, and even be more expensive: but if I can click “use this” without involving legal – it’s the one I’ll likely choose.

My workload is too special for Serverless

A few years back it was “My workload would cost more in the cloud”, which while I’m sure is true for some workloads, it was a small and falling amount. It fell even more when you actually costed in all the admin you were doing for your “cheap” servers.

Now it’s “my workload is cheaper on servers than serverless”. Now, again, this will be true for some workloads, but again, this percentage is falling every month as features increase.

Time for the Horror Story…

With every new technology, we need the horror story to dismiss it.

“bUt wHAT aBOUT tHe COld-StArT PeNalTy, thaT meANS tHiS IS uNusABlE fOr ME”

Serverless Function Refusenik

Yes, cold-starts are clunky, and if you’re on Amazon (at time of writing this), you cannot feasibly start a lambda into a VPC because the startup penalty is too painful. This is apparently on their roadmap for this year.

Microsoft are launching a pricing model that allows you to pay for some pre-warmed functions, which could give you the best combination of easy scaling, if the pricing is acceptable.

Anyway, for a lot of these things, the API-Gateway memory cache, or CDNs in front of your APIs should be offloading a lot of traffic and ensuring that common items are rapidly available

Stop swimming upstream

All the effort in IT infrastructure is heading towards serverless functions, container orchestration, containers without actively running container hosts. The choice of hosted database or database-like storage services we are offered can make it confusing to decide. The answer is almost never I’ll running something myself.

Shunning these modern hosting because you genuinely feel that your service is so special is choosing just to take the hard path for little reason, in nearly all cases. And someone- else will use them, have the advantage of working far more on functional code, and far less on overheads, and could offer a cheap/better product than you.

Yes, I know when you are at the scale of one of the top ten internet giants it can make sense – dropbox moved their storage to their own appliances, but you’re not really Dropbox, are you?

AWS Launches MediaConnect and almost gives us multicast

It’s Re:invent time, and Amazon have launched a new service to make video routing to the cloud reliable and easier to set-up.

A few weeks back I was at the brilliant DPP Leaders Summit, it was under the Chatham House Rule.1 There were some great speakers, and I particularly loved the exec who, to paraphrase, “If it doesn’t work without months of professional-services, THEN IT ISN’T AN ACTUAL PRODUCT.”2

Anyway one of the speakers was facing rebuilding their entire stack due to ownership changes, and wanted to do so in the cloud. They said “We need multicast and Precision Time Protocol”. Which I can understand, for playout or production applications, the need for those two is pretty clear.

It’s now Re:invent season, which is the point in the year when AWS tend to release a lot of their good stuff. And yesterday they unveiled a new media ingest service AWS Elemental MediaConnect.

It’s a managed service to get your video signals to/from/between your Amazon clouds.

This has historically been a pain: back when I was working on the Video Factory project we initially mooted a box in the cloud that we would send the signal to, and then that would fan out to both archiving and live streaming. This was hard to do, so we side-stepped the issue, and just rapidly uploaded the stream to S3 in consistently sized chunks instead. Later something was put in place to do the streaming, using something that I don’t think has been spoke about too much in public, so I shan’t detail here.

Anyway, this new service allows you to send content to/from an endpoint using standard RTP (with/without Forward Error Correction) or the more reliable but commercial Zixi protocol. The video has an Amazon ARN identifier, which then means that external accounts can have permissions to subscribe to the stream, the documentation says a ‘flow’ can have up to 20 outputs.

How are we going to use this?

  1. Contribution to streaming output: fire the video somewhere and you don’t have to know if/where it’s being used
  2. Contribution for programming: using few Amazon regions, broadcasters could very easily build a global contribution network to backhaul outside-broadcasts very easily
  3. Contribution from a Playout appliance, if your cloud playout outputs to an MediaConnect flow, then you can then output that flow to your broader distribution chain, allowing re-routing of things downstream.

It isn’t multicast within a VPC, it’s not PTP, I suspect the latency involved may be too great to allow it to be used to route between different stages in a virtual playout chain3.

MediaConnect does however simplify integrating cloud processing workflows by providing fixed points at the edges in and out of the cloud.

I’ll be interested to see how people use it.

  1. That it is a singular rule is one of those bits of pedantry I cannot let go of
  2. This is probably a topic for another time, but the fact that so many enterprise vendors expect you to pay for their ‘product’ then explain that ‘oh, no, you can’t just use it out of the box even in a basic manner’ is a bit of a joke
  3. I could be very wrong here, I don’t have a one of those hanging around to test

Data collection at the job fair

Last weekend I went to a tech recruitment event, and I was little shocked at how badly some employers did data-collection.

When enquiring about potential employers, people have a vague expectation of privacy. This is lost when:

  1. Data collection is adding your details to a sign-up sheet, with the ability to see the details of everyone who did so before you
  2. Data collection is adding yourself as contact on an iPad. This has all the problems of solution 1, but with the ability to send any contacts you like while you’re entering your data

Finally, don’t collect what you don’t need. Do you need to capture gender? And if you do, consider that for some people the options might not be as simple as “Male/Female”.

Recipe for success

What does a team need to deliver a successful software project, starting to think about what I’ll want in my next engagement.

There’s plenty left to do, but as I approach the end of my current main assignment as a Technical Architect, I’m starting to think what my future engagements should have.

This is my starter for ten five:

  1. Anything but waterfall
  2. Genuine Public Cloud, with a hint of lock-in
  3. Internal users matter just as much
  4. Partnership with your Product Owner
  5. Embedded QA, seen as a benefit, not a drag

Anything but waterfall

Scrum? Kanban? Scrumban? I don’t really care exactly what it is, more that it works for the project, everyone understands and supports it.

I hate designing things entirely upfront, it just seems so conceited that you can genuinely design an entire system without trying to make any of it. While I know this doesn’t apply when you’re building a rocket1 or CERN, you’re not doing that, are you?

Yes, you absolutely need a sense of roughly where you’re heading, and ideally an end goal that you’re heading towards – but you also need the pragmatism to know if you try to build that from the start, you’re going to burn lots of rubber on the road, while making very little progress.

Show your dev teams that you can and do go back to make things better. Build the sense of trust that when you say “Just build the slightly-hacky ‘tactical’ thing, we will fix it later” that you do go back and fix it.

You’ll free everyone up from the performance anxiety of “Must get it right first time, because I can’t go back and fix it”.

Genuine Public Cloud, with a hint of lock-in

I would like to think that cloud is a given, but I still face people who say things like “It’s just someone else’s computer” – yes, but in general they have better capacity planning than you, or the “I could do x for cheaper” – which I’m sure you could, but you’re usually not factoring in the hidden costs.

The main system we built does have an on-premise element, but it’s controlled by the cloud, and deployed in a similar way.

We host the core of the system in the cloud, and that gives us an agility in scale and deployment we don’t have on-premise. Now, could we get that in time, I’m sure we could, but then we lose the benefits of the AWS value-add services…

“we use Amazon, but we only use EC2 and we don’t use any of their special services, so we’re not locked-in”

Speaking of which, when I hear that particular line, I want to congratulate the person on ensuring they’ve deployed their software in a way that will either cost them more, or be less reliable, or both.

At some level, to get the best value out of a cloud provider, you do need to be using their value-add services, meaning you can run bits of your application server-less or other bits as more scalable state-less systems.

Yes, if you write a Lambda, you can’t instantly port that to Google Cloud Functions, but given they both run Node, provided you put the thing that does the work in a scoped module, migrating should mean you write the Google invoking code.

I’m not saying use every service, but to start with the position that you’re just going to use Infrastructure as a Service, is too dogmatic.

Internal users matter just as much

Yes it’s an internal system. Yes it’s not public facing.

Yes it should still be as performant and usable as your public properties.

Facebook probably does more than your system. Facebook is generally fast to use, and yet nobody gets training in how to use it. If your system requires lots of training, are you doing things as well as you could?

Consumer technology and services are good. Very good. Your users expect your system to match that, and when you give people tools that work well, they’re freed from hating the system they are using, and allowed to actually focus on the tasks they’re doing.

Focussing on my current engagement, a partnership with our core users meant they took up some extra manual working, while we ran the extended migration. They only agreed to those once we had earned their trust, and they realised that “could you do this for 3 months” was just that. (granted it was more like 4 months).

Partnership with your Product Owner

Product Management is still a relatively new discipline, so there is no one-true-way, and I hope there doesn’t become one, because not all products are the same.

Regardless, partnership with your Product Owner is crucial, and if they’re technical you want to work hand-in-hand with them on key design decisions. If they’re less so, you need their trust and for them to delegate responsibility.

Embedded QA, seen as a benefit, not a drag

The embedded tester in the team is a key resource. They should ask questions, spot the things we didn’t, and invariably are a first call for “do we know what happens in situation x?”.

For all the frustration that Test Driven Development can cause when doing genuine micro-services, the testing framework that provides means that we never ship the same bug twice. Sometimes when we’ve suspected bugs, modifying an existing test have helped us check our hypotheses quickly.

Easy regression testing make you far more able to build and iterate quickly.

In conclusion

You can’t make a project be a success, but there are things you can do that increase the chances…


  1. And talking of rockets, look at what SpaceX have done, which looks pretty like rapid evolution of a rocket platform adding more capabilities…