Anatomy of Ticketing ‘Fail’

(Or what happens when a company that isn’t eventbrite tries to be eventbrite)

A friend wanted to book some tickets for an event. I had some time today, so I said I’d book them.

For reasons of politeness, I’m not going to name the company. The event was massively over-subscribed, there were always going to be people who were annoyed (kinda like the Olympics). I’m just annoyed because I saw things done generally quite ad-hoc, specific technical bugs hit me.

Tickets were delivered in tranches. This is a sure sign there will be massive peaks in demand…

The hour arrives, and, in your all too typical scenario: www.example.com rapidly stopped responding.

A few minutes later everything went 403′d as they killed all access on the server to get the load down. Not great, but it’s a sign somebody is looking at the problem.

example.com then starts redirecting to http://xxxxxxxxxxxxx.cloudfront.net/url1 with all the individual ticket pages iFramed through to an Amazon EC2 instance (http://ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com/blah)

The IDs for the events were sequential, some had already been released, and you started to think that people had been gaming the system and ordering tickets prior to their availability windows. This was later denied by the company (which I accept), but given the way the scaling was going, at that time it was all too easy to think they were using security by obscurity to prevent access to the events.

Later in the day, when tickets appeared, it was announced via a tweet. The tweet though didn’t link to the site, but to a mailing list post, which again didn’t reference the actual site.

The site had now changed, example.com was redirecting to http://xxxxxxxxxxxxx.cloudfront.net/url2 again passing through to EC2 instance. Later many people complained on Facebook that they were looking the old page and pressing reload.

Anyway, I tapped. I was at the gym. I was on my iPhone. But I know my credit card number, I know my paypal password, I can even use that tiny keyboard. I’ve topped my starbucks card up while in the queue. I can do this.

There was even a mobile site.

Only the mobile site was erroring because it was asking for a non-existent field/table. I had no way to change my user-agent (and wouldn’t have trusted Opera with my credentials), and in the 10 minutes it took me to get back to my laptop all the tickets had been sold.

No tickets for me+mate. Grumbly me having seen things done badly.

As many will say, this is not life and death – but example.com is primarly not a ticketing company, and that showed today.

If you’re going to compete with the likes of eventbrite, you’re going to have to be as good as eventbrite.

The Constructive “What can we learn” Bit

1. Believe it could happen, no matter how unbelievable.

Ask yourself “if we get off-the-scale load how will we fix it”. Working out volumetrics and scaling is hard, so alongside your “reasonable” load calculations of “we can turn off these bits of our site”, have your plans of “how you’ll move to something big, cloudy and scalable if the unbelievable happens”.

Are there components that you should move upfront? You have something like 15 to 30 minutes of goodwill. What do you need to do upfront, so that in that downtime you can come up fully scaled.

If you’re looking a scalable elastic thing, look at how much it costs to start in that state anyway.

2. Architect things to give you agility

If you can’t host all your website on a scalable platform: Subdomains, DNS expiry times and proxy-passes give you room to move, but only if set up ahead of time.

Had tickets.example.com been available, example.com wouldn’t have had to disappear as it has done until tomorrow. You don’t want your website down for that length of time.

DNS changes can take time, much less if you dial down the expiry times, but again you have to do prior to the event. Amazon’s route 53 is cheap so move the domains ahead of time, and set Times to Live appropriately.

While you’re waiting for that propagation, proxy-passing can be a useful technique to bounce the traffic to the new server, while the DNS propagates. Proxy passing also means that example.com/tickets could have been redirected, rather than an entire domain.

Are you caching what you can at an HTTP level with varnish or a service level with memcache?

3. Be careful sending people onto new URLs that won’t update

Taking the ticketing system off their main website was a good move, but the static page should have remained there. The second you redirected to cloudfront, they were then looking at a page that would get stale.

Many people would have pressed reload, expecting it to appear, but they didn’t because as you can see from above above, the URL changed. They could have used the Cloudfront revocation API, but this wasn’t used.

4. Remember data protection issues

This company used the Virginia data centre (which I think is the AWS default). Without going into the whole world of pain that is data-protection and EU borders – Dublin would have less latent and less problematic compliance wise.

5. Testing is good, as is automatic deployemnt

There were not many tickets and the loading was huge, those were not avoidable. I can’t say the same about the erroring mobile site, that should not have occurred.

6. Rehearse

It’s not fun doing disaster recovery, but if you’re receiving catastrophic load then that is what you’re doing.

Write the script. Have someone else test it.

It’s not a valid plan until you have shown it works.

Posted in Technology | Tagged , , , | 1 Comment

a new adage for social

Arthur C. Clarke famously said “Any sufficiently advanced technology is indistinguishable from magic”.

The recent Facebook frictionless sharing gives us a new one “Any sufficiently complete and transparent sharing is just going to be creepy”.

We’re basically a fickle bunch. Some of us want to share more easily, but sharing everything also irritates us. Facebook in particular annoys me because I can’t send my habits for the useful bubbled-up aggregations, without the endless inanity of GARETH IS LISTENING TO BLAH. Given I listen to a lot of the same songs that’s really boring and spam. Ditto what articles I’m reading on The Guardian, individually quite dull but as part of the “things that you & your friends have been reading” aggregated things a bit more interesting.

Anyway, this is kind of problem that services like Zeebox will always face, incomplete or creepy. As a standalone app I have to remember to use it (and I’m already using my iPad for twitter), if they did ever have direct integration with my TV (By this I mean my TV updating things, rather than the existing TV Remote functionality in the app), I’d be creeped out because again, viewing habits reveal some awful taste. Maybe I just need a “share this” button on my remote that can easily publish what I’m doing to Facebook or some other back-end. A bit less friction, but still some.

It’s a tough one to solve, but we can’t seem to be comprehensive and convenient without being creepy.

Posted in Broadcast, Rights and Legal, Technology | Tagged , , | 3 Comments

iMessage and Phonecalls

SMS can come in alongside phone calls. I’ve got used to this, I’ll sometimes call people while waiting for other people to turn up, I’ll hear a text message arrive, can read, and act accordingly.

iMessages are shiny. iMessages are data. This means they’ll only arrive if I’m on WiFi or on a 3G UTMS connection.

If I’m on a 2G GSM connection, or a CDMA network user in the States, and not connected to WiFi, I’ll not be receiving messages while I’m on the phone.

Yes iMessage says things are “delivered” but I don’t yet know what that means, and I’ve seen people complain on twitter that it’s not delivered to the handset. I guess people haven’t seemed that bothered with BBM, but that has stronger delivery feedback; But the transparency of SMS/iMessage is the problem, I’ve had people ask “HOW DO I SEND THEM?” because the only distinction is the blue/green colouring, in some ways it’s so transparent you might not understand.

I’m not sure this is a massive issue, but it’s got the potential to make meeting people comically awkward, given my phone still spends a bit of time on 2G networks.

Posted in Technology | Tagged , , , , | Leave a comment

Security becoming life and death

(Given many of my posts are second rate Gruber posts on the mac, this one is a second rate Schneier)

I like Chip+PIN. I don’t think EMV is perfect: it has the complexity of a committee driven standard created by competing companies, and it has flaws and oversights. I’ll still wager it’s more secure than someone looking at a signature, and since skimming attacks get immediately moved abroad (when the cloned cards are created from the legacy mag-stripe) behavioural analysis makes spotting fraud a bit easier.

I do not feel the same way about Verified By Visa which I continue to curse every time I use it.

Anyway I very much disliked the UK Cards Association’s response to the excellent Cambridge Computer Laboratory when they’ve published flaws and potential attacks, demanding they take the papers down. They played the near standard “oh it’s very hard to do right now, we don’t think anyone could really do that, please, they’re very clever and most people won’t be” line. The only problem is that with each new vulnerability, the Cambridge Team appear to be producing more plausible attacks. UK Cards were rightly told to go away.

It would have been nicer to hear:

“We thank the CCL for their work in exposing potential attacks in the EMV system. At the moment we think these are peripheral threats, but we will work with EMV partners to take the findings onboard, and resolve these as the standard evolves”

This is course blows the “Chip+PIN is a totally secure” line out the water – which matters because they’re trying to move the liability onto the consumer, admitting the system is even partially compromised lessens that.

At the end of the day, this is just money. There’s always been fraud, there always will be. Not life and death.

I used to work in Broadcast. Many of those systems were insecure relying on being in a partitioned network. DNS and Active Directory were frowned on, being seen as potential points of failure rather than useful configuration and security tool. The result was a known, but brittle system. Hardening of builds was an afterthought and the armadillo model of crunchy perimeter, soft centre, meant that much like the US Predator Drone control pods, once inside passage made easy.

Depressing, yes? Particularly because so many of these problems were solved before, and solved well. But it was just telly. Not life and death.

I mean, it’s not like you can remotely inject someone with a lethal dose of something.

Except it is: A few months back someone reversed engineered the protocol of their insulin pump, able to control it with the serial number. This was bad enough. Devices that inject things into humans shouldn’t be controllable without some of authentication beyond a 6 digit number.

At the time the familiar: “it’s too difficult, you still need the number, you’ve got to be nearby” response was provided.

Two months later, another security person has now managed to decode the magical number, and used a long distance aerial to be able to send commands to the pump.

I’m sure it’s still “too hard to be viable”: because the death of someone isn’t something that has major consequences that could have the kind of support that makes hard things viable…

Security is hard to do well, and we need to start embedding it in everything – it is now a matter of life and death. But it’s hard, and hard for the psychology just as much as a technical. You should really use an existing algorithm implementation because the chances are it’s better than yours: but that’s licensing and IPR, so just roll your own cipher believing your application is too trivial to be a target for hacking. Besides your proprietary wire-protocol is proprietary, it’s already secret. People aren’t going to bother to figure it out.

Security makes things harder: you can’t just wire-sniff your protocol anymore to debug stuff. Your test suites become more complicated because you can no longer play back the commands and expect the device to respond. That little embedded processor isn’t powerful enough to be doing crypto: it’s going to up the unit price, it’s going to increase power usage and latency.

Many programmers, still, belong to the “if I hit it and hit it until it works” school of coding. I don’t mean test-driven-development, I’m meaning those coders who think if it compiles, it ships. These people don’t really adapt well to working in a permissions based sandbox; it’s harder to split your processes up so that only the things that need the privileges have them (we’ve all done ‘chmod 777 *’ to get an application up and running).

Until everyone realises that every device with smarts is a vector, from Batteries, to APIs, to websites we’re increasingly at risk. I guess that massive solar flare could take things out for us.

Posted in Technology | Tagged , , , | Leave a comment

The Problem Flickr Never Solved

I use instagram lots now to take little snippets. They flow nicely into my Facebook or Twitter streams. I rarely use Flickr apart from for more “curated” photos. My usage of flickr is really dying off. And I think I know why

Flickr, has never provided a way to see the most relevant content from my contacts. I can see the top five, or a single photo on the friends page. That doesn’t provide me with completeness, so I used a friends API call to give me an RSS feed of all photos.

Services like Facebook have long treated completeness as being an impossible goal, so they prioritise what they show you. Granted this leads to some of my friends bitching about that prioritisation but they are starting from the pragmatic position that “You will never see everything because there is too much”.

I don’t think flickr ever really solved that problem. I get that it’s much harder than atomic things, because in many case a clump of eight photos is relevant, rather than any one of those, but I don’t have a meaningful way to dip into the stream and get the most interesting stuff, merely the most recent stuff.

It’s a problem that every online service needs to solve. If the user can’t see everything, how can we give them a chunk of relevant stuff.

(and uploading with instagram is “frictionless” to use the latest jargon, compared to the clunky flickr app)

Posted in Technology | Tagged , | 1 Comment

In Praise of Policy

When I hear “policy” I have a deep sense of dread that only working at the BBC for many years can give you. I fear opening the 40 page manual described in prescriptive detail what should, shouldn’t, must and can’t be done for every scenario. Well, invariably every scenario apart from the one that you’re facing at that moment: which could be one of two with contradictory advice.

I think we all share this healthy skepticism for excessive policy, but I realised when writing an article on Social Media and Business that policy was exactly what you needed. I started off talking about URLS, and tools like hoot suite, before realising the real thing to get right is the underlying policy: what are we trying to get done here.

It doesn’t need to be a massive tomb, it doesn’t need to have flowcharts. It could just be a list of 5 bullet-points that cover what you’re doing.

Once you can succinctly sum up what you’re trying to do, doing it is invariably much easier.

Posted in Work | Tagged | Leave a comment

Siri and Boris Bikes

I know it’s not launched yet but it would be great if Apple incorporates the TFL Barclays Cycle Hire feed into Siri. Siri is perfect for when you can’t use the screen, like when you’re cycling. It would be great to be able to ask:

“What cycle docks are available near home?”

“5 docks available at Oval Way, and 4 at Kennington Post Office”

I know it’s a market specific request, but there’s an XML feed they could parse – here’s hoping the “beta” label gives them the wiggle room to add this kind of thing. There are a number of bike-share schemes around the world so it might not be that bespoke.

It won’t happen; but I can dream of a voice-controlled Bicycle as a Service future.

aside: you could emulate this with a text message service

“Siri, send a message to bikes, ‘i’m coming home’” and siri could read the reply.

Posted in Technology | Tagged , , | 2 Comments

Computer Illiteracy is not a badge of honour

I was at a conference a few weeks ago, and someone quite senior came on and declared, proudly, that “I’m computer illiterate”.

The crowd guffawed.

I can’t imagine that happening if someone came on and said, “I’m functionally illiterate”.

Computers are important now. They’re all-pervasive, not looking like computers, and vital to getting things done.

Let’s stop letting people revel that they don’t understand them.

Posted in Technology | Tagged , , | Leave a comment

Is Facebook Open Graph going to enable “proper” social TV

A few months ago I blogged about hashtags, and how they were imperfect but mostly worked… One of my meaningless predictions was that “Services like Facebook, Twitter and Google+ will provide ways to embed this metadata in posts”.

Today Facebook unveiled Open Graph at f8. I didn’t really pay that much attention until someone linked to a way to get the new profile quicker, which involved signup up to develop an app using the Open Graph actions.

At which point I realised I was staring at a simplified RDF: you have Objects, and Verbs.

Defining Verbs and ObjectsClicking through to define the objects you can define custom fields: both visible and hidden. The channel people are watching a show on, the episode number, the internal identifier to link back to it on the website. Give Facebook the data to bubble up insights like “5 of your friends are watching the XFactor”, but driven from data and not term-extraction from statuses.

Facebook always had the social network. Now it’s defined ways to create these events that have never been worthy of statuses, but have always been ready for Facebook’s insight.

It will be very interesting to see what the likes of GetGlue, iPlayer, Zeebox make of this. We’ve just been given a sensible way to aggregate realtime viewing activity.

Who’s going to be the first to populate it easily?

Posted in Technology, Television | Tagged , , , , | Leave a comment

On the Beta BBC Homepage

Standard “I used to work for the BBC” disclaimer applies.

Over at beta.bbc.co.uk you can see the new BBC homepage. The BBC have written a few articles about it.

My first impression is that this the first homepage that has done what the home-page needed to do, be a shop-front for across the BBC. Previous home-pages have always been very silo-structured. News had their bit, ditto Radio, Sport, Weather, etc.

It felt like representation of the org-chart rather than conveying the breadth of the site.

The new one feels both busier, and simpler. Without the excessive and technically unreliable customisation it’s lost that horrible of air of “is it a homepage or a BBC specific My-Yahoo?”.

I love the design, it’s a clean grid structure and I really hope that as the Ten Products are launched, they can all share this sharp-styling which is a great evolution of the GEL design guidelines.

The BBC has perhaps been through a few too many home-pages in previous years, but this one feels like it’s been given a really tight scope and done that – most people will still be browsing to bbc.co.uk/news or finding content via Google.

A gripe though, given they’re linking so much to iPlayer things, they really need to make the correct redirect to the TV/iPad “big screen” iPlayer version of a programme and not a link to the front-page. (Oh, and sort out an HTML5 player for News content)

Posted in Broadcast, Technology | Tagged , , , | Leave a comment