Operation Bootstrap

Web Operations, Culture, Security & Startups.

Give Your Developers Prod Access - It's Trust

| Comments

This isn’t a new idea and plenty of companies already do this. I had a discussion with a co-worker about this last week and wanted to get my thoughts down here so I can laugh at them later on when I get burned, which I haven’t yet, but have been assured I will.

There is an idea that there are two organizations in a web development company that have different and apparently opposing roles: Development and Operations. This quickly turns into a discussion about who does what? :

Roles

Developers:

  • Add capabilities to the existing system (write code, evaluate new components, integrate 3rd party products, etc)

  • Optimize previously added capabilities to the existing system (improve performance, fix bugs, design new architectures)

  • Help turn business needs into business value (codify a business requirement into a deliverable product)

Operations

  • Add capabilities to the existing system (implement monitoring, configuration management, evaluate new applications & products)

  • Optimize previously added capabilities to the existing system (Improve performance, fix problems, design new architecture)

  • Help turn business needs into business value (reduce cost to deliver, improve availability, improve security)

These groups do the same things in different ways. You could just as well have the two groups be “API Developers” and “GUI Developers” – they have sufficiently different goals to create conflict & have different points of view.

What are both groups doing? They are building and operating a service —– Period.

But But But….

“But developers with access to production could get access to customer data”.

“But developers think differently than Operations and they could cause outages”

“But developers might go in and change something without telling anyone”

“But developers might break something I have to fix and that would piss me off”

All of the above have happened to me in one job or another – every one of them. In every single case you know who did it? An Operations team member.

If I had $50 for every time a developer did it to me in environments where they had production access, I’d maybe have $50. This is my experience – yours may be completely different.

You trust them to write your code – that’s the product that your company runs on. Do you put layer upon layer to make sure they aren’t inserting malicious code? Do you prevent them from walking out of your building with your entire codebase? I know in larger organizations this may be true – if that’s going through your head re-read the title of this blog. I don’t care about hamstrung behemoth companies.

Do your ops folks have access to your code? You trust that they wont go break something in there, but you don’t trust that developers wont go break something in production?

The reality

Developers care about the products they build, just like Operations does. If they don’t care then you have bigger problems and giving them production access will only make those problems evident faster – which is good. Developers also write code with a certain understanding about how the production world works and when they don’t have production access, that understanding is often wrong.

Misunderstanding, lack of data, and lack of an ability to predict the outcome of code in a production environment – in my opinion – is more often fatal than any stupid or malicious act from a developer. Yes, you can try to build a production-like environment for their testing but there is nothing like the real thing – there never will be. It will always be simulated, it will always fall short in areas, and it will never be viewed as a perfectly accurate representation of production.

I know there are lots of arguments out there on both sides of this but I know where I fall. I also know this runs counter to many of the regulatory “requirements” out there. I’m not ignoring that, but I am choosing to challenge us to come up with a better way instead of giving ourselves a false sense of security by blocking access.

The benefit

So why would I want to give my developers access to production? Have you ever been given the master key to an office? How about being given your parents car keys for the first time? You may not have thought about it at the time, but there is tremendous pride and appreciation that comes from being trusted. All the silly team building games folks play – it’s about building trust. Another word for trust is respect.

When you give your developers access to production you are saying a few things quietly but clearly:

  • I value you as a team member and I value your contribution – I want to maximize what you can do for us.

  • I have an expectation of you that you will learn about our production environment and leverage this access to write better code.

  • I trust you, please do not violate that trust

  • I think you are competent and professional and believe you’ll do the right thing.

You can tell developers these things without giving them production access – but it’s much less convincing.

The safety net

So what happens that first time when a developer drops a database table in production thinking he was working on a development environment? You run a post-mortem, and the developers come to it (NOT just the one who caused the problem).

  • Blameless, you are trying to understand what information & decisions led up to the event – NOT who is responsible for it

  • Identify what the timeline was, what were peoples understandings about the situation that led to this decision.

  • Identify gaps in communication – why did that developer think they were working on a development box?

  • Identify gaps in your defense – you did have backups right? You were able to recover the db table right? You had a plan to communicate with customers during the outage right?

  • Create a set of corrective actions that will protect against this in the future. For example – maybe we make sure production machines all have a bright red prompt so you know you are working on production.

  • DO NOT use this as a reason to remove developer access to production

If the problem repeats itself, your post-mortem process continues. If the same offenders keep doing the same thing, you have to ask yourself if they should work for you. This applies to Operations team members just as much as it does Developers. If you can’t manage the responsibility of production access, then you don’t belong at a company that gives production access to the whole team.

Also, always keep in mind that your Operations teams can and will make the same mistakes you are worried about your Dev’s making. Except they’ll make them more often because they have this implicit trust that they have a “right” to work in that environment and that there are no environments like production in which they can test their change out. it’s no different – it just how you frame it.

Comments