Домой United States USA — software How to Make On-Call Work for Everyone

How to Make On-Call Work for Everyone

101
0
ПОДЕЛИТЬСЯ

Looking for your next job? Building out your tech organization? How on-call is handled can make a huge impact on your success.
Join the DZone community and get the full member experience.
I never liked being on-call (slight understatement) or asking others to shoulder some of the load. Sometimes it feels like it’s a penalty for being more involved and knowledgeable about our code and infrastructure. And it definitely is a big distraction from core development and innovation.
But there really is no way to avoid it once you have a live product or website with paying customers. Somebody needs to be available just in case something goes wrong.
How on-call is done in your organization or by your prospective employer can make all the difference in your success (and sanity). Here are some approaches I’ve seen that can improve the on-call experience and overall productivity.
Being woken up in the middle of the night due to the NOC or support team opening a high severity ticket, only to find out that it was a relatively non-critical issue absolutely sucks. 
To solve this all too common scenario, one company I worked at came up with a simple solution. They replaced the “High Severity” designation with “Wake Up R&D.” By clearly outlining the result of opening a high severity ticket, they forced the opener to think twice (maybe even thrice) about whether the issue was really worth waking someone up in the middle of the night.  
Make sure that you or your prospective employer has a good method for separating the signal from the noise. 
For junior or new employees who might be unfamiliar with all the intricacies of what constitutes a critical issue, how it should be handled, etc., it’s essential to have a runbook or some other documentation that outlines what issues warrant waking R&D up in the middle of the night. 
While this type of documentation goes a long way in describing various scenarios, their severity and how they should be handled, it takes a few months for someone to get a sense of the systems they’re working with, and be able to classify incidents accurately. 
Make sure that you or your prospective employer invests the time and training for newbies to ease them through this learning process.
Well, it might be the sixth Dora metric as Google added a fifth already in 2021. 
Either way, hat tip to Charity Majors, CTO at Honeycomb who suggests in this excellent blog post that software engineering management should be evaluated not only by the four original DORA metrics but also by how often their “team is alerted outside of working hours.

Continue reading...