49,99 €
inkl. MwSt.
Versandkostenfrei*
Erscheint vorauss. 3. September 2024
payback
25 °P sammeln
  • Broschiertes Buch

System downtime is a common pain point for a great many organizations. Recently, a 12-hour store outage cost one Silicon Valley giant $25 million. And a 14-hour outage cost another corporation an estimated $90 million. News like this reminds companies that they need an incident management program before the unexpected happens. And there's no better time to get started than now. But where do you begin? What elements does your program require? Who should be in charge? In this practical book, seasoned professionals Michael Kehoe and David Cintz, who build, maintain, and run incident management…mehr

Produktbeschreibung
System downtime is a common pain point for a great many organizations. Recently, a 12-hour store outage cost one Silicon Valley giant $25 million. And a 14-hour outage cost another corporation an estimated $90 million. News like this reminds companies that they need an incident management program before the unexpected happens. And there's no better time to get started than now. But where do you begin? What elements does your program require? Who should be in charge? In this practical book, seasoned professionals Michael Kehoe and David Cintz, who build, maintain, and run incident management programs for Confluent and LinkedIn, show you how to create a program that's effective, efficient, scalable, and automated--regardless of the size of your organization. You'll also learn how to tailor the program to meet the specific needs of your company as it continues to grow. This book will help you: * Understand the importance and benefits of an incident management program * Create an effective incident categorization system and after-incident review program * Create an effective automation strategy for incidents * Build a comprehensive on-call program and train engineers About the authors: Michael Kehoe is an author, speaker, and senior staff security engineer at Confluent, leading security initiatives for multiple organizations. David Cintz is an expert in leading large-scale incident response programs and serves as the staff technical program manager for security incidents at Confluent.
Autorenporträt
Michael is an author, speaker and Sr Staff Security Engineer at Confluent, leading multi-organization security initiatives. Previously, he was a Sr Staff Site Reliability Engineer (SRE) at LinkedIn, architecting LinkedIn's move to Microsoft Azure. Before graduating with a Bachelor of Electrical Engineering from the University of Queensland (Australia), Michael interned at NASA Ames Research Center building small-satellites known as Phonesats. While working at LinkedIn, Michael led the company's work on Incident Response, Disaster Recovery, Visibility Engineering & Reliability Principles. He has also been embedded with the profile, traffic, espresso (KV Store) teams. After leading LinkedIn's last physical data-center build, he is now the architect for how LinkedIn builds its infrastructure in Azure. Michael has spoken at numerous events all over the world and in 2018 was a co-author of the book "Cloud Native Infrastructure with Azure" and "Reducing MTTD for High Severity Incidents"