Learning Options

  • Online Video-Based Learning
  • Flexible Schedule
  • Expert Trainers with Industry Experience
  • High Pass Rates
  • 24/7 Personalised Support
  • Interactive Learning Materials
  • Live Online Classes
  • Expert Trainers with Industry Experience
  • Live Assessment and Feedback
  • Interactive Learning Materials
  • Networking Opportunities
  • High Pass Rates

Overview

The Site Reliability Engineering (SRE) Practitioner Training Course is designed to help professionals implement advanced reliability practices in modern IT environments. With a focus on automation, monitoring, and incident response, the course provides the tools and frameworks necessary to manage highly scalable and reliable systems, ensuring optimal performance. Learners will gain the expertise to enhance system reliability while adapting to evolving technological demands.

This training prepares learners to align technical operations with organisational goals, ensuring reliability and efficiency across infrastructures. Key topics include service-level objectives (SLOs), error budgets, incident management, and capacity planning, all tailored to address modern IT challenges. The course also emphasises proactive measures to optimise system uptime and minimise disruptions.

Delivered over 3 days by MPES, this hands-on training combines real-world scenarios, interactive sessions, and comprehensive exam preparation, empowering learners to excel in their roles as SRE practitioners. Learners will benefit from expert-led guidance and actionable insights to apply SRE principles effectively in their organisations.
 

Course Objectives
 

  • Understand the core principles of Site Reliability Engineering (SRE) and their practical application
  • Learn to define and implement service-level objectives (SLOs) and error budgets
  • Master strategies for automating operational tasks and reducing toil
  • Enhance skills in incident management and post-incident analysis
  • Develop capacity planning frameworks to ensure scalability and reliability
  • Explore effective methods for monitoring, observability, and alerting
  • Prepare thoroughly for SRE practitioner certification and real-world challenges 

calender

Average completion time

3 Month
wifi

with unlimited support

100% online
clock

Start anytime

Study At Your Own Pace

Course Includes

Course Details

Develop your understanding of essential financial, business and management accounting techniques with ACCA Applied Knowledge. You'll learn basic business and management principles and the skills required of an accountant working in business.

Entry Requirements

    • Educational Background: A degree in IT, Computer Science, or equivalent experience in IT operations or development. 

    • Professional Experience: Prior exposure to software development, system administration, or IT operations is recommended. 

    • Language Proficiency: Proficiency in English is required, as the course materials and certification exam are conducted in English. 

     

Learning Outcomes

    • Develop Advanced SRE Skills: Master the principles and practices of Site Reliability Engineering. 

    • Implement SLOs and Error Budgets: Learn to define, monitor, and manage service-level objectives effectively. 

    • Optimise Incident Management: Gain expertise in handling incidents and conducting post-incident reviews. 

    • Automate and Scale Operations: Build automation frameworks to minimise toil and improve efficiency. 

    • Strengthen Monitoring Capabilities: Implement robust monitoring and alerting systems for enhanced observability. 

     

Target Audience


    This course is ideal for professionals seeking to advance their expertise in SRE practices. It is particularly suited to: 

    • Site Reliability Engineers 

    • DevOps Engineers 

    • IT Operations Managers 

    • Infrastructure Engineers 

    • Software Engineers transitioning to SRE roles 

    • Professionals responsible for system reliability and scalability 

Course content


    Module 1: SRE Anti-Patterns 

    • Break the Ice with a Recap of DevOps Institute’s SRE Blueprint 

    • Discuss How SRE Works in a Distributed Ecosystem 

    • Discuss Some of the SRE Barriers 

    • Few SRE Anti-Patterns (Discuss the Right Patterns Too) 

    • Discuss the Case Story of How Monzo Bank Learned from Causes Leading to SEV1 Issue 
       

    Module 2: SLO is a Proxy for Customer Happiness 

    • What Has Changed with SLO? 

    • Identifying System Boundaries for Setting SLIs is Critical 

    • How Do You Use Error Budgets Beyond the Velocity Vs. Stability Debate? 
       

    Module 3: Building Secure and Reliable Systems 

    • Building Secure and Reliable Systems 

    • Non-Abstract Large-Scale Design 

    • Designing for the Changing Architecture and Distributed Ecosystem 

    • Fault Tolerant Design 

    • Designing for Security 

    • Designing for Resiliency 
       

    Module 4: Full Stack Observability 

    • Modern Apps are Complex and Unpredictable 

    • Slow is the New Down 

    • Pillars of Observability 

    • Using Open Telemetry 
       

    Module 5: Platform Engineering and AIOps 

    • Taking a Platform Centric View 

    • Using AIOps to Improve Resiliency 

    • How DataOps Can Help? 

    • Implementing AIOps 

    • Measuring AIOps 
       

    Module 6: SRE and Incident Response Management 

    • SRE Key Responsibilities Towards Incident Response 

    • DevOps and SRE and ITSM (New Vs. Old Ways) 

    • OODA and SRE Incident Response 

    • SRE and CLR (Closed Loop Remediation) 

    • Swarming – Food for Thought 

    • AI/ML for Better Incident Management
       

    Module 7: Chaos Engineering 

    • Navigating Complexity 

    • What Is Chaos Engineering? 

    • What Chaos Engineering Is Not? 

    • Chaos Engineering Myths 

    • Conducting Chaos Engineering Experiments 

    • Chaos Engineering for Security 
       

    Module 8: SRE is the Purest Form of DevOps 

    • Key Principles of SRE 

    • SREs Help Increase Reliability Across the Spectrum 

    • Metrics for Success 

    • SRE Execution Models 

    • Culture and Behavioural Skills are Key 

    • Transformation After Implementing SRE Practices 

MPES Support That Helps You Succeed

At MPES, we offer comprehensive support to help you succeed in your studies. With expert guidance and valuable resources, we help you stay on track throughout your course.

  • MPES Learning offers dedicated support to help you succeed in Accounting and Finance courses.
  • Get expert guidance from tutors available online to assist with your studies.
  • Check your eligibility for exemptions with the relevant professional body before starting.
  • Our supportive team is here to offer study advice and support throughout your course.
  • Access a range of materials to help enhance your learning experience. These resources include practice exercises and additional reading to support your progress.

Career Growth Stories

MPES Learning offers globally recognised courses in accounting,

Need help with your ACCA course?

Our course advisors are here to help guide you and ensure that you choose the right course for you and your career journey.

Have Questions? We’ve Got You

If you have any questions, we’re here to help. Find the answers you need in the MPES detailed FAQ section.

Q. Do I need prior experience in SRE to take this course?

No prior SRE experience is required, but knowledge of IT operations, system administration, or software development is recommended. 

Q. What topics are covered in this course?

The course covers SRE principles, SLOs, error budgets, automation, incident management, monitoring, observability, and capacity planning. 

Q. How will this course help me in my career?

This course equips you with advanced SRE skills, enabling you to manage reliable systems, align operations with business goals, and advance in roles like Site Reliability Engineer or DevOps Engineer. 

Q. Is this course suitable for non-technical professionals?

The course is ideal for technical professionals. Those with some IT background or exposure to operations will benefit the most. 

Q. Will I receive a certification upon completing the course?

You will receive a course completion certificate. This course also prepares you for SRE practitioner certification exams. 

Related Course

Explore additional courses designed to complement your learning journey and enhance your professional skills. Expand your knowledge with these expertly curated options tailored to your career goals.

DevOps Leader Certification Course Go To Course blue-arrow
Certified Agile DevOps Professional (CADOP) Go To Course blue-arrow
Certified SecOps Professional (CSOP) Go To Course blue-arrow
Certified DevOps Security Professional (CDSOP) Go To Course blue-arrow
SaltStack Training Course Go To Course blue-arrow
View More

Resources

Access a wide range of free resources to support your learning journey. From blogs to news and podcasts, these valuable guides are available at no cost to help you succeed.

Course Schedule

Course Schedule

£3995

Site Reliability Engineering (SRE) Practitioner Training Course

Mon 30th Sep 2024

Wed 2nd Oct 2024

Duration - 3 Days

DELIVERY METHOD

Virtual

£3995

Site Reliability Engineering (SRE) Practitioner Training Course

Mon 21st Oct 2024

Wed 23rd Oct 2024

Duration - 3 Days

DELIVERY METHOD

Virtual

£3995

Site Reliability Engineering (SRE) Practitioner Training Course

Mon 25th Nov 2024

Wed 27th Nov 2024

Duration - 3 Days

DELIVERY METHOD

Virtual

£3995

Site Reliability Engineering (SRE) Practitioner Training Course

Mon 16th Dec 2024

Wed 18th Dec 2024

Duration - 3 Days

DELIVERY METHOD

Virtual