The Webex Assurance team is looking for an incident commander to join their amazing team. The incident command team provides 24x7 monitoring and support for incidents, changes, and pipeline. Service availability is key to any cloud platform, and the incident command team is a critical force to ensuring minimal customer impact during times of change or incident.
...
What You’ll Do:
• Take command of incidents by setting up or taking over a multi-functional technical bridge call, comprised of Senior Engineers and SRE team members
• Create an effective plan to restore service as quickly as possible to minimize time to restore
• Ensure team can implement and overcome any identified blockers
• Communicate across multiple audiences including extended stakeholders and executives
• Be responsible for the incident management process; prioritizing and ensuring the most critical issues are addressed timely and successfully
• Identify any action items which will require follow up to improve service availability
• Participate in postmortem reviews
• Ensure highest operational readiness by leading training sessions, simulations, and drills
• Drive the technical root cause analysis process by assessing issues correctly and timely and then assembling the correct technical teams to implement a remediation plan
• Participate in an on-call rotation for incidents
• Participate in or lead the creation of tools to improve incident resolution, incident documentation and communication, or automate tasks.
Who'll You'll Work With:
Our incident command team works with DevOps teams and Service Owners across platform and software engineering to manage change and incidents. They also partner closely with problem management and site reliability engineering to drive continuous learning and improvements. Through accurate data captures, tool creation, and team collaboration, we contribute to a strong, healthy platform with high availability.
Who You Are:
You are confident to take the lead as a technical problem solver and decision maker during critical incidents. As a cloud engineer, you excel at problem analysis, troubleshooting methodologies, and situation appraisal within the context of large-scale systems. While you needn’t be the subject matter expert on all services on the platform, your strong technical leadership will guide teams through critical issues and potentially stressful situations, enabling the team to execute a mitigation strategy to ensure swift service restoration.
Skill Requirements Include:
• Strong understanding of problem analysis and troubleshooting methodologies.
• Consistent track record with customer incident, escalation, and crisis management resolution
• Ability to interact with senior executives, customers, and developers at the appropriate level
• Knowledgeable in UNIX Operating System fundamentals
• Familiar with network programming concepts and protocols
• Excellent project management skills
• At ease with at least one scripting language
• Strong communications skills
• High level understanding of project and program management
• Experience with Nomad, Docker (or like orchestration), Consul, Terraform, Jenkins (or like deployment tools)
• Familiarity with supporting AWS services such as EC2, Serverless, APIGateway, DNS
• Familiarity with applications like NodeJs (express + D3 + psql), Python (Flask + MongoDB), Harness.
Preferred Skill Requirements Include:
• Experience supporting cloud providers such as AWS, Azure, GCP, etc
• At ease with at least one application or systems language
• Experience engineering for or providing platform support for cloud infrastructure and hybrid solutions
• Experience with modern toolsets such as Jira, PagerDuty, Kibana, Grafana, etc.
We Are Cisco
#WeAreCisco, where each person is unique, but we bring our talents to work as a team and make a difference. Here’s how we do it.
We embrace digital, and help our customers implement change in their digital businesses. Some may think we’re “old” (30 years strong!) and only about hardware, but we’re also a software company. And a security company. A blockchain company. An AI/Machine Learning company. We even invented an intuitive network that adapts, predicts, learns and protects. No other company can do what we do – you can’t put us in a box!
But “Digital Transformation” is an empty buzz phrase without a culture that allows for innovation, creativity, and yes, even failure (if you learn from it).
Day to day, we focus on the give and take. We give our best, we give our egos a break and we give of ourselves (because giving back is built into our DNA.) We take accountability, we take bold steps, and we take difference to heart. Because without diversity of thought and a commitment to equality for all, there is no moving forward.
So, you have colorful hair? Don’t care. Tattoos? Show off your ink. Like polka dots? That’s cool.
show more
The Webex Assurance team is looking for an incident commander to join their amazing team. The incident command team provides 24x7 monitoring and support for incidents, changes, and pipeline. Service availability is key to any cloud platform, and the incident command team is a critical force to ensuring minimal customer impact during times of change or incident.
What You’ll Do:
• Take command of incidents by setting up or taking over a multi-functional technical bridge call, comprised of Senior Engineers and SRE team members
• Create an effective plan to restore service as quickly as possible to minimize time to restore
• Ensure team can implement and overcome any identified blockers
• Communicate across multiple audiences including extended stakeholders and executives
• Be responsible for the incident management process; prioritizing and ensuring the most critical issues are addressed timely and successfully
• Identify any action items which will require follow up to improve service availability
• Participate in postmortem reviews
• Ensure highest operational readiness by leading training sessions, simulations, and drills
...
• Drive the technical root cause analysis process by assessing issues correctly and timely and then assembling the correct technical teams to implement a remediation plan
• Participate in an on-call rotation for incidents
• Participate in or lead the creation of tools to improve incident resolution, incident documentation and communication, or automate tasks.
Who'll You'll Work With:
Our incident command team works with DevOps teams and Service Owners across platform and software engineering to manage change and incidents. They also partner closely with problem management and site reliability engineering to drive continuous learning and improvements. Through accurate data captures, tool creation, and team collaboration, we contribute to a strong, healthy platform with high availability.
Who You Are:
You are confident to take the lead as a technical problem solver and decision maker during critical incidents. As a cloud engineer, you excel at problem analysis, troubleshooting methodologies, and situation appraisal within the context of large-scale systems. While you needn’t be the subject matter expert on all services on the platform, your strong technical leadership will guide teams through critical issues and potentially stressful situations, enabling the team to execute a mitigation strategy to ensure swift service restoration.
Skill Requirements Include:
• Strong understanding of problem analysis and troubleshooting methodologies.
• Consistent track record with customer incident, escalation, and crisis management resolution
• Ability to interact with senior executives, customers, and developers at the appropriate level
• Knowledgeable in UNIX Operating System fundamentals
• Familiar with network programming concepts and protocols
• Excellent project management skills
• At ease with at least one scripting language
• Strong communications skills
• High level understanding of project and program management
• Experience with Nomad, Docker (or like orchestration), Consul, Terraform, Jenkins (or like deployment tools)
• Familiarity with supporting AWS services such as EC2, Serverless, APIGateway, DNS
• Familiarity with applications like NodeJs (express + D3 + psql), Python (Flask + MongoDB), Harness.
Preferred Skill Requirements Include:
• Experience supporting cloud providers such as AWS, Azure, GCP, etc
• At ease with at least one application or systems language
• Experience engineering for or providing platform support for cloud infrastructure and hybrid solutions
• Experience with modern toolsets such as Jira, PagerDuty, Kibana, Grafana, etc.
We Are Cisco
#WeAreCisco, where each person is unique, but we bring our talents to work as a team and make a difference. Here’s how we do it.
We embrace digital, and help our customers implement change in their digital businesses. Some may think we’re “old” (30 years strong!) and only about hardware, but we’re also a software company. And a security company. A blockchain company. An AI/Machine Learning company. We even invented an intuitive network that adapts, predicts, learns and protects. No other company can do what we do – you can’t put us in a box!
But “Digital Transformation” is an empty buzz phrase without a culture that allows for innovation, creativity, and yes, even failure (if you learn from it).
Day to day, we focus on the give and take. We give our best, we give our egos a break and we give of ourselves (because giving back is built into our DNA.) We take accountability, we take bold steps, and we take difference to heart. Because without diversity of thought and a commitment to equality for all, there is no moving forward.
So, you have colorful hair? Don’t care. Tattoos? Show off your ink. Like polka dots? That’s cool.
show more