Job Description
- Location: Dallas, TX
- Type: Direct Hire
- Job #248953
- Salary: $128,000
INCIDENT MANAGER Job Summary
The Intersect Group is seeking an Incident Manager for our direct client, where you will be responsible for coordinating the real-time response to incidents and maintaining production change integrity across data, applications, automations, and infrastructure. This role ensures disruptions are handled quickly and effectively while upholding operational standards and business continuity. Outside of incidents, this person works closely with business and technical stakeholders to monitor high-impact areas, improve operational readiness, and ensure that all production changes follow governance and compliance requirements.
Responsibilities Incident Management
- Coordinate the live response to incidents impacting data systems, applications, infrastructure, automations, and network operations.
- Evaluate incident severity, align appropriate resources (including security or networking stakeholders), and manage resolution efforts.
- Support triage of basic network alerts and connectivity issues in coordination with infrastructure teams.
- Ensure accurate documentation of incident activity, response steps, and decision points.
- Strong understanding of operational SLAs, incident classification criteria, and escalation protocols
Root Cause & Remediation
- Lead post-incident reviews and ensure root causes are identified with assigned corrective actions.
- Track remediation progress and collaborate with teams to prevent recurrence.
- Analyze incident patterns and recommend improvements to reduce impact and frequency.
- Partner with security and infrastructure teams when root causes involve access controls, network paths, or unauthorized activity.
Communication & Documentation
- Communicate clearly and promptly with internal stakeholders during active incidents, including escalations related to network or security.
- Maintain and update incident response procedures, SLAs, runbooks, and operational documentation in tools like Confluence.
- Ensure historical records of incidents are complete for audit, compliance, and trend analysis.
Change Management
- Review and validate production change requests to confirm all requirements and safeguards are in place.
- Maintain a current and complete change log across environments, including changes impacting network routing or firewall rules.
- Collaborate with DevOps, data, application, and infrastructure teams to reduce deployment risk and improve release consistency.
Continuous Improvement & Tooling
- Evolve and enforce standards for incidents and change processes across the technology landscape.
- Manage and enhance tooling that supports incident response and change control (e.g., Jira, Grafana, network monitors, or endpoint detection tools).
- Partner with teams to improve observability, alerting, and resilience across systems, with some awareness of network health and endpoint security triggers.
Security & Network Awareness
- Basic familiarity with incident types involving endpoint protection, identity access, or firewall policy violations.
- Comfortable coordinating with security analysts during potential data loss, suspicious login activity, or threat detection alerts.
- Awareness of common networking protocols and how they impact system availability or user experience during outages.
Qualifications - 2-4 years of experience in incident management, production operations, or change governance.
- Strong background across data, application, infrastructure, or automation platforms.
- Familiarity with observability tools and incident management platforms.
- Ability to analyze logs, identify issues using monitoring data (familiarity with Grafana, Power BI, or similar tool), and write or interpret SQL.
- Experience applying ITIL, SRE, or DevOps principles in real-time operations.
- Excellent communication skills, especially in time-sensitive scenarios.
- Detail-oriented and organized with a high sense of ownership.
- Willingness to support after-hours incident response when needed.
- Ability and willingness to learn end-to-end business processes, which are essential to effectively supporting and executing responsibilities in this role.
- Preferred: Awareness of basic security incident workflows, common network protocols, and coordination with infrastructure or security teams during cross-domain events.
Job Tags