Manager, SRE & DevOps
As Manager, SRE & DevOps you will lead all aspects of SRE & DevOps with technical & data expertise to deliver the best and world-class user experience on the services that we provide. You will lead a team to build and run large-scale, fault-tolerant systems and services. Cultural fit is a must, as you will need to be self-motivated, a critical problem solver, data-driven, results-oriented, with a focus on delivering outstanding user experience.
You will manage the SRE-centric efforts across independent functional teams comprised of Architecture, Engineering, Security and Solution Architecture and will lead a strong and experienced team to negotiate requirements with demanding internal and external clients and pushing us toward project milestones, driving daily agile-like stand-ups to promote team communication and keep the team motivated.
At SITA, we achieve more, together. Are you ready to join us?
What you will do
• Define Critical Success Factors and Key Performance Indicators (KPI) for processes and drive the reporting associated with them consistently across the organization to identify trends, anticipate problems to ensure a best in class level of support and service
• Accountable for process definition, promotion, and governance of the processes as well as driving implementation, adoption, and continuous improvement across the organization to enable business process improvement, innovation and create a service culture
• Prioritize and maintain the backlog based on the business needs to meet tight deadlines and ensure agile practices are performed in planning of weekly sprints and communicate on behalf of the team to report on progress, risk and achievements
• Identify inter-dependencies between the various partner groups to ensure all are aligned and risks are identified, mitigated and communicated
• Build a knowledge base with lessons learned from incidents and support issues to support
• Work with other team to encourage DevOps practices (deployment, monitoring, observability, Scalability)
• Build software and systems to manage infrastructure and applications through automation Deployment, support and monitoring of existing and new services, platforms, and application stacks and Increase operational efficiency via automation and reducing manual tasks
• Establish Service Level Agreements and Operational Level Agreements; and monitor, improve, and report performance on these and other key performance indicators
Who you are
• Proven leader who combines technical expertise with well-developed business acumen, strong analytic and problem-solving skills leading to effective decision making that enables process improvements
• Excellent problem solving, critical thinking, and interpersonal skills - Lead by example to empower and challenge the team to deliver their best
• Excellent communication skills for working across the organization, capable of building strong relationships with peers and leadership
• Hands on experience managing large, transformational projects and leading organizational change management initiatives
• Ability to prioritize and execute tasks in a high-pressure environment and make sound decisions in emergency situations
• Ability to deliver quantitative metrics of the environment to help with planning and execution of service delivery
• 4+ years of people management and team leadership experience developing strong and motivated teams with B. Tech./B.E. degree in Electronics & Telecomm or Computer Science
• 5+ years of demonstrated ability in site reliability and technical operations leadership
• Background in leading infrastructure / DevOps / SRE for highly-available, large-scale SaaS platforms and experience with modern SRE & DevOps practices
• Solid understanding of software development, debugging, optimization, and/or troubleshooting - hands-on experience with common programming languages preferred
• Experience building large and geographically disperse infrastructure supporting business-critical cloud & on-premises services
• Experience leading security concerns especially in the context of hosted environments and operations identity management. Leading through certification in HITRUST, ISO or other security certifications is a strong asset
• Experience operating and maintaining production systems in a Linux private and public cloud environment: Azure and/or AWS preferred
• Extensive experience leading teams responsible for customer facing systems in a high uptime 24-7 environment
• Expertise analyzing sophisticated application, database, network, and OS issues across a distributed large-scale business critical system
• A depth and breadth of experience with server-side Java development, relational databases, eventually consistent, high efficiency, cluster-based NoSQL solutions and distributed streaming platforms
• Experience on configuration management, code deployment and automate tasks like setup centralized log collection, monitoring, vulnerabilities patching, security audits
• Experience in Monitoring tools and Ticketing tools like ServiceNow, Jira, New Relic and Nagios
• Experience leveraging programming/scripting platforms (Unix shell, PowerShell, Python) to increase operational efficiency & consistency through automation of repeatable tasks, infrastructure as code tools such as Ansible or Terraform
• Good understanding of APIs, TLS, HTTP & DNS
• Experience working with PostgreSQL, MS SQL, SQL, Lucene and MongoDB
• Experience with Kubernetes, Docker, Kafka, Bitbucket and Bamboo
• Experience with 24/7 site monitoring and ability to own uptime & performance SLA’s
• Operational experience at scale - designing and operating highly available, scalable, and fault-tolerant systems using best-of-breed technologies like containers, APIs, Data Platform, etc.
• Strong knowledge of ITIL best practices
What we offer
SITA is a place of change and constant improvement, where we're always pushing ourselves to find better ways of doing things: smarter, quicker, easier, for us and our customers and for their customers too. Our values underpin everything we do at SITA.
And we offer all the good stuff you’d expect like holidays, bonus, flexible benefits, medical policy, pension plan and access to world class learning.
Welcome to SITA
SITA is the world’s leading specialist in air transport communications and information technology. We don’t just connect the global aviation industry. We apply decades of experience and expertise to address almost every core business, operational, baggage, and passenger process in air transport.
We design, build, and support technology solutions all with one vision to create easy air travel every step of the way. As an organization, we cover 95% of all international air travel destinations and work with over 2,800 air transport and government customers in every corner of the globe. Are you ready to explore the opportunities?
Keywords : DevOps Manager, SRE, system reliability engineering, service management