As Senior Site Reliability Engineer you'll be accountable for FanDuel Sportsbook Platform by closely monitoring its availability, performance and stability while closely working with software development teams in how to improve critical components.
Working with complex challenges while assuring uptime and reliability in different setups (AWS Cloud, AWS Outpost, OpenStack) allows you to use different skillsets in coding, algorithms and complexity analysis.
What will you be doing?
Lead SRE Technical Roadmap
Lead improvements to reduce the effort of managing 20+ DCs
Collaborate with different vertical principals in strategies to improve resiliency, performance and reliability for more than 11000 service instances
Lead AI based monitoring implementations within different sportsbook domains
Engage in and improve the whole lifecycle of services—from design, deployment, operation, and refinement.
Take an active part in production problems root cause investigation, identification, and resolution (where necessary)
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity.
Define and revise Service Level Indicators (SLIs);
Practice sustainable incident response and blameless postmortems
Support SRM in the development of Senior SREs personal development plan
We are looking for someone who:
Has experience with software engineering or site reliability engineering;
Good Operating Systems & Networking knowledge;
Has experience working with public cloud providers;
Has experience working with microservices architectures;
Has experience working with message queuing services and databases;
Has experience with Configuration Management tools such chef and ansible;
Has knowledge of Monitoring Solutions like Datadog and Splunk;
What You'll Get in Return...
Our passion has helped us take the betting industry by storm. So we think it's only fair that you enjoy excellent rewards.
The Fun Stuff
Competitive salary and bonus scheme
25 days annual leave
Snacks and drinks
Discounted gym memberships
In office shower and locker rooms
Health and well being classes (Yoga, bootcamp, etc.)
In office masseuse, nutritionist and ergonomist
The Boring Stuff
Own your own laptop scheme
Thousands of online learning courses
Employee Assistance Programme
There is more, but we won't go on...
Senior Site Reliability Engineer