Description:
You're the engineer who maintains uptime across 50+ SaaS products when nobody else knows where to start. We need DevOps professionals capable of entering unknown AWS environments, restoring order, and driving availability beyond 99.9% through genuine monitoring, automation, and root cause analysis. You'll break down complex projects into single-day increments, deliver production-ready Python or JavaScript, and leverage AI as your assistant.
Most organizations talk about "cloud infrastructure" while manually tending servers. We're systematizing reliability across a portfolio of acquired products whose original teams have departed and whose documentation is incomplete. That's where the challenge lies: you'll harness agents and current tooling to explore unfamiliar systems 5–10x faster, document your findings, and automate solutions so recurring failures become impossible. Rather than judge you on certifications and vendor badges, we'll observe how you troubleshoot in real time, author a genuine 5-Whys that identifies one actionable root cause, and construct automations that endure in production.
This is not a tier-two "follow the runbook" position. Here, you author the runbooks, architect the deployment path from development through staging to 10% rollout to full release with soak periods and rollback conditions, and create the monitoring that captures corner cases. You block risky changes before execution. You distinguish between infrastructure failures under your ownership and application bugs owned by Engineering, then route permanent remediation to the correct team.
You'll operate at the engineering center of reliability, taking charge of infrastructure initiatives, incident response with RCAs, and change requests accompanied by copy-paste-ready runbooks. If you've already operated a substantial SaaS platform and want to apply that expertise across an entire fleet, join us. Bring deep AWS knowledge, production-quality coding ability, strict scope discipline, and daily, mission-critical use of AI tooling. If you're prepared to ensure continuous operation, please apply.
What You Will Be Doing
What You Won’t Be Doing
Senior DevOps Engineer Key Responsibilities
Basic Requirements