About: Selby Jennings has partnered with a global tech company with offices in San Jose, CA and Seattle WA. They have been rigorously been looking to build out their SRE teams across various organizations in the business. Today, their top priority is their Recommendation Infrastructure team. They need an SRE Leader to be a local presence for engineers in the US and act as the liaison with teams in Asia.
Qualifications:
- Strongly preferred that candidates are bilingual in English AND Mandarin
- Bachelor's degree or above majoring in Computer Science or related fields
- 5 YOE in SRE of large-scale systems deployment with high reliability and scalability.
- Familiar with system operation skills in Linux and networking
- Experience programming in at least one of the following languages: Python, Perl, Go, or C/C++
- Experience in designing, analyzing and troubleshooting large-scale distributed systems
- Familiar with popular CI/CD procedures and environments
- Effective communication skills and a sense of ownership and drive
Responsibilities:
- Engage in and improve the whole lifecycle of Recommendation systems from system design consulting through to launch reviews, deployment, operation and refinement
- Deliver tools/software to improve the reliability and scalability of services, automate operations and improve R&D efficiency
- Build availability of large-scale services deployed across global data centers
- Plan, manage and optimize cloud resources utilization, ensuring SLA of large-scale clusters
- Measure and monitor availability, latency and overall service health
- Practice sustainable incident response and postmortems.