title: HPC Engineer
Position: Fulltime
Location: San Francisco (open to remote with weekly travel)
Compensation: Base salary of $225k
A proprietary trading firm based in San Francisco is looking for a Senior HPC Engineer to design and evangelize their physical datacenter infrastructure spanning upwards of 20,000 cores.
The firm are looking for an engineer experience with design, deploying, and scaling large HPC physical cluster deployments. This position requires excellent project management and will have broad responsibilities and freedom to analyze and solve problems with solution that will have immediate impact to their operations worldwide.
Responsibilities:
-Build& Deploy new machines in the datacenter
-Optimize distribute storage layer (including NFS configuration, local cashing, etc)
-Debug hardware failures & work with networking teams on hardware upgrades
-Maintain distributed workload manager & ensure project management & completion
Qualifications:
-5+ YOE with physical cluster deployments & large scale HPC deployment environment (greater than 10k cores)
-Expert level knowledge of Shell & Python scripting
-Expertise in Linux troubleshooting and NFS
-Experience with distributed workload managers (such as Slurm)
-Deep knowledge fo CPU hardware