IT Network Site Reliability Engineer

Full Time
Santa Clara, CA
Posted
Job description
The NVIDIA IT organization is looking for site reliability engineering talent to build, deploy and scale NVIDIA’s infrastructure. These services include software to manage hardware and network provisioning to deploy and run a multi-tenant infrastructure. As a site reliability engineer, you will work with other site reliability engineers, software engineers, product owners, and network engineers as a collaborative team to deliver and maintain end-to-end solutions to handle complex hybrid cloud infrastructure deployments.
You will write as well as integrate with services and software that aligns with the broad architectural vision for the NVIDIA IT Network Infrastructure, working with other teams to develop a robust and scalable, sustainable system. You own your code - from development to commit to test to production. We expect you to be passionate about code quality, documentation, testing, deployment efficiency/simplicity and bringing amazing products and capacity to market.
What you will be doing:
  • Work with NVIDIA internal customers
  • Design, Build and Operate scalable software systems to run NVIDIA’s network infrastructure
  • Lead sustainable incident response, blameless postmortems, and production improvements that result in direct business opportunities for NVIDIA.
  • Provide guidance to other team members on managing end-to-end availability and performance of critically important services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions.
  • Building network and systems automation software for running a multi-tenant cloud infrastructure
  • Debugging complex problems across full stack and creating solid solutions
  • Automating work across a variety of infrastructure needs such as testing, failover, policy modifications and deployment.
  • Writing, updating, and using documentation, including runbooks/playbooks
What we need to see:
  • 4+ yrs of experience with designing and building distributed software systems.
  • BS/MS degree in Computer science or related areas (or equivalent experience)
  • Demonstrated ability to write code in a mainstream systems programming language such as C, C++, Go, Python, Java, Rust, etc.
  • Ability to use, design and implement maintainable APIs.
  • Knowledge of underlying Linux Internals: Kernel scheduling, memory management, and networking subsystems.
  • Understanding of networking protocols such as IP, IPv6, BGP, HTTP, ICMP, tunneling protocols (VXLAN, Geneve, GRE), etc.
  • Knowledge of secure communication protocols (mutual-TLS, IPsec, or similar).
  • Ability to reach multi-functional consensus without all the details
Ways to stand out from the crowd:
  • Proficiency in a Hyperscale Cloud Service Provider (public facing or not)
  • Experience with high level compiled languages such as Go or Java
  • Proficiency with Kubernetes and/or distributed task scheduling
  • Background with host security services and security principles such as TPM, TXT, SecureBoot
  • Understanding of SRE principles (observability, SLOs, SLIs, logging, etc)
  • Knowledge of software interface design & documentation for less technical end-users
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and passionate people in the world working for us. If you're creative and passionate about developing cloud services we want to hear from you!
The base salary range is $122,000 - $274,000. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and
benefits
.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

jjbodyshop.com is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, jjbodyshop.com provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, jjbodyshop.com is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs