ViralMoment

Site Reliability Engineer

ViralMoment San Francisco, CA

Job Title: DevOps/Site Reliability Engineer

Location: Remote

About ViralMoment:

ViralMoment is an AI social listening platform that analyzes social videos to identify trending topics and provide insights to brands and agencies. Our mission is to help our clients stay ahead of the curve by leveraging cutting-edge AI technology.

About the Role:

We are seeking a Site Reliability Engineer to join our dynamic team at ViralMoment. In this critical role, you will be responsible for optimizing our cloud infrastructure for scaling, and ensuring high reliability, performance, and availability of our AI-driven platform. Reporting directly to the CTO, you will have the opportunity to influence architectural decisions and lead initiatives for a multi-cloud environment.

Key Responsibilities:

  • Optimize and manage our cloud infrastructure, focusing on scalability, performance, and reliability
  • Develop and enhance observability systems for monitoring and alerting
  • Ensure the stability and efficiency of large-scale systems through effective DevOps practices
  • Handle multi-cloud environments, primarily AWS, with potential implementations on GCP and Azure
  • Collaborate with engineering teams to integrate and optimize backend processes
  • Research and implement systematic solutions for large model applications
  • Maintain and improve system performance through proactive monitoring and troubleshooting

Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related fields from an accredited institution
  • At least 5 years of experience in a similar role, focusing on cloud infrastructure and site reliability
  • Proficiency in cloud-native technologies and strong understanding of the relevant technology stack
  • Expertise in AWS, with additional knowledge of GCP and Azure preferred
  • Strong programming skills in Python with the ability to it them proficiently in a professional setting
  • Familiarity with infrastructure as code, particularly Terraform
  • Experience with large-scale cluster management and cloud-native technologies for log collection, monitoring, and alerting

Preferred Qualifications:

  • Prior experience in constructing and maintaining stability systems for large-scale infrastructures
  • Experience with infrastructure as code, especially Terraform
  • Proven track record in operating and maintaining large-scale systems

What We Offer:

  • A pivotal role in a rapidly growing startup at the forefront of AI technology
  • Direct impact on the platform's performance and scalability that supports major global brands
  • Remote work flexibility with a supportive and dynamic team environment
  • Competitive salary and opportunities for advancement and leadership

How to Apply:

If you are passionate about optimizing cloud infrastructure and ensuring system reliability, we encourage you to apply. Please submit your resume highlighting your experience with cloud platforms, programming languages, and system reliability.

Powered by JazzHR

8IBASh9g40
  • Seniority level

    Mid-Senior level
  • Employment type

    Full-time
  • Job function

    Engineering and Information Technology
  • Industries

    Internet Publishing

Referrals increase your chances of interviewing at ViralMoment by 2x

See who you know

Get notified about new Site Reliability Engineer jobs in San Francisco, CA.

Sign in to create job alert

Similar jobs

People also viewed

Looking for a job?

Visit the Career Advice Hub to see tips on interviewing and resume writing.

View Career Advice Hub