Home › Companies › Careers Herbalife Icims Com › Principal III, SRE

Principal III, SRE

Careers Herbalife Icims Com · Torrance, CA, US · Hybrid · Active · iCIMS

Job facts

Field	Value
Company	Careers Herbalife Icims Com
Title	Principal III, SRE
Normalized title	-
Department / team	Engineering
Location	Torrance, CA, United States
Work model	Hybrid / Hybrid
Employment type	Full Time
Salary	-
Status	active
ATS provider	iCIMS
Posted / first seen	2026-05-29 / 2026-05-31
Changed / last seen	2026-06-03 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Careers Herbalife Icims Com.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through iCIMS.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in Torrance.	Open
Department jobs	Active postings in Engineering.	Open
Work model jobs	Active Hybrid postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Careers Herbalife Icims Com
Source	a31b711c-51ba-4aaf-9f37-98f916f2ce31
ATS provider	iCIMS

Description

Overview THE ROLE The SRE Principal Engineer III will work a hybrid schedule, with a requirement to be onsite at our Torrance, CA facility at least two days per week or more if needed, while also having the flexibility to work remotely. This role is responsible for leading, designing, and implementing robust Site Reliability Engineering (SRE) practices to ensure high availability, scalability, and resilience of critical business systems and applications. The SRE Principal Engineer III will focus on improving system reliability through automation, monitoring, and performance tuning, working closely with development and operations teams to champion a culture of continuous improvement and operational excellence.The SRE team consists of:● SRE Engineers● Deployment Automation● Incident Response and Postmortem Analysis● Observability and MonitoringThis role will drive the adoption of best practices in multi-cloud and hybrid-cloud platforms, managing services from major cloud providers like Microsoft Azure, Amazon AWS, Oracle OCI, Google GCP, and Alibaba Cloud. The SRE Principal Engineer III will focus on automation, incident management, performance monitoring, and optimizing infrastructure to support scalable, reliable systems. The position will also be responsible for fostering collaboration between development, operations, and security teams to streamline system operations across the organization. HOW YOU WOULD CONTRIBUTE: ● Lead the implementation and optimization of SRE practices, ensuring system reliability, performance, and scalability.● Architect and maintain automation for infrastructure provisioning, deployment, and incident response.● Establish and implement SLOs (Service Level Objectives) and SLIs (Service Level Indicators) for key services.● Collaborate with development teams to design and deliver reliable software systems, ensuring that production environments are optimized for uptime and performance.● Create and maintain monitoring, alerting, and observability solutions to provide real-time insights into system health and performance.● Respond to production incidents, perform root cause analysis, and implement corrective measures to prevent recurrence.● Continuously improve system performance, capacity planning, and reliability through infrastructure tuning and automation.● Facilitate post-incident reviews, fostering a blameless culture that focuses on learning from incidents.● Collaborate with security teams to ensure infrastructure meets compliance, security standards, and best practices.● Champion a collaborative environment across development, operations, and security teams to enhance operational efficiency and knowledge sharing.● Drive the adoption of automation tools and frameworks to minimize manual intervention and optimize systems. Qualifications Skills Required:● Proven expertise in SRE practices, with a focus on automation, incident management, observability, and infrastructure scalability.● Extensive knowledge of cloud platforms (Azure, AWS, GCP, Alibaba) and hybrid-cloud environments, with a focus on reliability and performance optimization.● Experience with automation tools and scripting languages, such as Python, Go, Terraform, or Ansible, for leading infrastructure and incident response.● Strong understanding of containerization (Docker, Kubernetes) and orchestration systems.● Solid grasp of monitoring and observability tools (Prometheus, Grafana, Dynatrace, Splunk) to ensure real-time system health monitoring.● Expertise in capacity planning, performance tuning, and failure management techniques.● Strong background in incident management, root cause analysis, and postmortem processes to improve system resilience.● Deep understanding of security and compliance requirements, and the ability to ensure production environments meet industry standards.● Experience with Agile and DevOps methodologies to ensure fast, reliable delivery of services. Experience Required:● 10+ years of experience in IT, with a focus on SRE, DevOps, or infrastructure engineering roles.● Extensive hands-on experience with cloud infrastructure management and automation tools such as Terraform, CloudFormation, or equivalent.● Proficiency in scripting and automation languages like Python, Bash, Go, or Ruby for infrastructure automation.● Proven experience in managing large-scale systems, ensuring reliability, high availability, and scalability.● Expertise in container orchestration technologies, including Kubernetes, OpenShift, and Docker Swarm.● Deep knowledge of monitoring and observability platforms (Prometheus, Grafana, ELK, Dynatrace), including experience building and maintaining alerting and dashboard systems.● Strong understanding of version control systems and CI/CD practices to optimize code deployment as it relates to infrastructure.● Demonstrated ability to optimize performance in multi-cloud and hybrid-cloud environments, ensuring uptime and performance at scale. Education Required:● Bachelor’s degree in computer science, Information Technology, or related field, or equivalent experience. Certificates / Training Preferred:● Relevant cloud certifications such as AWS Certified Solutions Architect, Azure Solutions Architect Expert, or Google Cloud Professional Cloud Architect.● SRE-related certifications like Certified Kubernetes Administrator (CKA) or Google Professional Cloud DevOps Engineer. US Benefits Statement Herbalife offers a variety of benefits to eligible employees in the U.S. (limited to the 50 States and the District of Columbia), which includes Group Health Programs, other Voluntary Benefit Programs, and Paid Time Off. Group Health Programs include Medical, Dental, Vision, Health Savings Account (HSA), Flexible Spending Accounts (FSA), Basic Life/AD&D; Short-Term and Long-Term Disability, and an Employee Assistance Program (EAP). Other Voluntary Benefit Programs include a 401(k) plan, Wellness Incentive Program, Employee Stock Purchase Plan (ESPP), Supplemental Life/Critical Illness/Hospitalization/Accident Insurance, and Pet Insurance. Paid time off includes Company-observed U.S. Holidays, Floating Holidays, Vacation, Sick Time, a Volunteer Program, Paid Maternity and Paternity Leave, Bereavement Leave, Personal Leave and time off for voting.

Full job record

Job ID	17bd7362ac2a071bffdb70d6bbb75b5b0a78e887
Org ID	ac28d449-c5ef-4e7a-9755-698cd03a1ada
Source ID	a31b711c-51ba-4aaf-9f37-98f916f2ce31
Board ID	a31b711c-51ba-4aaf-9f37-98f916f2ce31
Provider	icims
Provider Job Key	19730
Title	Principal III, SRE
Normalized Title	—
Status	active
Active	yes
Location Text	Torrance, CA, US
Department	Engineering
Team	—
Employment Type	full_time
Workplace Type	hybrid
Remote Policy	hybrid
Country	United States
Region	CA
City	Torrance
Salary Raw	Overview THE ROLE The SRE Principal Engineer III will work a hybrid schedule, with a requirement to be onsite at our Torrance, CA facility at least two days per week or more if needed, while also having the flexibility to work remotely. This role is responsible for leading, designing, and implementing robust Site Reliability Engineering (SRE) practices to ensure high availability, scalability, and resilience of critical business systems and applications. The SRE Principal Engineer III will focus on improving system reliability through automation, monitoring, and performance tuning, working closely with development and operations teams to champion a culture of continuous improvement and operational excellence.The SRE team consists of:● SRE Engineers● Deployment Automation● Incident Response and Postmortem Analysis● Observability and MonitoringThis role will drive the adoption of best practices in multi-cloud and hybrid-cloud platforms, managing services from major cloud providers like Microsoft Azure, Amazon AWS, Oracle OCI, Google GCP, and Alibaba Cloud. The SRE Principal Engineer III will focus on automation, incident management, performance monitoring, and optimizing infrastructure to support scalable, reliable systems. The position will also be responsible for fostering collaboration between development, operations, and security teams to streamline system operations across the organization. HOW YOU WOULD CONTRIBUTE: ● Lead the implementation and optimization of SRE practices, ensuring system reliability, performance, and scalability.● Architect and maintain automation for infrastructure provisioning, deployment, and incident response.● Establish and implement SLOs (Service Level Objectives) and SLIs (Service Level Indicators) for key services.● Collaborate with development teams to design and deliver reliable software systems, ensuring that production environments are optimized for uptime and performance.● Create and maintain monitoring, alerting, and observability solutions to provide real-time insights into system health and performance.● Respond to production incidents, perform root cause analysis, and implement corrective measures to prevent recurrence.● Continuously improve system performance, capacity planning, and reliability through infrastructure tuning and automation.● Facilitate post-incident reviews, fostering a blameless culture that focuses on learning from incidents.● Collaborate with security teams to ensure infrastructure meets compliance, security standards, and best practices.● Champion a collaborative environment across development, operations, and security teams to enhance operational efficiency and knowledge sharing.● Drive the adoption of automation tools and frameworks to minimize manual intervention and optimize systems. Qualifications Skills Required:● Proven expertise in SRE practices, with a focus on automation, incident management, observability, and infrastructure scalability.● Extensive knowledge of cloud platforms (Azure, AWS, GCP, Alibaba) and hybrid-cloud environments, with a focus on reliability and performance optimization.● Experience with automation tools and scripting languages, such as Python, Go, Terraform, or Ansible, for leading infrastructure and incident response.● Strong understanding of containerization (Docker, Kubernetes) and orchestration systems.● Solid grasp of monitoring and observability tools (Prometheus, Grafana, Dynatrace, Splunk) to ensure real-time system health monitoring.● Expertise in capacity planning, performance tuning, and failure management techniques.● Strong background in incident management, root cause analysis, and postmortem processes to improve system resilience.● Deep understanding of security and compliance requirements, and the ability to ensure production environments meet industry standards.● Experience with Agile and DevOps methodologies to ensure fast, reliable delivery of services. Experience Required:● 10+ years of experience in IT, with a focus on SRE, DevOps, or infrastructure engineering roles.● Extensive hands-on experience with cloud infrastructure management and automation tools such as Terraform, CloudFormation, or equivalent.● Proficiency in scripting and automation languages like Python, Bash, Go, or Ruby for infrastructure automation.● Proven experience in managing large-scale systems, ensuring reliability, high availability, and scalability.● Expertise in container orchestration technologies, including Kubernetes, OpenShift, and Docker Swarm.● Deep knowledge of monitoring and observability platforms (Prometheus, Grafana, ELK, Dynatrace), including experience building and maintaining alerting and dashboard systems.● Strong understanding of version control systems and CI/CD practices to optimize code deployment as it relates to infrastructure.● Demonstrated ability to optimize performance in multi-cloud and hybrid-cloud environments, ensuring uptime and performance at scale. Education Required:● Bachelor’s degree in computer science, Information Technology, or related field, or equivalent experience. Certificates / Training Preferred:● Relevant cloud certifications such as AWS Certified Solutions Architect, Azure Solutions Architect Expert, or Google Cloud Professional Cloud Architect.● SRE-related certifications like Certified Kubernetes Administrator (CKA) or Google Professional Cloud DevOps Engineer. US Benefits Statement Herbalife offers a variety of benefits to eligible employees in the U.S. (limited to the 50 States and the District of Columbia), which includes Group Health Programs, other Voluntary Benefit Programs, and Paid Time Off. Group Health Programs include Medical, Dental, Vision, Health Savings Account (HSA), Flexible Spending Accounts (FSA), Basic Life/AD&D; Short-Term and Long-Term Disability, and an Employee Assistance Program (EAP). Other Voluntary Benefit Programs include a 401(k) plan, Wellness Incentive Program, Employee Stock Purchase Plan (ESPP), Supplemental Life/Critical Illness/Hospitalization/Accident Insurance, and Pet Insurance. Paid time off includes Company-observed U.S. Holidays, Floating Holidays, Vacation, Sick Time, a Volunteer Program, Paid Maternity and Paternity Leave, Bereavement Leave, Personal Leave and time off for voting.
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	week
Source URL	https://careers-herbalife.icims.com/jobs/19730/principal-iii%2c-sre/job
Apply URL	https://careers-herbalife.icims.com/jobs/19730/principal-iii%2c-sre/job
First Seen At	2026-05-31 18:44:37Z
Last Seen At	2026-06-06 08:31:10Z
Last Checked At	2026-06-06 08:31:10Z
Last Changed At	2026-06-03 14:21:10Z
Inactive At	—
Source Posted At	2026-05-29 04:00:00Z
Source Updated At	2026-06-02 17:44:22Z
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=icims/board=careers-herbalife.icims.com/date=2026-06-06/2026-06-06T08-31-09-047Z-d6cf8110a8c8c361ea4dfd72875d651cf362e709ba7575b596ec61b21c976045.json

Event Fields

{
  "content_hash": "011b5efede060979a20031a4e5c5f9745d7443e874ecc1d00749e1c44b34ee6f",
  "source_hash": "f7b63e7d03d4e30e201664ec6c108cf69ddd6d71f1b3b47b94dc63c6024c854d",
  "last_changed_at": "2026-06-03T14:21:10.731Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "Torrance, CA, US",
    "city": "Torrance",
    "region": "CA",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.8
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T08:31:10.944Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "Torrance, CA, US",
      "city": "Torrance",
      "region": "CA",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.8
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "hybrid",
  "salary_period": "week",
  "workplace_type": "hybrid",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "json_ld": {
    "url": "https://careers-herbalife.icims.com/jobs/19730/principal-iii%2c-sre/job",
    "@type": "JobPosting",
    "title": "Principal III, SRE",
    "@context": "http://schema.org",
    "datePosted": "2026-05-29T04:00:00.000Z",
    "description": "<h2>Overview</h2>\n<p><strong>THE ROLE</strong></p>\n<p>The SRE Principal Engineer III will work a hybrid schedule, with a requirement to be onsite at our Torrance, CA facility at least two days per week or more if needed, while also having the flexibility to work remotely. This role is responsible for leading, designing, and implementing robust Site Reliability Engineering (SRE) practices to ensure high availability, scalability, and resilience of critical business systems and applications. The SRE Principal Engineer III will focus on improving system reliability through automation, monitoring, and performance tuning, working closely with development and operations teams to champion a culture of continuous improvement and operational excellence.The SRE team consists of:●    SRE Engineers●    Deployment Automation●    Incident Response and Postmortem Analysis●    Observability and MonitoringThis role will drive the adoption of best practices in multi-cloud and hybrid-cloud platforms, managing services from major cloud providers like Microsoft Azure, Amazon AWS, Oracle OCI, Google GCP, and Alibaba Cloud. The SRE Principal Engineer III will focus on automation, incident management, performance monitoring, and optimizing infrastructure to support scalable, reliable systems. The position will also be responsible for fostering collaboration between development, operations, and security teams to streamline system operations across the organization. </p>\n<p><strong>HOW YOU WOULD CONTRIBUTE:</strong></p>\n<p>●    Lead the implementation and optimization of SRE practices, ensuring system reliability, performance, and scalability.●    Architect and maintain automation for infrastructure provisioning, deployment, and incident response.●    Establish and implement SLOs (Service Level Objectives) and SLIs (Service Level Indicators) for key services.●    Collaborate with development teams to design and deliver reliable software systems, ensuring that production environments are optimized for uptime and performance.●    Create and maintain monitoring, alerting, and observability solutions to provide real-time insights into system health and performance.●    Respond to production incidents, perform root cause analysis, and implement corrective measures to prevent recurrence.●    Continuously improve system performance, capacity planning, and reliability through infrastructure tuning and automation.●    Facilitate post-incident reviews, fostering a blameless culture that focuses on learning from incidents.●    Collaborate with security teams to ensure infrastructure meets compliance, security standards, and best practices.●    Champion a collaborative environment across development, operations, and security teams to enhance operational efficiency and knowledge sharing.●    Drive the adoption of automation tools and frameworks to minimize manual intervention and optimize systems. </p>\n<h2>Qualifications</h2>\n<p>Skills Required:●    Proven expertise in SRE practices, with a focus on automation, incident management, observability, and infrastructure scalability.●    Extensive knowledge of cloud platforms (Azure, AWS, GCP, Alibaba) and hybrid-cloud environments, with a focus on reliability and performance optimization.●    Experience with automation tools and scripting languages, such as Python, Go, Terraform, or Ansible, for leading infrastructure and incident response.●    Strong understanding of containerization (Docker, Kubernetes) and orchestration systems.●    Solid grasp of monitoring and observability tools (Prometheus, Grafana, Dynatrace, Splunk) to ensure real-time system health monitoring.●    Expertise in capacity planning, performance tuning, and failure management techniques.●    Strong background in incident management, root cause analysis, and postmortem processes to improve system resilience.●    Deep understanding of security and compliance requirements, and the ability to ensure production environments meet industry standards.●    Experience with Agile and DevOps methodologies to ensure fast, reliable delivery of services.</p>\n<p> </p>\n<p>Experience Required:●    10+ years of experience in IT, with a focus on SRE, DevOps, or infrastructure engineering roles.●    Extensive hands-on experience with cloud infrastructure management and automation tools such as Terraform, CloudFormation, or equivalent.●    Proficiency in scripting and automation languages like Python, Bash, Go, or Ruby for infrastructure automation.●    Proven experience in managing large-scale systems, ensuring reliability, high availability, and scalability.●    Expertise in container orchestration technologies, including Kubernetes, OpenShift, and Docker Swarm.●    Deep knowledge of monitoring and observability platforms (Prometheus, Grafana, ELK, Dynatrace), including experience building and maintaining alerting and dashboard systems.●    Strong understanding of version control systems and CI/CD practices to optimize code deployment as it relates to infrastructure.●    Demonstrated ability to optimize performance in multi-cloud and hybrid-cloud environments, ensuring uptime and performance at scale. </p>\n<p>Education Required:●    Bachelor’s degree in computer science, Information Technology, or related field, or equivalent experience. </p>\n<p>Certificates / Training Preferred:●    Relevant cloud certifications such as AWS Certified Solutions Architect, Azure Solutions Architect Expert, or Google Cloud Professional Cloud Architect.●    SRE-related certifications like Certified Kubernetes Administrator (CKA) or Google Professional Cloud DevOps Engineer. </p>\n<h2>US Benefits Statement</h2>Herbalife offers a variety of benefits to eligible employees in the U.S. (limited to the 50 States and the District of Columbia), which includes Group Health Programs, other Voluntary Benefit Programs, and Paid Time Off. Group Health Programs include Medical, Dental, Vision, Health Savings Account (HSA), Flexible Spending Accounts (FSA), Basic Life/AD&D; Short-Term and Long-Term Disability, and an Employee Assistance Program (EAP). Other Voluntary Benefit Programs include a 401(k) plan, Wellness Incentive Program, Employee Stock Purchase Plan (ESPP), Supplemental Life/Critical Illness/Hospitalization/Accident Insurance, and Pet Insurance. Paid time off includes Company-observed U.S. Holidays, Floating Holidays, Vacation, Sick Time, a Volunteer Program, Paid Maternity and Paternity Leave, Bereavement Leave, Personal Leave and time off for voting.",
    "directApply": true,
    "jobLocation": [
      {
        "@type": "Place",
        "address": {
          "@type": "PostalAddress",
          "postalCode": "90502",
          "addressRegion": "CA",
          "streetAddress": "990 West 190th Street",
          "addressCountry": "US",
          "addressLocality": "Torrance",
          "postOfficeBoxNumber": "UNAVAILABLE"
        }
      }
    ],
    "validThrough": "2027-05-29T04:00:00.000Z",
    "employmentType": "FULL_TIME",
    "hiringOrganization": {
      "name": "Herbalife",
      "@type": "Organization",
      "sameAs": "https://www.herbalife.com/"
    },
    "occupationalCategory": "Engineering"
  },
  "detail_meta": {
    "url": "https://careers-herbalife.icims.com/jobs/19730/principal-iii%2c-sre/job?in_iframe=1",
    "http_status": 200,
    "content_type": "text/html;charset=UTF-8",
    "response_bytes": 54008,
    "compact_response_bytes": 8203,
    "original_response_bytes": 54008
  },
  "sitemap_job": {
    "id": "19730",
    "url": "https://careers-herbalife.icims.com/jobs/19730/principal-iii%2c-sre/job",
    "slug": "principal-iii%2c-sre",
    "lastmod": "2026-06-02T13:44:22-04:00"
  },
  "detail_errors": []
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/17bd7362ac2a071bffdb70d6bbb75b5b0a78e887?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/ac28d449-c5ef-4e7a-9755-698cd03a1adaJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/a31b711c-51ba-4aaf-9f37-98f916f2ce31JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/17bd7362ac2a071bffdb70d6bbb75b5b0a78e887/eventsJSON

Docs · Get an API key