Guidewire Software Logo

Guidewire Software

Site Reliability Engineer III

Posted 4 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Canada
Mid level
Remote
Hiring Remotely in Canada
Mid level
As a Site Reliability Engineer, you will automate processes, oversee AWS infrastructure, ensure platform reliability, and enhance observability tools while collaborating with developers.
The summary above was generated by AI

Summary

POSITION OVERVIEW
At Guidewire, we make software that offers Property and Casualty (P&C) Insurance companies the tools to take care of their customers when they need it the most, whether that’s a time of crisis, a natural disaster, an accident, or exposure to cyber risks. We build the core applications that insurance companies use to sell and underwrite policies, settle claims, and bill their customers. We also have a portfolio of innovative products serving the needs of P&C insurance companies in areas such as data management, digital online portals, and predictive analytics. We run these products on the Guidewire Cloud Platform, and we help hundreds of insurance providers all over the world to handle billions of dollars of business.
We are proud to be voted a Top Cloud Employer on Glassdoor by our own employees and positioned as a market leader by industry experts like Gartner. We have a fun work environment and a culture that lives by our core values of integrity, rationality, and collegiality.
We’re searching for people who are as passionate about working together to deliver quality products and support as we are. Join us and enjoy a career where you can make an impact. You’ll be inspired by those around you, and you’ll be trusted and empowered to go further.
As a Site Reliability Engineer, you will be part of a team that is passionately automating everything possible to make Guidewire systems run more efficiently. The Platform team is dedicated full-time to creating and running software that improves the reliability of systems in production, serving hundreds of customers and supporting millions of transactions each day. You will be ensuring the reliability of Guidewire’s flagship cloud platform and InsuranceSuite products and building tooling to help ensure efficient operations and optimal availability of all SaaS multi-tenant and customer-focused systems. Platform SREs collaborate closely with Guidewire’s core product developers to ensure that the Guidewire core cloud products address functional and non-functional requirements such as availability, performance, observability, and maintainability.
This role requires a high degree of collaboration, teamwork, ownership and responsibility. If you like to be challenged and have a passion for solving problems at scale with systems like AWS, Kubernetes and Aurora, then we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of, "If you have to do something more than once, automate it," and who can rapidly self-educate on new concepts and tools. Bonus points if you have prior experience doing production support of a SaaS platform and are comfortable working with bleeding edge highly containerized cloud-native environments in AWS.

Job Description

ESSENTIAL DUTIES AND RESPONSIBILITIES
  • Take a purist SRE approach to shared multi-tenant infrastructure for a resilient SaaS microservice-based containerized systems in addition to customer-centric application environments

  • Oversee and automate the team’s growing presence in AWS

  • Contribute to core infrastructure systems development with features, bug fixes, reliability improvements, etc

  • Platform reliability engineering of a complex single sign-on SAML/OAuth-based central authentication platform

  • Creatively build and develop tooling to aid in driving 24x7x365 follow-the-sun operations of critical production systems

  • Automate deployment tasks for core product and infrastructure tools and maintain automation infrastructure

  • Create system documentation and training materials to empower and educate our fellow team members

  • Build and maintain observability tooling, metrics, and dashboarding for a global platform product infrastructure

  • Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks and issues

  • Enhance platform observability with helping create a self-healing approach to platform reliability

  • Collaborate with engineering teams, providing product feedback and where necessary contribute code to the product


REQUIRED SKILLS AND EXPERIENCEEducation and Work Experience
  • Bachelor’s Degree in Computer Science or related field

  • Software engineering and task automation skills with Bash, Python, and/or Go are a must.

  • Solid understanding of agile software development methodologies (Scrum, Kanban, etc.)

  • Deep background with Linux systems and engineering

  • Highly experienced with engineering and automating on Amazon Web Services (AWS)

  • Experience supporting web applications running on Java / Apache / Tomcat in a live production environment

  • Prior experience with IaC tools like Terraform/Terragrunt/Terraspace

  • Prior experience with devops/gitops tools (Git, Bitbucket, Flux CD, Teamcity) for gate promotions

  • Production-At-Scale support background in a heavily microservice-based world

  • Hands-on engineering and ops expertise in containerization (Docker, Helm, Kubernetes/EKS, CNI and Ingress networking)

  • Strong understanding of Single-Sign On, SAML, OAuth (Bonus if hands-on experience with Okta)

  • Seasoned expertise around x.509 certificate technology and basic concepts of encryption

  • Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS

  • Advanced exposure to application development, web UI (design and development), JSON, application architecture

  • Experience strongly utilizing observability tools (logging/APM) like Datadog, CloudWatch, and PagerDuty.

  • Familiarity with event store/stream-processing technologies like Kafka or AWS SQS

  • Understanding of Open Application Model systems such as KubeVela or Crossplane

Personal Qualities and Soft Skills
  • You greatly prefer writing code than clicking a GUI.

  • You enjoy teaching, being a mentor to others, and working across boundaries

  • Outstanding troubleshooting skills; ability to think critically and display an aptitude for problem solving

  • Strong analytical mind with a penchant for process development and enhancement

  • A highly positive can-do attitude with desire for being a team player

  • Great communication skills and ability to explain complex technical concepts to a varied audience

  • Demonstrate strong follow-through, a strong work ethic and consistently keep and meet commitments

  • Ability to champion a culture of reliability within the product team, promoting practices like blameless postmortems, SLO tracking, and continuous learning from incidents.

Other Requirements
  • Ability to read, write, and speak English

  • We provide 24x7 support to our customers, so we expect you to take turns with your teammates being on-call for weekend production emergencies or to provide rotating weekend operational support

  • Travel – Expect occasional travel (less than 5%) to other Guidewire offices for training and team meetings

About Guidewire

Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently. We combine digital, core, analytics, and AI to deliver our platform as a cloud service. More than 540+ insurers in 40 countries, from new ventures to the largest and most complex in the world, run on Guidewire.

As a partner to our customers, we continually evolve to enable their success. We are proud of our unparalleled implementation track record with 1600+ successful projects, supported by the largest R&D team and partner ecosystem in the industry. Our Marketplace provides hundreds of applications that accelerate integration, localization, and innovation.

For more information, please visit www.guidewire.com and follow us on Twitter: @Guidewire_PandC.

Guidewire Software, Inc. is proud to be an equal opportunity and affirmative action employer. We are committed to an inclusive workplace, and believe that a diversity of perspectives, abilities, and cultures is a key to our success. Qualified applicants will receive consideration without regard to race, color, ancestry, religion, sex, national origin, citizenship, marital status, age, sexual orientation, gender identity, gender expression, veteran status, or disability. All offers are contingent upon passing a criminal history and other background checks where it's applicable to the position.

Top Skills

Apache
Aurora
AWS
Aws Sqs
Bash
Bitbucket
Cloudwatch
Crossplane
Datadog
Docker
Flux Cd
Git
Go
Helm
Java
Kafka
Kubernetes
Kubevela
Okta
Pagerduty
Python
Teamcity
Terraform
Tomcat

Similar Jobs

20 Days Ago
Remote
Canada
Mid level
Mid level
Cloud • Information Technology
As a Site Reliability Engineer, improve infrastructure and automate tasks while collaborating with teams to enhance reliability across the application lifecycle.
Top Skills: AnsibleAWSAzureBashDockerElk StackGCPGoGrafanaKubernetesPrometheusPythonTerraform
19 Hours Ago
Remote or Hybrid
8 Locations
252K-377K Annually
Expert/Leader
252K-377K Annually
Expert/Leader
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Lead the design vision for Square's public web experience, enhancing storytelling and product pathways to drive customer acquisition.
Top Skills: Interaction DesignPerformance MarketingSeoWeb Design
19 Hours Ago
Remote or Hybrid
8 Locations
177K-312K Annually
Senior level
177K-312K Annually
Senior level
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Design impactful product experiences for Square's ecosystem, enhance customer engagement, and collaborate closely with cross-functional teams to drive retention.
Top Skills: AndroidInteraction DesigniOS

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account