Syndica

Site Reliability Engineer

Posted 11 Days Ago

Remote

Senior level

Remote

Senior level

As a Site Reliability Engineer at Syndica, you will maintain blockchain infrastructure, ensure reliability and performance, and utilize monitoring tools. You’ll work with teams to enhance system security and automate processes.

The summary above was generated by AI

About us:

Syndica is creating the Cloud of Web3. We supply the most critical applications in Web3 with enterprise-grade RPC infrastructure and developer tools tailored for the Solana ecosystem. Joining our team means you'll be held to a high standard, technically challenged, and grow close to a group of individuals passionate about building new infrastructure technologies.

We are backed by strategic partners, investors, and advisors who are all-in on our mission: Chamath Palihapitiya of Social Capital, Steve Jang of Kindred Ventures, Joe McCann of Asymmetric, Jump Crypto, Coinbase Ventures, Solana Ventures, Circle Ventures, and many more.

About you:

Great collaborator with 5+ years of experience in a DevOps or SRE role
Proficiency in scripting languages (Python, Shell) and experience with at least one modern programming language (Go, Rust, Typescript, etc.)
Experience deploying large-scale systems reliably
Experience using Kubernetes
Working knowledge of web and network protocols and standards (HTTP, TLS, DNS, etc)
Working knowledge of information security issues
Experience writing automation tools & eagerness to "automate all the things"
Commitment to implementing reliability and security best practices
Capacity planning experience, including resource optimization and load testing
Systematic problem-solving approach, combined with a strong sense of ownership and drive

Standout experience:

Experience with Prometheus/Grafana for metrics aggregation/visualization and other monitoring and alerting tools
Experience with infrastructure-as-code tools such as Terraform, Ansible, Chef
Experience in Building and managing Virtualized systems (KVM, OVM, Containers/Docker) and ability to read and understand source code
Knowledge of one or more load testing tools (K6, Locust, JMeter, etc.)
Experience with configuration of CI/CD pipelines

About the role:

As a Site Reliability Engineer, you will be accountable for maintaining and operating Syndica’s blockchain infrastructure platform with other infrastructure engineers.

A successful candidate must have demonstrable experience working with at least one major cloud platform language (AWS, Azure, or GCP) via Kubernetes and previous work in SaaS application development and operations.

You will be working closely with the Data and Infrastructure teams on building robust solutions to ensure the highest level of reliability, performance and security of our services. Your work will span the entire end-to-end lifecycle of our systems: initial design and deployment, ongoing monitoring and incident response, and comprehensive analysis of systems to iteratively improve reliability, performance, and security.

Key responsibilities:

Administer overall site availability, security, latency, and system health.
Effective provisioning, installation/configuration, operation, and maintenance of services and system software and related infrastructure.
Develop comprehensive monitoring solutions to provide full visibility to the different system components using tools like Kubernetes, Prometheus, Grafana, ELK, Datadog, New Relic, etc.
Enable the development team to release code quickly and reliably by ensuring full observability of systems and automated detection of performance and integration issues.
Formulate technical performance measures and implement them using queries, logs, code instrumentation and other analytics tools.
Design dashboards and visualizations that effectively convey technical measures
Troubleshoot issues at multiple layers of deployment, from hardware, to operating environment, network, and application to conduct root cause analysis and make recommendations from your findings.
Work with development teams to ensure best practices for scalability, reliability, and security are designed and implemented from the start.
Forecast changes in demand and capacity to establish appropriate scalability plans and drive decisions on the right-sizing of servers, storage and other resources.
Design and perform high-throughput stress testing to determine system capacity limits and identify points of failure.
Troubleshoot critical customer issues related to Syndica’s RPC, APIs, and App Deployments.

What does success in this role look like?

In three months, you will have become our go-to for overall site availability, security, latency, and system health. You will have taken on independent code review responsibilities and be collaborating on the design of new features.
In six months, you have earned the trust of the team. You are delivering tasks through the entire SDLC, from design through development, with minimal guidance, and you are helping to effectively mentor new engineers joining the team.
In twelve months, you have established a cadence of predictable, on-time delivery without cutting corners.

Top Skills

Ansible

AWS

Azure

Chef

Datadog

Docker

Elk

GCP

Grafana

Jmeter

Kubernetes

Locust

New Relic

Prometheus

Python

Rust

Shell

Terraform

Typescript

Similar Jobs

Atlassian

Site Reliability Engineer

3 Days Ago

Remote

San Francisco, CA, USA

117K-187K Annually

Junior

117K-187K Annually

Junior

Cloud • Information Technology • Productivity • Security • Software • App development • Automation

As a Site Reliability Engineer at Atlassian, you will manage and improve cloud infrastructure, automate processes, and ensure the reliability and performance of services. You will build monitoring into code, troubleshoot, and communicate technical issues effectively. Experience with public cloud offerings and backend engineering is essential.

MongoDB

Staff Site Reliability Engineer, Fabric

5 Days Ago

Remote

Hybrid

147K-289K Annually

Expert/Leader

147K-289K Annually

Expert/Leader

Big Data • Cloud • Software • Database

Seeking a Site Reliability Engineer with strong networking skills to build and maintain secure infrastructure for service communication. Involves collaboration, support, and 24/7 on-call participation.

Top Skills: AWSAzureBgpCloud ComputingDnsGCPKubernetesLoad BalancingSdnService MeshTcp/IpTls

MongoDB

Staff Site Reliability Engineer, Fabric

5 Days Ago

Remote

United States

147K-289K Annually

Expert/Leader

147K-289K Annually

Expert/Leader

Big Data • Cloud • Software • Database

The Staff Site Reliability Engineer will manage secure communication infrastructure, focusing on deep networking, distributed systems, and ensuring system resilience in a multi-cloud environment.

Top Skills: AWSAzureBgpDnsGCPKubernetesLoad-BalancingSdnService MeshTcp/IpTls/MtlsVpns

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus