Lob was founded in 2013 by technical co-founders with a vision to connect the world one mailbox at a time. Today, we're transforming the way businesses use direct mail and bringing the power of technology to a traditionally manual channel.
Our modern logistics and fulfillment engine helps businesses to build and scale high-quality, personalized direct mail programs without the operational burden. As we grow to meet the evolving needs of our customers and expand our product offerings, we’re building a team to shape the future of direct mail.
About The Role
We are looking for a Senior Platform Engineer to help scale and improve the reliability, observability, performance, and cost efficiency of our platform infrastructure.
This role is focused on observability engineering and infrastructure optimization across AWS environments. The ideal candidate has deep hands-on experience with Datadog, OpenTelemetry, and HashiCorp Nomad, and understands how to build highly visible, scalable, and operationally efficient systems while actively reducing unnecessary infrastructure spend.
You will work closely with engineering teams to improve telemetry, monitoring, performance testing, platform reliability, and cloud infrastructure efficiency across a fast-moving distributed environment, including leveraging modern AI-driven tooling and operational workflows where appropriate.
What You’ll Work On
Building and improving observability across distributed systems and services
Designing dashboards, alerting, metrics, tracing, and telemetry pipelines
Improving operational visibility using Datadog, and OpenTelemetry
Helping evolve and mature the organization’s observability strategy and tooling
Supporting and improving HashiCorp Nomad orchestration environments
Identifying and implementing AWS cost-saving opportunities across compute, storage, and platform infrastructure
Improving infrastructure utilization and operational efficiency across Nomad workloads
Optimizing S3 storage utilization, lifecycle management, and storage costs
Designing and maintaining performance testing environments and tooling
Running load and performance tests to identify bottlenecks and scalability issues
Managing and tuning Elasticsearch/OpenSearch environments
Troubleshooting production performance issues across services, infrastructure, and databases
Partnering with engineering teams to improve platform reliability, scalability, and infrastructure efficiency
Responsibilities
Lead observability initiatives across infrastructure and applications
Design and maintain monitoring, telemetry, dashboards, tracing, and alerting systems
Build actionable visibility into platform health, reliability, and performance
Improve incident detection, troubleshooting, and operational response capabilities
Define observability standards and best practices across engineering teams
Drive infrastructure cost optimization initiatives across AWS services and platform environments
Analyze infrastructure utilization and recommend performance and cost efficiency improvements
Maintain and improve infrastructure-as-code standards and workflows
Design, build, and maintain scalable performance testing environments and tooling
Execute and analyze load/performance testing initiatives
Support and improve Nomad-based orchestration environments
Troubleshoot complex production and infrastructure issues across distributed systems
Collaborate closely with engineering teams to improve scalability, reliability, operational visibility, and infrastructure efficiency
Create and maintain operational documentation and platform best practices
Qualifications
7+ years of experience in platform engineering, infrastructure engineering, or site reliability engineering
Strong hands-on experience with HashiCorp Nomad
Deep expertise with Datadog
Strong experience implementing and operating observability platforms using OpenTelemetry and modern monitoring tooling
Experience with Grafana or similar visualization and observability platforms
Strong understanding of distributed tracing, metrics, logging, and monitoring best practices
Experience building dashboards, alerts, telemetry pipelines, and operational visibility tooling
Strong experience identifying and implementing AWS cost optimization strategies in production environments
Strong knowledge of S3 optimization, lifecycle management, and storage cost reduction
Experience building and running performance/load testing environments
Strong troubleshooting and performance analysis skills across distributed systems
Strong experience operating infrastructure in AWS environments
Strong experience with Terraform and infrastructure-as-code practices
Experience balancing platform reliability, observability, and infrastructure cost efficiency at scale
Experience working with distributed and event-driven architectures using technologies such as Redis, SQS, or Temporal
Experience managing and tuning Elasticsearch or OpenSearch clusters
Experience working in fast-paced engineering environments
Strong communication and collaboration skills
Nice to Have
Exposure to PostgreSQL RDS to Aurora migrations
Experience with Kubernetes
Experience with CI/CD systems and deployment automation
Experience with Go, Python, or TypeScript
Since great engineers come from a variety of backgrounds, it doesn’t particularly matter if you have a specific degree—we want to hear about your contributions in a real-world setting.
Compensation information
The compensation for this role consists of a base salary + additional RSUs.
Annual Base Salary: $160,000 - $177,500
<#LI-REMOTE #LI-GD1
“Lob’s salary ranges are based on market data, relative to our size, industry and stage of growth. Salary is one part of total compensation, which also includes equity, perks and competitive benefits. Salary decisions are based on many factors including geographic location, qualifications for the role, skillset, proficiency and experience level. Lob reasonably expects to pay candidates who are offered roles within the provided salary ranges.”
We offer remote working opportunities in AZ, CA, CO, DC, FL, GA, IA, IL, MA, MD, MI, MN, MT, NE, NC, NH, NJ, NV, NY, OH, OR, PA, RI, TN, TX, UT, and WA, unless specified otherwise in the job description above.
If you are looking for a progressive, fun-spirited, and mentally stimulating environment, come join us at Lob!
Our Commitment to Diversity
Lob is an equal opportunity employer and values diversity of backgrounds and perspectives to cultivate an environment of understanding to have greater impact on our business and customers. We encourage under-represented groups to apply and do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, disability status, or criminal history in accordance with local, state, and/or federal laws, including the San Francisco’s Fair Chance Ordinance.
Recent awards
#88 on BuiltIn's Best Remote Midsize Companies to Work For in 2025
BuiltIn Best Remote Midsize Companies to Work For in 2024
BuiltIn Best Midsize Companies to Work For 2022