← BACK_TO_JOBS

DevOps Engineer

Addi · Bogotá · posted 4 months ago
FULL_TIME Software / IT
PythonAWSTerraform

About Addi

We are a leading financial platform, building the future of payments, shopping, and banking—a world where consumers and merchants can transact effortlessly, grow together and where we create abundance and generate pride in them. Today, we serve over 2 million customers and partner with more than 20,000 merchants, making Addi Colombia’s fastest-growing marketplace.

We provide banking solutions (deposits, payments, unsecured credit) and commerce services (e-commerce, marketing) using state-of-the-art technology, bridging the financial gap for millions and redefining how people experience financial freedom. As the country’s leading Buy Now, Pay Later provider, we have secured regulatory approval to operate as a bank, unlocking even greater opportunities for our customers. In the past year, we have also achieved profitability, reinforcing the strength of our business model and our ability to scale sustainably.

Our mission has earned the trust of world-class investors, including Andreessen Horowitz, Architect Capital, GIC, Goldman Sachs, Greycroft, Monashees, Notable Capital, Quona Capital, Union Square Ventures, Victory Park Capital, and more, who back our vision for the future. With their support, we are not just growing—we are transforming Latin America’s financial ecosystem and shaping the next generation to shop, pay, and bank in Colombia.

But what truly sets us apart is how we build. We are a conscious company, driven by deep experience in scaling technology, services and products, and we live by our values every day.

About the Role

This is where you come in. Below, you’ll find what this role is all about—the impact you’ll drive, the challenges you’ll tackle, and what it takes to thrive at Addi. If you’re ready to be part of something big, keep reading.

What’s the mission you’ll drive

To architect and lead the evolution of Addi’s engineering platform, championing a culture of independent deployability and proactive reliability, directly enabling rapid product deployment and guaranteeing availability, security, and scalability required to support our transformation into a leading financial platform in Latin America.

What you will do

Architect and provide the tooling that allows product squads to deploy without cross-team synchronization. Transition 80% of core services to a self-service deployment model where "deployment trains" are replaced by independent, asynchronous releases.

Redesign the global networking architecture to move away from legacy monolithic ingress points toward a decoupled, multi-layered security perimeter. Implement a "Security-by-Design" topology (e.g., Service Mesh or API Gateway isolation) that abstracts internal services from public entry points.

Deliver a standardized "Delivery-as-a-Product" capability that enables complex traffic shifting strategies for all teams. Provide a unified interface for teams to manage Canary and Blue-Green deployments, including automated health-check gates that trigger rollbacks without manual intervention.

Build the telemetry "Golden Signals" pipeline and provide the libraries/tooling required for other teams to instrument their own services (OpenTelemetry). 100% of new services are "Observability-Ready" at launch, with automated SLO dashboards and alerting enabled via the platform.

Integrate AI-assisted development workflows into the platform (e.g., AI-driven IaC generation or LLM-based troubleshooting assistants for dev squads). 30% reduction in platform-related support tickets by providing AI-assisted self-service documentation and diagnostic tools.

What we’re looking for

Proven Expertise in Cloud & Infrastructure Fundamentals

3-5 years of full-time, relevant experience as a DevOps Software Engineer, Cloud Engineer, or Site Reliability Engineer (SRE).

Demonstrated mastery in designing and operating highly available, scalable, and self-healing systems on the cloud, aligning with the AWS Well-Architected Framework.

Expert proficiency in Linux system administration and scripting (Bash, Python) for large-scale automation, and experience administering relational/non-relational database systems.

Track Record of Success with IaC and CI/CD Optimization

Deep, professional experience defining, provisioning, and managing 100% of cloud infrastructure using modern IaC tools like Terraform or AWS CDK.

Proven ability to implement and optimize CI/CD pipelines (e.g., using GitHub Actions or Jenkins) resulting in a quantifiable reduction in Mean Time to Deployment (MTTD) and Change Failure Rate (CFR).

Proven ability to design and deliver internal developer platforms that enable product squads to deploy independently via Service Mesh, Feature Flags, and Contract Testing frameworks.

Expert proficiency in implementing complex deployment patterns (Canary, Blue/Green) at scale, automating traffic shifting and health-check gates to ensure zero-downtime releases.

Demonstrates Technical Ownership & SRE Mindset

Takes full responsibility for the reliability and performance of critical production systems, treating infrastructure as a product.

Not only uses observability tools (Prometheus, Grafana, OpenTelemetry) but builds the "Observability Pipeline" that enables other teams to instrument their own services and define their own SLOs/SLIs.

Proactively identifies and eliminates "toil" by automating repetitive tasks, measuring, and reporting on efficiency gains.

Actively participates in on-call rotations, writing clear runbooks, and leading post-mortem analyses to prevent recurrence of incidents.

Has Solid Expertise in IaC Mastery & Security-First Approach

Expertly defines, provisions, and manages 100% of cloud resources (AWS) using Infrastructure as Code (e.g., Terraform, AWS CDK).

Deep expertise in redesigning cloud networking topologies (e.g., Transit Gateways, PrivateLink, Service Mesh) to decouple entry points and enforce a Zero-Trust security mindset.