Role overview

[1747] - Sr Machine Learning Engineer

Staffy

Published via Mainder

What you'll do

About the Role

We are looking for a Senior Machine Learning Engineer to lead the end-to-end design and implementation of large-scale, production-grade AI systems. This is a highly hands-on role focused on building, scaling, and operating ML systems that serve a high volume of users in real-world environments.

You will take ownership of ML system architecture — from data pipelines and model integration to inference, monitoring, and optimization — with a strong emphasis on AWS-based infrastructure and RAG architectures deployed at scale.

This is not a research role. We are looking for a senior systems builder who has successfully delivered AI/ML platforms in production and understands the challenges of reliability, performance, cost efficiency, and scalability in high-traffic environments.

You will collaborate closely with AI, data science, and product teams while operating with high autonomy and technical ownership.

This is a fully remote position, open to candidates based in LATAM, with a preference for Eastern or Central time zones to collaborate effectively with teams across the US, Europe, and India.


Responsibilities

  • Build and maintain production-grade data pipelines that power AI systems, integrating with existing infrastructure and developing new pipelines as needed

  • Design, implement, and deploy RAG architectures, embeddings pipelines, and LLM-powered features on AWS

  • Implement monitoring, observability, and evaluation frameworks for AI systems, including performance, cost, and output quality metrics

  • Rapidly prototype, test, and validate AI approaches, iterating from experimentation to production-ready solutions

  • Optimize LLM usage across the platform, including model selection, prompt engineering, and inference deployment

  • Collaborate closely with product and data science teams to translate business and customer needs into working AI systems

  • Ensure systems are scalable, maintainable, well-documented, and ready for long-term use


Requirements

Technical & Professional Experience

  • 10+ years of experience in machine learning engineering, data engineering, or a combination of both

  • 5+ years of experience building and deploying production AI applications using LLMs

  • 3+ years of hands-on experience with RAG architectures, vector databases, embeddings, and retrieval systems in production

  • Experience using AWS Bedrock for AI/ML workloads.

  • Strong data engineering skills, including pipeline design, data transformation, and large-scale dataset handling

  • Solid software engineering fundamentals, including version control, testing, CI/CD, and code review practices

Ways of Working

  • Ability to work independently with minimal supervision while collaborating effectively with cross-functional teams

  • Strong ownership mindset and accountability for end-to-end delivery

  • Active use of AI-assisted development tools (e.g., Cursor, Claude Code, GitHub Copilot) in daily workflows

Nice to Have

  • Experience with LLM fine-tuning and a clear understanding of when to fine-tune versus rely on prompt engineering

  • Background in retail, e-commerce, or product data domains

  • Experience with Python, Node.js, and modern application development stacks

  • Experience optimizing AI models for cost-efficient inference at scale


Benefits

  • 100% remote work

  • International contractor contract with competitive USD compensation

  • Stable, long-term client engagement

  • Flexible schedule and high autonomy

  • Exposure to complex, large-scale technical challenges

  • Supportive, inclusive, and collaborative engineering culture


About the Company

We are technology agency with over 10 years of experience delivering high-impact software solutions for international clients, primarily across the US and Canada. We partner with leading companies to build scalable, secure, and data-driven platforms, embedding our engineers directly with client teams as trusted long-term partners. Our culture is remote-first and grounded in technical excellence, ownership, collaboration, and accountability.