Serpentium Solutions is a pan-European team of business analysts, product managers, data managers, data scientists, and engineers. We thrive as a remote-first company, embracing the flexibility to work across borders while cherishing the moments we come together in person—whether in London, Budapest, or other key hubs.
13 квітня 2026

ML OPS (hourly based salary 30 USD/h)

за кордоном, віддалено до $5000

We’re building production-grade NLP systems and need someone who can take a model from research to reliable, scalable deployment. You’ll own the full lifecycle — from containerisation to live inference endpoints.

What you’ll do

• Package, serve, and monitor small language models on AWS SageMaker Serverless endpoints with optimised cold-start behaviour

• Build slim multi-stage Docker images, push to ECR, and keep inference images under tight size budgets

• Own the build → test → push → deploy CI/CD pipeline for ML services

• Configure IAM roles and manage secrets via AWS Secrets Manager following least-privilege principles

• Version datasets, models, and experiments; instrument latency, throughput, and accuracy in production

• Work with NLP libraries (spaCy, Transformers, FAISS, PyTorch) to build and iterate on NLP pipelines


Requirements:

Cloud & infrastructure:

• Docker — multi-stage builds, image optimisation

• AWS: ECR, IAM roles, Secrets Manager, SageMaker Serverless endpoint configuration

• CI/CD pipelines: build / test / push / deploy for ML services (GitHub Actions or similar)

ML & NLP:

• PyTorch, Hugging Face Transformers, spaCy, FAISS

• Hands-on experience running and tuning small language models (≤7B params) — spinning them up, stress-testing, optimising for latency and throughput

• Familiarity with quantisation (GGUF, ONNX, bitsandbytes) or model distillation


Nice to have

• RAG pipeline experience

LinkedIn