About the role
We have a small AI team (3 engineers and a research-leaning ex-tariff lawyer). The work mixes LLMs (classification, regulatory Q&A, document parsing), classical ML (anomaly detection on entry data), and a lot of domain-shaped retrieval.
Nothing here is a chatbot for chatbot's sake. Every model has a P&L attached.
What you'll do
- Ship and operate models that classify HTS codes at sub-15-second latency, with confidence calibrated against real CBP rulings.
- Build the anomaly detection layer that flags suspicious entries before they're filed. We catch things humans miss.
- Work with the compliance team to design explainable outputs. The model has to tell the user why, in regulator-defensible language.
- Own evaluation. Every model ships with offline evals against historical entries plus online A/B tests on real traffic.
- Build the retrieval and grounding layer that powers our internal compliance assistant. Used by our team every day.
- Partner with engineering to make AI features feel native, not bolted on. Latency, reliability, and pricing matter.
What you bring
- 5+ years of professional software engineering with at least 2 years shipping ML or AI features in production.
- Comfort with both LLM-shaped problems (RAG, function calling, evals, prompt engineering as engineering not art) and classical ML (gradient boosting, calibration, drift monitoring).
- Strong Python. Strong product instinct. You can argue with a PM about whether a feature actually helps a customer.
- A portfolio of shipped work, ideally including something where the model's wrongness cost real money and you fixed it.
- Willingness to live in someone else's domain. You will need to learn HTS classification well enough to argue with a customs broker.
Bonus
- Prior work in regulated domains: legal tech, fintech, healthcare AI, etc.
- Experience with tabular ML at scale (XGBoost, LightGBM, calibration, anomaly detection).
- Open source contributions or published research.
- You've operated an LLM in production at meaningful cost. You know what a token costs and what a cache hit means.
How we hire
(1) Recruiter intro, 30 minutes. (2) Hiring manager (VP Engineering) technical conversation, 60 minutes. (3) Take-home: a 4-hour scoped task on a real Mamora-shaped problem (calibration of an HTS classifier against a holdout set). We pay for take-homes. (4) Pair-programming review, 90 minutes, with two engineers. (5) Founder and Head of Product conversation, 45 minutes.
From first conversation to offer: typically 3 to 4 weeks.
Apply
Send us your resume.
Email jobs@mamora.io with your resume and a link to a piece of ML work you're proud of. Github, paper, internal writeup with the proprietary bits removed, all fine.
jobs@mamora.io →