
Join our mission to create AI agents that can do real work using a computer mouse and keyboard. We're revolutionizing the medical claims industry with cutting-edge agent tooling and vision-language models.
MEN WANTED FOR UNCERTAIN JOURNEYBELOW MARKET WAGES, LONG HOURSBUILDING TRAINING PIPELINES ANDDATASETS. DEFEAT IS NOT AN OPTION.Equity and impact from day one,
fame and money on success.
$40k/year + 10% equity + studio in Austin + Uber Eats budget · Austin, TX
This isn't a job this is a Mission. Mercenaries need not apply.
This is a startup founder's dream job - get in just when things have gotten moving, but before revenue goes from a pledge to a reality. We have a pilot program of 10 dental practices (worth $6,000/mo revenue) and are wrapping up the MVP. Employee #4 carries a lot of weight, and you will have a lot of responsibility, and will drive the future value of the company with your own hands.
We're looking for someone who believes in creating digital labor that transforms industries. If you're chasing a paycheck, this isn't the role for you. But if you want to build something that matters and are willing to do whatever it takes to make it happen - this is your moment.
Imagine working in the garage of Apple Computer back in the late 70s. You are employee #4. Missionaries will be rewarded far more long term than a mercenary could dream of now.
Today the ML Training Specialist, tomorrow the VP of ML Research.
We're looking for an experienced ML Training Specialist to help us build and scale our digital labor agents. You'll work on training vision-language models, implementing fine-tuning pipelines, creating scalable inference systems, and building training flywheels that continuously improve our agent capabilities. This is a foundational role where you'll have significant impact on our technology and product direction.
What We Offer
• $40,000/year salary (starting, scales with investment/revenue)
• 10% equity in the next funding round (4-year vest, 1-year cliff)
• Studio apartment in Austin
• Monthly Uber Eats budget
The Reality
We're a seed-stage pre-revenue company running a pilot program with 10 dental practices. We're approximately 6 months from revenue. The $40k salary meets FLSA exempt requirements and will scale to market rate as we secure investment and generate revenue. ML engineers will be first to scale.
Why 10% Equity is Massive
• Investors at this stage typically get 1-4%
• Board members get ~1%
• Pilot partners get 0.5%
• Founding ML engineer gets 10%
📍 Austin, TX Required
Work from your studio apartment (we provide). CTO is down the hall. Immediate collaboration, sharp feedback cycles. This is the distributed garage - everyone has their workspace, close enough to solve problems together.
🇺🇸 US Citizens Only
Equity component requires US citizenship.
Hands-on experience with distributed training for models 7B+ parameters using DeepSpeed, FSDP, or similar frameworks. We use Modal.com for our training infrastructure.
Proven track record fine-tuning large models for specific tasks and domains
Experience building and optimizing inference systems that scale efficiently
Ability to implement continuous training loops that improve model performance over time
Strong ability to create, curate, and test datasets for training and evaluating large models
Work on technology that directly reduces administrative burden in healthcare and creates measurable value for dental practices
Push the boundaries of what's possible with agentic AI systems that interact with real-world software
Take ownership of critical ML infrastructure and strategy in a foundational role
Join early-stage startup with significant upside potential and room to grow into leadership
No resumes. Answer the questions below in an email. We use an automated system to surface the top 5 responses based on depth of experience and technical accuracy.
⚠️ DO NOT USE AN LLM FOR THIS
Answers must be from your mind alone. We check for LLM-generated text and will filter you out. If you need an LLM to answer these questions, you're not who we're looking for.
You are building a dataset based on some customer screens. The data includes both an action and a thought. You have an eval script that runs against the finetuned model, and it consistently fails. What experiments would you do to try and find the problem?
You are finetuning a model and the dataset is growing to the point where a single GPU can't fit the batches in its memory. How do you solve this?
Tell us an ablation you performed that surprised you.
Tell us your thought process when building your hypothesis, data, and experiment, and how you determine if your experiment made positive gains.
Tell us of a time you were SURE you knew why an eval was failing, and you turned out to be wrong. What did you learn from the experience?
You need to build a dataset for desktop automation. You have two options: 500 meticulously hand-crafted examples, or 10,000 synthetically generated examples with some noise. Which do you choose and why? How do you validate your choice was correct?
Describe your experience generating synthetic training data. What methods have you used? What are the failure modes you watch for? How do you validate synthetic data quality before training?
You're building an eval for an agent that needs to navigate complex UIs. What metrics do you track? How do you know if a 5% improvement in your metric actually means the agent is better in production?
Tell us about your worst training run. What went wrong, how much did it cost you (time/money), and what safeguards did you put in place afterward?
You're fine-tuning a 7B model to click the correct button on a dental insurance website. How many examples do you need in your training set? Walk us through your reasoning.
How does HIPAA compliance impact the traditional ML training process when working with healthcare data? What constraints does it create, and how would you work around them?
Email your answers to:
Subject: ML Training Specialist
We value truth and curiosity most of all
Seek objective reality over comforting narratives. Ask why relentlessly and dig deeper until you understand how things really work.
Bias toward action and rapid iteration. Ship early, learn quickly, and improve continuously.
Obsess over customer success and feedback. Build products that create real, measurable value.
Push the boundaries of what's possible while maintaining high code quality and system reliability.