CrowdStrike Logo

CrowdStrike

Director, Model Post-Training and Agentic Research (Remote)

Posted 2 Hours Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in USA
195K-290K Annually
Senior level
Remote or Hybrid
Hiring Remotely in USA
195K-290K Annually
Senior level
Lead and hands-on develop the full post-training stack for security-domain AI, including SFT, RLHF/RLAIF, reward modeling, and agent-RL harnesses. Build training environments and agent scaffolds, define evaluation and benchmarks, drive research direction, publish findings, and recruit and grow a high-density research and engineering team while actively contributing to experiments and architecture.
The summary above was generated by AI

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role:

The security domain presents one of the richest and most consequential training signal environments in applied AI.  It’s adversarial by nature, grounded in real operational outcomes, and evolving faster than any static benchmark can capture. We're building the post-training and reinforcement learning capability to build the latest models and harnesses into security-specialized systems that reason, plan, and act across complex cyber workflows. The person leading this work will be in the research, not just directing it.

In this role, you'll own the full post-training stack for security-domain AI (e.g., supervised fine-tuning, reward modeling, RLHF and RLAIF pipelines, and agent-RL environments) and the agentic research that sits on top of it. That means designing, building, and evaluating the harnesses that security agents actually run on (e.g., the scaffolding, tool-use interfaces, planning loops, memory and context management, and multi-step execution frameworks) that determine whether a trained model can operate reliably on complex security tasks. Post-training and agent architecture are not separable problems in this work. The reward signal you design has to reflect what the harness can measure, and the harness has to be built to surface what training needs to optimize. You'll set the technical direction on both, and you'll be in the work on both.

You'll lead a team of research scientists and engineers, but the team will look to your own work as the standard. The successful candidate shapes research priorities, keeps the team moving at high velocity across multiple training cycles per year, and elevates the quality of work by staying close enough to it to know what good actually looks like.

What You'll Do:

  • Own and personally drive the full post-training pipeline for security-domain AI — SFT, RLHF/RLAIF, agent-RL, and reward modeling. Set research priorities and architectural direction, and lead experimental work on the hardest problems yourself rather than delegating them away. Design reward modeling methodology grounded in verified security outcomes rather than proxy signals, drawing on both human expert feedback and automated adversarial evaluation. Define data curation standards across sourcing, filtering, quality scoring, and domain weighting that drive measurable capability improvement.

  • Build and maintain agent-RL training environments that simulate realistic cyber workflows (multi-step offensive and defensive tasks, tool use, and long-horizon planning) contributing directly to environment design and reward shaping. Lead the design and build of the agent harnesses that run on top of those trained models: scaffolding architecture, tool-calling interfaces, planning and reasoning loops, and memory and context management. Treat harness design with the same rigor as the training pipeline; these systems determine whether strong post-training translates into reliable, trustworthy behavior in the field.

  • Develop and own evaluation methodology for the full agentic stack, not model capability in isolation, but harness behavior, tool-use reliability, planning coherence, and end-to-end task completion across realistic security workflows. Define the benchmarks, red-line tests, and measurement practices that give the team and the organization genuine confidence that an agent works.

  • Partner closely with other teams to ensure post-training and agentic work integrates cleanly with the broader model development loop. Contribute original research through publications, external presentations, and open-source artifacts where appropriate, building CrowdStrike's credibility as a research-first organization in this space.

  • Recruit, develop, and retain a high-density team of research scientists and ML engineers. Set a technical bar through your own contributions, not just your standards.

What You'll Need:

  • MS or PhD in computer science, machine learning, or a related quantitative discipline.

  • 8+ years of experience in ML research or engineering, with meaningful depth in large language model post-training.

  • Hands-on expertise across the modern post-training stack, including SFT data pipelines, RLHF/RLAIF, PPO or similar RL algorithms applied to language models, and reward model design and training. This means you've done the work, not managed people who have.

  • Demonstrated experience designing or building agentic system harnesses for LLM-based agents, including tool-use frameworks, planning scaffolds, multi-step execution environments, and context or memory management. You've built these systems, not just used them.

  • Strong evaluation instincts: experience designing evaluation protocols that are resistant to overfitting, capable of measuring genuine capability improvement, and interpretable to both technical and non-technical stakeholders.

  • Track record of running high-velocity research programs with disciplined tracking and fast iteration.

  • Proven ability to lead and grow research teams while remaining a credible, active technical contributor.

Ways to Stand Out:

  • Demonstrated experience building or operating RL training environments for language model agents, including environment design, rollout infrastructure, and reward shaping.

  • Experience applying post-training or RL techniques in security, adversarial ML, or other high-stakes operational domains where ground truth is expensive and noisy.

  • Deep hands-on experience with agent harness architecture applied to long-horizon, multi-step task environments where reliability and failure modes matter as much as peak capability.

  • Background designing synthetic data pipelines or simulation environments for agent training in complex, tool-using workflows.

  • Familiarity with the offensive or defensive security practitioner's workflow — penetration testing, detection engineering, incident response, or threat intelligence — sufficient to reason about what good model behavior looks like in practice.

  • Published research in post-training, RLHF, RL for language agents, or related areas at top-tier venues (NeurIPS, ICML, ICLR, ACL, or equivalent).

  • Experience working on and adapting open-weight base models (Llama-class, Qwen-class, or similar) for domain-specialized continued training and fine-tuning.

#LI-JF1

#LI-Remote

Benefits of Working at CrowdStrike:

  • Market leader in compensation and equity awards

  • Comprehensive physical and mental wellness programs 

  • Competitive vacation and holidays for recharge  

  • Paid parental and adoption leaves

  • Professional development opportunities for all employees regardless of level or role

  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections

  • Vibrant office culture with world class amenities

  • Great Place to Work Certified™ across the globe

CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

Find out more about your rights as an applicant.

CrowdStrike participates in the E-Verify program.

Notice of E-Verify Participation

Right to Work

CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $195,000 - $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.

For detailed information about the U.S. benefits package, please click here

Expected Close Date of Job Posting is:08-11-2026

Similar Jobs at CrowdStrike

2 Hours Ago
Remote or Hybrid
USA
195K-290K Annually
Senior level
195K-290K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead and conduct mechanistic interpretability and alignment research for security-specialized AI. Develop methods to read model internals, detect misuse signals, design training interventions and evaluation frameworks, publish original research, and recruit and mentor a lean research team.
Top Skills: Activation PatchingAdversarial EvaluationAlignment EvaluationsCausal TracingCircuit AnalysisFeature VisualizationLarge Language ModelsMechanistic InterpretabilityProbing ClassifiersRed Teaming
4 Hours Ago
Remote or Hybrid
CA, USA
125K-180K Annually
Senior level
125K-180K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead design, governance, and scaling of CrowdStrike's Atlassian Cloud ecosystem for 12,000+ employees. Own architecture, migrations, app governance, integrations, workflow automation, SDLC traceability, and stakeholder mentorship to ensure secure, performant, and standardized platform operations.
Top Skills: Advanced RoadmapsAtlassian CloudAtlassian GuardAtlassian MarketplaceCi/CdConfluenceForgeGroovyJIRAJira Service ManagementJSONOktaPythonRest ApisRovoScriptrunner
8 Hours Ago
Remote or Hybrid
USA
120K-180K Annually
Senior level
120K-180K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Assess and harden software supply chain security by performing technical assessments, designing and implementing security controls and monitoring for build pipelines, investigating systems for vulnerabilities, creating tooling to address gaps, and leading cross-team projects to improve CrowdStrike's product security posture.
Top Skills: Argo CdArtifactoryBitbucketCi/CdCloud PlatformsDatadogGitGoGroovyJavaScriptJenkinsLinuxLogscalePkiPrometheusPythonRest ApisS3ShellSoftware SigningSplunkTlsUnix

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account