Wells Fargo Logo

Wells Fargo

Lead Software Engineer - Gen AI Inferencing Services and Agentic AI

Posted Yesterday
Be an Early Applicant
Hybrid
3 Locations
Senior level
Hybrid
3 Locations
Senior level
About this role:
Wells Fargo is seeking a Lead Software Engineer - LLM Inferencing & Agentic AI within Digital Technology's AI Capability Engineering organization. In this role, you will design, build, and operate the GenAI Platform's GPU infrastructure and LLM/SLM serving systems, ensuring highly performant, reliable, and secure model inferencing at scale.
You will work across the full inferencing stack-from GPU cluster configuration and Run:AI / OpenShift AI orchestration to vLLM and NVIDIA Triton runtime optimization, including performance tuning, production hardening, and multi-model deployment. Focus areas include operating H100/H200 GPU clusters, advanced GPU scheduling, disaggregated prefill/decode serving, deep observability, and productionizing endpoints behind the enterprise API Gateway.
You will also design and deliver OpenAI-compatible APIs (Responses, Interactions), support MCP server integrations, and contribute to agentic AI development-including tools, agents, workflows, and evaluations. This role additionally involves building UI surfaces that improve developer and operator productivity, enabling teams to use, monitor, and troubleshoot AI services more effectively.
Strong experience with LLM/SLM behavior, inferencing optimizations, tuning techniques, and prompt engineering/evaluation is expected.
In this role, you will:
  • Lead complex technology initiatives including those that are companywide with broad impact
  • Act as a key participant in developing standards and companywide best practices for engineering complex and large scale technology solutions for technology engineering disciplines
  • Design, code, test, debug, and document for projects and programs
  • Review and analyze complex, large-scale technology solutions for tactical and strategic business objectives, enterprise technological environment, and technical challenges that require in-depth evaluation of multiple factors, including intangibles or unprecedented technical factors
  • Make decisions in developing standard and companywide best practices for engineering and technology solutions requiring understanding of industry best practices and new technologies, influencing and leading technology team to meet deliverables and drive new initiatives
  • Collaborate and consult with key technical experts, senior technology team, and external industry groups to resolve complex technical issues and achieve goals
  • Lead projects, teams, or serve as a peer mentor
  • Engineer GPU clusters and node pools; configure NVLink/NVSwitch, NVIDIA GPU Operator, MIG profiles, container runtime, and kernel/driver baselines for high-throughput LLM/SLM workloads.
  • Design and implement OpenAI-compatible APIs (Responses, Interactions) behind the AI Gateway: define OpenAPI contracts, authN/Z (OAuth2/mTLS), rate limits/quotas, SLAs, versioning/deprecation, and SDK generation.
  • Build and support MCP servers and tool adapters; manage agent/tool identity and capability metadata; integrate with agent registries and execution flows.
  • Develop Agentic AI capabilities (tools/agents/workflows) including disaggregated prefill/decode patterns; contribute to runbooks, guardrails, and safe tool usage.
  • Build UI surfaces (developer/ops consoles) for endpoint onboarding, prompt testing, evaluations, observability dashboards, and incident response workflows.
  • Apply prompt engineering and evaluation best practices; create golden test suites, regression harnesses, and measurable SLO-aligned criteria for production promotion.
Required Qualifications:
  • 5+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
Desired Qualifications:
  • 5+ years of experience in Python for backend/services development, packaging, instrumentation, and automation
  • 5+ years of experience building modern web UI for developer/ops workflows, including dashboards, wizards, and prompt/eval tooling, with strong testing and accessibility practices
  • 1+ years of experience building MCP servers, tool adapters, and agent workflows, with an understanding of agent identity, permissions, and governance metadata
  • 2+ years of experience in GenAI engineering, including LLM/SLM operations, fine-tuning/evaluation, per-model performance recipes, and prompt engineering and evaluation harnesses
  • 1+ years of experience with LLM API exposure, including AI Gateway - OAuth2/mTLS, rate limits/quotas, OpenAPI/SDKs, SLAs, versioning/deprecation, and OpenAI-compatible API design for responses and interactions
  • 1+ years of experience with serving large language models (LLM/SLM), including vLLM, Triton, TensorRT-LLM/MII, KV cache strategies, FP8/INT4 AWQ/GPTQ, and certified disaggregated prefill/decode
  • 1+ years of experience with orchestration tools for GPU workload management, such as Run:AI (Collections/queues, quotas, preemption, fair share), OpenShift AI (RHOAI), and OCP/GKE administration
  • 1+ years of experience with GPU Inference Layer, including NVIDIA and CUDA technologies such as CUDA, cuDNN, NVLink/NVSwitch, MIG, NIXL, GPU profiling, and H100/H200 performance tuning
Job Expectations:
  • Hybrid onsite at required locations
  • No visa sponsorship available
  • No relocation assistance for this position
Posting End Date:
27 Jan 2026
*Job posting may come down early due to volume of applicants.
We Value Equal Opportunity
Wells Fargo is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other legally protected characteristic.
Employees support our focus on building strong customer relationships balanced with a strong risk mitigating and compliance-driven culture which firmly establishes those disciplines as critical to the success of our customers and company. They are accountable for execution of all applicable risk programs (Credit, Market, Financial Crimes, Operational, Regulatory Compliance), which includes effectively following and adhering to applicable Wells Fargo policies and procedures, appropriately fulfilling risk and compliance obligations, timely and effective escalation and remediation of issues, and making sound risk decisions. There is emphasis on proactive monitoring, governance, risk identification and escalation, as well as making sound risk decisions commensurate with the business unit's risk appetite and all risk and compliance program requirements.
Candidates applying to job openings posted in Canada: Applications for employment are encouraged from all qualified candidates, including women, persons with disabilities, aboriginal peoples and visible minorities. Accommodation for applicants with disabilities is available upon request in connection with the recruitment process.
Applicants with Disabilities
To request a medical accommodation during the application or interview process, visit Disability Inclusion at Wells Fargo .
Drug and Alcohol Policy
Wells Fargo maintains a drug free workplace. Please see our Drug and Alcohol Policy to learn more.
Wells Fargo Recruitment and Hiring Requirements:
a. Third-Party recordings are prohibited unless authorized by Wells Fargo.
b. Wells Fargo requires you to directly represent your own experiences during the recruiting and hiring process.

Top Skills

Cuda
Cudnn
Gpu
Mtls
Nvidia Triton
Oauth2
Openapi
Openshift Ai
Python
Run:Ai
Vllm

Wells Fargo Charlotte, North Carolina, USA Office

355 W Martin Luther King, Jr BLVD, Charlotte, NC, United States, 28202

Similar Jobs at Wells Fargo

3 Hours Ago
Hybrid
3 Locations
119K-224K Annually
Senior level
119K-224K Annually
Senior level
Fintech • Financial Services
Lead Infrastructure as Code initiatives, architect reusable components, implement GitOps workflows, and improve infrastructure automation with collaboration and coding best practices.
Top Skills: AnsibleArgocdAWSAzureBackstageGCPGitopsKubernetesPulumiRestful ApisTerraform
3 Hours Ago
Hybrid
7 Locations
100K-196K Annually
Senior level
100K-196K Annually
Senior level
Fintech • Financial Services
As a Senior Information Security Engineer, design and automate secure research environments, respond to security incidents, and provide consulting on security solutions.
Top Skills: AnsibleDockerElasticsearchGitGrafanaKibanaKubernetesPythonTerraform
3 Hours Ago
Hybrid
6 Locations
159K-305K Annually
Senior level
159K-305K Annually
Senior level
Fintech • Financial Services
The Lead Product Owner for Cyber Threat Management will drive product strategy and requirements, focusing on threat intelligence and cybersecurity solutions, ensuring alignment with business outcomes and regulatory expectations.
Top Skills: Cyber Defense PlatformsMitre Att&CkSIEMSoar

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account