NIM Solution Architect

1 Month ago • 3 Years + • Artificial Intelligence

Job Summary

Job Description

As a NIM Solution Architect at NVIDIA, you will drive the implementation and deployment of NVIDIA Inference Microservice (NIM) solutions. Responsibilities include using NIM Factory Pipeline to package optimized models into containers, refining NIM tools for the community, designing agentic AI solutions using NIMs, delivering technical projects and demos, providing client support, collaborating with cross-functional teams, and championing NVIDIA software within the technical community. You'll also support the NVAIE team and contribute to their business in China. This role requires expertise in deploying and optimizing large language models, proficiency in inference frameworks (TensorRT, ONNX Runtime, PyTorch), strong Python/C++ programming, and familiarity with DevOps/MLOps practices.
Must have:
  • 3+ years experience
  • LLM deployment & optimization
  • Inference framework proficiency (TensorRT, etc.)
  • Python/C++ programming skills
  • DevOps/MLOps experience
  • Problem-solving & troubleshooting skills
Good to have:
  • Experience with field LLM projects
  • TensorRT expertise
  • AI workflow design experience
  • Cluster resource management tools
  • Agile methodologies
  • CUDA optimization experience
  • Large-scale HPC/enterprise system design

Job Details

NVIDIA is leading company of AI computing. At NVIDIA, our employees are passionate about AI, HPC , VISUAL, GAMING. Our Solution Architect team is more focusing to bring NVIDIA new technology into difference industries. We help to design the architecture of AI computing platform, analysis the AI and HPC applications to deliver our value to customers. This role will be instrumental in leveraging NVIDIA's cutting-edge technologies to optimize open-source and proprietary large models, create AI workflows, and support our customers in implementing advanced AI solutions. 

What you’ll be doing:

  • Drive the implementation and deployment of NVIDIA Inference Microservice (NIM) solutions 

  • Use NVIDIA NIM Factory Pipeline to package optimized models (including LLM, VLM, Retriever, CV, OCR, etc.) into containers providing standardized API access for on-prem or cloud deployment 

  • Refine NIM tools for the community, help the community to build their performant NIMs 

  • Design and implement agentic AI tailored to customer business scenarios using NIMs

  • Deliver technical projects, demos and client support tasks as directed by the Solution Architecture Leadership 

  • Provide technical support and guidance to customers, facilitating the adoption and implementation of NVIDIA technologies and products 

  • Collaborate with cross-functional teams to enhance and expand our AI solutions portfolio

  • Be an internal champion for NVIDIA software and total solutions in technical community 

  • Be an industry thought leader on integrating NVIDIA technology especially inference services into LHA, business partners and whole community 

  • Assist in supporting NVAIE team and driving NVAIE business in China 

What we need to see:

  • 3+ years working experience with Bachelor's or Master's degree in Computer Science, Artificial Intelligence, or a related field 

  • Proven experience in deploying and optimizing large language models 

  • Proficiency in at least one inference framework (e.g., TensorRT, ONNX Runtime, PyTorch) 

  • Strong programming skills in Python or C++ 

  • Familiarity with main stream inference engines (e.g., vLLM, SGLang) 

  • Experience with DevOps/MLOps such as Docker, Git, and CI/CD practices 

  • Excellent problem-solving skills and ability to troubleshoot complex technical issues 

  • Demonstrated ability to collaborate effectively across diverse, global teams, adapting communication styles while maintaining clear, constructive professional interactions 

Ways to stand out from the crowd:

  • Experience in architectural design for field LLM projects 

  • Expertise in model optimization techniques, particularly using TensorRT 

  • Knowledge of AI workflow design and implementation, experience on cluster resource management tools. Familiarity with agile development methodologies 

  • CUDA optimization experience, extensive experience designing and deploying large scale HPC and enterprise computing systems 

Similar Jobs

Riot Games - Staff Software Engineer - Infrastructure Reliability

Riot Games

Los Angeles, California, United States (On-Site)
1 Month ago
Seedify - Staff Technical Architect

Seedify

(Remote)
9 Months ago
Tesla - Senior Machine Learning, AI Engineer

Tesla

Brandenburg, Germany (On-Site)
3 Months ago
Alphasense - Senior Engineer, iOS

Alphasense

Helsinki, Uusimaa, Finland (Hybrid)
1 Week ago
Fluence - Controls Software Engineer II

Fluence

Houston, Texas, United States (Hybrid)
7 Months ago
ByteDance - Research Engineer Intern

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
NVIDIA - Engineering Manager, AI Developer Technology

NVIDIA

Austin, Texas, United States (On-Site)
2 Months ago
Lucid Reality Labs - ML/AI Engineer

Lucid Reality Labs

Poland (Remote)
2 Months ago
Hitachi - Senior AI Data Scientist

Hitachi

Chennai, Tamil Nadu, India (On-Site)
7 Months ago
NVIDIA - Research Scientist, Deep Learning and Computer Vision

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

N-iX - Senior AQA Engineer (Python + Robot)

N-iX

Colombia (Remote)
1 Month ago
The Walt Disney Company - Lead Software Engineer, Machine Learning - Ad Platforms

The Walt Disney Company

California, United States (On-Site)
1 Month ago
Super - Staff Software Engineer - Fullstack

Super

United States (Remote)
6 Months ago
Barracuda Networks  Inc  - Senior Software Developer

Barracuda Networks Inc

Ottawa, Ontario, Canada (Hybrid)
3 Months ago
G- space studios - Build Engineer

G- space studios

(Remote)
2 Weeks ago
Telastra - Software Engineer II

Telastra

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Colo pl - Server-Side Engineer (New Title)

Colo pl

Minato City, Tokyo, Japan (On-Site)
11 Months ago
Capgemini - Mobile Architect

Capgemini

Mumbai, Maharashtra, India (On-Site)
1 Week ago
ION - Cloud Engineer Kubernetes

ION

Rome, Lazio, Italy (Hybrid)
7 Months ago
Zazz - Cloud Engineer (Azure)

Zazz

(Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Shanghai, Shanghai, China

World Resource Institute - Research Associate, Climate

World Resource Institute

Beijing, China (On-Site)
2 Weeks ago
Marsh McLennan - Consultant

Marsh McLennan

Shanghai, China (Hybrid)
1 Week ago
Haleon - Sr. Account Manager, MRT

Haleon

Chengdu, Sichuan, China (On-Site)
1 Week ago
NVIDIA - Software Test Developer Intern - Spark Rapids, Big Data & Deep Learning - 2025

NVIDIA

Shanghai, Shanghai, China (On-Site)
1 Month ago
Tencent - Senior Environment Artist

Tencent

Shanghai, Shanghai, China (On-Site)
3 Months ago
Thatgamecompany - Senior Backend Engineer - China

Thatgamecompany

Shanghai, Shanghai, China (On-Site)
2 Months ago
Nordson Corporation - Manufacturing Engineer II

Nordson Corporation

Shanghai, China (On-Site)
2 Weeks ago
Thatgamecompany - Marketing Project Manager - China

Thatgamecompany

Shanghai, Shanghai, China (On-Site)
2 Months ago
NVIDIA - Senior AI Training Performance Engineer

NVIDIA

Shanghai, Shanghai, China (Hybrid)
4 Months ago
Google - Account Strategist, Mid-Market Sales

Google

Guangdong Province, China (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Interface AI - Senior Account Manager

Interface AI

United States (Remote)
3 Months ago
A-Team - AI Strategy Lead

A-Team

New York, New York, United States (Hybrid)
2 Months ago
Google - Software Engineer III, AI/ML, Google Cloud AI

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
NVIDIA - Solutions Architect, Financial Services

NVIDIA

New Jersey, United States (Remote)
1 Month ago
NVIDIA - Senior Computer Architect - Deep Learning

NVIDIA

Santa Clara, California, United States (On-Site)
4 Months ago
Meta - Visiting Senior Research Scientist

Meta

Paris, Île-de-France, France (On-Site)
6 Months ago
ByteDance - LLM Software Engineer/Researcher (Applied Machine Learning)

ByteDance

Seattle, Washington, United States (On-Site)
2 Months ago
NVIDIA - AI Computing Software Development Engineer, TensorRT

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
3 Months ago
Google - Conversational AI Consultant

Google

Gurugram, Haryana, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug
OSZAR »