
Software EngineeringMeetupFreeOnline
Building Multimodal Agents With NVIDIA Nemotron and RAPIDS - Part One
Tue 30 Jun · 16:00
< 50 attendees
About this event
This two-part workshop covers building agents that read documents, extract text via OCR, and run topic modeling with NVIDIA Nemotron and RAPIDS. Both sessions start with concepts and end with code you can experiment with. No multimodal pipeline experience required.
Build multimodal extraction pipelines with Nemotron 3 Nano Omni and Nemotron Parse through NVIDIA NIMs. Turn charts, tables, screenshots, scanned documents, and screen recordings into structured artifacts for AI agents.
In this workshop you will learn how to:
- Call Nemotron Parse and Nemotron 3 Nano Omni from Python through NIM endpoints
- Build a document pipeline with OCR text, bounding boxes, tables, visual descriptions, and page context
- Wrap Python logic as a LangGraph tool and connect model output to validated tool execution
Prerequisites:
Python 3 fundamentals, exploratory data analysis workflows, and basic microservice concepts.
Recommended Resources:
NVIDIA NIMs intro
NVIDIA RAPIDS intro
LangChain Academy's Intro to LangGraph
Source: meetup