6-10 October 2024
Concordia University Conference Centre
America/New_York timezone

GStreamer Nervous System for AI Brain : Introducing Python Analytics

7 Oct 2024, 15:20
20m
Room 2

Room 2

Speaker

Aaron Boxer (Collabora Inc.)

Description

With the growing success of machine learning (ML) language and speech models over the past four years, ML systems are behaving increasingly like human brains. These brains must be be fed with data, and GStreamer is the perfect framework to do it. But how do we remove obstacles to rapid adoption ?

ML research and commercial development takes place almost exclusively in the Python world, and the current dominant ML toolkit is PyTorch. PyTorch has succeeded for a number of reasons, including it's simplicity, strong community, rapid innovation, broad hardware support and ease of integration with the vast Python world. Over the past year, Pytorch has introduced a compile feature, a Just In Time (JIT) compiler that dynamically optimizes code for the current target hardware. Performance improvements are astonishing - in some cases compiled PyTorch is faster than TensorRT on Nvidia hardware. But compile is not limited to just one hardware platform.

Collabora was the first to upstream neural network support into GStreamer via ONNX analytics elements. ONNX
is a cross-platform inference engine whose C++ API has been integrated to enable new object detection and segmentation elements. We have also introduced the analytics meta-data framework, a framework for flexibly storing meta data generated from AI models, and the relationships between different meta data..

We now introduce a suite of GStreamer custom elements and classes written in Python that allow users to easily and rapidly support all the latest AI models, using native PyTorch support. We provide a package with base classes supporting models for audio, video and Large Language Models (LLMs). The package works with the latest GStreamer version and inter-operates with the new meta data framework. Performance enhancements such as batching and managing device memory buffers are available out of the box.

In addition to the base classes, we also provide elements that perform object detection, tracking, speech to text, text to speech and LLM chat-bot features. There is a Kafka element that can send meta data to a Kafka server.

The list of elements continues to grow rapidly, supporting any of the many Hugging Face models with ease. Our goal is no less than to provide a complete upstream solution for GStreamer Analytics via PyTorch, making upstream GStreamer the number one multimedia framework for machine learning.

Duration of the talk
Speaker bio Aaron is a mathematician and developer and enjoys the simple pleasure of squeezing every last ounce of performance out of both hardware and software. He works for Collabora and is based in Toronto, Canada.

Primary author

Aaron Boxer (Collabora Inc.)

Presentation Materials

2024 Platinum Sponsor
Collabora
2024 Gold Sponsors
Arm
Google
Microsoft
NVIDIA
2024 Silver Sponsors
AMD
FEX-Emu
Igalia
Qualcomm
The Linux Foundation
2024 Bronze Sponsors
CodeWeavers
Khronos Group
Libre Computer