Speaker
Description
In this presentation, I will summarize a library I am developing for video presentation and recording applications that assists users in creating content for video presentations. A key component of this system is GStreamer, which plays an integral role in real-time video and audio processing.
Key Features:
The app streams media to Amazon ECS during recording for real-time GStreamer-based processing, offering the following features:
- Conversion: Transforms video from webm format to HLS.
- Thumbnails Generation: Automatically creates thumbnail images from the video stream.
- Waveform Generation: Produces waveform visualizations to enhance the user experience.
- Audio Transcription: Converts spoken audio into text for accessibility and indexing.
- Notably, all of this processing is done 100% in memory, without utilizing the local filesystem.
Advanced Editing Capabilities:
Users can edit and enhance multiple recordings by:
- Mixing recordings together,
- Trimming and adding images or other visual elements,
- Combining and editing them to form a final presentation.
This is accomplished using GStreamer Editing Services (GES), allowing users to create structured presentations composed of logical chapters, enhancing both functionality and narrative flow.
Technical Implementation:
GStreamer functionality is encapsulated in a C++ library, integrated with a thin C layer, and exposed to Go, providing users with an easy-to-use interface.
Challenges and Solutions:
During development, I encountered and reported several GStreamer-related issues, many of which have since been resolved. Additionally, the C++ library includes simple yet highly useful wrappers that streamline automatic resource management, further improving and simplifying the development.
Conclusion:
In this presentation, I will demonstrate GStreamer's powerful capabilities within a modern video presentation app, showcasing how automatic resource management can simplify development and enhance overall workflow efficiency.
Speaker bio | I'm a software developer with 20 years of professional experience, having worked on projects in diverse domains such as networking, high availability systems, low-level microcontroller programming, fintech, automotive and computer vision. For the past 5 years, I have focused on audio and video processing, which I find particularly engaging. While I am most comfortable working in C++, I also enjoy integrating various technologies within a single project. |
---|---|
Duration of the talk |