
Automating Data Cleaning with Generative AI
As modern datasets increasingly include diverse formats like sound, images, and video, traditional data cleaning methods fall short. This masterclass explores how Generative AI and machine learning can automate the cleaning of multimodal data—saving time, reducing manual effort, and improving overall data quality.
Designed for professionals working with complex datasets, this session will introduce AI-powered techniques and scalable Python-based pipelines to streamline data preparation across formats.
Learning Outcomes
-
Apply AI-Driven Cleaning to Multimodal Data
Learn how to clean and standardise tabular, audio, image, and video data using AI and machine learning tools. -
Build Scalable Python Pipelines
Discover how to design and implement efficient, repeatable data cleaning workflows using Python and widely-used libraries. -
Enhance Data Quality with Generative AI
Explore the use of pre-trained models and generative techniques to intelligently fill gaps, remove noise, and detect anomalies in your datasets.
By the end of the session, you’ll be equipped to handle complex data cleaning challenges using cutting-edge AI tools—helping you move from raw data to insight faster and more reliably.
Target Audience
This masterclass is ideal for data scientists, engineers, technical managers, and enthusiasts who work with Python and spend significant time preparing and cleaning data. No prior experience with AI models is required, but familiarity with Python and data pipelines is recommended.