Organizations analyze large amounts of tabular data to uncover insights, improve products and services, and achieve efficiency. For those enterprises that want to thrive in a rapidly changing environment, the ability to process big data quickly can often create the competitive edge needed to succeed. Because speed is of such critical importance, accelerating the data processing pipeline—and doing it in a way that maximizes hardware utility—can profoundly impact the productivity and outcomes of data science efforts.
This Deep Learning Institute (DLI) workshop will share how to create an end-to-end hardware-accelerated machine learning pipeline for large datasets. You’ll utilize NVIDIA RAPIDSTM and Dask to scale your data science workloads. This workshop will illustrate how the same process can be applied to other machine learning use cases. You’ll then learn how to speed up data engineering by avoiding hidden slowdowns and reduce model development time by maximizing hardware utility. Throughout the development process, you’ll use diagnostic tools to identify delays and learn to mitigate common pitfalls.
Learning Objectives
By participating in this workshop, you’ll:
- Develop and deploy an accelerated end-to-end data processing pipeline for large datasets
- Scale data science workflows using distributed computing
- Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns
- Enhance machine learning solutions through feature engineering and rapid experimentation
- Improve data processing pipeline performance by optimizing memory management and hardware utilization
Download workshop datasheet (PDF 68 KB)