Ray Data

  • Introduction to Ray Data

  • Loading, Transforming, Materializing

  • Data Operations

  • Ray Data for Batch Inference

  • Batch inference

  • Common Ray Data Issues and solutions

  • Diagnosing Ray Data

  • Brief about the workload

  • General Diagnostics

  • Ray Data Architecture

Ray Train

  • Introduction to Ray Train for Deep Learning

  • Single GPU PyTorch

  • Overview training loop in Ray Train

  • Migrating the model & dataset to Ray Train

  • Ray Train

  • Observability

  • Using the Ray dashboard

  • Profiling the training loop

  • Adding Ray Data

  • Tuning Configs for Cost & Performance

  • Debugging Ray Train common failures

  • Stable Diffusion Pre-training

Ray Serve

  • Introduction to Ray Serve

  • Overview of Ray Serve

  • Deployments

  • Key Ray Serve Features

  • Implement MNISTClassifier service

  • Advanced features of Ray Serve

  • Ray Serve Architecture

  • Architecture, Components

  • Fault tolerance

  • Transient Data Loss

  • Load shedding and backpressure

Ray Tune

  • Loading the data

  • Starting out with vanilla XGBoost

  • Hyperparameter tuning

  • Diving deeper into Ray Tune concepts

  • Monitoring a Ray Tune experiment

  • Recovering from a failed experiment

  • Fault-tolerant trials with Ray Tune

  • Specifying stopping criteria

  • Using complex search algorithms

  • Passing data to Ray Tune trials

Ray Core

  • Introduction to Ray Core

  • Remote Tasks

  • Remote Objects

  • Ray Actors

  • Ray Core Best Practices

  • Ray Task Lifecycle Overview

  • Ray Tasks lifecycle - deep dive

  • Ray Object Lifecycle

RLlib

  • Intro to RLlib

  • RLlib architecture, key concepts

  • single-agent training and evaluation

  • multi-agent training and evaluation

  • offline RL

  • scaling experiments and performance profiling

Request for Private Class

Tailored private training to meet the needs of your business