Model Training Foundations (Ray Train)

Model Training Foundations (Ray Train)

Learn the foundations of distributed training of machine learning models with Ray Train.

Enroll in training

Course curriculum

1. Introduction
2. Imports & Loading the Dataset
3. Defining the Model
4. Train Loop Per Worker
5. Defining the Training Loop Configuration
6. Configuring the Scaling Config
7. Defining the Model Wrapper
8. Building the Dataloader
9. Reporting Metrics and Checkpointing
10. Persistent Storage
11. Putting it all together with TorchTrainer
12. Inspecting the Training Results
13. Inference with Your Trained Model
14. Full Chapter Notebook
1. Introduction
2. Train Loop Using Ray Data
3. Building a Ray Data-Backed Dataloader
4. Preparing and Loading the Dataset for Ray Data
5. Transformations with Ray Data
6. Configuring TorchTrainer and Launching Training
7. Full Chapter Notebook
1. Introduction
2. Checkpoint Loading for Fault Tolerance
3. Saving Fault Tolerant Checkpoints
4. Launching Fault Tolerant Training
5. Manually Restoring from Checkpoints
6. Cleaning up Cluster Storage and Conclusion
7. Concluding the Intro Tutorials and Next Steps
8. Full Chapter Notebook

About this course

Free
29 lessons
1 hour of video content

Discover your potential, starting today

Enroll today