Introduction to Ray Data: Ray Data + Structured Data

Introduction to Ray Data: Ray Data + Structured Data#

© 2025, Anyscale. All Rights Reserved

💻 Launch Locally: You can run this notebook locally, but performance will be reduced.

🚀 Launch on Cloud: A Ray Cluster (Click here to easily start a Ray cluster on Anyscale) is recommended to run this notebook.

This notebook will provide an overview of Ray Data and how to use it to load, and transform data in a distributed manner.

Here is the roadmap for this notebook:
  • Part 0: What is Ray Data?
  • Part 1: How to use Ray Data?
  • Part 2: Loading Data
  • Part 3: Transforming Data
  • Part 4: Writing Data
  • Part 5: Data Operations: Shuffling, Grouping and Aggregation
  • Part 6: When to use Ray Data
  • Part 7: Ray Data in Production
  • Part 8: Upcoming Features in Ray Data

Imports

import ray
import pandas as pd