Introduction to Ray Data: Ray Data + Structured Data#
© 2025, Anyscale. All Rights Reserved
💻 Launch Locally: You can run this notebook locally, but performance will be reduced.
🚀 Launch on Cloud: A Ray Cluster (Click here to easily start a Ray cluster on Anyscale) is recommended to run this notebook.
This notebook will provide an overview of Ray Data and how to use it to load, and transform data in a distributed manner.
Here is the roadmap for this notebook:
- Part 0: What is Ray Data?
- Part 1: How to use Ray Data?
- Part 2: Loading Data
- Part 3: Transforming Data
- Part 4: Writing Data
- Part 5: Data Operations: Shuffling, Grouping and Aggregation
- Part 6: When to use Ray Data
- Part 7: Ray Data in Production
- Part 8: Upcoming Features in Ray Data
Imports
import ray
import pandas as pd