Advanced LLM Features with Ray Serve LLM

Advanced LLM Features with Ray Serve LLM#

© 2025, Anyscale. All Rights Reserved

💻 Launch Locally: You can run this notebook locally, but you’ll need access to GPUs.

🚀 Launch on Cloud: A Ray Cluster with GPUs (Click here to easily start a Ray cluster on Anyscale) is recommended to run this notebook.

This notebook explores advanced features and capabilities of Ray Serve LLM beyond basic model deployment. We’ll dive into practical examples that showcase the power and flexibility of production LLM serving.

Here is the roadmap for this notebook:
  • Overview: Advanced Features Preview
  • Example: Deploying LoRA Adapters
  • Example: Getting Structured JSON Output
  • Example: Setting up Tool Calling
  • How to Choose an LLM?
  • Conclusion: Next Steps