Advanced LLM Features with Ray Serve LLM

Advanced LLM Features with Ray Serve LLM#

💻 Launch Locally: You can run this notebook locally, but you’ll need access to GPUs.

🚀 Launch on Cloud: A Ray Cluster with GPUs (Click here to easily start a Ray cluster on Anyscale) is recommended to run this notebook.

This notebook explores advanced features and capabilities of Ray Serve LLM beyond basic model deployment. We’ll dive into practical examples that showcase the power and flexibility of production LLM serving.

Here is the roadmap for this notebook:

Overview: Advanced Features Preview
Example: Deploying LoRA Adapters
Example: Getting Structured JSON Output
Example: Setting up Tool Calling
How to Choose an LLM?
Conclusion: Next Steps