top of page

Qwen 3 Agentic RAG: Private LLM Deployment with CrewAI

  • * The tutorial details the deployment of an Agentic RAG system powered by Alibaba's Qwen 3, emphasizing a 100% private and local LLM setup.
  • * The stack includes `CrewAI` for agent orchestration, `Firecrawl` for web search, and `Lightning AI's LitServe` for deployment.
  • * The Agentic RAG flow involves a Retriever Agent that uses `Firecrawl` or a vector DB to gather context, followed by a Writer Agent that generates the response.
  • * `LitServe` is used to serve the Agentic RAG, with methods for orchestrating agents (`setup`), preparing input (`decode_request`), invoking the Crew (`predict`), and sending the response (`encode_response`).
  • * The tutorial includes basic client code using the `requests` Python library to invoke the created API.
  • * A key question raised in the comments concerns the dynamic tool selection between `Firecrawl` and the vector DB: specifically, whether routing is based on query type or autonomous agent decision-making.
Source:
bottom of page