RAGFlow

Build knowledge-base Q&A with TokenMix LLM and embedding models.

Prepare TokenMix values

RAGFlow usually needs two models:

Chat model ID: for answers.
Embedding model ID: for document parsing and retrieval.
API Key: your TokenMix key.
Base URL: https://api.tokenmix.ai/v1.

Option 1: Configure in the UI

Start RAGFlow and log in.
Open Model providers.
Choose OpenAI-API-Compatible.
Set Base URL to https://api.tokenmix.ai/v1.
Set API Key to your TokenMix key.
Set Model to a TokenMix chat model ID.
Add an embedding model with a TokenMix embedding model ID.

Option 2: Configure before startup

In service_conf.yaml.template, find user_default_llm:

user_default_llm:
  factory: "OpenAI-API-Compatible"
  api_key: "<your-tokenmix-key>"
  base_url: "https://api.tokenmix.ai/v1"

Restart RAGFlow after changing it.

Test a knowledge base

Create a dataset.
Select the TokenMix embedding model and a chunk template.
Upload one small PDF or Markdown file.
Wait for parsing to finish.
Create a Chat Assistant and ask:

Summarize the uploaded document in five bullet points.

RAGFlow's quickstart notes that once a dataset has parsed files, its embedding model should not be changed casually because all chunks must stay in the same vector space.

Troubleshooting

parsing stuck: check embedding model and API key.
chat works but document Q&A fails: embedding or indexing is not ready.
404/model not found: wrong chat or embedding model ID.
weak retrieval: choose a better chunk template or recreate the dataset with another embedding model.

Practical notes

RAGFlow needs more detail than a normal chat tool because it has chat models, embedding models, datasets, chunk templates, and parsing states. Beginners often think “chat works, so RAG works”, but RAG also requires a working embedding model.

Beginner flow

Configure only the chat model first and send a normal question.
Configure the embedding model next; do not put a chat model into the embedding field.
Create one test dataset and upload one small file.
Wait until parsing is complete before creating a Chat Assistant.
Ask a question that can only be answered from the uploaded document.

Model choice

Chat model: choose a TokenMix text model with good instruction following.
Embedding model: choose a TokenMix embedding model and follow its documented dimension.
Large documents: test with one small file first.
Multilingual documents: choose an embedding model suitable for the document language.

Changing embeddings

RAGFlow's quickstart warns that once files are parsed in a dataset, switching embedding models is not recommended. For beginners, create a new dataset and re-upload files when changing embeddings.