Add document ingest support for Ragflow #17483

Sameerlite · 2025-12-04T15:06:50Z

Title

Add document ingest support for Ragflow

Relevant issues

FIxes #17112

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature

Changes

Added support for providing dynamic chat id/ agent id
Added support for ingesting doucments of ragflow via rag/ingest API

vercel · 2025-12-04T15:06:55Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
litellm	Ready	Preview	Comment	Dec 4, 2025 3:06pm

litellm/rag/ingestion/ragflow_ingestion.py

+            "file": (filename, file_content, content_type or "application/octet-stream")
+        }
+
+        verbose_logger.debug(f"Uploading document to RAGFlow: {url}")


To fix the problem, we should avoid logging the full url variable if it may derive from secret or sensitive sources.

General approach: Remove or redact logging of sensitive connection endpoints such as api_base. Instead, log only non-sensitive information (e.g., action description, or redacted endpoint info).

Specific fix: In litellm/rag/ingestion/ragflow_ingestion.py, replace
verbose_logger.debug(f"Uploading document to RAGFlow: {url}")
with a log that excludes url, or, if needed, only log static or whitelisted info (e.g., dataset ID, filename).

Ideally, only log safe, non-secret identifying metadata for debugging (like dataset_id, filename), or replace the log with a generic message ("Uploading document to RAGFlow").

Only the log line(s) on line 178 in the _upload_document method needs updating; ensure code functionality remains unchanged.

litellm/rag/ingestion/ragflow_ingestion.py

+        if not request_body:
+            return  # Nothing to update
+
+        verbose_logger.debug(f"Updating document configuration: {url}")


The fix is to avoid logging secret-derived values. Specifically, do not log the full constructed URL if it may contain secrets or sensitive values.

In RAGFlowRAGIngestion._update_document_config, replace the debug log verbose_logger.debug(f"Updating document configuration: {url}") with a message that omits the sensitive/interpolated parts, or logs only safe, non-secret portions (e.g., just the document ID or a generic statement).

You may log a generic string such as "Updating document configuration" or other non-sensitive context.

No new methods or imports are required beyond possibly changing what is logged at the highlighted location.

Ensure this change is applied only to the relevant logger invocation where the taint source flows into the sink.

litellm/rag/ingestion/ragflow_ingestion.py

+            "document_ids": document_ids,
+        }
+
+        verbose_logger.debug(f"Triggering parsing for documents: {url}")


To address this, we want to prevent accidental logging of potentially sensitive information such as the full URL derived from secrets or private environment configuration. Instead of logging the full URL, we can log only non-sensitive, high-level information (e.g., that parsing is being triggered, optionally with the dataset or document IDs if not sensitive), or redact/sanitize values before logging.

Recommended fix:

In litellm/rag/ingestion/ragflow_ingestion.py, within _trigger_parsing, update the log statement to omit or redact the URL, and only state that parsing was triggered, with possibly non-sensitive info (e.g., dataset_id, number of documents).

Do not log the value of variables that may contain secrets, such as api_base, url, or other secrets.

Implementation:

Replace verbose_logger.debug(f"Triggering parsing for documents: {url}") with a statement like verbose_logger.debug(f"Triggering parsing for documents in dataset '{dataset_id}' (count: {len(document_ids)})").

Ensure that no secrets are logged in the message.

No new dependencies are required.

krrishdholakia · 2025-12-06T22:03:50Z

@metalshanked is this what you wanted?

any help qa'ing this would be appreciated

metalshanked · 2025-12-06T23:00:48Z

Thanks @Sameerlite and @krrishdholakia. This is awesome with Chat, Agent w/ Dynamic ids and Dataset management.

Would it be possible to add the actual Search/Retrieval as well. I assume many users would need a basic Search/Retrieval capability that Ragflow exposes like a basic vector store.
Example:- I already have a separate non ragflow Chat or Agent app and want to query the Ragflow vector store (Ragflow uses Inifini or Elasticsearch behind the scenes) to retrieve the relevant chunks.

I mean this endpoint --> https://ragflow.io/docs/dev/http_api_reference#retrieve-chunks

Also, this should show up as a vector store in the litellm UI so that usual vector db permissions can be applied.

Thanks

Add ragflow support for rag injest

96ef7d0

github-advanced-security bot found potential problems Dec 4, 2025

View reviewed changes

@@ -175,7 +175,7 @@
                         "file": (filename, file_content, content_type or "application/octet-stream")
                     }
-                    verbose_logger.debug(f"Uploading document to RAGFlow: {url}")
+                    verbose_logger.debug("Uploading document to RAGFlow")
                     client = get_async_httpx_client(
                         llm_provider=httpxSpecialProvider.RAG,

@@ -254,7 +254,7 @@
                     if not request_body:
                         return  # Nothing to update
-                    verbose_logger.debug(f"Updating document configuration: {url}")
+                    verbose_logger.debug("Updating document configuration.")
                     client = get_async_httpx_client(
                         llm_provider=httpxSpecialProvider.RAG,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add document ingest support for Ragflow #17483

Add document ingest support for Ragflow #17483

Uh oh!

Sameerlite commented Dec 4, 2025 •

edited

Loading

Uh oh!

vercel bot commented Dec 4, 2025

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

krrishdholakia commented Dec 6, 2025

Uh oh!

metalshanked commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Add document ingest support for Ragflow #17483

Are you sure you want to change the base?

Add document ingest support for Ragflow #17483

Uh oh!

Conversation

Sameerlite commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Dec 4, 2025

Uh oh!

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Sameerlite commented Dec 4, 2025 •

edited

Loading