You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -12,7 +12,7 @@ As new variables are introduced, this page will be updated to reflect the growin
12
12
13
13
:::info
14
14
15
-
This page is up-to-date with Open WebUI release version [v0.6.35](https://github.com/open-webui/open-webui/releases/tag/v0.6.35), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions.
15
+
This page is up-to-date with Open WebUI release version [v0.6.39](https://github.com/open-webui/open-webui/releases/tag/v0.6.39), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions.
16
16
17
17
:::
18
18
@@ -2167,25 +2167,94 @@ Note: this configuration assumes that AWS credentials will be available to your
- Description: Specifies the URL for the Docling server. Requires Docling version 1.0.0 or later.
2170
+
- Description: Specifies the URL for the Docling server. Requires Docling version 2.0.0 or later for full compatibility with the new parameter-based configuration system.
2171
2171
- Persistence: This environment variable is a `PersistentConfig` variable.
2172
2172
2173
-
#### `DOCLING_OCR_ENGINE`
2173
+
:::warning
2174
+
2175
+
**Docling 2.0.0+ Required**
2176
+
2177
+
The Docling integration has been refactored to use server-side parameter passing. If you are using Docling:
2178
+
2179
+
1. Upgrade to Docling server version 2.0.0 or later
2180
+
2. Migrate all individual `DOCLING_*` configuration variables to the `DOCLING_PARAMS` JSON object
2181
+
3. Remove all deprecated `DOCLING_*` environment variables from your configuration
2182
+
4. Add `DOCLING_API_KEY` if your server requires authentication
2183
+
2184
+
The old individual environment variables (`DOCLING_OCR_ENGINE`, `DOCLING_OCR_LANG`, etc.) are no longer supported and will be ignored.
2185
+
2186
+
:::
2187
+
2188
+
#### `DOCLING_API_KEY`
2174
2189
2175
2190
- Type: `str`
2176
-
- Default: `tesseract`
2177
-
- Description: Specifies the OCR engine used by Docling.
2178
-
Supported values include: `tesseract` (default), `easyocr`, `ocrmac`, `rapidocr`, and `tesserocr`.
2191
+
- Default: `None`
2192
+
- Description: Sets the API key for authenticating with the Docling server. Required when the Docling server has authentication enabled.
2179
2193
- Persistence: This environment variable is a `PersistentConfig` variable.
2180
2194
2181
-
#### `DOCLING_OCR_LANG`
2195
+
#### `DOCLING_PARAMS`
2196
+
2197
+
- Type: `str` (JSON)
2198
+
- Default: `{}`
2199
+
- Description: Specifies all Docling processing parameters in JSON format. This is the primary configuration method for Docling processing options. All previously individual Docling settings are now configured through this single JSON object.
2200
+
2201
+
**Supported Parameters:**
2202
+
-`do_ocr` (bool): Enable OCR processing
2203
+
-`force_ocr` (bool): Force OCR even when text layer exists
2204
+
-`ocr_engine` (str): OCR engine to use (`tesseract`, `easyocr`, `ocrmac`, `rapidocr`, `tesserocr`)
2205
+
-`ocr_lang` (str): OCR language codes (e.g., `eng,fra,deu,spa`)
-`picture_description_mode` (str): Mode for picture descriptions
2211
+
-`picture_description_local` (str): Local model for picture descriptions
2212
+
-`picture_description_api` (str): API endpoint for picture descriptions
2213
+
-`vlm_pipeline_model_api` (str): Vision-language model API configuration
2214
+
2215
+
- Example:
2216
+
```json
2217
+
{
2218
+
"do_ocr": true,
2219
+
"ocr_engine": "tesseract",
2220
+
"ocr_lang": "eng,fra,deu,spa",
2221
+
"force_ocr": false,
2222
+
"do_picture_description": true,
2223
+
"picture_description_mode": "api",
2224
+
"vlm_pipeline_model_api": "openai://gpt-4o"
2225
+
}
2226
+
```
2182
2227
2183
-
- Type: `str`
2184
-
- Default: `eng,fra,deu,spa` (when using the default `tesseract` engine)
2185
-
- Description: Specifies the OCR language(s) to be used with the configured `DOCLING_OCR_ENGINE`.
2186
-
The format and available language codes depend on the selected OCR engine.
2187
2228
- Persistence: This environment variable is a `PersistentConfig` variable.
2188
2229
2230
+
:::info
2231
+
2232
+
**Migration from Individual Docling Variables**
2233
+
2234
+
If you were previously using individual `DOCLING_*` environment variables (such as `DOCLING_OCR_ENGINE`, `DOCLING_OCR_LANG`, etc.), these are now deprecated. You must migrate to using `DOCLING_PARAMS` as a single JSON configuration object.
@@ -2383,9 +2452,36 @@ When configuring `RAG_FILE_MAX_SIZE` and `RAG_FILE_MAX_COUNT`, ensure that the v
2383
2452
2384
2453
- Type: `int`
2385
2454
- Default: `1`
2386
-
- Description: Controls how many text chunks are embedded in a single API request when using external embedding providers (Ollama, OpenAI, or Azure OpenAI). Higher values (20-100+; max 16000) process documents faster by sending more API requests, but may exceed API rate limits, while lower values (1-10) are more stable but slower. Default is 1 (safest option if you are API rate limit constrained, but slowest option). This setting only applies to external embedding engines, not the default SentenceTransformers engine.
2455
+
- Description: Controls how many text chunks are embedded in a single API request when using external embedding providers (Ollama, OpenAI, or Azure OpenAI). Higher values (20-100+; max 16000 (not recommended)) may process documents faster by sending less, but larger API requests. Some external APIs do not support batching or sending more than 1 chunk per request. In such casey you must leave this at `1`. Default is 1 (safest option if the API does not support batching / more than 1 chunk per request). This setting only applies to external embedding engines, not the default SentenceTransformers engine.
2456
+
- Persistence: This environment variable is a `PersistentConfig` variable.
2457
+
2458
+
:::info
2459
+
2460
+
Check if your API and embedding model supports batched processing.
2461
+
Only increase this variable's value if it does - otherwise you might run into unexpected issues.
2462
+
2463
+
:::
2464
+
2465
+
#### `ENABLE_ASYNC_EMBEDDING`
2466
+
2467
+
- Type: `bool`
2468
+
- Default: `true`
2469
+
- Description: Runs embedding tasks asynchronously (parallelized) for maximum performance. Only works for Ollama, OpenAI and Azure OpenAI, does not affect sentence transformer setups.
2387
2470
- Persistence: This environment variable is a `PersistentConfig` variable.
2388
2471
2472
+
:::tip
2473
+
2474
+
It may be needed to increase the value of `THREAD_POOL_SIZE` if many other users are simultaneously using your Open WebUI instance while having async embeddings turned on to prevent
2475
+
2476
+
:::warning
2477
+
2478
+
Enabling this will potentially send thousands of requests per minute.
2479
+
If you are embedding locally, ensure that you can handle this amount of requests, otherwise turn this off to return to sequential embedding (slower but will always work).
2480
+
If you are embedding externally via API, ensure your rate limits are high enough to handle parallel embedding.
2481
+
(Usually, OpenAI can handle thousands of embedding requests per minute, even on the lowest API tier).
0 commit comments