Add a reusable “Web Search Context” mode for all AI features (primary use case: Add Tags / auto-tagging)

### Your idea

Introduce a **general-purpose Web Search Context mode** that ThunderAI can apply to **any AI function** (summarize, rewrite, translate, classify, add tags, etc.) to optionally ground the model with lightweight web context before generating its final output.

This should be implemented as a **shared, reusable capability** (not a one-off for tagging), but with **Add Tags / auto-tagging (`prompt_add_tags`)** as the primary, first-class use case.

Key capabilities:

* A global toggle to enable “Web Search Context,” plus per-feature controls (at minimum: tagging).
* A user-editable **Business context / custom instructions** field that guides the web search step (for example, “I run a local bakery; prefer tags like Suppliers, Wholesale Orders, Delivery Apps…”).
* A **local RAG cache** of web search results so repeated contexts do not require repeated web calls.
* **No changes to output schemas** for existing prompts. For tagging specifically: do not embed sources, explanations, or metadata inside tag names; keep producing the same tag list output the tagging prompt already expects (for example `{"tags":[...]}`).

### Value

* Improves output quality for queries that benefit from up-to-date or niche information (vendors, tools, organizations, acronyms, regulations, product names).
* Makes Add Tags / auto-tagging substantially more accurate for ambiguous senders without requiring users to maintain huge rule lists.
* Reduces latency and cost over time via local caching (RAG-style reuse).
* Keeps the UX consistent because each AI feature keeps its existing output format; this only adds upstream context.

##### Proposed UX and settings

Global settings:

* `Enable Web Search Context` (default off)
* `Search provider` (configurable; allow OpenAI-compatible provider options where relevant)
* `Business context / custom instructions` (multi-line, optional; included in the web-search step)
* `RAG cache` settings:

  * enable/disable
  * TTL (for example, 7 days)
  * max entries / storage limit
  * “Clear cache” button

Per-feature controls:

* `Apply Web Search Context to:`

  * Add Tags / auto-tagging (primary)
  * Other AI features (optional checkboxes or per-command toggles)

##### Suggested implementation plan (modular)

**Phase 1: Shared “Web Search Context” module + tagging integration**

1. Create a shared module (for example `js/mzta-web-context.js`) that exposes:

   * `getWebContext({ queryTerms, businessContext, scope, cacheKey }): { contextText, sourcesMeta }`
2. Query building defaults (privacy-first):

   * Tagging default query terms: sender domain + subject
   * Optional scopes (explicit opt-in): include sanitized snippet
   * Include the Business context as guidance for query construction and/or result summarization
3. Caching (RAG-lite first):

   * Cache raw/normalized snippets + metadata locally, keyed by sender domain and query signature
   * Reuse cached context when fresh; fall back to live web search otherwise
4. Inject web context into the existing prompt pipeline:

   * Append a bounded “Web context” block (or provide a placeholder like `{%web_context%}`)
   * Do **not** alter the existing output format requirements

**Phase 2: Full local RAG cache (better reuse across features)**

* Store web results as small documents (title + snippet + source domain + timestamp).
* Retrieval:

  * baseline lexical matching (domain + keywords)
  * optional semantic retrieval if embeddings are available/configured
* For each AI function invocation, retrieve top-k relevant cached contexts and include them in the prompt context, bounded by strict limits.

**Phase 3: Provider-native web grounding (optional)**

* If a configured provider supports native web grounding, allow selecting that mode.
* Still cache the resulting context locally to avoid repeating work.

##### Guardrails and privacy notes

* Default to sending minimal data for search (sender domain + subject for tagging).
* Clearly disclose in the UI that enabling Web Search Context may transmit:

  * derived query terms
  * business context text (or a derived form of it)
  * optionally a sanitized snippet if explicitly enabled
* Enforce strict size limits for:

  * business context
  * injected web context
  * cached documents
* Hard failure behavior:

  * if web search fails, continue normally without web context (no blocking error)

##### Tagging-specific acceptance criteria (primary)

* When enabled, tagging accuracy improves on ambiguous vendor/tool emails.
* Tag output remains exactly the same shape as today (for example `{"tags":[...]}`); no extra text in tag values.
* Cache reuse reduces repeated searches for the same sender domain over time.

##### General acceptance criteria

* No behavior change when the feature is disabled.
* Works as a shared capability that other AI functions can opt into without duplicating code.
* Web search failures never break the main AI action; they only remove the extra context.


### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a reusable “Web Search Context” mode for all AI features (primary use case: Add Tags / auto-tagging) #565

Your idea

Value

Proposed UX and settings

Suggested implementation plan (modular)

Guardrails and privacy notes

Tagging-specific acceptance criteria (primary)

General acceptance criteria

Additional information

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add a reusable “Web Search Context” mode for all AI features (primary use case: Add Tags / auto-tagging) #565

Description

Your idea

Value

Proposed UX and settings

Suggested implementation plan (modular)

Guardrails and privacy notes

Tagging-specific acceptance criteria (primary)

General acceptance criteria

Additional information

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions