Add support for request caching

### What feature would you like to be added?

Implement a caching mechanism for LLM API calls to reduce unnecessary API calls, similar to that of in 0.2. 

When enabled, this feature should allow us to retrieve cached responses for identical LLM requests instead of making new API calls. Ideal to include a configuration flag to enable/disable caching as well as for managing the cache.

We don't need to follow the same API as in the 0.2 version. We can have the cache managed by the model client instead.

### Why is this needed?

Save cost on identical inference requests. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for request caching #3637

What feature would you like to be added?

Why is this needed?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add support for request caching #3637

Description

What feature would you like to be added?

Why is this needed?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions