-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Server: Change Invalid Schema from Server Error (500) to User Error (400) #17572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Server: Change Invalid Schema from Server Error (500) to User Error (400) #17572
Conversation
ngxson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if it's better to create a dedicated class for grammar-related exceptions
|
|
||
| json body_parsed; | ||
| try { | ||
| body_parsed = oaicompat_chat_params_parse( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC oaicompat_chat_params_parse is used in multiple places. Changing it here won't fix it for other endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, that's right.
What do you think about moving the invalid schema catch handler to ex_wrapper?
llama.cpp/tools/server/server.cpp
Line 3675 in c6f7a42
| static server_http_context::handler_t ex_wrapper(server_http_context::handler_t func) { |
That would be a more robust change, but a bit larger scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with placing it inside ex_wrapper, probably better to combine this with the error class specific to grammar as I mentioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, a better approach is to wrap whatever call to json_schema_to_grammar inside an exception handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments!
Give me a day or so and I can flip to the custom exception + ex_wrapper.
Otherwise, a better approach is to wrap whatever call to
json_schema_to_grammarinside an exception handler
I didn't quite get this. Do you mean to wrap that call and re-throw as the new custom exception? Most of those calls are in common/chat.cpp via oaicompat_chat_params_parse in server.cpp. Can you explain more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm ok I thought that oaicompat_chat_params_parse can directly return error response instead of std::exception, but it can't.
for now, I think it's ok to use invalid_argument as 400 and generic exception as 500
- please update the
ex_wrapperto return the error using this convention - for everything else inside
oaicompat_chat_params_parsethat throws an exception, we should convert them toinvalid_argument
Sure, we can create a custom exception type for grammar errors. |
Problem
Currently when a client passes an invalid schema to
llama-server, the server returns an Internal Server Error (500). This can cause the client to repeatedly retry the request with backoff. Since an invalid schema will never be fixed by a retry, the user just sees a long processing time and then eventual failure.Proposed Fix
This PR changes the server error to a Bad Request (400). That prevents the client from retrying, and surfaces to the invalid schema error back to the client.
Issues
This is a partial fix for Issue 17574.
Tests
$ ctest --test-dir build$ ./tools/server/tests/tests.shSomething went wrong. Here's the specific error message we encountered: An error occurred while processing the request: 400 JSON schema conversion failed: Unrecognized schema: {"not":{}}instead of long pause then unknown failure.Alternative Solutions
An alternative solution could move this logic to
ex_wrapperintools/server/server.cpp. I went with the smaller behavior change for now, but can pursue that solution instead if desired.