-
Notifications
You must be signed in to change notification settings - Fork 13.9k
models : fix LFM2 tensors #17548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
models : fix LFM2 tensors #17548
Conversation
|
Oh, ooops, cc/ @tdakhran |
tdakhran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked it, thank you for the fix @ggerganov!
| {LLM_TENSOR_POS_EMBD, {LLM_TENSOR_LAYER_INPUT, GGML_OP_GET_ROWS}}, | ||
| {LLM_TENSOR_TOKEN_EMBD_NORM, {LLM_TENSOR_LAYER_INPUT, GGML_OP_GET_ROWS}}, | ||
| {LLM_TENSOR_TOKEN_TYPES, {LLM_TENSOR_LAYER_INPUT, GGML_OP_GET_ROWS}}, | ||
| {LLM_TENSOR_TOKEN_EMBD_NORM, {LLM_TENSOR_LAYER_INPUT, GGML_OP_MUL}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never fully grasped where this is used and how, I guess this won't have any side-effects for other models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick search of tok_norm_b indicate that the token embd norm is only used by older archs like bloom or bert. I guess because at the time they have different (less efficient) training technique, thus the norm is there to make training more stable.
It should not make much of a different though, because MUL should be available on most backend now (so it's likely to be supported by what every backend holding input layer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment to clarify how this information is used: #17550
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's quite helpful.
alt #17248
LLM_TENSOR_TOKEN_EMBD_NORMtensor info