Skip to content

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Nov 27, 2025

alt #17248

  • Force token embeddings to be at the start of the graph
  • Fix LFM2 output norm tensor
  • Fix LLM_TENSOR_TOKEN_EMBD_NORM tensor info

@CISC
Copy link
Collaborator

CISC commented Nov 27, 2025

Oh, ooops, cc/ @tdakhran

Copy link
Contributor

@tdakhran tdakhran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overlooked it, thank you for the fix @ggerganov!

{LLM_TENSOR_POS_EMBD, {LLM_TENSOR_LAYER_INPUT, GGML_OP_GET_ROWS}},
{LLM_TENSOR_TOKEN_EMBD_NORM, {LLM_TENSOR_LAYER_INPUT, GGML_OP_GET_ROWS}},
{LLM_TENSOR_TOKEN_TYPES, {LLM_TENSOR_LAYER_INPUT, GGML_OP_GET_ROWS}},
{LLM_TENSOR_TOKEN_EMBD_NORM, {LLM_TENSOR_LAYER_INPUT, GGML_OP_MUL}},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never fully grasped where this is used and how, I guess this won't have any side-effects for other models?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick search of tok_norm_b indicate that the token embd norm is only used by older archs like bloom or bert. I guess because at the time they have different (less efficient) training technique, thus the norm is there to make training more stable.

It should not make much of a different though, because MUL should be available on most backend now (so it's likely to be supported by what every backend holding input layer)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to clarify how this information is used: #17550

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's quite helpful.

@github-actions github-actions bot added the model Model specific label Nov 27, 2025
@ggerganov ggerganov merged commit 6783b11 into master Nov 27, 2025
64 of 66 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants