-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
We are trying to train the GPT-CoNuT model. Following the instruction, we are trying to run the training script: src/trainer/gpt_conut_trainer.py.
However, the training fails here:
def __init__(self, train_loader, valid_loader, dictionary, gpt_file):
gpt_loaded = torch.load(gpt_file)
config = gpt_loaded['config']
gpt_model = OpenAIGPTLMHeadModel(config).cuda()
gpt_model.load_state_dict(gpt_loaded['model'])
In the very first step, this code is trying to load the model. Here, we are trying to train the model from scratch. So unless I am missing something, this does not seem correct. Can you share the artefact for training the model from scratch, please?
Looking forward to hear your feedback. Thanks in advance for the help.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels