Training script for GPT-CoNuT model does not work

We are trying to train the GPT-CoNuT model. Following the [instruction](https://github.com/nashid/CURE), we are trying to run the training script: src/trainer/gpt_conut_trainer.py.

However, the training fails here:

https://github.com/lin-tan/CURE/blob/master/src/trainer/gpt_conut_trainer.py#L22

```
    def __init__(self, train_loader, valid_loader, dictionary, gpt_file):
        gpt_loaded = torch.load(gpt_file)
        config = gpt_loaded['config']
        gpt_model = OpenAIGPTLMHeadModel(config).cuda()
        gpt_model.load_state_dict(gpt_loaded['model'])
```

In the very first step, this code is trying to load the model. Here, we are trying to train the model from scratch. So unless I am missing something, this does not seem correct. Can you share the artefact for training the model from scratch, please?

Looking forward to hear your feedback. Thanks in advance for the help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training script for GPT-CoNuT model does not work #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Training script for GPT-CoNuT model does not work #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions