基于 BERT 的 NER 模型在反序列化时给出不一致的预-Python问题

BERT-based NER model giving inconsistent prediction when deserialized(基于 BERT 的 NER 模型在反序列化时给出不一致的预测)

本文介绍了基于 BERT 的 NER 模型在反序列化时给出不一致的预测的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在 Colab 云 GPU 上使用 HuggingFace 转换器库训练一个 NER 模型，对其进行腌制并将模型加载到我自己的 CPU 上以进行预测.

I am trying to train an NER model using the HuggingFace transformers library on Colab cloud GPUs, pickle it and load the model on my own CPU to make predictions.

代码

模型如下:

from transformers import BertForTokenClassification

model = BertForTokenClassification.from_pretrained(
    "bert-base-cased",
    num_labels=NUM_LABELS,
    output_attentions = False,
    output_hidden_states = False
)

我正在使用此代码段在 Colab 上保存模型

I am using this snippet to save the model on Colab

import torch

torch.save(model.state_dict(), FILENAME)

然后使用

# Initiating an instance of the model type

model_reload = BertForTokenClassification.from_pretrained(
    "bert-base-cased",
    num_labels=len(tag2idx),
    output_attentions = False,
    output_hidden_states = False
)

# Loading the model
model_reload.load_state_dict(torch.load(FILENAME, map_location='cpu'))
model_reload.eval()

用于标记文本和进行实际预测的代码片段在 Colab GPU 笔记本实例和我的 CPU 笔记本实例上是相同的.

The code snippet used to tokenize the text and make actual predictions is the same both on the Colab GPU notebook instance and my CPU notebook instance.

预期行为

经过 GPU 训练的模型表现正确，并完美地对以下标记进行了分类:

The GPU-trained model behaves correctly and classifies the following tokens perfectly:

O       [CLS]
O       Good
O       morning
O       ,
O       my
O       name
O       is
B-per   John
I-per   Kennedy
O       and
O       I
O       am
O       working
O       at
B-org   Apple
O       in
O       the
O       headquarters
O       of
B-geo   Cupertino
O       [SEP]

实际行为

加载模型并使用它在我的 CPU 上进行预测时，预测完全错误:

When loading the model and use it to make predictions on my CPU, the predictions are totally wrong:

I-eve   [CLS]
I-eve   Good
I-eve   morning
I-eve   ,
I-eve   my
I-eve   name
I-eve   is
I-geo   John
B-eve   Kennedy
I-eve   and
I-eve   I
I-eve   am
I-eve   working
I-eve   at
I-gpe   Apple
I-eve   in
I-eve   the
I-eve   headquarters
I-eve   of
B-org   Cupertino
I-eve   [SEP]

有人知道为什么它不起作用吗?我错过了什么吗?

Does anyone have ideas why it doesn't work? Did I miss something?