Classification: BERT-CNN

In this guide, we prepare a BERT-CNN ensemble which takes the embeddings generated by the BERT base model and feeds them into a CNN. The general logic from this guide can be used to replace the CNN with any other NN of your choice. While it is a fun task to explore, adding on what is technically an inferior model on-top of a Transformer is not really necessary. Like other guides, this walk through provides a complete treatment of the data preparation and training of the BERT-CNN ensemble in PyTorch.