HILANCO

Hungarian Intelligent Language Applications Consortium

« Back to Home Page

HIL-RoBERTa

Case based RoBERTa model trained on Hungarian Wikipedia published as part of the Webcorpus 2.0 dataset.
Pretraining was done in 1.25 million steps, with a batch size of 32, using a BPE encoded vocabulary of 30000 subwords.
Using a learning rate of 1e-4, the training went on for five epochs. On a configuration consisting of 4 GTX 1080Ti GPU cards with a total of 44 GB vRAM, the training took 219 hours.
For further details see: this page »

To DOWNLOAD the models, please fill out the registration form: » REGISTRATION FORM «

References

More Language Models