WikiText

WikiText

The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia.

cc-by-sa-4.0
1M<n<10M
Text Generation
Fill-Mask
English
by @AIOZNetwork
2

Last updated: 5 months ago


Sign in to see dataset files

orCreate an account