Datasets

Search
all
verified

Popular Datasets

super-dataset-in-the-world
Super Dataset In The World

The Super Dataset in the World is a groundbreaking, all-encompassing data repository designed to empower researchers, developers, and industry professionals with an unparalleled resource for machine learning, data analytics, and AI innovation. Meticulously curated from diverse, high-quality sources across multiple domains, this dataset sets a new benchmark in data comprehensiveness, accuracy, and scalability

BLiMP
BLiMP

The Benchmark of Linguistic Minimal Pairs, a challenge set for evaluating the linguistic knowledge of language models (LMs) on major grammatical phenomena in English, finds that state-of-the-art models identify morphological contrasts related to agreement reliably, but they struggle with some subtle semantic and syntactic phenomena.

NIH_Chest_X_ray
NIH Chest X-Ray

NIH Chest X-Ray is a large dataset containing chest X-ray images of patients collected by the National Institutes of Health (NIH) of the United States.

moodeng-dataset-pro-1.42
Moodeng Dataset Pro-v1.42

moodeng-dataset-pro-1.42

PLOD_An_Abbreviation_Detection_Dataset
PLOD: An Abbreviation Detection Dataset

This is the repository for PLOD Dataset subset being used for CW in NLP module 2023-2024 at University of Surrey.

XQuAD
XQuAD

This dataset is a great resource for researchers who want to evaluate cross-lingual question answering performance.

MNIST
MNIST

MNIST is used to train and evaluate image classification models in complex tasks.

X-CSR
X-CSR

To create these datasets, the authors automatically translated the original CSQA and CODAH datasets, originally available only in English, into 15 other languages.