Datasets

Search
all
verified
face_anti_spoofing_data
Face Anti-Spoofing Challenge

Dataset for Face Anti-Spoofing Challenge

spaceship_titanic_data
Spaceship Titanic Challenge

Dataset for Spaceship Titanic Challenge

movie_reviews_data
Movie Reviews Challenge

Dataset for Movie Reviews Challenge

housing_price_data
Housing Prices Challenge

Dataset for Housing prices, include train and test data.

MediaContent-2025
Media Content Gold Set

Multimodal dataset containing 1.2M media assets with metadata tags, sourced from licensed content libraries. Suitable for recommendation system training.

GAHSE-DIVINE
DIVINE-GAHSE

DUVINE PTOJECT

Radiology-Images-VQA
A dataset of clinically generated visual questions and answers about radiology images

Radiology images are an essential part of clinical decision making and population screening, e.g., for cancer. Automated systems could help clinicians cope with large amounts of images by answering questions about the image contents. An emerging area of artificial intelligence, Visual Question Answering (VQA) in the medical domain explores approaches to this form of clinical decision support. Success of such machine learning tools hinges on availability and design of collections composed of medical images augmented with question-answer pairs directed at the content of the image. We introduce VQA-RAD, the first manually constructed dataset where clinicians asked naturally occurring questions about radiology images and provided reference answers. Manual categorization of images and questions provides insight into clinically relevant tasks and the natural language to phrase them.

XQuAD
XQuAD

This dataset is a great resource for researchers who want to evaluate cross-lingual question answering performance.

CommonGen
CommonGen

Building machines with commonsense to compose realistically plausible sentences is challenging. CommonGen is a constrained text generation task, associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning. Given a set of common concepts; the task is to generate a coherent sentence describing an everyday sce- nario using these concepts.

BLiMP
BLiMP

The Benchmark of Linguistic Minimal Pairs, a challenge set for evaluating the linguistic knowledge of language models (LMs) on major grammatical phenomena in English, finds that state-of-the-art models identify morphological contrasts related to agreement reliably, but they struggle with some subtle semantic and syntactic phenomena.

TAL-SCQ5K
TAL-SCQ5K

TAL-SCQ5K are high-quality mathematical competition datasets created by TAL Education Group.

X-CSR
X-CSR

To create these datasets, the authors automatically translated the original CSQA and CODAH datasets, originally available only in English, into 15 other languages.

DOCCI
DOCCI

The DOCCI dataset consists of comprehensive descriptions on 15k images specifically taken with the objective of evaluating T2I and I2T models. These cover a lot of key details in the images, as illustrated below.

AI2_Reasoning_Challenge
AI2 Reasoning Challenge

The ARC dataset consists of 7,787 science exam questions drawn from a variety of sources, including science questions, provided under license by a research partner affiliated with AI2.

PLOD_An_Abbreviation_Detection_Dataset
PLOD: An Abbreviation Detection Dataset

This is the repository for PLOD Dataset subset being used for CW in NLP module 2023-2024 at University of Surrey.

NIH_Chest_X_ray
NIH Chest X-Ray

NIH Chest X-Ray is a large dataset containing chest X-ray images of patients collected by the National Institutes of Health (NIH) of the United States.

MNIST
MNIST

MNIST is used to train and evaluate image classification models in complex tasks.

1