
Housing Prices Challenge
Dataset for Housing prices, include train and test data.
other
1M<n<10M
Tabular Regression
English
Housing Price Data
Summary
Introduction
This dataset is based on Gigasheet(2023) that contains detailed real estate listings, including property type, price, location, bedrooms, bathrooms, area (in Marla), and purpose (rent/sale). Designed for housing market analysis, it supports trend assessment, comparative pricing, and data-driven investment decisions. It is also used for the Housing Price Challenge, enabling predictive modeling and valuation insights.
Dataset Structure
Include
The dataset consists of two files:
- train.csv:
- Used for training machine learning models.
- Contains 69,649 entries with detailed property features (e.g., type, price, location) and corresponding sale prices.
- test.csv:
- Used for generating predictions to submit to the challenge platform.
- Contains 29,850 entries for evaluation. Submissions are scored and displayed on the Public Leaderboard.
An example from train.csv:
house_index,property_type,price,location,city,baths,purpose,bedrooms,Area_in_Marla
0,Flat,10000000,G-10,Islamabad,2,For Sale,2,4.0
An example from test.csv:
house_index,property_type,location,city,baths,purpose,bedrooms,Area_in_Marla
110571,House,Bahria Town Karachi,Karachi,3,For Sale,3,8.0
Data Fields
The train.csv has the following fields:
- House_index: Unique identifier for each property.
- Property_type: Type of the property (e.g., House, Apartment, Plot).
- Location: Specific location or neighborhood of the property.
- City: City in which the property is located.
- Bath: Number of bathrooms.
- Bedrooms: Number of bedrooms.
- Area_in_marla: Total area of the property in marla.
- Purpose: Indicates whether the property is for sale or rent.
- Price: Target variable representing the property price.
Reference
For more information about the dataset, please visit the gigasheet website.
License
This repository is licensed under the MIT License.