
Musical Instrument Classification
Musical instrument classification is the task of automatically recognizing and categorizing different musical instruments from audio recordings or spectrograms. It involves identifying the unique characteristics and sound patterns associated with each instrument to determine its class or type.
Musical Instrument Classification
Summary
Introduction
Musical instrument classification is the task of categorizing musical instruments based on visual information extracted from images. This task involves developing models and algorithms that can automatically identify and classify different types of musical instruments solely based on their visual characteristics.
By analyzing the visual features, shapes, textures, and structural properties of musical instruments captured in images, the classification models aim to accurately recognize and differentiate between instruments such as piano, guitar, violin, drums, saxophone, and more. This task combines elements of computer vision, image processing, and machine learning to extract relevant visual features and train models capable of robust instrument classification.
In this task, there are common labels such as didgeridoo, tambourine, xylophone, acordian, alphorn, bagpipes, banjo, bongo drum, casaba, castanets, clarinet, clavichord, concertina, drums, dulcimer, flute, guiro, guitar, harmonica, harp, marakas, ocarina, piano, saxaphone, sitar, steel drum, trombone, trumpet, tuba, violin.
Parameters
Inputs
- input - (image -.png|.jpg|.jpeg): The input of the model is an image of musical instruments.
Output
- output - (text): Includes the model's predictions for each label, displayed as probabilities.
Examples
input | output |
---|---|
![]() | ![]() |
Usage for developers
Please find below the details to track the information and access the code for processing the model on our platform.
Requirements
torch
fastai
opencv-python
Code based on AIOZ structure
from fastai.vision.all import load_learner
import cv2, torch, os
...
def do_ai_task(
context: Union[str, Path],
answer: Union[str, Path],
model_storage_directory: Union[str, Path],
device: Literal["cpu", "cuda", "gpu"] = "cpu",
*args, **kwargs) -> Any:
model_id = os.path.abspath(model_storage_directory + "...")
categories = ('didgeridoo','tambourine', ...)
learn = load_learner(model_id)
if torch.cuda.is_available():
learn.model.cuda()
pred,idx,probs = learn.predict(cv2.imread(input))
output = dict(zip(categories,map(float,probs)))
return output
Reference
This repository is based on and inspired by Neeraj Handa's work. We sincerely appreciate their generosity in sharing the code.
License
We respect and comply with the terms of the author's license cited in the Reference section.