All Models

Search
all
AIOZ AI
Decoupled-DMD-Distillation

Decoupled DMD Distillation

Decoupled DMD reveals that Distribution Matching Distillation's success in few-step diffusion models comes from two distinct mechanisms: CFG Augmentation as the core engine for multi-to-few-step conversion, and Distribution Matching as a regularizer for training stability. This new understanding enables principled improvements through decoupled noise schedules, achieving state-of-the-art performance in efficient image generation and powering the top-tier 8-step Z-Image model.

face_anti_spoofing_model

Face Anti-Spoofing Challenge

Model for Face Anti-Spoofing Challenge

user-avatar
34
118
Background_Removal

Background Removal

Background Removal is an image processing technique, used to separate the main object from the background of a photo. Removing the background helps highlight the product, subject, or character, bringing a professional and aesthetically pleasing look to the image.

user-avatar
313
174
SHARP-Monocular-View-Synthesis

SHARP Monocular View Synthesis

SHARP (Single-image High-Accuracy Real-time Parallax) is a photorealistic view synthesis model that generates high-quality 3D Gaussian representations from a single photograph in under one second. It supports real-time rendering of nearby views at over 100 FPS with metric camera movements, achieving state-of-the-art performance by reducing LPIPS by 25-34% and DISTS by 21-43% compared to prior models while being 1000× faster.

user-avatar
129
35
LTX-2-Audiovisual-Model

LTX-2 Audiovisual Model

LTX-2 is an open-source audiovisual foundation model that generates high-quality, temporally synchronized video and audio content from text prompts. It features an asymmetric dual-stream transformer architecture (14B video + 5B audio parameters) with bidirectional cross-attention, achieving state-of-the-art quality among open-source systems while being 18× faster than comparable models and supporting up to 20 seconds of continuous generation.

Miro-Thinker

Miro Thinker

MiroThinker is an open-source research agent that advances tool-augmented reasoning through model, context, and interactive scaling. It achieves state-of-the-art performance among open-source agents by enabling up to 600 tool calls per task within a 256K context window, demonstrating that interaction depth is a critical dimension for building next-generation research agents alongside model size and context length.

122
42
Image_Super_Resolution_SMFANet

Image Super-Resolution with SMFANet

Image Super-Resolution with SMFANet involves utilizing the SMFANet model architecture to enhance the resolution and quality of images. SMFANet is a deep learning network designed for super-resolution tasks, aiming to generate high-quality, detailed images from low-resolution inputs.

user-avatar
280
163
Low_light_Image_Enhancement

Low-light Image Enhancement

Low light Image Enhancement is a task focused on improving the quality and visibility of images captured in low-light conditions. This task involves applying image processing techniques and algorithms to enhance details, reduce noise, and increase brightness in photos taken in dimly lit environments.

user-avatar
299
170
MediaPipe_Face_Detection

MediaPipe Face Detection

Face detection is a computer vision technique that involves identifying and locating human faces within an image or video. The goal of face detection is to detect the presence of faces, and draw bounding boxes around them, without necessarily identifying specific facial features or landmarks.

user-avatar
269
162
MediaPipe_Face_Mesh_Ploting

MediaPipe Face Mesh Plotting

Face mesh detection, also known as facial landmark detection or face pose estimation, is the task of identifying and localizing specific keypoints or landmarks on a human face. It involves detecting the positions of facial features, such as eyes, eyebrows, nose, mouth, and jawline, in an image or video.

user-avatar
293
164
image_blend_multiple_method

Image Blending with Multiple Methods

Image Blending with Multiple Methods is a task that involves combining two or more images seamlessly to create a composite image using a variety of blending techniques. By leveraging multiple blending methods, such as alpha blending, gradient blending, or Laplacian pyramid blending, this task enables the merging of images while preserving the visual coherence and integrity of the final composition.

user-avatar
300
164
Video_To_Canny_Edge

Video to Canny Edge

Video to Canny Edge is the process of converting a video into a Canny edge representation, where edges in the video are emphasized and separated. Canny Edge is a popular algorithm in image processing and is often used to detect edges in images and videos.

user-avatar
298
165
XFeat

XFeat: Accelerated Features for Lightweight Image Matching

This is a task focused on enhancing the efficiency of image matching by leveraging lightweight yet highly discriminative features. XFeat employs optimized feature extraction techniques to identify key points and patterns in images, making it suitable for fast and accurate image comparison.

user-avatar
198
166
DehazeFormer

DehazeFormer

DehazeFormer is a deep learning model designed for single image haze removal. Leveraging a transformer-based architecture, it effectively restores image clarity and contrast by removing haze, making it suitable for applications in autonomous driving, remote sensing, and image enhancement.

user-avatar
192
158
Color_Harmonization

Color Harmonization

Color Harmonization is a computational model designed to adjust and enhance the color balance of an image based on harmony templates, as proposed in the paper by Daniel Cohen-Or et al. The model improves visual aesthetics by aligning image colors with established color harmony principles. It supports multiple harmony templates and can be integrated with user interfaces for visual quality assessment using metrics from interfacemetrics.aalto.fi.

user-avatar
184
159
Background_Replacement

Background Replacement

Background Replacement is a powerful tool that enables users to easily change the background of their images, opening up endless possibilities for creative transformations, and visual enhancements.

user-avatar
291
163
Image_Super_Resolution_SeemoRe

Image Super-Resolution with SeemoRe

Image Super-Resolution with SeemoRe is a task aimed at improving the process of image super-resolution by leveraging expertise in the field. This task involves incorporating techniques that identify and utilize expert knowledge or specialized information to enhance the efficiency and accuracy of image upscaling.

user-avatar
279
169
Color_Extraction

Color Extraction

Color Extraction is a task in computer vision that involves the extraction and analysis of colors from images or videos. The objective of this task is to identify and isolate specific colors, or color ranges present in the visual data.

user-avatar
349
171
Text_Generation_by_LiteLlama

Text Generation by LiteLlama

We present an open-source reproduction of Meta AI's LLaMa 2. However, with significantly reduced model sizes, LiteLlama-460M-1T has 460M parameters trained with 1T tokens.

user-avatar
119
140
Archer_Image_Generator

Archer Image Generator

Archer Image Generator user Archer Diffusion, is a highly specialized Image generation AI Model of type Safetensors / Checkpoint AI Model created by AI community user civitai. Derived from the powerful Stable Diffusion (SD 1.5) model, Archer Diffusion has undergone an extensive fine-tuning process, leveraging the power of a dataset consisting of images generated by other AI models or user-contributed data. This fine-tuning process ensures that Archer Diffusion is capable of generating images that are highly relevant to the specific use-cases it was designed for, such as landscapes, nitrosocke, archer.

user-avatar
118
151