研究目的
Developing a general method for detecting local artifacts in videos by learning differences between distorted and pristine video frames, and producing a full resolution map of artifact detection probabilities.
研究成果
The VID-MAP model significantly outperforms previous state-of-the-art methods in detecting and localizing video artifacts. Future work includes applying the model to other video artifact detection problems and incorporating temporal information to enhance detection capabilities.
研究不足
The model currently does not utilize temporal information, which could enrich the dimensionality and diversity of the VID-MAP model. Additionally, training on multiple artifact locations within a single image could improve generality.
1:Experimental Design and Method Selection:
The VID-MAP model uses a CNN architecture to learn to detect and localize video artifacts. It processes images through a local band-pass filtering operation followed by a local non-linear divisive normalization.
2:Sample Selection and Data Sources:
Pristine patches from the Netflix video collection were used, divided into non-overlapping training and testing sets. Positive samples of upscaling were produced by downscaling pristine video frames before upscaling.
3:List of Experimental Equipment and Materials:
A Tesla K40 GPGPU was used for developing the convolutional network.
4:Experimental Procedures and Operational Workflow:
The model was trained using a batch size of 100, ensuring class balance at each iteration.
5:Data Analysis Methods:
Performance was evaluated using the F1 score, which is the harmonic mean of precision and recall.
独家科研数据包,助您复现前沿成果,加速创新突破
获取完整内容