研究目的
Improving the classification accuracy on confused categories in image datasets by building a Confusion Visual Tree (CVT) and proposing Visual Tree Convolutional Neural Networks (VT-CNN).
研究成果
The VT-CNN model, which combines a visual tree structure with original CNN models, demonstrates improved performance in image classification tasks, particularly for confused categories. The experiments confirm the benefits of the VT-CNN model over original CNN models, with significant improvements in accuracy on the CIFAR-10 and CIFAR-100 datasets.
研究不足
The study focuses on improving classification accuracy for confused categories but does not address the potential increase in computational complexity due to the additional tree structure. The performance of the VT-CNN model crucially depends on the accuracy of the CVT construction.