您现在的位置是：首页 > .Net

当前栏目

如何处理 SSD 神经网络在小目标检测数据集上 mAP 和置信度较低的问题

2023-02-18 16:32:03 时间

前言

SSD 的神经网络结构很简洁，可以较好的实现多尺度的目标检测，但是对小目标物体的检测效果并不是很好。虽然有很多 SSD 的魔改版本，比如 FSSD 和 DSSD，提高了 SSD 在小目标检测上的表现，但是这里我们只讨论怎么使用 SSD 来更好地检测小目标，尤其是那些特征非常简单的目标。

YOLO 的启发

在 Yolo V3 中使用了先验框聚类的方式来决定先验框的尺寸，而在 SSD 的原始版本中是通过公式来决定先验框的尺寸，最小的先验框尺寸都有 30。如果我们的目标很小，比如只有十几像素，那么使用这些先验框训练出来的 SSD 模型的表现大概率是差强人意的。所以我们可以在自己的数据集上对先验框进行聚类，下面给出聚类的代码：

# coding:utf-8
from pathlib import Path
from xml.etree import ElementTree as ET

import numpy as np



def iou(box: np.ndarray, boxes: np.ndarray):
    """ 计算一个边界框和其他边界框的交并比
    Parameters
    ----------
    box: `~np.ndarray` of shape `(4, )`
        边界框
    boxes: `~np.ndarray` of shape `(n, 4)`
        其他边界框

    Returns
    -------
    iou: `~np.ndarray` of shape `(n, )`
        交并比
    """
    # 计算交集
    xy_max = np.minimum(boxes[:, 2:], box[2:])
    xy_min = np.maximum(boxes[:, :2], box[:2])
    inter = np.clip(xy_max-xy_min, a_min=0, a_max=np.inf)
    inter = inter[:, 0]*inter[:, 1]

    # 计算并集
    area_boxes = (boxes[:, 2]-boxes[:, 0])*(boxes[:, 3]-boxes[:, 1])
    area_box = (box[2]-box[0])*(box[3]-box[1])

    # 计算 iou
    iou = inter/(area_box+area_boxes-inter)  # type: np.ndarray
    return iou


class AnchorKmeans:
    """ 先验框聚类 """

    def __init__(self, annotation_dir: str):
        self.annotation_dir = Path(annotation_dir)
        if not self.annotation_dir.exists():
            raise ValueError(f'标签文件夹 `{annotation_dir}` 不存在')

        self.bbox = self.get_bbox()

    def get_bbox(self) -> np.ndarray:
        """ 获取所有的边界框 """
        bbox = []

        for path in self.annotation_dir.glob('*.xml'):
            root = ET.parse(path).getroot()

            # 图像的宽度和高度
            w = int(root.find('size/width').text)
            h = int(root.find('size/height').text)

            # 获取所有边界框
            for obj in root.iter('object'):
                box = obj.find('bndbox')

                # 归一化坐标
                xmin = int(box.find('xmin').text)/w
                ymin = int(box.find('ymin').text)/h
                xmax = int(box.find('xmax').text)/w
                ymax = int(box.find('ymax').text)/h

                bbox.append([0, 0, xmax-xmin, ymax-ymin])

        return np.array(bbox)

    def get_cluster(self, n_clusters=9, metric=np.median):
        """ 获取聚类结果

        Parameters
        ----------
        n_clusters: int
            聚类数

        metric: callable
            选取聚类中心点的方式
        """
        rows = self.bbox.shape[0]

        if rows < n_clusters:
            raise ValueError("n_clusters 不能大于边界框样本数")

        last_clusters = np.zeros(rows)
        clusters = np.ones((n_clusters, 2))
        distances = np.zeros((rows, n_clusters))  # type:np.ndarray

        # 随机选取出几个点作为聚类中心
        np.random.seed(1)
        clusters = self.bbox[np.random.choice(rows, n_clusters, replace=False)]

        # 开始聚类
        while True:
            # 计算距离
            distances = 1-self.iou(clusters)

            # 将每一个边界框划到一个聚类中
            nearest_clusters = distances.argmin(axis=1)

            # 如果聚类中心不再变化就退出
            if np.array_equal(nearest_clusters, last_clusters):
                break

            # 重新选取聚类中心
            for i in range(n_clusters):
                clusters[i] = metric(self.bbox[nearest_clusters == i], axis=0)

            last_clusters = nearest_clusters

        return clusters[:, 2:]

    def average_iou(self, clusters: np.ndarray):
        """ 计算 IOU 均值

        Parameters
        ----------
        clusters: `~np.ndarray` of shape `(n_clusters, 2)`
            聚类中心
        """
        clusters = np.hstack((np.zeros((clusters.shape[0], 2)), clusters))
        return np.mean([np.max(iou(bbox, clusters)) for bbox in self.bbox])

    def iou(self, clusters: np.ndarray):
        """ 计算所有边界框和所有聚类中心的交并比

        Parameters
        ----------
        clusters: `~np.ndarray` of shape `(n_clusters, 4)`
            聚类中心

        Returns
        -------
        iou: `~np.ndarray` of shape `(n_bbox, n_clusters)`
            交并比
        """
        bbox = self.bbox
        A = self.bbox.shape[0]
        B = clusters.shape[0]

        xy_max = np.minimum(bbox[:, np.newaxis, 2:].repeat(B, axis=1),
                            np.broadcast_to(clusters[:, 2:], (A, B, 2)))
        xy_min = np.maximum(bbox[:, np.newaxis, :2].repeat(B, axis=1),
                            np.broadcast_to(clusters[:, :2], (A, B, 2)))

        # 计算交集面积
        inter = np.clip(xy_max-xy_min, a_min=0, a_max=np.inf)
        inter = inter[:, :, 0]*inter[:, :, 1]

        # 计算每个矩阵的面积
        area_bbox = ((bbox[:, 2]-bbox[:, 0])*(bbox[:, 3] -
                     bbox[:, 1]))[:, np.newaxis].repeat(B, axis=1)
        area_clusters = ((clusters[:, 2] - clusters[:, 0])*(
            clusters[:, 3] - clusters[:, 1]))[np.newaxis, :].repeat(A, axis=0)

        return inter/(area_bbox+area_clusters-inter)


if __name__ == '__main__':
    # 标签文件夹
    root = 'data/Hotspot/Annotations'
    model = AnchorKmeans(root)
    clusters = model.get_cluster(9)

    # 将先验框还原为原本的大小
    print('聚类结果:\n', clusters*300)
    print('平均 IOU:', model.average_iou(clusters))

将代码中的先验框尺寸参照聚类的结果进行修改，不出意外的话是可以提升 mAP 和置信度的，以上~~

猜你喜欢

最简单的人工神经网络
5 秒克隆声音「GitHub 热点速览 v.21.34」
解决：sentry + loguru 不告警的问题
面试反杀「GitHub 热点速览 v.21.33」
相见恨晚！开源的傻瓜搜索引擎，帮你快速实现搜索功能
那些 Unix 命令替代品们「GitHub 热点速览 v.21.32」
承载童年的游戏机，已停产！但我在 GitHub 找到了它们
实现工具自由，开源的桌面工具箱
自制车速记录仪「GitHub 热点速览 v.21.31」
开源百宝箱《HelloGitHub》第 64 期
在线体验 Windows 11「GitHub 热点速览 v.21.30」
你的站点抗压么？推荐一款超方便的开源压测工具
AI 预测蛋白质结构「GitHub 热点速览 v.21.29」
SQL 查询并不是从 SELECT 开始的
获取 Windows 密码「GitHub 热点速览 v.21.28」
手痒想写项目？我挑了 10 个开源项目送你
互联网巨头们的 SRE 运维实践「GitHub 热点速览 v.21.27」
我成了 GitHub Star
你的电脑适合升级 Win11 吗？「GitHub 热点速览 v.21.26」
有趣的开源项目集结完毕，HelloGitHub 月刊第 63 期发布啦！

zl程序教程

当前栏目

如何处理 SSD 神经网络在小目标检测数据集上 mAP 和置信度较低的问题

前言

YOLO 的启发

相关文章