手把手教你用Python和OpenCV搭建一个半自动标注工具(详细步骤 + 源码)

2023-09-27

导  读

    本文将手把手教你用Python和OpenCV搭建一个半自动标注工具(包含详细步骤 + 源码)。




pip install pyOpenAnnotate















import cv2import numpy as npimport matplotlib.pyplot as pltplt.rcParams['image.cmap'] = 'gray'


stags = cv2.imread('stags.jpg')boars = cv2.imread('boar.jpg')berries = cv2.imread('strawberries.jpg')fishes = cv2.imread('fishes.jpg')coins = cv2.imread('coins.png')boxes = cv2.imread('boxes2.jpg')

    选择色彩空间(这里添加了 RGB和HSV,存储在字典中,方便验证使用):​​​​​​​

def select_colorsp(img, colorsp='gray'):    # Convert to grayscale.    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)    # Split BGR.    red, green, blue = cv2.split(img)    # Convert to HSV.    im_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)    # Split HSV.    hue, sat, val = cv2.split(im_hsv)    # Store channels in a dict.    channels = {'gray':gray, 'red':red, 'green':green,                 'blue':blue, 'hue':hue, 'sat':sat, 'val':val}         return channels[colorsp]

    显示 1×2 图像的实用函数(display()函数接受两个图像并并排绘制。可选参数是绘图的标题和图形大小):​​​​​​​

def display(im_left, im_right, name_l='Left', name_r='Right', figsize=(10,7)):         # Flip channels for display if RGB as matplotlib requires RGB.    im_l_dis = im_left[...,::-1]  if len(im_left.shape) > 2 else im_left    im_r_dis = im_right[...,::-1] if len(im_right.shape) > 2 else im_right         plt.figure(figsize=figsize)    plt.subplot(121); plt.imshow(im_l_dis);    plt.title(name_l); plt.axis(False);    plt.subplot(122); plt.imshow(im_r_dis);    plt.title(name_r); plt.axis(False);

    阈值处理(thresh()函数接受1通道灰度图像,默认阈值设置为 127。执行逆阈值处理,方便轮廓分析,它返回单通道阈值图像):​​​​​​​

def threshold(img, thresh=127, mode='inverse'):    im = img.copy()         if mode == 'direct':        thresh_mode = cv2.THRESH_BINARY    else:        thresh_mode = cv2.THRESH_BINARY_INV         ret, thresh = cv2.threshold(im, thresh, 255, thresh_mode)             return thresh





# Select colorspace.gray_stags = select_colorsp(stags)# Perform thresholding.thresh_stags = threshold(gray_stags, thresh=110)  # Display.display(stags, thresh_stags,         name_l='Stags original infrared',         name_r='Thresholded Stags',        figsize=(20,14))



def morph_op(img, mode='open', ksize=5, iterations=1):    im = img.copy()    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(ksize, ksize))         if mode == 'open':        morphed = cv2.morphologyEx(im, cv2.MORPH_OPEN, kernel)    elif mode == 'close':        morphed = cv2.morphologyEx(im, cv2.MORPH_CLOSE, kernel)    elif mode == 'erode':        morphed = cv2.erode(im, kernel)    else:        morphed = cv2.dilate(im, kernel)         return morphed​​​​​​​
# Perform morphological operation.morphed_stags = morph_op(thresh_stags) # Display.display(thresh_stags, morphed_stags,         name_l='Thresholded Stags',         name_r='Morphological Operations Result',        figsize=(20,14))


bboxes = get_bboxes(morphed_stags)ann_morphed_stags = draw_annotations(stags, bboxes, thickness=5, color=(0,0,255)) # Display.display(ann_stags, ann_morphed_stags,         name_l='Annotating Thresholded Stags',         name_r='Annotating Morphed Stags',        figsize=(20,14))


def get_filtered_bboxes(img, min_area_ratio=0.001):    contours, hierarchy = cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)    # Sort the contours according to area, larger to smaller.    sorted_cnt = sorted(contours, key=cv2.contourArea, reverse = True)    # Remove max area, outermost contour.    sorted_cnt.remove(sorted_cnt[0])    # Container to store filtered bboxes.    bboxes = []    # Image area.    im_area = img.shape[0] * img.shape[1]    for cnt in sorted_cnt:        x,y,w,h = cv2.boundingRect(cnt)        cnt_area = w * h        # Remove very small detections.        if cnt_area > min_area_ratio * im_area:            bboxes.append((x, y, x+w, y+h))    return bboxes


bboxes = get_filtered_bboxes(thresh_stags, min_area_ratio=0.001)filtered_ann_stags = draw_annotations(stags, bboxes, thickness=5, color=(0,0,255)) # Display.display(ann_stags, filtered_ann_stags,         name_l='Annotating Thresholded Stags',         name_r='Annotation After Filtering Smaller Boxes',        figsize=(20,14))



    Pascal VOC、YOLO和COCO 是对象检测中使用的三种流行注释格式。让我们研究一下它们的结构。

    I. Pascal VOC 以 XML 格式存储注释

    II. YOLO标注结果保存在文本文件中。对于每个边界框,它看起来如下所示。这些值相对于图像的高度和宽度进行了归一化。

0 0.0123 0.2345 0.123 0.754
<object-class> <x_centre_norm> <y_centre_norm> <box_width_norm> <box_height_norm>

    让边界框的左上角和右下角坐标表示为(x1, y1)和(x2, y2)。然后:


    这里以YOLO Darknet保存格式为例(当然,你可以保存其他格式):​​​​​​​

def save_annotations(img, bboxes):    img_height = img.shape[0]    img_width = img.shape[1]    with open('image.txt', 'w') as f:        for box in boxes:            x1, y1 = box[0], box[1]            x2, y2 = box[2], box[3]                         if x1 > x2:                x1, x2 = x2, x1            if y1 > y2:                y1, y2 = y2, y1                             width = x2 - x1            height = y2 - y1            x_centre, y_centre = int(width/2), int(height/2)             norm_xc = x_centre/img_width            norm_yc = y_centre/img_height            norm_width = width/img_width            norm_height = height/img_height             yolo_annotations = ['0', ' ' + str(norm_xc),                                 ' ' + str(norm_yc),                                 ' ' + str(norm_width),                                 ' ' + str(norm_height), '\n']                         f.writelines(yolo_annotations)

