cut it out什么意思_cutout例句

大家好，又见面了，我是你们的朋友全栈君。如果您正在找激活码,请点击查看最新教程,关注关注公众号 “全栈程序员社区” 获取激活教程,可能之前旧版本教程已经失效.最新Idea2022.1教程亲测有效,一键激活。

Jetbrains全系列IDE使用 1年只要46元售后保障童叟无欺

1.几种数据增强的比较

Mixup:将随机的两张样本按比例混合，分类的结果按比例分配；

Cutout:随机的将样本中的部分区域cut掉，并且填充0像素值，分类的结果不变；

CutMix:就是将一部分区域cut掉但不填充0像素而是随机填充训练集中的其他数据的区域像素值，分类结果按一定的比例分配

cut it out什么意思_cutout例句

区别

上述三种数据增强的区别：cutout和cutmix就是填充区域像素值的区别；mixup和cutmix是混合两种样本方式上的区别：mixup是将两张图按比例进行插值来混合样本，cutmix是采用cut部分区域再补丁的形式去混合图像，不会有图像混合后不自然的情形。

优点

(1)在训练过程中不会出现非信息像素，从而能够提高训练效率；

(2)保留了regional dropout的优势，能够关注目标的non-discriminative parts；

(3)通过要求模型从局部视图识别对象，对cut区域中添加其他样本的信息，能够进一步增强模型的定位能力；

(4)不会有图像混合后不自然的情形，能够提升模型分类的表现；

(5)训练和推理代价保持不变。

2.What does model learn with CutMix?

作者通过热力图，给出了结果。CutMix的操作使得模型能够从一幅图像上的局部视图上识别出两个目标，提高训练的效率。由图可以看出，Cutout能够使得模型专注于目标较难区分的区域(腹部)，但是有一部分区域是没有任何信息的，会影响训练效率；Mixup的话会充分利用所有的像素信息，但是会引入一些非常不自然的伪像素信息。

cut it out什么意思_cutout例句

3. 查看CutMix代码

“””输入为：样本的size和生成的随机lamda值”””

def rand_bbox(size, lam):

W = size[2]

H = size[3]

“””1.论文里的公式2，求出B的rw,rh”””

cut_rat = np.sqrt(1. – lam)

cut_w = np.int(W * cut_rat)

cut_h = np.int(H * cut_rat)

# uniform

“””2.论文里的公式2，求出B的rx,ry(bbox的中心点)”””

cx = np.random.randint(W)

cy = np.random.randint(H)

#限制坐标区域不超过样本大小

bbx1 = np.clip(cx – cut_w // 2, 0, W)

bby1 = np.clip(cy – cut_h // 2, 0, H)

bbx2 = np.clip(cx + cut_w // 2, 0, W)

bby2 = np.clip(cy + cut_h // 2, 0, H)

“””3.返回剪裁B区域的坐标值”””

return bbx1, bby1, bbx2, bby2

整体流程

“””train.py 220-244行”””

for i, (input, target) in enumerate(train_loader):

# measure data loading time

data_time.update(time.time() – end)

input = input.cuda()

target = target.cuda()

r = np.random.rand(1)

if args.beta > 0 and r < args.cutmix_prob:

# generate mixed sample

“””1.设定lamda的值，服从beta分布”””

lam = np.random.beta(args.beta, args.beta)

“””2.找到两个随机样本”””

rand_index = torch.randperm(input.size()[0]).cuda()

target_a = target#一个batch

target_b = target[rand_index] #batch中的某一张

“””3.生成剪裁区域B”””

bbx1, bby1, bbx2, bby2 = rand_bbox(input.size(), lam)

“””4.将原有的样本A中的B区域，替换成样本B中的B区域”””

input[:, :, bbx1:bbx2, bby1:bby2] = input[rand_index, :, bbx1:bbx2, bby1:bby2]

# adjust lambda to exactly match pixel ratio

“””5.根据剪裁区域坐标框的值调整lam的值”””

lam = 1 – ((bbx2 – bbx1) * (bby2 – bby1) / (input.size()[-1] * input.size()[-2]))

# compute output

“””6.将生成的新的训练样本丢到模型中进行训练”””

output = model(input)

“””7.按lamda值分配权重”””

loss = criterion(output, target_a) * lam + criterion(output, target_b) * (1. – lam)

else:

# compute output

output = model(input)

loss = criterion(output, target)

3. 查看CutOut代码

import torch

import numpy as np

class Cutout(object):

“””Randomly mask out one or more patches from an image.

Args:

n_holes (int): Number of patches to cut out of each image.

length (int): The length (in pixels) of each square patch.

“””

def __init__(self, n_holes, length):

self.n_holes = n_holes

self.length = length

def __call__(self, img):

“””

Args:

img (Tensor): Tensor image of size (C, H, W).

Returns:

Tensor: Image with n_holes of dimension length x length cut out of it.

“””

h = img.size(1)

w = img.size(2)

mask = np.ones((h, w), np.float32)

for n in range(self.n_holes):

y = np.random.randint(h)

x = np.random.randint(w)

y1 = np.clip(y – self.length // 2, 0, h)

y2 = np.clip(y + self.length // 2, 0, h)

x1 = np.clip(x – self.length // 2, 0, w)

x2 = np.clip(x + self.length // 2, 0, w)

mask[y1: y2, x1: x2] = 0.

mask = torch.from_numpy(mask)

mask = mask.expand_as(img)

img = img * mask

return img

4.Mosaic数据增强方法

Yolov4的mosaic数据增强参考了CutMix数据增强方式，理论上具有一定的相似性。CutMix数据增强方式利用两张图片进行拼接，但是mosaic利用了四张图片，根据论文所说其拥有一个巨大的优点是丰富检测物体的背景，且在BN计算的时候一下子会计算四张图片的数据。

实现思路

1.每次读取四张图片

cut it out什么意思_cutout例句

2.分别对四张图片进行翻转、缩放、色域变化等，并且按照四个方向位置摆好。

cut it out什么意思_cutout例句

3.进行图片的组合和框的组合

cut it out什么意思_cutout例句

全部代码

from PIL import Image, ImageDraw

import numpy as np

from matplotlib.colors import rgb_to_hsv, hsv_to_rgb

import math

def rand(a=0, b=1):

return np.random.rand()*(b-a) + a

def merge_bboxes(bboxes, cutx, cuty):

merge_bbox = []

for i in range(len(bboxes)):

for box in bboxes[i]:

tmp_box = []

x1,y1,x2,y2 = box[0], box[1], box[2], box[3]

if i == 0:

if y1 > cuty or x1 > cutx:

continue

if y2 >= cuty and y1 <= cuty:

y2 = cuty

if y2-y1 < 5:

continue

if x2 >= cutx and x1 <= cutx:

x2 = cutx

if x2-x1 < 5:

continue

if i == 1:

if y2 < cuty or x1 > cutx:

continue

if y2 >= cuty and y1 <= cuty:

y1 = cuty

if y2-y1 < 5:

continue

if x2 >= cutx and x1 <= cutx:

x2 = cutx

if x2-x1 < 5:

continue

if i == 2:

if y2 < cuty or x2 < cutx:

continue

if y2 >= cuty and y1 <= cuty:

y1 = cuty

if y2-y1 < 5:

continue

if x2 >= cutx and x1 <= cutx:

x1 = cutx

if x2-x1 < 5:

continue

if i == 3:

if y1 > cuty or x2 < cutx:

continue

if y2 >= cuty and y1 <= cuty:

y2 = cuty

if y2-y1 < 5:

continue

if x2 >= cutx and x1 <= cutx:

x1 = cutx

if x2-x1 < 5:

continue

tmp_box.append(x1)

tmp_box.append(y1)

tmp_box.append(x2)

tmp_box.append(y2)

tmp_box.append(box[-1])

merge_bbox.append(tmp_box)

return merge_bbox

def get_random_data(annotation_line, input_shape, random=True, hue=.1, sat=1.5, val=1.5, proc_img=True):

”’random preprocessing for real-time data augmentation”’

h, w = input_shape

min_offset_x = 0.4

min_offset_y = 0.4

scale_low = 1-min(min_offset_x,min_offset_y)

scale_high = scale_low+0.2

image_datas = []

box_datas = []

index = 0

place_x = [0,0,int(w*min_offset_x),int(w*min_offset_x)]

place_y = [0,int(h*min_offset_y),int(w*min_offset_y),0]

for line in annotation_line:

# 每一行进行分割

line_content = line.split()

# 打开图片

image = Image.open(line_content[0])

image = image.convert(“RGB”)

# 图片的大小

iw, ih = image.size

# 保存框的位置

box = np.array([np.array(list(map(int,box.split(‘,’)))) for box in line_content[1:]])

# image.save(str(index)+”.jpg”)

# 是否翻转图片

flip = rand()<.5>

if flip and len(box)>0:

image = image.transpose(Image.FLIP_LEFT_RIGHT)

box[:, [0,2]] = iw – box[:, [2,0]]

# 对输入进来的图片进行缩放

new_ar = w/h

scale = rand(scale_low, scale_high)

if new_ar < 1:

nh = int(scale*h)

nw = int(nh*new_ar)

else:

nw = int(scale*w)

nh = int(nw/new_ar)

image = image.resize((nw,nh), Image.BICUBIC)

# 进行色域变换

hue = rand(-hue, hue)

sat = rand(1, sat) if rand()<.5 else sat>

val = rand(1, val) if rand()<.5 else val>

x = rgb_to_hsv(np.array(image)/255.)

x[…, 0] += hue

x[…, 0][x[…, 0]>1] -= 1

x[…, 0][x[…, 0]<0] += 1

x[…, 1] *= sat

x[…, 2] *= val

x[x>1] = 1

x[x<0] = 0

image = hsv_to_rgb(x)

image = Image.fromarray((image*255).astype(np.uint8))

# 将图片进行放置，分别对应四张分割图片的位置

dx = place_x[index]

dy = place_y[index]

new_image = Image.new(‘RGB’, (w,h), (128,128,128))

new_image.paste(image, (dx, dy))

image_data = np.array(new_image)/255

# Image.fromarray((image_data*255).astype(np.uint8)).save(str(index)+”distort.jpg”)

index = index + 1

box_data = []

# 对box进行重新处理

if len(box)>0:

np.random.shuffle(box)

box[:, [0,2]] = box[:, [0,2]]*nw/iw + dx

box[:, [1,3]] = box[:, [1,3]]*nh/ih + dy

box[:, 0:2][box[:, 0:2]<0] = 0

box[:, 2][box[:, 2]>w] = w

box[:, 3][box[:, 3]>h] = h

box_w = box[:, 2] – box[:, 0]

box_h = box[:, 3] – box[:, 1]

box = box[np.logical_and(box_w>1, box_h>1)]

box_data = np.zeros((len(box),5))

box_data[:len(box)] = box

image_datas.append(image_data)

box_datas.append(box_data)

img = Image.fromarray((image_data*255).astype(np.uint8))

for j in range(len(box_data)):

thickness = 3

left, top, right, bottom = box_data[j][0:4]

draw = ImageDraw.Draw(img)

for i in range(thickness):

draw.rectangle([left + i, top + i, right – i, bottom – i],outline=(255,255,255))

img.show()

# 将图片分割，放在一起

cutx = np.random.randint(int(w*min_offset_x), int(w*(1 – min_offset_x)))

cuty = np.random.randint(int(h*min_offset_y), int(h*(1 – min_offset_y)))

new_image = np.zeros([h,w,3])

new_image[:cuty, :cutx, :] = image_datas[0][:cuty, :cutx, :]

new_image[cuty:, :cutx, :] = image_datas[1][cuty:, :cutx, :]

new_image[cuty:, cutx:, :] = image_datas[2][cuty:, cutx:, :]

new_image[:cuty, cutx:, :] = image_datas[3][:cuty, cutx:, :]

# 对框进行进一步的处理

new_boxes = merge_bboxes(box_datas, cutx, cuty)

return new_image, new_boxes

def normal_(annotation_line, input_shape):

”’random preprocessing for real-time data augmentation”’

line = annotation_line.split()

image = Image.open(line[0])

box = np.array([np.array(list(map(int,box.split(‘,’)))) for box in line[1:]])

iw, ih = image.size

image = image.transpose(Image.FLIP_LEFT_RIGHT)

box[:, [0,2]] = iw – box[:, [2,0]]

return image, box

if __name__ == “__main__”:

with open(“2007_train.txt”) as f:

lines = f.readlines()

a = np.random.randint(0,len(lines))

# index = 0

# line_all = lines[a:a+4]

# for line in line_all:

# image_data, box_data = normal_(line,[416,416])

# img = image_data

# for j in range(len(box_data)):

# thickness = 3

# left, top, right, bottom = box_data[j][0:4]

# draw = ImageDraw.Draw(img)

# for i in range(thickness):

# draw.rectangle([left + i, top + i, right – i, bottom – i],outline=(255,255,255))

# img.show()

# # img.save(str(index)+”box.jpg”)

# index = index+1

line = lines[a:a+4]

image_data, box_data = get_random_data(line,[416,416])

img = Image.fromarray((image_data*255).astype(np.uint8))

for j in range(len(box_data)):

thickness = 3

left, top, right, bottom = box_data[j][0:4]

draw = ImageDraw.Draw(img)

for i in range(thickness):

draw.rectangle([left + i, top + i, right – i, bottom – i],outline=(255,255,255))

img.show()

# img.save(“box_all.jpg”)

发布者：全栈程序员-站长，转载请注明出处：https://javaforall.net/193682.html原文链接：https://javaforall.net

相关推荐

研究发现VR有效地减少了不舒服的医疗过程中的痛苦_怎样照顾半身不遂的病人

Repeater控件的ItemDataBound事件

PAT乙级考试经验分享

qt中Qtcpserver服务端_qt websocket

爬虫(2)之re 爬取淘宝网「建议收藏」

数字证书原理,公钥私钥加密原理 – 因为这个太重要了[通俗易懂]

发表回复