1. 特征金字塔池化
如上图所示:
- 将特征图的所有像素划分为 n × n n\times n n×n个网格,对每个网格进行池化,池化层的核大小即为网格大小,宽度不符合时可以padding
- 取不同的n值,重复1过程;
- 将上述过程得到的所有结果经过flatten和concat,得到 C × N C\times N C×N格式的特征图,可以直接用于全连接。
输出的结果只与 n n n值和通道数量相关,而与输入Tensor的形状无关(当然不能太小,否则池化结果为0)
2. 实现
完整代码连接:古承风的gitee
以下是核心代码
def _spp_layer(self,x:torch.Tensor,mode='max',grid_nums:list=[16]):
""" output_num denote an grid's width steps: --- 1. compute width for specific output_num, sqrt(num) 2. compute pooling's kernel_size and stride 3. pooling 4. concat all the output """
N,C,H,W = x.size()
for i in range(len(grid_nums)):
# step1
h = ceil(H/(sqrt(grid_nums[i])))
w = ceil(W/(sqrt(grid_nums[i])))
h_pad = int(((h*sqrt(grid_nums[i])+1)-H)/2)
w_pad = int(((w*sqrt(grid_nums[i])+1)-W)/2)
# step2
if mode == "max":
pool = nn.MaxPool2d(kernel_size=(h,w),stride=(h,w),padding=(h_pad,w_pad))
elif mode=='avg':
pool = nn.AvgPool2d(kernel_size=(h,w),stride=(h,2),padding=(h_pad,w_pad))
else:
raise ValueError(f"{
mode} mode type error ,expect 'max' and 'avg'")
temp = pool(x) # to origin x , means pyramid pooling
# if for fully connected , could use this concat method
if i == 0:
output = temp.view(N,-1)
else:
output = torch.concat((output,temp.view(N,-1)),-1)
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/232170.html原文链接:https://javaforall.net
