一、算法
1.1 DEFLATE
DEFLATE
是同时使用了LZ77算法与哈夫曼编码
(Huffman Coding)的一个无损数据压缩算法,jdk中对zlib压缩库提供了支持,压缩类Deflater和解压类Inflater,Deflater和Inflater都提供了相应的native方法。
public static byte[] compress(byte input[]) {
ByteArrayOutputStream bos = new ByteArrayOutputStream(); Deflater compressor = new Deflater(1); try {
compressor.setInput(input); compressor.finish(); final byte[] buf = new byte[2048]; while (!compressor.finished()) {
int count = compressor.deflate(buf); bos.write(buf, 0, count); } } finally {
compressor.end(); } return bos.toByteArray(); } public static byte[] uncompress(byte[] input) throws DataFormatException {
ByteArrayOutputStream bos = new ByteArrayOutputStream(); Inflater decompressor = new Inflater(); try {
decompressor.setInput(input); final byte[] buf = new byte[2048]; while (!decompressor.finished()) {
int count = decompressor.inflate(buf); bos.write(buf, 0, count); } } finally {
decompressor.end(); } return bos.toByteArray(); }
1.2 gzip
gzip
的实现算法还是deflate,只是在deflate格式上增加了文件头和文件尾,同样jdk也对gzip提供了支持,分别是GZIPOutputStream和GZIPInputStream类,同样可以发现GZIPOutputStream是继承于DeflaterOutputStream的,GZIPInputStream继承于InflaterInputStream,并且可以在源码中发现writeHeader和writeTrailer方法。
public static byte[] compress(byte srcBytes[]) {
ByteArrayOutputStream out = new ByteArrayOutputStream(); GZIPOutputStream gzip; try {
gzip = new GZIPOutputStream(out); gzip.write(srcBytes); gzip.close(); } catch (IOException e) {
e.printStackTrace(); } return out.toByteArray(); } public static byte[] uncompress(byte[] bytes) {
ByteArrayOutputStream out = new ByteArrayOutputStream(); ByteArrayInputStream in = new ByteArrayInputStream(bytes); try {
GZIPInputStream ungzip = new GZIPInputStream(in); byte[] buffer = new byte[2048]; int n; while ((n = ungzip.read(buffer)) >= 0) {
out.write(buffer, 0, n); } } catch (IOException e) {
e.printStackTrace(); } return out.toByteArray(); }
1.3 bzip2
bzip2
是Julian Seward开发并按照自由软件/开源软件协议发布的数据压缩算法及程序。Seward在1996年7月第一次公开发布了bzip2 0.15版,在随后几年中这个压缩工具稳定性得到改善并且日渐流行,Seward在2000年晚些时候发布了1.0版。bzip2比传统的gzip的压缩效率更高,但是它的压缩速度较慢。jdk中没有对bzip2实现,但是在commons-compress中进行了实现。
<dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-compress</artifactId> <version>1.12</version> </dependency>
代码实现如下:
public static byte[] compress(byte srcBytes[]) throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream(); BZip2CompressorOutputStream bcos = new BZip2CompressorOutputStream(out); bcos.write(srcBytes); bcos.close(); return out.toByteArray(); } public static byte[] uncompress(byte[] bytes) {
ByteArrayOutputStream out = new ByteArrayOutputStream(); ByteArrayInputStream in = new ByteArrayInputStream(bytes); try {
BZip2CompressorInputStream ungzip = new BZip2CompressorInputStream( in); byte[] buffer = new byte[2048]; int n; while ((n = ungzip.read(buffer)) >= 0) {
out.write(buffer, 0, n); } } catch (IOException e) {
e.printStackTrace(); } return out.toByteArray(); }
1.4 lzo
LZO
是致力于解压速度的一种数据压缩算法,LZO是Lempel-Ziv-Oberhumer的缩写。这个算法是无损算法,需要引入第三方库。
<dependency> <groupId>org.anarres.lzo</groupId> <artifactId>lzo-core</artifactId> <version>1.0.5</version> </dependency>
实现代码:
public static byte[] compress(byte srcBytes[]) throws IOException {
LzoCompressor compressor = LzoLibrary.getInstance().newCompressor( LzoAlgorithm.LZO1X, null); ByteArrayOutputStream os = new ByteArrayOutputStream(); LzoOutputStream cs = new LzoOutputStream(os, compressor); cs.write(srcBytes); cs.close(); return os.toByteArray(); } public static byte[] uncompress(byte[] bytes) throws IOException {
LzoDecompressor decompressor = LzoLibrary.getInstance() .newDecompressor(LzoAlgorithm.LZO1X, null); ByteArrayOutputStream baos = new ByteArrayOutputStream(); ByteArrayInputStream is = new ByteArrayInputStream(bytes); LzoInputStream us = new LzoInputStream(is, decompressor); int count; byte[] buffer = new byte[2048]; while ((count = us.read(buffer)) != -1) {
baos.write(buffer, 0, count); } return baos.toByteArray(); }
1.5 lz4
LZ4
是一种无损数据压缩算法,着重于压缩和解压缩速度,需要依赖三方库。
<dependency> <groupId>net.jpountz.lz4</groupId> <artifactId>lz4</artifactId> <version>1.2.0</version> </dependency>
实现代码:
public static byte[] compress(byte srcBytes[]) throws IOException {
LZ4Factory factory = LZ4Factory.fastestInstance(); ByteArrayOutputStream byteOutput = new ByteArrayOutputStream(); LZ4Compressor compressor = factory.fastCompressor(); LZ4BlockOutputStream compressedOutput = new LZ4BlockOutputStream( byteOutput, 2048, compressor); compressedOutput.write(srcBytes); compressedOutput.close(); return byteOutput.toByteArray(); } public static byte[] uncompress(byte[] bytes) throws IOException {
LZ4Factory factory = LZ4Factory.fastestInstance(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); LZ4FastDecompressor decompresser = factory.fastDecompressor(); LZ4BlockInputStream lzis = new LZ4BlockInputStream( new ByteArrayInputStream(bytes), decompresser); int count; byte[] buffer = new byte[2048]; while ((count = lzis.read(buffer)) != -1) {
baos.write(buffer, 0, count); } lzis.close(); return baos.toByteArray(); }
1.6 Snappy
Snappy(以前称Zippy)是Google基于LZ77的思路用C++语言编写的快速数据压缩与解压程序库,并在2011年开源。它的目标并非最大压缩率或与其他压缩程序库的兼容性,而是非常高的速度和合理的压缩率。
<dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> <version>1.1.2.6</version> </dependency>
实现代码:
public static byte[] compress(byte srcBytes[]) throws IOException {
return Snappy.compress(srcBytes); } public static byte[] uncompress(byte[] bytes) throws IOException {
return Snappy.uncompress(bytes); }
二、压力测试
以下对35kb玩家数据进行压缩和解压测试,相对来说35kb数据还是很小量的数据,所有以下测试结果只是针对指定的数据量区间进行测试的结果,并不能说明哪种压缩算法好与不好。
测试环境:
- jdk:1.7.0_79
- cpu:i5-4570@3.20GHz 4 Core
- memory:4G
对35kb数据进行2000次压缩和解压缩测试,测试代码如下:
public static void main(String[] args) throws Exception {
FileInputStream fis = new FileInputStream(new File("player.dat")); FileChannel channel = fis.getChannel(); ByteBuffer bb = ByteBuffer.allocate((int) channel.size()); channel.read(bb); byte[] beforeBytes = bb.array(); int times = 2000; System.out.println("压缩前大小:" + beforeBytes.length + " bytes"); long startTime1 = System.currentTimeMillis(); byte[] afterBytes = null; for (int i = 0; i < times; i++) {
afterBytes = GZIPUtil.compress(beforeBytes); } long endTime1 = System.currentTimeMillis(); System.out.println("压缩后大小:" + afterBytes.length + " bytes"); System.out.println("压缩次数:" + times + ",时间:" + (endTime1 - startTime1) + "ms"); byte[] resultBytes = null; long startTime2 = System.currentTimeMillis(); for (int i = 0; i < times; i++) {
resultBytes = GZIPUtil.uncompress(afterBytes); } System.out.println("解压缩后大小:" + resultBytes.length + " bytes"); long endTime2 = System.currentTimeMillis(); System.out.println("解压缩次数:" + times + ",时间:" + (endTime2 - startTime2) + "ms"); }
三、总结
从结果来看,deflate、gzip和bzip2更关注压缩率,压缩和解压缩时间会更长;lzo,lz4以及snappy这3中压缩算法,均已压缩速度为优先,压缩率会稍逊一筹;lzo,lz4以及snappy在cpu高峰更低一点。
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/233420.html原文链接:https://javaforall.net