python解压bz2文件命令,在Python中解压缩.bz2文件

python解压bz2文件命令,在Python中解压缩.bz2文件So,thisisaseeminglysimplequestion,butI’mapparentlyveryverydull.Ihavealittlescriptthatdownloadsallthe.bz2filesfromawebpage,butforsomereasonthedecompressingofthatfile…

大家好,又见面了,我是你们的朋友全栈君。

python解压bz2文件命令,在Python中解压缩.bz2文件

So, this is a seemingly simple question, but I’m apparently very very dull. I have a little script that downloads all the .bz2 files from a webpage, but for some reason the decompressing of that file is giving me a MAJOR headache.

I’m quite a Python newbie, so the answer is probably quite obvious, please help me.

In this bit of the script, I already have the file, and I just want to read it out to a variable, then decompress that? Is that right? I’ve tried all sorts of way to do this, I usually get “ValueError: couldn’t find end of stream” error on the last line in this snippet. I’ve tried to open up the zipfile and write it out to a string in a zillion different ways. This is the latest.

openZip = open(zipFile, “r”)

s = ”

while True:

newLine = openZip.readline()

if(len(newLine)==0):

break

s+=newLine

print s

uncompressedData = bz2.decompress(s)

Hi Alex, I should’ve listed all the other methods I’ve tried, as I’ve tried the read() way.

METHOD A:

print ‘decompressing ‘ + filename

fileHandle = open(zipFile)

uncompressedData = ”

while True:

s = fileHandle.read(1024)

if not s:

break

print(‘RAW “%s”‘, s)

uncompressedData += bz2.decompress(s)

uncompressedData += bz2.flush()

newFile = open(steamTF2mapdir + filename.split(“.bz2″)[0],”w”)

newFile.write(uncompressedData)

newFile.close()

I get the error:

uncompressedData += bz2.decompress(s)

ValueError: couldn’t find end of stream

METHOD B

zipFile = steamTF2mapdir + filename

print ‘decompressing ‘ + filename

fileHandle = open(zipFile)

s = fileHandle.read()

uncompressedData = bz2.decompress(s)

Same error :

uncompressedData = bz2.decompress(s)

ValueError: couldn’t find end of stream

Thanks so much for you prompt reply. I’m really banging my head against the wall, feeling inordinately thick for not being able to decompress a simple .bz2 file.

By the by, used 7zip to decompress it manually, to make sure the file isn’t wonky or anything, and it decompresses fine.

解决方案

You’re opening and reading the compressed file as if it was a textfile made up of lines. DON’T! It’s NOT.

uncompressedData = bz2.BZ2File(zipFile).read()

seems to be closer to what you’re angling for.

Edit: the OP has shown a few more things he’s tried (though I don’t see any notes about having tried the best method — the one-liner I recommend above!) but they seem to all have one error in common, and I repeat the key bits from above:

opening … the compressed file as if

it was a textfile … It’s NOT.

open(filename) and even the more explicit open(filename, ‘r’) open, for reading, a text file — a compressed file is a binary file, so in order to read it correctly you must open it with open(filename, ‘rb’). ((my recommended bz2.BZ2File KNOWS it’s dealing with a compressed file, of course, so there’s no need to tell it anything more)).

In Python 2.*, on Unix-y systems (i.e. every system except Windows), you could get away with a sloppy use of open (but in Python 3.* you can’t, as text is Unicode, while binary is bytes — different types).

In Windows (and before then in DOS) it’s always been indispensable to distinguish, as Windows’ text files, for historical reason, are peculiar (use two bytes rather than one to end lines, and, at least in some cases, take a byte worth ‘\0x1A’ as meaning a logical end of file) and so the reading and writing low-level code must compensate.

So I suspect the OP is using Windows and is paying the price for not carefully using the ‘rb’ option (“read binary”) to the open built-in. (though bz2.BZ2File is still simpler, whatever platform you’re using!-).

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/138646.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • 滴滴的大数据可视化效果「建议收藏」

    滴滴的大数据可视化效果「建议收藏」前言上一篇专门针对mobike的空间可视化效果写了一篇总结,本篇主要基于滴滴的大数据可视化做一个描述,上篇介绍的空间可视化效果偏静态的,滴滴的大数据可视化更加动态,形式上也更加丰富多彩,本篇主要参考了这篇文章:http://baijiahao.baidu.com/s?id=1588178807086352632和《滴滴出行2017年度城市交通出行报告》。蝌蚪图通过“蝌蚪图”,滴滴大数据…

    2022年10月7日
    0
  • mysql中字符转数字,MYSQL字符数字转换为数字「建议收藏」

    mysql中字符转数字,MYSQL字符数字转换为数字「建议收藏」1、将字符的数字转成数字,比如’0’转成0可以直接用加法来实现例如:将user表中的uid进行排序,可uid的定义为varchar,可以这样解决select*fromuserorderby(uid+0)2、在进行ifnull处理时,比如ifnull(a/b,’0′)这样就会导致a/b成了字符串,因此需要把’0’改成0,即可解决此困扰3、比较数字和varchar时,比如a=11,…

    2022年5月7日
    44
  • java语言的特性有什么

    java语言的特性有什么1.java语言是简单的java语言是和c++语言类似的,其次java中丢弃了c++中一些难理解的特性,比如运算符重载等,java语言不使用指针,并且拥有垃圾回收机制2.java语言是面向对象的java语言提供了类、接口和继承等特性,只支持类之间的单继承,但是支持接口之间的多继承,并且支持类与接口之间的实现机制,而且java是全面支持动态绑定的。3.java语言是分布式的jav…

    2022年7月7日
    22
  • 设备树详解

    设备树详解在Linux3.x版本后,arch/arm/plat-xxx和arch/arm/mach-xxx中,描述板级细节的代码(比如platform_device、i2c_board_info等)被大量取消,取而代之的是设备树

    2022年6月29日
    24
  • matlab激光雷达三角测距,三角测距激光雷达原理[通俗易懂]

    matlab激光雷达三角测距,三角测距激光雷达原理[通俗易懂]激光雷达近几年越来越普及了,复杂的比如应用在无人驾驶汽车上,简单的比如用在扫地机上去。随着无人驾驶和服务机器人行业的发展,后续激光雷达的应用会更广泛。激光雷达之所以流行,主要是因为它能够精准的测距,那么它是如何实现这样的测距功能的呢?主流的激光雷达主要是基于两种原理的,一种是三角测距法,一种是飞行时间(TOF)法。听名字可不要觉得很复杂,其实只需要高中知识,任何人都能看懂它的测距原理!今天咱们就先…

    2022年6月2日
    38
  • windows版TensorFlow最优安装,使用AVX2指令集

    windows版TensorFlow最优安装,使用AVX2指令集通常我们运行TensorFlow会报告如下信息,意思是你的CPU支持AVX2指令集,但TensorFlow的二进制版本没有使用2019-02-1415:44:41.989265:IT:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141]YourCPUsupportsinstruction…

    2022年5月22日
    34

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号