python解压bz2文件命令,在Python中解压缩.bz2文件

python解压bz2文件命令,在Python中解压缩.bz2文件So,thisisaseeminglysimplequestion,butI’mapparentlyveryverydull.Ihavealittlescriptthatdownloadsallthe.bz2filesfromawebpage,butforsomereasonthedecompressingofthatfile…

大家好,又见面了,我是你们的朋友全栈君。

python解压bz2文件命令,在Python中解压缩.bz2文件

So, this is a seemingly simple question, but I’m apparently very very dull. I have a little script that downloads all the .bz2 files from a webpage, but for some reason the decompressing of that file is giving me a MAJOR headache.

I’m quite a Python newbie, so the answer is probably quite obvious, please help me.

In this bit of the script, I already have the file, and I just want to read it out to a variable, then decompress that? Is that right? I’ve tried all sorts of way to do this, I usually get “ValueError: couldn’t find end of stream” error on the last line in this snippet. I’ve tried to open up the zipfile and write it out to a string in a zillion different ways. This is the latest.

openZip = open(zipFile, “r”)

s = ”

while True:

newLine = openZip.readline()

if(len(newLine)==0):

break

s+=newLine

print s

uncompressedData = bz2.decompress(s)

Hi Alex, I should’ve listed all the other methods I’ve tried, as I’ve tried the read() way.

METHOD A:

print ‘decompressing ‘ + filename

fileHandle = open(zipFile)

uncompressedData = ”

while True:

s = fileHandle.read(1024)

if not s:

break

print(‘RAW “%s”‘, s)

uncompressedData += bz2.decompress(s)

uncompressedData += bz2.flush()

newFile = open(steamTF2mapdir + filename.split(“.bz2″)[0],”w”)

newFile.write(uncompressedData)

newFile.close()

I get the error:

uncompressedData += bz2.decompress(s)

ValueError: couldn’t find end of stream

METHOD B

zipFile = steamTF2mapdir + filename

print ‘decompressing ‘ + filename

fileHandle = open(zipFile)

s = fileHandle.read()

uncompressedData = bz2.decompress(s)

Same error :

uncompressedData = bz2.decompress(s)

ValueError: couldn’t find end of stream

Thanks so much for you prompt reply. I’m really banging my head against the wall, feeling inordinately thick for not being able to decompress a simple .bz2 file.

By the by, used 7zip to decompress it manually, to make sure the file isn’t wonky or anything, and it decompresses fine.

解决方案

You’re opening and reading the compressed file as if it was a textfile made up of lines. DON’T! It’s NOT.

uncompressedData = bz2.BZ2File(zipFile).read()

seems to be closer to what you’re angling for.

Edit: the OP has shown a few more things he’s tried (though I don’t see any notes about having tried the best method — the one-liner I recommend above!) but they seem to all have one error in common, and I repeat the key bits from above:

opening … the compressed file as if

it was a textfile … It’s NOT.

open(filename) and even the more explicit open(filename, ‘r’) open, for reading, a text file — a compressed file is a binary file, so in order to read it correctly you must open it with open(filename, ‘rb’). ((my recommended bz2.BZ2File KNOWS it’s dealing with a compressed file, of course, so there’s no need to tell it anything more)).

In Python 2.*, on Unix-y systems (i.e. every system except Windows), you could get away with a sloppy use of open (but in Python 3.* you can’t, as text is Unicode, while binary is bytes — different types).

In Windows (and before then in DOS) it’s always been indispensable to distinguish, as Windows’ text files, for historical reason, are peculiar (use two bytes rather than one to end lines, and, at least in some cases, take a byte worth ‘\0x1A’ as meaning a logical end of file) and so the reading and writing low-level code must compensate.

So I suspect the OP is using Windows and is paying the price for not carefully using the ‘rb’ option (“read binary”) to the open built-in. (though bz2.BZ2File is still simpler, whatever platform you’re using!-).

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/138646.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • IntelliJ idea 主题包下载以及安装

    IntelliJ idea 主题包下载以及安装IntelliJidea默认的主体只有简单的白和灰,不一定能满足所有人的喜好,所以想要下载一些其它不错的主题包;主题下载地址;部分截图;选择自己喜欢的主题下载,个人还是比较喜欢SublimeText2主题,下载好之后,随意放个找得到的位置(还是放在安装目录下面吧,是个整体嘛),是个jar包。安装:file–>importsetttings–…

    2022年5月6日
    456
  • LVM扩容操作

    LVM扩容操作文章目录一、测试环境二、给lvm分区扩容(加硬盘)1.新增硬盘2.给新的硬盘分区3、Lvm操作查看卷组状态:`vgdisplay`创建物理卷:`pvcreate/dev/sdb1`扩展卷组:`vgextend卷组名物理卷路径`扩展逻辑卷:lvextend拉伸文件系统:xfs_growfs或者resize2fs4、验证结果:参考文档一、测试环境我是在virtualbox上安装的测试环境:centos。其具体硬盘配置如下df-Th磁盘情况:fdisk-l今天主要是分别操作下

    2022年6月20日
    36
  • mysql索引abc,a=1 and c=2是否可使用索引_sql联合索引

    mysql索引abc,a=1 and c=2是否可使用索引_sql联合索引在一次查询中,MySQL只能使用一个索引。在真实项目中,SQL语句中的WHERE子句里通常会包含多个查询条件还会有排序、分组等。若表中索引过多,会影响INSERT及UPDATE性能,简单说就是会影响数据写入性能。因为更新数据的同时,也要同时更新索引。最实际的好处当然是查询速度快,性能好。MYSQL中常用的强制性操作(例如强制索引)https://www.jb51.net/article/49807…

    2022年9月5日
    2
  • java利用异或运算的性质,对几个字符_java位运算符详解

    java利用异或运算的性质,对几个字符_java位运算符详解原标题:干货:Java异或运算符的使用方法做Java这么久,还真的从来没有用到过某些基础的Java知识。今天就遇到了一个:Java的异或运算^,这个小不点“^”就是Java的异或运算符,是不是有点小,再来个大点的看得清楚:真^假=真  假^真=真  假^假=假  真^真=假这四个是在网上copy的例子,但它却是说明了Java异或运算的基本法则,那就是:只要两个条件同时为真或假,其结果都为假(这里要…

    2022年10月4日
    0
  • xgboost入门与实战(原理篇)

    xgboost入门与实战(原理篇)xgboost入门与实战(原理篇)前言:xgboost是大规模并行boostedtree的工具,它是目前最快最好的开源boostedtree工具包,比常见的工具包快10倍以上。在数据科学方面,有大量kaggle选手选用它进行数据挖掘比赛,其中包括两个以上kaggle比赛的夺冠方案。在工业界规模方面,xgboost的分布式版本有广泛的可移植性,支持在YARN,MPI,SungridEn

    2022年4月30日
    54
  • 万能乘法速算法大全_小学数学加减乘除【速算法】都在这里! 寒假让孩子练一练…

    万能乘法速算法大全_小学数学加减乘除【速算法】都在这里! 寒假让孩子练一练…★需要电子版资料可直接拉至文末查看领取方式哈!小果老师说:很多小朋友的寒假生活已经开启啦!寒假的确可以好好玩一玩,但某种程度上该学习还是的学习一些的!因此,今天小果老师要给大家分享的内容是数学速算法,这些内容掌握以后就几乎不用担心那些简便运算没头绪啦!赶紧来看看然后为孩子收藏起来吧!01加法的神奇速算法一、加大减差法口诀前面加数加上后面加数的整数,减去后面加数与整数的差等于和。例题1376+98…

    2022年6月5日
    31

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号