python解压bz2文件命令,在Python中解压缩.bz2文件

python解压bz2文件命令,在Python中解压缩.bz2文件So,thisisaseeminglysimplequestion,butI’mapparentlyveryverydull.Ihavealittlescriptthatdownloadsallthe.bz2filesfromawebpage,butforsomereasonthedecompressingofthatfile…

大家好,又见面了,我是你们的朋友全栈君。

python解压bz2文件命令,在Python中解压缩.bz2文件

So, this is a seemingly simple question, but I’m apparently very very dull. I have a little script that downloads all the .bz2 files from a webpage, but for some reason the decompressing of that file is giving me a MAJOR headache.

I’m quite a Python newbie, so the answer is probably quite obvious, please help me.

In this bit of the script, I already have the file, and I just want to read it out to a variable, then decompress that? Is that right? I’ve tried all sorts of way to do this, I usually get “ValueError: couldn’t find end of stream” error on the last line in this snippet. I’ve tried to open up the zipfile and write it out to a string in a zillion different ways. This is the latest.

openZip = open(zipFile, “r”)

s = ”

while True:

newLine = openZip.readline()

if(len(newLine)==0):

break

s+=newLine

print s

uncompressedData = bz2.decompress(s)

Hi Alex, I should’ve listed all the other methods I’ve tried, as I’ve tried the read() way.

METHOD A:

print ‘decompressing ‘ + filename

fileHandle = open(zipFile)

uncompressedData = ”

while True:

s = fileHandle.read(1024)

if not s:

break

print(‘RAW “%s”‘, s)

uncompressedData += bz2.decompress(s)

uncompressedData += bz2.flush()

newFile = open(steamTF2mapdir + filename.split(“.bz2″)[0],”w”)

newFile.write(uncompressedData)

newFile.close()

I get the error:

uncompressedData += bz2.decompress(s)

ValueError: couldn’t find end of stream

METHOD B

zipFile = steamTF2mapdir + filename

print ‘decompressing ‘ + filename

fileHandle = open(zipFile)

s = fileHandle.read()

uncompressedData = bz2.decompress(s)

Same error :

uncompressedData = bz2.decompress(s)

ValueError: couldn’t find end of stream

Thanks so much for you prompt reply. I’m really banging my head against the wall, feeling inordinately thick for not being able to decompress a simple .bz2 file.

By the by, used 7zip to decompress it manually, to make sure the file isn’t wonky or anything, and it decompresses fine.

解决方案

You’re opening and reading the compressed file as if it was a textfile made up of lines. DON’T! It’s NOT.

uncompressedData = bz2.BZ2File(zipFile).read()

seems to be closer to what you’re angling for.

Edit: the OP has shown a few more things he’s tried (though I don’t see any notes about having tried the best method — the one-liner I recommend above!) but they seem to all have one error in common, and I repeat the key bits from above:

opening … the compressed file as if

it was a textfile … It’s NOT.

open(filename) and even the more explicit open(filename, ‘r’) open, for reading, a text file — a compressed file is a binary file, so in order to read it correctly you must open it with open(filename, ‘rb’). ((my recommended bz2.BZ2File KNOWS it’s dealing with a compressed file, of course, so there’s no need to tell it anything more)).

In Python 2.*, on Unix-y systems (i.e. every system except Windows), you could get away with a sloppy use of open (but in Python 3.* you can’t, as text is Unicode, while binary is bytes — different types).

In Windows (and before then in DOS) it’s always been indispensable to distinguish, as Windows’ text files, for historical reason, are peculiar (use two bytes rather than one to end lines, and, at least in some cases, take a byte worth ‘\0x1A’ as meaning a logical end of file) and so the reading and writing low-level code must compensate.

So I suspect the OP is using Windows and is paying the price for not carefully using the ‘rb’ option (“read binary”) to the open built-in. (though bz2.BZ2File is still simpler, whatever platform you’re using!-).

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/138646.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • 关于visual profiler 的“The application being profiled returnd a non-zero return code“解决方法

    关于visual profiler 的“The application being profiled returnd a non-zero return code“解决方法这里写自定义目录标题欢迎使用Markdown编辑器新的改变功能快捷键合理的创建标题,有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能,丰富你的文章UML图表FLowchart流程图导出与导入导出导入欢迎使用Markdown编辑器你好!这是你第一次使用Markdown编辑器所展示的欢迎页。如果你想学习如何使用Mar

    2022年4月30日
    58
  • 使用BCGControlBar界面库美化MFC界面的详细过程

    使用BCGControlBar界面库美化MFC界面的详细过程系统环境:Windows7软件环境:VisualStudio2013本次目的:实现MFC对话框换肤下载安装BCGControlBar25激活成功教程版安装完成自动弹出编译库文件的对话框,选择需要的进行编译,需要一段时间,等候,完成打开vs2013首先使用BCGPAppWizard建立工程:Applicationtype:Dialog

    2022年10月8日
    1
  • solidworks怎样绘制螺纹_螺纹孔怎么画

    solidworks怎样绘制螺纹_螺纹孔怎么画1随便画一个圆柱2在原来的地方画一个一摸一样的圆(草图2)3在特征选项卡中点击曲线-螺旋线/涡状线4设置螺距和圈数,画螺旋线5建立一个基准面,第一参考是点,第二参考是曲线6在刚才

    2022年8月4日
    5
  • opencv中的resize 函数 的理解以及引申[通俗易懂]

    一、什么是resize函数:  resize函数opencv中专门用来调整图像大小的函数;  opencv提供五种方法供选择分别是:                   a.最近邻插值——INTER_NEAREST;                   b.线性插值——INTER_LINEAR;(默认值)                   c.区域插值——IN…

    2022年4月13日
    97
  • 5分钟快速了解MySQL索引的各种类型

    5分钟快速了解MySQL索引的各种类型之所以在索引在面试中经常被问到,就是因为:索引是数据库的良好性能表现的关键,也是对查询能优化最有效的手段。索引能够轻易地把查询性能提高几个数量级。

    2022年6月24日
    30
  • 谷歌 analytics.js 部分解密版

    谷歌 analytics.js 部分解密版源:http://www.google-analytics.com/analytics.js(function(){varaa=encodeURIComponent,f=window,ba=setTimeout,n=Math;functionPc(a,b){returna.href=b}functionfa(a,b){returna.name=b}varQc=”repla…

    2022年7月26日
    12

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号