PyPDF2 编码问题 PyPDF2.utils.PdfReadError Illegal character in Name Object

PyPDF2 编码问题 PyPDF2.utils.PdfReadError Illegal character in Name ObjectPyPDF2编码问题PyPDF2.utils.PdfReadErrorIllegalcharacterinNameObject参考资料:https://github.com/mstamy2/PyPDF2/issues/438使用PyPDF2做合并PDF文件时报错如下:Traceback(mostrecentcalllast):File”D:\pr…

大家好,又见面了,我是你们的朋友全栈君。

PyPDF2 编码问题 PyPDF2.utils.PdfReadError Illegal character in Name Object

参考资料:https://github.com/mstamy2/PyPDF2/issues/438

使用 PyPDF2 做合并 PDF 文件时报错如下:

Traceback (most recent call last):
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 484, in readFromStream
    return NameObject(name.decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcb in position 8: invalid continuation byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\projects\myproject\apps\backstage\views\busi_contract_manage_view.py", line 703, in post
    merge_pdf_result = merge_pdf(final_files, pdf_path)
  File "D:\projects\myproject\apps\utils\doc_convert_util.py", line 86, in merge_pdf
    pdf_writer.write(new_file)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 482, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 577, in _sweepIndirectReferences
    newobj = data.pdf.getObject(data)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\pdf.py", line 1611, in getObject
    retval = readObject(self.stream, self)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 66, in readObject
    return DictionaryObject.readFromStream(stream, pdf)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 579, in readFromStream
    value = readObject(stream, pdf)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 60, in readObject
    return NameObject.readFromStream(stream, pdf)
  File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 492, in readFromStream
    raise utils.PdfReadError("Illegal character in Name Object")
PyPDF2.utils.PdfReadError: Illegal character in Name Object

找到对应的报错文件 

File "D:\projects\myproject\venv\lib\site-packages\PyPDF2\generic.py", line 484

第484行 原代码:

try:
    return NameObject(name.decode('utf-8'))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
    # Name objects should represent irregular characters
    # with a '#' followed by the symbol's hex number
    if not pdf.strict:
        warnings.warn("Illegal character in Name Object", utils.PdfReadWarning)
        return NameObject(name)
    else:
        raise utils.PdfReadError("Illegal character in Name Object")

在 except 中加入代码 

return NameObject(name.decode('gbk'))

修改后

try:
    return NameObject(name.decode('utf-8'))
except (UnicodeEncodeError, UnicodeDecodeError) as e:
    try:
        return NameObject(name.decode('gbk'))
    except (UnicodeEncodeError, UnicodeDecodeError) as e:
        # Name objects should represent irregular characters
        # with a '#' followed by the symbol's hex number
        if not pdf.strict:
            warnings.warn("Illegal character in Name Object", utils.PdfReadWarning)
            return NameObject(name)
        else:
            raise utils.PdfReadError("Illegal character in Name Object")

修改后仍会报错,需要修改修改另一处

Lib/site-packages/PyPDF2/utils.py 第238行

原代码

r = s.encode('latin-1')
if len(s) < 2:
    bc[s] = r
return r

 修改后代码:

try:
    r = s.encode('latin-1')
except Exception as e:
    r = s.encode('utf-8')
if len(s) < 2:
    bc[s] = r
return r

 

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/152402.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • 给力者基于51单片机的C语言教程,给力者单片机开发教程

    给力者基于51单片机的C语言教程,给力者单片机开发教程资源介绍给力者单片机开发教程给力者51单片机视频教程01:51单片机学前的准备工作1.mp4给力者51单片机视频教程01:51单片机学前的准备工作2.mp4给力者51单片机视频教程02:51单片机的C语言程序框架.mp4给力者51单片机视频教程03:51单片机的数字量输出1.mp4给力者51单片机视频教程03:51单片机的数字量输出2.mp4给力者51单片机视频教程04:51单片机的查表操作1.m…

    2022年6月7日
    31
  • webmin 安装后如何登录 用户名和密码

    webmin 安装后如何登录 用户名和密码

    2022年2月21日
    49
  • springboot启动类–SpringApplication.run()详解

    springboot启动类–SpringApplication.run()详解前言实习的第一个项目是利用springboot完成一个需求,在项目搭建的过程中真正感受到springboot的强大,springboot的起步依赖以及自动配置特性简直不要太爽,在项目搭建的过程中解放了我们的小手。而springboot的启动也非常简单,只需要启动springboot的启动类,springboot会帮助我们准备所有的环境,包括server,监听器,装配spring的上下文等等,s…

    2025年9月3日
    8
  • java 4种 布局方法_JAVA布局模式:GridBagConstraints终极技巧

    java 4种 布局方法_JAVA布局模式:GridBagConstraints终极技巧JAVA布局模式:GridBagConstraints终极技巧(2006-11-1421:07:33)最近正在修改《公交线路查询系统》,做系统的时候都是用NULL布局,由于NULL布局调用windows系统的API,所以生成的程序无法在其他平台上应用,而且如果控件的数量很多,管理起来也比较麻烦,最近我发现一个非常强大的布局模式:GridBagConstraints布局,先发一个实例:gridx…

    2025年10月13日
    5
  • c++发送post请求_request的post方法作用

    c++发送post请求_request的post方法作用介绍:RestSharpRestSharp是一个轻量的,不依赖任何第三方的组件或者类库的Http的组件。RestSharp具体以下特性;1、通过NuGet方便引入到任何项目(Install-Packagerestsharp)支持net4.0++2、可以自动反序列化XML和JSON3、支持自定义的序列化与反序列化4、自动检测返回的内容类型5、支持HTTP的GET,POST,PUT,HEAD,OPTIONS,DELETE等操作…

    2025年8月29日
    6
  • 【SVN】SVN服务器搭建,客户端使用,在VS Code 中使用SVN

    【SVN】SVN服务器搭建,客户端使用,在VS Code 中使用SVN1.软件下载http://subversion.apache.org/packages.html#windows①VisualSVN服务端②TortoiseSVN客户端③Chinese,simplified语言包④vscode下载2.在vscode使用svn①在vscode里面下载TortoiseSVNforVSCode插件②配置svn环境变量和在……

    2022年7月19日
    27

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号