RapidXml 简介

RapidXml 简介2019独角兽企业重金招聘Python工程师标准>>>…

大家好,又见面了,我是你们的朋友全栈君。

来自:http://rapidxml.sourceforge.net/manual.html

RapidXml is an attempt to create the fastest XML DOM parser possible, while retaining useability, portability and reasonable W3C compatibility. It is an in-situ parser written in C++, with parsing speed approaching that of
strlen() function executed on the same data.


RapidXml 试图成为最快的 XML DOM 解析
工具包,同时保证解析结果的可用性、可移植性以及与 W3C 标准的兼容性。RapidXml 使用 C++ 编写,因此在操作同一数据时,其解析速度接近于 strlen() 函数。

Entire parser is contained in a single header file, so no building or linking is neccesary. To use it you just need to copy
rapidxml.hpp file to a convenient place (such as your project directory), and include it where needed. You may also want to use printing functions contained in header
rapidxml_print.hpp.


整个解析工具包包含在一个头文件中,所以使用时不用编译也不用连接。要想使用 RapidXml 只要包含 rapidxml.hpp 即可,当然如果要用附加功能(如打印函数),你可以包含 rapidxml_print.hpp 文件。

1.1 Dependencies And Compatibility【依赖性与兼容性

RapidXml has
no dependencies other than a very small subset of standard C++ library (
<cassert>,
<cstdlib>,
<new> and
<exception>, unless exceptions are disabled). It should compile on any reasonably conformant compiler, and was tested on Visual C++ 2003, Visual C++ 2005, Visual C++ 2008, gcc 3, gcc 4, and Comeau 4.3.3. Care was taken that no warnings are produced on these compilers, even with highest warning levels enabled.


除了标准C++库中的 cassert、cstdlib、new、exception外,RapidXml几乎不依赖于其他库,几乎能够在任何编译器上通过,经过测试的有
Visual C++ 2003, Visual C++ 2005, Visual C++ 2008, gcc 3, gcc 4, and Comeau 4.3.3。

1.2 Character Types And Encodings【字符类型和编码

RapidXml is character type agnostic, and can work both with narrow and wide characters. Current version does not fully support UTF-16 or UTF-32, so use of wide characters is somewhat incapacitated. However, it should succesfully parse
wchar_t strings containing UTF-16 or UTF-32 if endianness of the data matches that of the machine. UTF-8 is fully supported, including all numeric character references, which are expanded into appropriate UTF-8 byte sequences (unless you enable parse_no_utf8 flag).


RapidXml的字符类型检查不严格(?),窄字符和宽字符
均可以被处理。由于目前版本不支持 UTF-16和UTF-32,因此宽字符的处理范围还有待改进,UTF-8完全没有问题。

Note that RapidXml performs no decoding – strings returned by name() and value() functions will contain text encoded using the same encoding as source file. Rapidxml understands and expands the following character references:
&apos; &amp; &quot; &lt; &gt; &#...; Other character references are not expanded.


注意:name()函数返回不解码的值,value()函数返回以原编码方式编码的文本值。RapidXml认
&apos; &amp; &quot; &lt; &gt; &#...;

1.3 Error Handling【错误处理

By default, RapidXml uses C++ exceptions to report errors. If this behaviour is undesirable, RAPIDXML_NO_EXCEPTIONS can be defined to suppress exception code. See
parse_error class and
parse_error_handler() function for more information.


一般情况下,RapidXml使用 C++的异常处理报告错误,如果异常行为无法预期,可定义
RAPIDXML_NO_EXCEPTIONS。

1.4 Memory Allocation【内存分配

RapidXml uses
a special memory pool object
to allocate nodes and attributes, because direct allocation using
new operator would be far too slow. Underlying memory allocations performed by the pool can be customized by use of
memory_pool::set_allocator() function. See class
memory_pool for more information.

1.5 W3C Compliance【W3C兼容性

RapidXml is not a W3C compliant parser, primarily
because it ignores DOCTYPE declarations. There is a number of other, minor incompatibilities as well. Still, it can successfully parse and produce complete trees of all valid XML files in W3C conformance suite (over 1000 files specially designed to find flaws in XML processors). In destructive mode it performs whitespace normalization and character entity substitution for a small set of built-in entities.


并非W3C兼容的XML解析器,但问题不大。

1.6 API Design【API设计原则

RapidXml API is minimalistic, to reduce code size as much as possible, and facilitate use in embedded environments. Additional convenience functions are provided in separate headers:
rapidxml_utils.hpp and
rapidxml_print.hpp. Contents of these headers is not an essential part of the library, and is currently not documented (otherwise than with comments in code).


API设计坚持最小化原则,以尽可能减少代码尺寸,使之适用于嵌入式环境。

1.7 Reliability【稳定性

RapidXml is
very robust and comes with a large harness of unit tests. Special care has been taken to ensure stability of the parser no matter what source text is thrown at it. One of the unit tests produces 100,000 randomly corrupted variants of XML document, which (when uncorrupted) contains all constructs recognized by RapidXml. RapidXml passes this test when it correctly recognizes that errors have been introduced, and does not crash or loop indefinitely.

Another unit test puts RapidXml head-to-head with another, well estabilished XML parser, and verifies that their outputs match across a wide variety of small and large documents.

Yet another test feeds RapidXml with over 1000 test files from W3C compliance suite, and verifies that correct results are obtained. There are also additional tests that verify each API function separately, and test that various parsing modes work as expected.

1.8 Acknowledgements

I would like to thank Arseny Kapoulkine for his work on
pugixml, which was an inspiration for this project. Additional thanks go to Kristen Wegner for creating
pugxml, from which pugixml was derived. Janusz Wohlfeil kindly ran RapidXml speed tests on hardware that I did not have access to, allowing me to expand performance comparison table.


类别:
Xml 
查看评论

转载于:https://my.oschina.net/zhmsong/blog/5230

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/160911.html原文链接:https://javaforall.net

(0)
上一篇 2022年7月17日 上午6:46
下一篇 2022年7月17日 上午6:46


相关推荐

  • Python3,选择Python自动安装第三方库,从此跟pip说拜拜!!「建议收藏」

    python安装第三方库方法1、引言2、pip手动安装2.1在线安装2.1.1pipinstall2.1.2指定版本安装2.2离线安装2.3设置国内源2.4卸载与升级2.4.1卸载2.4.2升级3、pip.main自动安装3.1pipmain安装3.2os安装4、总结1、引言续上一篇《Python3:我低调的只用一行代码,就导入Python所有库!》,小鱼发现,别说,还真有不少懒人~~不知道是不是都跟小鱼一样,把剩下的时间来学(撩)习(妹)。为了能让体现小鱼在懒上的造

    2022年4月15日
    42
  • 5种获取JavaScript时间戳函数的方法

    5种获取JavaScript时间戳函数的方法来源 https www fly63 com 一 JavasCRIPT 时间转时间戳 JavaScript 获得时间戳的方法有五种 后四种都是通过实例化时间对象 newDate 来进

    2026年3月19日
    3
  • Java基础篇:Iterator迭代器

    Java基础篇:Iterator迭代器

    2021年10月4日
    45
  • JAVA外文参考文献_java参考文献近五年

    JAVA外文参考文献_java参考文献近五年java论文英文的参考文献相关内容:欢迎浏览,小编为你提供的一篇关于英文毕业论文提纲的毕业论文提纲!1Introduction1.1Significanceoftheresearch1.2Organizationofthethesis2LiteratureReview2.1Researchesonmonolingualmentallexicon2.1…..

    2026年4月19日
    5
  • Linux proc目录详解

    Linux proc目录详解目录 1 什么是 proc2 proc 目录介绍 2 1 proc cpuinifoCPU 的信息 型号 家族 缓存大小等 2 2 proc meminfo 物理内存 交换空间 2 3 proc mounts 已加载的文件系统的列表 2 4 proc devices 可用设备的列表 2 5 proc filesystems 被支持的文件系统 2 6 proc modules 已加载的模块 2 7 proc virsion 内核版本 2 8 proc cmdl

    2026年3月19日
    2
  • pytest-allure_pytest数据驱动

    pytest-allure_pytest数据驱动前言allure是一个report框架,支持java的Junit/testng等框架,当然也可以支持python的pytest框架,也可以集成到Jenkins上展示高大上的报告界面。mac环境:

    2022年7月28日
    15

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号