RapidXml 简介

RapidXml 简介2019独角兽企业重金招聘Python工程师标准>>>…

大家好,又见面了,我是你们的朋友全栈君。

来自:http://rapidxml.sourceforge.net/manual.html

RapidXml is an attempt to create the fastest XML DOM parser possible, while retaining useability, portability and reasonable W3C compatibility. It is an in-situ parser written in C++, with parsing speed approaching that of
strlen() function executed on the same data.


RapidXml 试图成为最快的 XML DOM 解析
工具包,同时保证解析结果的可用性、可移植性以及与 W3C 标准的兼容性。RapidXml 使用 C++ 编写,因此在操作同一数据时,其解析速度接近于 strlen() 函数。

Entire parser is contained in a single header file, so no building or linking is neccesary. To use it you just need to copy
rapidxml.hpp file to a convenient place (such as your project directory), and include it where needed. You may also want to use printing functions contained in header
rapidxml_print.hpp.


整个解析工具包包含在一个头文件中,所以使用时不用编译也不用连接。要想使用 RapidXml 只要包含 rapidxml.hpp 即可,当然如果要用附加功能(如打印函数),你可以包含 rapidxml_print.hpp 文件。

1.1 Dependencies And Compatibility【依赖性与兼容性

RapidXml has
no dependencies other than a very small subset of standard C++ library (
<cassert>,
<cstdlib>,
<new> and
<exception>, unless exceptions are disabled). It should compile on any reasonably conformant compiler, and was tested on Visual C++ 2003, Visual C++ 2005, Visual C++ 2008, gcc 3, gcc 4, and Comeau 4.3.3. Care was taken that no warnings are produced on these compilers, even with highest warning levels enabled.


除了标准C++库中的 cassert、cstdlib、new、exception外,RapidXml几乎不依赖于其他库,几乎能够在任何编译器上通过,经过测试的有
Visual C++ 2003, Visual C++ 2005, Visual C++ 2008, gcc 3, gcc 4, and Comeau 4.3.3。

1.2 Character Types And Encodings【字符类型和编码

RapidXml is character type agnostic, and can work both with narrow and wide characters. Current version does not fully support UTF-16 or UTF-32, so use of wide characters is somewhat incapacitated. However, it should succesfully parse
wchar_t strings containing UTF-16 or UTF-32 if endianness of the data matches that of the machine. UTF-8 is fully supported, including all numeric character references, which are expanded into appropriate UTF-8 byte sequences (unless you enable parse_no_utf8 flag).


RapidXml的字符类型检查不严格(?),窄字符和宽字符
均可以被处理。由于目前版本不支持 UTF-16和UTF-32,因此宽字符的处理范围还有待改进,UTF-8完全没有问题。

Note that RapidXml performs no decoding – strings returned by name() and value() functions will contain text encoded using the same encoding as source file. Rapidxml understands and expands the following character references:
&apos; &amp; &quot; &lt; &gt; &#...; Other character references are not expanded.


注意:name()函数返回不解码的值,value()函数返回以原编码方式编码的文本值。RapidXml认
&apos; &amp; &quot; &lt; &gt; &#...;

1.3 Error Handling【错误处理

By default, RapidXml uses C++ exceptions to report errors. If this behaviour is undesirable, RAPIDXML_NO_EXCEPTIONS can be defined to suppress exception code. See
parse_error class and
parse_error_handler() function for more information.


一般情况下,RapidXml使用 C++的异常处理报告错误,如果异常行为无法预期,可定义
RAPIDXML_NO_EXCEPTIONS。

1.4 Memory Allocation【内存分配

RapidXml uses
a special memory pool object
to allocate nodes and attributes, because direct allocation using
new operator would be far too slow. Underlying memory allocations performed by the pool can be customized by use of
memory_pool::set_allocator() function. See class
memory_pool for more information.

1.5 W3C Compliance【W3C兼容性

RapidXml is not a W3C compliant parser, primarily
because it ignores DOCTYPE declarations. There is a number of other, minor incompatibilities as well. Still, it can successfully parse and produce complete trees of all valid XML files in W3C conformance suite (over 1000 files specially designed to find flaws in XML processors). In destructive mode it performs whitespace normalization and character entity substitution for a small set of built-in entities.


并非W3C兼容的XML解析器,但问题不大。

1.6 API Design【API设计原则

RapidXml API is minimalistic, to reduce code size as much as possible, and facilitate use in embedded environments. Additional convenience functions are provided in separate headers:
rapidxml_utils.hpp and
rapidxml_print.hpp. Contents of these headers is not an essential part of the library, and is currently not documented (otherwise than with comments in code).


API设计坚持最小化原则,以尽可能减少代码尺寸,使之适用于嵌入式环境。

1.7 Reliability【稳定性

RapidXml is
very robust and comes with a large harness of unit tests. Special care has been taken to ensure stability of the parser no matter what source text is thrown at it. One of the unit tests produces 100,000 randomly corrupted variants of XML document, which (when uncorrupted) contains all constructs recognized by RapidXml. RapidXml passes this test when it correctly recognizes that errors have been introduced, and does not crash or loop indefinitely.

Another unit test puts RapidXml head-to-head with another, well estabilished XML parser, and verifies that their outputs match across a wide variety of small and large documents.

Yet another test feeds RapidXml with over 1000 test files from W3C compliance suite, and verifies that correct results are obtained. There are also additional tests that verify each API function separately, and test that various parsing modes work as expected.

1.8 Acknowledgements

I would like to thank Arseny Kapoulkine for his work on
pugixml, which was an inspiration for this project. Additional thanks go to Kristen Wegner for creating
pugxml, from which pugixml was derived. Janusz Wohlfeil kindly ran RapidXml speed tests on hardware that I did not have access to, allowing me to expand performance comparison table.


类别:
Xml 
查看评论

转载于:https://my.oschina.net/zhmsong/blog/5230

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/160911.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • NAVCAT15 激活码【中文破解版】

    (NAVCAT15 激活码)这是一篇idea技术相关文章,由全栈君为大家提供,主要知识点是关于2021JetBrains全家桶永久激活码的内容IntelliJ2021最新激活注册码,破解教程可免费永久激活,亲测有效,下面是详细链接哦~https://javaforall.net/100143.html2KLKA7BQFO-eyJsa…

    2022年4月1日
    50
  • Mysql忘记密码和密码重置

    Mysql忘记密码和密码重置**Mysql忘记密码和密码重置**环境:系统Windows10MySQL-8.0.23操作步骤:1、停止MySQL服务打开命令窗口cmd,输入命令:netstopmysql,停止MySQL服务2、开启跳过密码验证登录的MySQL服务打开命令窗口cmd,进入mysql安装目录下的bin目录,然后输入如下这条命令`mysqld–shared-memory–skip-grant-tables`3、重新打开一个cmd命令窗口,输入mysql命令就可以直接登录了,直接

    2022年6月17日
    20
  • 分布式系统可用性与一致性

    分布式系统可用性与一致性可用性(Availability)和一致性(Consistency)是分布式系统的基本问题,先有著名的CAP理论定义过分布式环境下二者不可兼得的关系,又有神秘的Paxos协议号称是史上最简单的分布式系统一致性算法并获得图灵奖,再有开源产品ZooKeeper实现的ZAB协议号称超越Paxos,它们之间究竟有什么联系?分布式系统的挑战        一致性可理解为所有节点都能访问到最

    2022年7月15日
    13
  • 自动化测试平台(一):前期准备和后端服务搭建「建议收藏」

    自动化测试平台(一):前期准备和后端服务搭建「建议收藏」本专栏会基于djangorestframework+react,并结合这些年自己构建多个自动化测试平台的经验,做一些自动化、平台、测试开发方面的技术、经验分享。会从0开始搭建一个前后端分离的自动化测试平台。由于是免费教程,对于太过初级的内容不会详细进行讲解,更多的是分享自己的理念和开发过程分享。

    2022年6月29日
    22
  • ## HTTP系列之Accept-Encoding和Content-Encoding[通俗易懂]

    ## HTTP系列之Accept-Encoding和Content-Encoding[通俗易懂]前端的性能优化是一个永不停歇的路程,优化的方式也不一而足,今天重点不在于介绍性能优化,而是介绍性能优化的其中一种方式,通过压缩来节省http请求的流量,实现过程中依赖http中header部分的两个字段,Accept-Encoding和Content-EnCoding(分别来自request的header和response的header)。前两天排查一个问题是注意到项目里ssr时返回的页面竟然没…

    2022年7月15日
    23
  • getParameter、getParameterValues、getParameterMap用法详解「建议收藏」

    getParameter、getParameterValues、getParameterMap用法详解「建议收藏」首先request中的参数parameter是一个map表,如下例(1)当调用getParameter(“hobby”)时只能获取hobby[0],即eat。(2)调用getParameterV

    2022年7月4日
    54

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号