No way to disable datapump estimate?

No way to disable datapump estimate?

昨天晚上去客户现场配合一个导入导出作业;这是一套Solaris 10上的10.2.0.1系统,导出采用expdp数据泵工具,需要导出的数据源是一张大小在120G左右的单表,该表存放了最近一年的数据,实际导出时只需要最近三个月的数据,所以使用了QUERY参数,并指定了并行度为2。 该导出作业之前未经测试过,语句也是临时写的,实际执行导出工作时发现在评估阶段(estimate phase)耗费了大约十多分钟的时间,estimate默认使用blocks模式,即通过计算对象段占用的数据库块来预估dumpfile的大小;此外还有statistics模式通过对象的统计信息来推算导出文件的大小:

The block method of estimate is the least accurate. We investigate how many blocks were allocated for the storage segments for the tables specified. If the table had been recently analyzed, the estimate=statistics would be the most accurate. So remember, it is important to generate statistics before calculating the estimate using estimate=statistics method. Additionally, ESTIMATE=BLOCKS method could be inaccurate when: a) The table was created with a much bigger initial extent size than was needed for the actual table data b) Many rows have been deleted from the table, or a very small percentage of each block is used. # When statistics have been explicitly set via the DBMS_STATS package in order to force certain optimizer plans, instead of being directly collected from the data, ESTIMATE=STATISTICS may not report correct results.

但实际上这2中方式在使用QUERY的情况下都无法正确预估导出文件的大小:

[maclean@rh2 udump]$ expdp maclean/maclean directory=dump  tables=estimate_me query=estimate_me:'"where t1<2000"'  estimate=statistics

Export: Release 10.2.0.4.0 - 64bit Production on Tuesday, 02 November, 2010 21:28:26

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
FLASHBACK automatically enabled to preserve database integrity.
Starting "MACLEAN"."SYS_EXPORT_TABLE_01":  maclean/******** directory=dump tables=estimate_me query=estimate_me:"where t1<2000" estimate=statistics
Estimate in progress using STATISTICS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
.  estimated "MACLEAN"."ESTIMATE_ME"                     104.7 MB
Total estimation using STATISTICS method: 104.7 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "MACLEAN"."ESTIMATE_ME"                     32.42 KB    1999 rows
Master table "MACLEAN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for MACLEAN.SYS_EXPORT_TABLE_01 is:
  /s01/dump/expdat.dmp
Job "MACLEAN"."SYS_EXPORT_TABLE_01" successfully completed at 21:28:29

/* statistics模式使用统计来判断导出文件大小,比较容易控制 */

begin
 dbms_stats.set_table_stats(ownname => 'MACLEAN',tabname => 'ESTIMATE_ME',numrows => 999999999999999,numblks => 99999999999999999999);
  end;
/

PL/SQL procedure successfully completed.

Estimate in progress using STATISTICS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
.  estimated "MACLEAN"."ESTIMATE_ME"                     10244 GB
Total estimation using STATISTICS method: 10244 GB

/* 再次导出时数据泵采用了"作假的"统计信息*/

[maclean@rh2 udump]$ expdp maclean/maclean directory=dump  tables=estimate_me query=estimate_me:'"where t1<2000"'  estimate=blocks

Export: Release 10.2.0.4.0 - 64bit Production on Tuesday, 02 November, 2010 21:31:58

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
FLASHBACK automatically enabled to preserve database integrity.
Starting "MACLEAN"."SYS_EXPORT_TABLE_01":  maclean/******** directory=dump tables=estimate_me query=estimate_me:"where t1<2000" estimate=blocks
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
.  estimated "MACLEAN"."ESTIMATE_ME"                       192 MB
Total estimation using BLOCKS method: 192 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "MACLEAN"."ESTIMATE_ME"                     32.42 KB    1999 rows
Master table "MACLEAN"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for MACLEAN.SYS_EXPORT_TABLE_01 is:
  /s01/dump/expdat.dmp
Job "MACLEAN"."SYS_EXPORT_TABLE_01" successfully completed at 21:32:01

/* 可以看到以上BLOCKS method预估的dumpfile大小为192MB,而STATISTICS方式预估为104.7MB;
    然而实际的使用QUERY参数导出的1999行数据只占用32k的空间!
    estimate在这里摆了一个大乌龙
*/

查了一下metalink发现10.2.0.1上存在expdp在estimate评估阶段耗时很久的bug:

EXPDP Slow and Estimate Takes Long Time [ID 822283.1]
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3
This problem can occur on any platform.
Symptoms
Data pump Export is very slow and long time is spent in Estimating the data size.
EXPDP is stuck for long at :

Starting "SYSTEM"."SYS_EXPORT_SCHEMA_01": system/******** parfile=dpexp.par
Estimate in progress using BLOCKS method...

AWR report taken when EXPDP is running slow show the below query took long time to complete .

SELECT /*+rule*/ SYS_XMLGEN(VALUE(KU$),
XMLFORMAT.createFormat2('TABLE_DATA_T', '7')), 0 ,KU$.BASE_OBJ.NAME
,KU$.BASE_OBJ.OWNER_NAME ,'TABLE' ,to_char(KU$.BYTES_ALLOC) ,
to_char(KU$.ET_PARALLEL) ,KU$.FGAC ,KU$.NONSCOPED_REF ,KU$.XMLSCHEMACOLS
,KU$.NAME , KU$.NAME ,'TABLE_DATA' ,KU$.PART_NAME ,KU$.SCHEMA_OBJ.OWNER_NAME
,KU$.TS_NAME , KU$.TRIGFLAG
,decode(KU$.SCHEMA_OBJ.TYPE_NUM,2,decode(bitand(KU$.PROPERTY,8192),8192,
'NESTED TABLE','T'),19,'PARTITION',20,'PARTITION','SUBPARTITION')
,to_char(KU$.UNLOAD_METHOD)
FROM SYS.KU$_TABLE_DATA_VIEW KU$
WHERE NOT BITAND(KU$.BASE_OBJ.FLAGS,128)!=0 AND NOT (BITAND
(KU$.BASE_OBJ.FLAGS,16)=16) AND KU$.BASE_OBJ.NAME IN
('PA_DRAFT_INVOICES_ALL') AND KU$.BASE_OBJ.OWNER_NAME IN ('PA') AND NOT
EXISTS (SELECT 1 FROM SYS.KU$NOEXP_TAB A WHERE A.OBJ_TYPE='TABLE' AND
A.NAME=KU$.BASE_OBJ.NAME AND A.SCHEMA=KU$.BASE_OBJ.OWNER_NAME) AND NOT EXISTS
(SELECT 1 FROM SYS.KU$NOEXP_TAB A WHERE A.OBJ_TYPE='SCHEMA' AND
A.NAME=KU$.BASE_OBJ.OWNER_NAME);

Cause
The issue here is with using RBO.
Data pump is not choosing Cost Based Optimizer.
This is because of the 'unpublished' bug 5929373 -DATA PUMP TAKES LONG TIME TO EXPORT DATA

This bug is closed as duplicate of another bug << 5464834 >>

Solution

   1. Upgrade to 10.2.0.4 . The bug is fixed in 10.2.0.4
      Or
   2. Apply the << patch 5464834 >> if available for your platform .

        * Login to metalink
        * Go to 'Patches and Updates'
        * Click on 'Simple Search'
        * Enter the patch number 5464834
        * Select your platform from the dropdown list .
        * Patch,if available , will be displayed .
        * Download and apply the patch .

该Bug可以通过升级版本解决,但远水解不了近火,最初想disable掉estimate功能;但查阅若干资料后发现目前似乎没有能够关掉estimate功能的办法,这个就是悲哀所在了!耐心等estimate阶段完成后,实际数据导出阶段倒也只花了十来分钟,就这一点看来datapump很有中国式的办事风格。 另外当指定的导出文件数小于指定的parallelism,且导出数据量较大时可能出现ORA-39095错误,会导致导出作业意外终止:

39095, 00000, "Dump file space has been exhausted: Unable to allocate %s bytes"
// *Cause:  The Export job ran out of dump file space before the job was
//          completed.
// *Action: Reattach to the job and add additional dump files to the job
//          restarting the job.

EXPDP generating ORA-39095 : " dump file space has been exhausted"
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2
Information in this document applies to any platform.
Symptoms
Checked for relevance on 10-22-2010

Errors while performing Datapump Export:

- when exporting a large number of tables then it completely stops
- the jobs may remain in the datapump export tables
- No trace generated at the time of the failure.
ERROR:
ORA - 39095 : " dump file space has been exhausted. Unable to allocate 4096 bytes".jobsystem.sys_export_full_02 stops due to fatal error.

Cause
Customer chose only limited number of files in "dumpfile" parameter with limited "filesize" , i.e.:

   full=Y
   directory=dpump_dir
   dumpfile=full_1.dmp,full_2.dmp,full_3.dmp,full_4.dmp
   filesize=3000000000

This means 4 dump files with filesize 3000000000 byte (2.79 G).

the dumpfile parameter is the issue here.

.
Solution
1- You can use the dynamic format (i.e dumpfile=full_%U.dmp) :

The 'wildcard' specification for the dump file can expand up to 99 files. If 99 files have been generated before the export has completed, it will again return the ORA-39095 error.

2- If this is yet not enough and more files are needed, a workaround would be to speficy a bigger 'filesize' parameter.

3- If this is inconvenient, another option is to use this syntax:

      dumpfile=fullexp%U.dmp, fullexp2_%U.dmp, fullexp3_%U.dmp

which can expand up to 3*99 files.

If encountering problems containing the dump in a single directory using this solution, you may prefer this syntax:

     dumpfile=dmpdir1:fullexp1_%U.dmp, dmpdir2:fullexp2_%U.dmp, dmpdir3:fullexp3_U.dmp

(assuming the 3 directory objects listed above had been already created first).

解决方法很简单就是去掉并行度,或者指定动态的导出文件名。 记以录之,回去补觉了!

  • The block method of estimate is the least accurate. We investigate how many blocks were allocated for the storage segments for the tables specified. If the table had been recently analyzed, the estimate=statistics would be the most accurate. So remember, it is important to generate statistics before calculating the estimate using estimate=statistics method. Additionally, ESTIMATE=BLOCKS method could be inaccurate when:a) The table was created with a much bigger initial extent size than was needed for the actual table data b) Many rows have been deleted from the table, or a very small percentage of each block is used.
  • When statistics have been explicitly set via the DBMS_STATS package in order to force certain optimizer plans, instead of being directly collected from the data, ESTIMATE=STATISTICS may not report correct results.
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/110699.html原文链接:https://javaforall.net

(0)
上一篇 2021年8月8日 下午7:00
下一篇 2021年8月8日 下午8:00


相关推荐

  • 智谱推出全球100B级最强开源多模态模型GLM-4.5V:获41个榜单SOTA

    智谱推出全球100B级最强开源多模态模型GLM-4.5V:获41个榜单SOTA

    2026年3月12日
    2
  • 【web前端】16.实现点击回到页面底部/顶部

    【web前端】16.实现点击回到页面底部/顶部scriptsrc js jquery 3 3 1 min js script document ready function go up click function console log go up script scriptsrc js

    2026年3月17日
    2
  • Delphi考虑sql注入 QuotedStr

    Delphi考虑sql注入 QuotedStr之前只在BS架构的项目中考虑了Sql注入问题,却很少考虑到用了多年的Delphi项目也应该考虑Sql注入的问题,今天做了个实验,成功完成注入,把表里数据全部删除,以后再做Delphi项目还真的考虑这个问题。总体讲,大体知道有两种方式可以避免Delphi中的Sql注入:1、用QuotedStr替代”’进行字符串拼接;2、采用传参数的方式与数据库交互,这种方式哪天再仔细体验一下。 以下为一个小测试,…

    2022年10月18日
    3
  • matplotlib绘图太单调 | 我们可以自定义背景图片

    matplotlib绘图太单调 | 我们可以自定义背景图片读者朋友们好 今天分享 matplotlib 自定义背景图片 本文讲一下如何使用 matplotlib 在自定义背景上进行图像绘制 比如下方的图片就是在找好背景图之后 使用 matplotlib 绘制而成其实呢 使用自定义背景是非常简单的事情 找好你要的背景图 之后使用下面三行代码就能将这张图设置为背景 img plt imread 背景 png fig ax plt subplots figsiz

    2026年3月18日
    2
  • 宝塔安装mysql失败_手机如何卸载宝塔防火墙

    宝塔安装mysql失败_手机如何卸载宝塔防火墙先用宝塔自带卸载宝塔软件环境默认是不卸载得mysql先停止服务,再删除服务apache先停止服务,再删除服务然后安装新的宝塔即可

    2025年9月22日
    5
  • wpf一定比winform好吗_winform转wpf好转吗

    wpf一定比winform好吗_winform转wpf好转吗1、结果来说,属于两套界面渲染方式。一个是对传统windows界面元素的封装,通过gdi绘制。另一个是全新的dx渲染绘制的界面,也脱离了对传统windows控件的依赖,没有历史包袱,理论上可以展现更炫酷的界面。对初级开发人员来说,没太大区别,类似的基本设计器是他们设计界面的主要手段,一样给事件编写代码。对初级以上开发人员来说,wpf需要学习xaml,有全新的ui描述语言,特别是可以通过模板的嵌套…

    2025年7月27日
    4

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号