数仓搭建——ODS层

数仓搭建——ODS层数仓搭建 ODS 层用户行为数据和业务数据

1 ODS层(用户行为数据)

1.1 创建日志表ods_log

建表语句

hive (gmall)> drop table if exists ods_log; CREATE EXTERNAL TABLE ods_log (`line` string) PARTITIONED BY (`dt` string) -- 按照时间创建分区 STORED AS -- 指定存储方式,读数据采用LzoTextInputFormat; INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_log' -- 指定数据在hdfs上的存储位置 ; 

数仓搭建——ODS层

 数据装载思路:

数仓搭建——ODS层

 装载语句:

hive (gmall)> load data inpath '/origin_data/gmall/log/topic_log/2020-06-14' into table ods_log partition(dt='2020-06-14'); 

为lzo压缩文件创建索引

[zhang@hadoop102 bin]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.20.jar com.hadoop.compression.lzo.DistributedLzoIndexer /warehouse/gmall/ods/ods_log/dt=2020-06-14

数仓搭建——ODS层

 1.2 ODS层日志表加载数据脚本

在hadoop102的/home/zhang/bin目录下创建脚本

[zhang@hadoop102 bin]$ vim hdfs_to_ods_log.sh

 在脚本中编写如下内容

#!/bin/bash # 定义变量方便修改 APP=gmall # 如果是输入的日期按照取输入日期;如果没输入日期取当前时间的前一天 if [ -n "$1" ] ;then do_date=$1 else do_date=`date -d "-1 day" +%F` fi echo ================== 日志日期为 $do_date ================== sql=" load data inpath '/origin_data/$APP/log/topic_log/$do_date' into table ${APP}.ods_log partition(dt='$do_date'); " hive -e "$sql" hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-lzo-0.4.20.jar com.hadoop.compression.lzo.DistributedLzoIndexer /warehouse/$APP/ods/ods_log/dt=$do_date 

数仓搭建——ODS层

 增加权限:[zhang@hadoop102 bin]$ chmod +x hdfs_to_ods_log.sh

去datagrip执行以下刚刚写的建表语句

数仓搭建——ODS层

 去执行装载脚本:

[zhang@hadoop102 bin]$ ./hdfs_to_ods_log.sh 2020-06-14

数仓搭建——ODS层

 去datagrip中查看刚刚的数据有没有被写入

数仓搭建——ODS层

 2 ODS层(业务数据)

2.1 27个表

 DROP TABLE IF EXISTS ods_activity_info; CREATE EXTERNAL TABLE ods_activity_info( `id` STRING COMMENT '编号', `activity_name` STRING COMMENT '活动名称', `activity_type` STRING COMMENT '活动类型', `start_time` STRING COMMENT '开始时间', `end_time` STRING COMMENT '结束时间', `create_time` STRING COMMENT '创建时间' ) COMMENT '活动信息表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_activity_info/'; DROP TABLE IF EXISTS ods_activity_rule; CREATE EXTERNAL TABLE ods_activity_rule( `id` STRING COMMENT '编号', `activity_id` STRING COMMENT '活动ID', `activity_type` STRING COMMENT '活动类型', `condition_amount` DECIMAL(16,2) COMMENT '满减金额', `condition_num` BIGINT COMMENT '满减件数', `benefit_amount` DECIMAL(16,2) COMMENT '优惠金额', `benefit_discount` DECIMAL(16,2) COMMENT '优惠折扣', `benefit_level` STRING COMMENT '优惠级别' ) COMMENT '活动规则表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_activity_rule/'; DROP TABLE IF EXISTS ods_base_category1; CREATE EXTERNAL TABLE ods_base_category1( `id` STRING COMMENT 'id', `name` STRING COMMENT '名称' ) COMMENT '商品一级分类表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_category1/'; DROP TABLE IF EXISTS ods_base_category2; CREATE EXTERNAL TABLE ods_base_category2( `id` STRING COMMENT ' id', `name` STRING COMMENT '名称', `category1_id` STRING COMMENT '一级品类id' ) COMMENT '商品二级分类表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_category2/'; DROP TABLE IF EXISTS ods_base_category3; CREATE EXTERNAL TABLE ods_base_category3( `id` STRING COMMENT ' id', `name` STRING COMMENT '名称', `category2_id` STRING COMMENT '二级品类id' ) COMMENT '商品三级分类表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_category3/'; DROP TABLE IF EXISTS ods_base_dic; CREATE EXTERNAL TABLE ods_base_dic( `dic_code` STRING COMMENT '编号', `dic_name` STRING COMMENT '编码名称', `parent_code` STRING COMMENT '父编码', `create_time` STRING COMMENT '创建日期', `operate_time` STRING COMMENT '操作日期' ) COMMENT '编码字典表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_dic/'; DROP TABLE IF EXISTS ods_base_province; CREATE EXTERNAL TABLE ods_base_province ( `id` STRING COMMENT '编号', `name` STRING COMMENT '省份名称', `region_id` STRING COMMENT '地区ID', `area_code` STRING COMMENT '地区编码', `iso_code` STRING COMMENT 'ISO-3166编码,供可视化使用', `iso_3166_2` STRING COMMENT 'IOS-3166-2编码,供可视化使用' ) COMMENT '省份表' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_province/'; DROP TABLE IF EXISTS ods_base_region; CREATE EXTERNAL TABLE ods_base_region ( `id` STRING COMMENT '编号', `region_name` STRING COMMENT '地区名称' ) COMMENT '地区表' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_region/'; DROP TABLE IF EXISTS ods_base_trademark; CREATE EXTERNAL TABLE ods_base_trademark ( `id` STRING COMMENT '编号', `tm_name` STRING COMMENT '品牌名称' ) COMMENT '品牌表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_base_trademark/'; DROP TABLE IF EXISTS ods_cart_info; CREATE EXTERNAL TABLE ods_cart_info( `id` STRING COMMENT '编号', `user_id` STRING COMMENT '用户id', `sku_id` STRING COMMENT 'skuid', `cart_price` DECIMAL(16,2) COMMENT '放入购物车时价格', `sku_num` BIGINT COMMENT '数量', `sku_name` STRING COMMENT 'sku名称 (冗余)', `create_time` STRING COMMENT '创建时间', `operate_time` STRING COMMENT '修改时间', `is_ordered` STRING COMMENT '是否已经下单', `order_time` STRING COMMENT '下单时间', `source_type` STRING COMMENT '来源类型', `source_id` STRING COMMENT '来源编号' ) COMMENT '加购表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_cart_info/'; DROP TABLE IF EXISTS ods_comment_info; CREATE EXTERNAL TABLE ods_comment_info( `id` STRING COMMENT '编号', `user_id` STRING COMMENT '用户ID', `sku_id` STRING COMMENT '商品sku', `spu_id` STRING COMMENT '商品spu', `order_id` STRING COMMENT '订单ID', `appraise` STRING COMMENT '评价', `create_time` STRING COMMENT '评价时间' ) COMMENT '商品评论表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_comment_info/'; DROP TABLE IF EXISTS ods_coupon_info; CREATE EXTERNAL TABLE ods_coupon_info( `id` STRING COMMENT '购物券编号', `coupon_name` STRING COMMENT '购物券名称', `coupon_type` STRING COMMENT '购物券类型 1 现金券 2 折扣券 3 满减券 4 满件打折券', `condition_amount` DECIMAL(16,2) COMMENT '满额数', `condition_num` BIGINT COMMENT '满件数', `activity_id` STRING COMMENT '活动编号', `benefit_amount` DECIMAL(16,2) COMMENT '减金额', `benefit_discount` DECIMAL(16,2) COMMENT '折扣', `create_time` STRING COMMENT '创建时间', `range_type` STRING COMMENT '范围类型 1、商品 2、品类 3、品牌', `limit_num` BIGINT COMMENT '最多领用次数', `taken_count` BIGINT COMMENT '已领用次数', `start_time` STRING COMMENT '开始领取时间', `end_time` STRING COMMENT '结束领取时间', `operate_time` STRING COMMENT '修改时间', `expire_time` STRING COMMENT '过期时间' ) COMMENT '优惠券表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_coupon_info/'; DROP TABLE IF EXISTS ods_coupon_use; CREATE EXTERNAL TABLE ods_coupon_use( `id` STRING COMMENT '编号', `coupon_id` STRING COMMENT '优惠券ID', `user_id` STRING COMMENT 'skuid', `order_id` STRING COMMENT 'spuid', `coupon_status` STRING COMMENT '优惠券状态', `get_time` STRING COMMENT '领取时间', `using_time` STRING COMMENT '使用时间(下单)', `used_time` STRING COMMENT '使用时间(支付)', `expire_time` STRING COMMENT '过期时间' ) COMMENT '优惠券领用表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_coupon_use/'; DROP TABLE IF EXISTS ods_favor_info; CREATE EXTERNAL TABLE ods_favor_info( `id` STRING COMMENT '编号', `user_id` STRING COMMENT '用户id', `sku_id` STRING COMMENT 'skuid', `spu_id` STRING COMMENT 'spuid', `is_cancel` STRING COMMENT '是否取消', `create_time` STRING COMMENT '收藏时间', `cancel_time` STRING COMMENT '取消时间' ) COMMENT '商品收藏表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_favor_info/'; DROP TABLE IF EXISTS ods_order_detail; CREATE EXTERNAL TABLE ods_order_detail( `id` STRING COMMENT '编号', `order_id` STRING COMMENT '订单号', `sku_id` STRING COMMENT '商品id', `sku_name` STRING COMMENT '商品名称', `order_price` DECIMAL(16,2) COMMENT '商品价格', `sku_num` BIGINT COMMENT '商品数量', `create_time` STRING COMMENT '创建时间', `source_type` STRING COMMENT '来源类型', `source_id` STRING COMMENT '来源编号', `split_final_amount` DECIMAL(16,2) COMMENT '分摊最终金额', `split_activity_amount` DECIMAL(16,2) COMMENT '分摊活动优惠', `split_coupon_amount` DECIMAL(16,2) COMMENT '分摊优惠券优惠' ) COMMENT '订单详情表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_order_detail/'; DROP TABLE IF EXISTS ods_order_detail_activity; CREATE EXTERNAL TABLE ods_order_detail_activity( `id` STRING COMMENT '编号', `order_id` STRING COMMENT '订单号', `order_detail_id` STRING COMMENT '订单明细id', `activity_id` STRING COMMENT '活动id', `activity_rule_id` STRING COMMENT '活动规则id', `sku_id` BIGINT COMMENT '商品id', `create_time` STRING COMMENT '创建时间' ) COMMENT '订单详情活动关联表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_order_detail_activity/'; DROP TABLE IF EXISTS ods_order_detail_coupon; CREATE EXTERNAL TABLE ods_order_detail_coupon( `id` STRING COMMENT '编号', `order_id` STRING COMMENT '订单号', `order_detail_id` STRING COMMENT '订单明细id', `coupon_id` STRING COMMENT '优惠券id', `coupon_use_id` STRING COMMENT '优惠券领用记录id', `sku_id` STRING COMMENT '商品id', `create_time` STRING COMMENT '创建时间' ) COMMENT '订单详情活动关联表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_order_detail_coupon/'; DROP TABLE IF EXISTS ods_order_info; CREATE EXTERNAL TABLE ods_order_info ( `id` STRING COMMENT '订单号', `final_amount` DECIMAL(16,2) COMMENT '订单最终金额', `order_status` STRING COMMENT '订单状态', `user_id` STRING COMMENT '用户id', `payment_way` STRING COMMENT '支付方式', `delivery_address` STRING COMMENT '送货地址', `out_trade_no` STRING COMMENT '支付流水号', `create_time` STRING COMMENT '创建时间', `operate_time` STRING COMMENT '操作时间', `expire_time` STRING COMMENT '过期时间', `tracking_no` STRING COMMENT '物流单编号', `province_id` STRING COMMENT '省份ID', `activity_reduce_amount` DECIMAL(16,2) COMMENT '活动减免金额', `coupon_reduce_amount` DECIMAL(16,2) COMMENT '优惠券减免金额', `original_amount` DECIMAL(16,2) COMMENT '订单原价金额', `feight_fee` DECIMAL(16,2) COMMENT '运费', `feight_fee_reduce` DECIMAL(16,2) COMMENT '运费减免' ) COMMENT '订单表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_order_info/'; DROP TABLE IF EXISTS ods_order_refund_info; CREATE EXTERNAL TABLE ods_order_refund_info( `id` STRING COMMENT '编号', `user_id` STRING COMMENT '用户ID', `order_id` STRING COMMENT '订单ID', `sku_id` STRING COMMENT '商品ID', `refund_type` STRING COMMENT '退单类型', `refund_num` BIGINT COMMENT '退单件数', `refund_amount` DECIMAL(16,2) COMMENT '退单金额', `refund_reason_type` STRING COMMENT '退单原因类型', `refund_status` STRING COMMENT '退单状态',--退单状态应包含买家申请、卖家审核、卖家收货、退款完成等状态。此处未涉及到,故该表按增量处理 `create_time` STRING COMMENT '退单时间' ) COMMENT '退单表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_order_refund_info/'; DROP TABLE IF EXISTS ods_order_status_log; CREATE EXTERNAL TABLE ods_order_status_log ( `id` STRING COMMENT '编号', `order_id` STRING COMMENT '订单ID', `order_status` STRING COMMENT '订单状态', `operate_time` STRING COMMENT '修改时间' ) COMMENT '订单状态表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_order_status_log/'; DROP TABLE IF EXISTS ods_payment_info; CREATE EXTERNAL TABLE ods_payment_info( `id` STRING COMMENT '编号', `out_trade_no` STRING COMMENT '对外业务编号', `order_id` STRING COMMENT '订单编号', `user_id` STRING COMMENT '用户编号', `payment_type` STRING COMMENT '支付类型', `trade_no` STRING COMMENT '交易编号', `payment_amount` DECIMAL(16,2) COMMENT '支付金额', `subject` STRING COMMENT '交易内容', `payment_status` STRING COMMENT '支付状态', `create_time` STRING COMMENT '创建时间', `callback_time` STRING COMMENT '回调时间' ) COMMENT '支付流水表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_payment_info/'; DROP TABLE IF EXISTS ods_refund_payment; CREATE EXTERNAL TABLE ods_refund_payment( `id` STRING COMMENT '编号', `out_trade_no` STRING COMMENT '对外业务编号', `order_id` STRING COMMENT '订单编号', `sku_id` STRING COMMENT 'SKU编号', `payment_type` STRING COMMENT '支付类型', `trade_no` STRING COMMENT '交易编号', `refund_amount` DECIMAL(16,2) COMMENT '支付金额', `subject` STRING COMMENT '交易内容', `refund_status` STRING COMMENT '支付状态', `create_time` STRING COMMENT '创建时间', `callback_time` STRING COMMENT '回调时间' ) COMMENT '支付流水表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_refund_payment/'; DROP TABLE IF EXISTS ods_sku_attr_value; CREATE EXTERNAL TABLE ods_sku_attr_value( `id` STRING COMMENT '编号', `attr_id` STRING COMMENT '平台属性ID', `value_id` STRING COMMENT '平台属性值ID', `sku_id` STRING COMMENT '商品ID', `attr_name` STRING COMMENT '平台属性名称', `value_name` STRING COMMENT '平台属性值名称' ) COMMENT 'sku平台属性表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_sku_attr_value/'; DROP TABLE IF EXISTS ods_sku_info; CREATE EXTERNAL TABLE ods_sku_info( `id` STRING COMMENT 'skuId', `spu_id` STRING COMMENT 'spuid', `price` DECIMAL(16,2) COMMENT '价格', `sku_name` STRING COMMENT '商品名称', `sku_desc` STRING COMMENT '商品描述', `weight` DECIMAL(16,2) COMMENT '重量', `tm_id` STRING COMMENT '品牌id', `category3_id` STRING COMMENT '品类id', `is_sale` STRING COMMENT '是否在售', `create_time` STRING COMMENT '创建时间' ) COMMENT 'SKU商品表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_sku_info/'; DROP TABLE IF EXISTS ods_sku_sale_attr_value; CREATE EXTERNAL TABLE ods_sku_sale_attr_value( `id` STRING COMMENT '编号', `sku_id` STRING COMMENT 'sku_id', `spu_id` STRING COMMENT 'spu_id', `sale_attr_value_id` STRING COMMENT '销售属性值id', `sale_attr_id` STRING COMMENT '销售属性id', `sale_attr_name` STRING COMMENT '销售属性名称', `sale_attr_value_name` STRING COMMENT '销售属性值名称' ) COMMENT 'sku销售属性名称' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_sku_sale_attr_value/'; DROP TABLE IF EXISTS ods_spu_info; CREATE EXTERNAL TABLE ods_spu_info( `id` STRING COMMENT 'spuid', `spu_name` STRING COMMENT 'spu名称', `category3_id` STRING COMMENT '品类id', `tm_id` STRING COMMENT '品牌id' ) COMMENT 'SPU商品表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_spu_info/'; DROP TABLE IF EXISTS ods_user_info; CREATE EXTERNAL TABLE ods_user_info( `id` STRING COMMENT '用户id', `login_name` STRING COMMENT '用户名称', `nick_name` STRING COMMENT '用户昵称', `name` STRING COMMENT '用户姓名', `phone_num` STRING COMMENT '手机号码', `email` STRING COMMENT '邮箱', `user_level` STRING COMMENT '用户等级', `birthday` STRING COMMENT '生日', `gender` STRING COMMENT '性别', `create_time` STRING COMMENT '创建时间', `operate_time` STRING COMMENT '操作时间' ) COMMENT '用户表' PARTITIONED BY (`dt` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/warehouse/gmall/ods/ods_user_info/'; 

粘贴到datagrip,并执行当前代码

数仓搭建——ODS层

 数仓搭建——ODS层

 执行完之后就可以做数据的装载了。

装载思路

数仓搭建——ODS层

 2.2  ODS层业务表首日数据装载脚本

编写脚本

在/home/zhang/bin目录下创建脚本hdfs_to_ods_db_init.sh

[zhang@hadoop102 bin]$ vim hdfs_to_ods_db_init.sh

在脚本中填写如下内容

#!/bin/bash APP=gmall if [ -n "$2" ] ;then do_date=$2 else echo "请传入日期参数" exit fi ods_order_info=" load data inpath '/origin_data/$APP/db/order_info/$do_date' OVERWRITE into table ${APP}.ods_order_info partition(dt='$do_date');" ods_order_detail=" load data inpath '/origin_data/$APP/db/order_detail/$do_date' OVERWRITE into table ${APP}.ods_order_detail partition(dt='$do_date');" ods_sku_info=" load data inpath '/origin_data/$APP/db/sku_info/$do_date' OVERWRITE into table ${APP}.ods_sku_info partition(dt='$do_date');" ods_user_info=" load data inpath '/origin_data/$APP/db/user_info/$do_date' OVERWRITE into table ${APP}.ods_user_info partition(dt='$do_date');" ods_payment_info=" load data inpath '/origin_data/$APP/db/payment_info/$do_date' OVERWRITE into table ${APP}.ods_payment_info partition(dt='$do_date');" ods_base_category1=" load data inpath '/origin_data/$APP/db/base_category1/$do_date' OVERWRITE into table ${APP}.ods_base_category1 partition(dt='$do_date');" ods_base_category2=" load data inpath '/origin_data/$APP/db/base_category2/$do_date' OVERWRITE into table ${APP}.ods_base_category2 partition(dt='$do_date');" ods_base_category3=" load data inpath '/origin_data/$APP/db/base_category3/$do_date' OVERWRITE into table ${APP}.ods_base_category3 partition(dt='$do_date'); " ods_base_trademark=" load data inpath '/origin_data/$APP/db/base_trademark/$do_date' OVERWRITE into table ${APP}.ods_base_trademark partition(dt='$do_date'); " ods_activity_info=" load data inpath '/origin_data/$APP/db/activity_info/$do_date' OVERWRITE into table ${APP}.ods_activity_info partition(dt='$do_date'); " ods_cart_info=" load data inpath '/origin_data/$APP/db/cart_info/$do_date' OVERWRITE into table ${APP}.ods_cart_info partition(dt='$do_date'); " ods_comment_info=" load data inpath '/origin_data/$APP/db/comment_info/$do_date' OVERWRITE into table ${APP}.ods_comment_info partition(dt='$do_date'); " ods_coupon_info=" load data inpath '/origin_data/$APP/db/coupon_info/$do_date' OVERWRITE into table ${APP}.ods_coupon_info partition(dt='$do_date'); " ods_coupon_use=" load data inpath '/origin_data/$APP/db/coupon_use/$do_date' OVERWRITE into table ${APP}.ods_coupon_use partition(dt='$do_date'); " ods_favor_info=" load data inpath '/origin_data/$APP/db/favor_info/$do_date' OVERWRITE into table ${APP}.ods_favor_info partition(dt='$do_date'); " ods_order_refund_info=" load data inpath '/origin_data/$APP/db/order_refund_info/$do_date' OVERWRITE into table ${APP}.ods_order_refund_info partition(dt='$do_date'); " ods_order_status_log=" load data inpath '/origin_data/$APP/db/order_status_log/$do_date' OVERWRITE into table ${APP}.ods_order_status_log partition(dt='$do_date'); " ods_spu_info=" load data inpath '/origin_data/$APP/db/spu_info/$do_date' OVERWRITE into table ${APP}.ods_spu_info partition(dt='$do_date'); " ods_activity_rule=" load data inpath '/origin_data/$APP/db/activity_rule/$do_date' OVERWRITE into table ${APP}.ods_activity_rule partition(dt='$do_date');" ods_base_dic=" load data inpath '/origin_data/$APP/db/base_dic/$do_date' OVERWRITE into table ${APP}.ods_base_dic partition(dt='$do_date'); " ods_order_detail_activity=" load data inpath '/origin_data/$APP/db/order_detail_activity/$do_date' OVERWRITE into table ${APP}.ods_order_detail_activity partition(dt='$do_date'); " ods_order_detail_coupon=" load data inpath '/origin_data/$APP/db/order_detail_coupon/$do_date' OVERWRITE into table ${APP}.ods_order_detail_coupon partition(dt='$do_date'); " ods_refund_payment=" load data inpath '/origin_data/$APP/db/refund_payment/$do_date' OVERWRITE into table ${APP}.ods_refund_payment partition(dt='$do_date'); " ods_sku_attr_value=" load data inpath '/origin_data/$APP/db/sku_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_attr_value partition(dt='$do_date'); " ods_sku_sale_attr_value=" load data inpath '/origin_data/$APP/db/sku_sale_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_sale_attr_value partition(dt='$do_date'); " ods_base_province=" load data inpath '/origin_data/$APP/db/base_province/$do_date' OVERWRITE into table ${APP}.ods_base_province;" ods_base_region=" load data inpath '/origin_data/$APP/db/base_region/$do_date' OVERWRITE into table ${APP}.ods_base_region;" case $1 in "ods_order_info"){ hive -e "$ods_order_info" };; "ods_order_detail"){ hive -e "$ods_order_detail" };; "ods_sku_info"){ hive -e "$ods_sku_info" };; "ods_user_info"){ hive -e "$ods_user_info" };; "ods_payment_info"){ hive -e "$ods_payment_info" };; "ods_base_category1"){ hive -e "$ods_base_category1" };; "ods_base_category2"){ hive -e "$ods_base_category2" };; "ods_base_category3"){ hive -e "$ods_base_category3" };; "ods_base_trademark"){ hive -e "$ods_base_trademark" };; "ods_activity_info"){ hive -e "$ods_activity_info" };; "ods_cart_info"){ hive -e "$ods_cart_info" };; "ods_comment_info"){ hive -e "$ods_comment_info" };; "ods_coupon_info"){ hive -e "$ods_coupon_info" };; "ods_coupon_use"){ hive -e "$ods_coupon_use" };; "ods_favor_info"){ hive -e "$ods_favor_info" };; "ods_order_refund_info"){ hive -e "$ods_order_refund_info" };; "ods_order_status_log"){ hive -e "$ods_order_status_log" };; "ods_spu_info"){ hive -e "$ods_spu_info" };; "ods_activity_rule"){ hive -e "$ods_activity_rule" };; "ods_base_dic"){ hive -e "$ods_base_dic" };; "ods_order_detail_activity"){ hive -e "$ods_order_detail_activity" };; "ods_order_detail_coupon"){ hive -e "$ods_order_detail_coupon" };; "ods_refund_payment"){ hive -e "$ods_refund_payment" };; "ods_sku_attr_value"){ hive -e "$ods_sku_attr_value" };; "ods_sku_sale_attr_value"){ hive -e "$ods_sku_sale_attr_value" };; "ods_base_province"){ hive -e "$ods_base_province" };; "ods_base_region"){ hive -e "$ods_base_region" };; "all"){ hive -e "$ods_order_info$ods_order_detail$ods_sku_info$ods_user_info$ods_payment_info$ods_base_category1$ods_base_category2$ods_base_category3$ods_base_trademark$ods_activity_info$ods_cart_info$ods_comment_info$ods_coupon_info$ods_coupon_use$ods_favor_info$ods_order_refund_info$ods_order_status_log$ods_spu_info$ods_activity_rule$ods_base_dic$ods_order_detail_activity$ods_order_detail_coupon$ods_refund_payment$ods_sku_attr_value$ods_sku_sale_attr_value$ods_base_province$ods_base_region" };; esac 

数仓搭建——ODS层

2.3 ODS层业务表每日数据装载脚本

编写脚本

在/home/zhang/bin目录下创建脚本hdfs_to_ods_db.sh

[zhang@hadoop102 bin]$ vim hdfs_to_ods_db.sh

在脚本中填写如下内容

#!/bin/bash APP=gmall # 如果是输入的日期按照取输入日期;如果没输入日期取当前时间的前一天 if [ -n "$2" ] ;then do_date=$2 else do_date=`date -d "-1 day" +%F` fi ods_order_info=" load data inpath '/origin_data/$APP/db/order_info/$do_date' OVERWRITE into table ${APP}.ods_order_info partition(dt='$do_date');" ods_order_detail=" load data inpath '/origin_data/$APP/db/order_detail/$do_date' OVERWRITE into table ${APP}.ods_order_detail partition(dt='$do_date');" ods_sku_info=" load data inpath '/origin_data/$APP/db/sku_info/$do_date' OVERWRITE into table ${APP}.ods_sku_info partition(dt='$do_date');" ods_user_info=" load data inpath '/origin_data/$APP/db/user_info/$do_date' OVERWRITE into table ${APP}.ods_user_info partition(dt='$do_date');" ods_payment_info=" load data inpath '/origin_data/$APP/db/payment_info/$do_date' OVERWRITE into table ${APP}.ods_payment_info partition(dt='$do_date');" ods_base_category1=" load data inpath '/origin_data/$APP/db/base_category1/$do_date' OVERWRITE into table ${APP}.ods_base_category1 partition(dt='$do_date');" ods_base_category2=" load data inpath '/origin_data/$APP/db/base_category2/$do_date' OVERWRITE into table ${APP}.ods_base_category2 partition(dt='$do_date');" ods_base_category3=" load data inpath '/origin_data/$APP/db/base_category3/$do_date' OVERWRITE into table ${APP}.ods_base_category3 partition(dt='$do_date'); " ods_base_trademark=" load data inpath '/origin_data/$APP/db/base_trademark/$do_date' OVERWRITE into table ${APP}.ods_base_trademark partition(dt='$do_date'); " ods_activity_info=" load data inpath '/origin_data/$APP/db/activity_info/$do_date' OVERWRITE into table ${APP}.ods_activity_info partition(dt='$do_date'); " ods_cart_info=" load data inpath '/origin_data/$APP/db/cart_info/$do_date' OVERWRITE into table ${APP}.ods_cart_info partition(dt='$do_date'); " ods_comment_info=" load data inpath '/origin_data/$APP/db/comment_info/$do_date' OVERWRITE into table ${APP}.ods_comment_info partition(dt='$do_date'); " ods_coupon_info=" load data inpath '/origin_data/$APP/db/coupon_info/$do_date' OVERWRITE into table ${APP}.ods_coupon_info partition(dt='$do_date'); " ods_coupon_use=" load data inpath '/origin_data/$APP/db/coupon_use/$do_date' OVERWRITE into table ${APP}.ods_coupon_use partition(dt='$do_date'); " ods_favor_info=" load data inpath '/origin_data/$APP/db/favor_info/$do_date' OVERWRITE into table ${APP}.ods_favor_info partition(dt='$do_date'); " ods_order_refund_info=" load data inpath '/origin_data/$APP/db/order_refund_info/$do_date' OVERWRITE into table ${APP}.ods_order_refund_info partition(dt='$do_date'); " ods_order_status_log=" load data inpath '/origin_data/$APP/db/order_status_log/$do_date' OVERWRITE into table ${APP}.ods_order_status_log partition(dt='$do_date'); " ods_spu_info=" load data inpath '/origin_data/$APP/db/spu_info/$do_date' OVERWRITE into table ${APP}.ods_spu_info partition(dt='$do_date'); " ods_activity_rule=" load data inpath '/origin_data/$APP/db/activity_rule/$do_date' OVERWRITE into table ${APP}.ods_activity_rule partition(dt='$do_date');" ods_base_dic=" load data inpath '/origin_data/$APP/db/base_dic/$do_date' OVERWRITE into table ${APP}.ods_base_dic partition(dt='$do_date'); " ods_order_detail_activity=" load data inpath '/origin_data/$APP/db/order_detail_activity/$do_date' OVERWRITE into table ${APP}.ods_order_detail_activity partition(dt='$do_date'); " ods_order_detail_coupon=" load data inpath '/origin_data/$APP/db/order_detail_coupon/$do_date' OVERWRITE into table ${APP}.ods_order_detail_coupon partition(dt='$do_date'); " ods_refund_payment=" load data inpath '/origin_data/$APP/db/refund_payment/$do_date' OVERWRITE into table ${APP}.ods_refund_payment partition(dt='$do_date'); " ods_sku_attr_value=" load data inpath '/origin_data/$APP/db/sku_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_attr_value partition(dt='$do_date'); " ods_sku_sale_attr_value=" load data inpath '/origin_data/$APP/db/sku_sale_attr_value/$do_date' OVERWRITE into table ${APP}.ods_sku_sale_attr_value partition(dt='$do_date'); " ods_base_province=" load data inpath '/origin_data/$APP/db/base_province/$do_date' OVERWRITE into table ${APP}.ods_base_province;" ods_base_region=" load data inpath '/origin_data/$APP/db/base_region/$do_date' OVERWRITE into table ${APP}.ods_base_region;" case $1 in "ods_order_info"){ hive -e "$ods_order_info" };; "ods_order_detail"){ hive -e "$ods_order_detail" };; "ods_sku_info"){ hive -e "$ods_sku_info" };; "ods_user_info"){ hive -e "$ods_user_info" };; "ods_payment_info"){ hive -e "$ods_payment_info" };; "ods_base_category1"){ hive -e "$ods_base_category1" };; "ods_base_category2"){ hive -e "$ods_base_category2" };; "ods_base_category3"){ hive -e "$ods_base_category3" };; "ods_base_trademark"){ hive -e "$ods_base_trademark" };; "ods_activity_info"){ hive -e "$ods_activity_info" };; "ods_cart_info"){ hive -e "$ods_cart_info" };; "ods_comment_info"){ hive -e "$ods_comment_info" };; "ods_coupon_info"){ hive -e "$ods_coupon_info" };; "ods_coupon_use"){ hive -e "$ods_coupon_use" };; "ods_favor_info"){ hive -e "$ods_favor_info" };; "ods_order_refund_info"){ hive -e "$ods_order_refund_info" };; "ods_order_status_log"){ hive -e "$ods_order_status_log" };; "ods_spu_info"){ hive -e "$ods_spu_info" };; "ods_activity_rule"){ hive -e "$ods_activity_rule" };; "ods_base_dic"){ hive -e "$ods_base_dic" };; "ods_order_detail_activity"){ hive -e "$ods_order_detail_activity" };; "ods_order_detail_coupon"){ hive -e "$ods_order_detail_coupon" };; "ods_refund_payment"){ hive -e "$ods_refund_payment" };; "ods_sku_attr_value"){ hive -e "$ods_sku_attr_value" };; "ods_sku_sale_attr_value"){ hive -e "$ods_sku_sale_attr_value" };; "all"){ hive -e "$ods_order_info$ods_order_detail$ods_sku_info$ods_user_info$ods_payment_info$ods_base_category1$ods_base_category2$ods_base_category3$ods_base_trademark$ods_activity_info$ods_cart_info$ods_comment_info$ods_coupon_info$ods_coupon_use$ods_favor_info$ods_order_refund_info$ods_order_status_log$ods_spu_info$ods_activity_rule$ods_base_dic$ods_order_detail_activity$ods_order_detail_coupon$ods_refund_payment$ods_sku_attr_value$ods_sku_sale_attr_value" };; esac 

数仓搭建——ODS层

授权

数仓搭建——ODS层

 执行一下首日装载脚本,把14号的全量数据装载到ods表的业务表的第一个分区中(每日装载脚本要14号过去,15号之后才能执行)

[zhang@hadoop102 bin]$ ./hdfs_to_ods_db_init.sh all 2020-06-14

数仓搭建——ODS层

 打开datagrip抽查一下是否有数据

数仓搭建——ODS层

 至此,完成了ODS层业务表的创建和数据的装载。。。

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/233589.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • phpstorm 激活码生成【最新永久激活】2022.02.28

    (phpstorm 激活码生成)2021最新分享一个能用的的激活码出来,希望能帮到需要激活的朋友。目前这个是能用的,但是用的人多了之后也会失效,会不定时更新的,大家持续关注此网站~IntelliJ2021最新激活注册码,破解教程可免费永久激活,亲测有效,下面是详细链接哦~https://javaforall.net/100143.html…

    2022年4月2日
    292
  • CAP原理和数据高可用性

    CAP原理和数据高可用性对网站而言,数据是最宝贵的资源,硬件可以购买,软件可以重构,但是数据(用户数据,交易数据,商品数据)一旦丢失,对网站的打击可以说是毁灭性的。数据高可用性数据高可用性包括如下几个方面的含义数据持久性保证数据可以持久存储,在各种情况下都不会出现数据丢失。为了实现数据持久性,不但在写入数据是需要写入持久性存储,还需要将数据备份到一个或多个副本,存放在不同的物理存储设备上,在某个存储故障发生是,数据不会丢失

    2022年5月12日
    41
  • Git可视化教程——Git Gui的使用[通俗易懂]

    Git可视化教程——GitGui的使用

    2022年4月12日
    1.0K
  • Java标识符命名规则(超详细!)[通俗易懂]

    Java标识符命名规则(超详细!)[通俗易懂]规则1:标识符只能由数字、字母(包括中文)、下划线_、美元符号$组成,不能含有其它符号。规则2:标识符不能以数字开头规则3:关键字不能做标识符。例如:publicclassstaticvoid这些蓝色的字体都是关键字,关键字是不能做标识符的。规则4:标识符是严格区分大小写的。大写A和小写a不一样。规则5:标识符理论上是没有长度限制的。…

    2022年7月7日
    22
  • 常用网络工具

    常用网络工具这里是计算机网络的一些相关工具的用途和使用方法,很简洁。

    2022年6月20日
    22
  • 无人机的电路板和控制器_无人机飞控电路图

    无人机的电路板和控制器_无人机飞控电路图第1步,查看官方ESP-Drone无人机ESP32-S2-WROVER模块的参考设计原理图第二步,用KiCAD绘制ESP32-S2-WROVER模块及周边电路。1、如图2-1所示,从KiCAD的原理图符号库中直接调出ESP32-S2-WROVER的原理图符号。注意,在安装KiCAD软件后,它的原理图符号库中已经有ESP32-S2-WROVER的原理图符号,可以直接调用,对于没有的原理图符号,可以进行自己创建。2、如图2-2所示,给ESP32-S2-WROVER模块…

    2022年8月15日
    3

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号