Flink_企业级风控平台架构建设_01

概要

        实时风控解决方案
        总体架构介绍
        第一版需求实现

1. 风控背景

互联网场景中，典型的风控场景包括：注册风控、登陆风控、交易风控、活动风控等，而风控的最佳效果是防患于未然，所以事前事中和事后三种实现方案中，又以事前预警和事中控制最好。
这要求风控系统一定要有实时性。我们将实时风控架构作为重点讲解。

2. 总体架构

风控是业务场景的产物，风控系统直接服务于业务系统，与之相关的还有惩罚系统和分析系统，各系统关系与角色如下：

Flink_企业级风控平台架构建设_01

风控系统有规则和模型两种技术路线，规则的优点是简单直观、可解释性强、灵活，所以长期活跃在风控系统之中，但缺点是容易被攻破，一但被黑产猜中就会失效，于是在实际的风控系统中，往往需要再结合上基于模型的风控环节来增加健壮性。

规则就是针对事物的条件判断，我们针对注册、登陆、交易、活动分别假设几条规则，比如：
- 用户名与身份证姓名不一致；
- 某 IP 最近 1 小时注册账号数超过 10 个；
- 某账号最近 3 分钟登陆次数大于 5 次；
- 某账号群体最近 1 小时购买优惠商品超过 100 件；
- 某账号最近 3 分钟领券超过 3 张；
规则可以组合成规则组
- 事实：即被判断的主体和属性，如上面规则的账号及登陆次数、IP 和注册次数等；
- 条件：判断的逻辑，如某事实的某属性大于某个指标；
- 指标阈值：判断的依据，比如登陆次数的临界阈值，注册账号数的临界阈值等；
- 规则可由运营专家凭经验填写，也可由数据分析师根据历史数据发掘，但因为规则在与黑产的攻防之中会被猜中导致失效，所以无一例外都需要动态调整。
基于上边的讨论，我们设计一个风控系统方案如下：

Flink_企业级风控平台架构建设_01

该系统有三条数据流向：
- 实时风控数据流：由红线标识，同步调用，为风控调用的核心链路
- 准实时指标数据流：由蓝线标识，异步写入，为实时风控部分准备指标数据
- 准实时/离线分析数据流：由绿线标识，异步写入，为风控系统的表现分析提供数据

实时风控

前置过滤
- 业务系统在特定事件（如注册、登陆、下单、参加活动等）被触发后同步调用风控系统，附带相关上下文，比如 IP 地址，事件标识等，规则判断部分会根据管理后台的配置决定是否进行判断，如果是，接着进行黑白名单过滤，都通过后进入下一个环节。
实时数据准备
- 在进行判断之前，系统必须要准备一些事实数据，比如：
  - 注册场景，假如规则为单一 IP 最近 1 小时注册账号数不超过 10 个，那系统需要根据 IP 地址去 Redis/Hbase 找到该 IP 最近 1 小时注册账号的数目，比如 15；
  - 登陆场景，假如规则为单一账号最近 3 分钟登陆次数不超过 5 次，那系统需要根据账号去 Redis/Hbase 找到该账号最近 3 分钟登陆的次数，比如 8；

规则判断

准实时数据流
- 这部分属于后台逻辑，为风控系统服务，准备事实数据。
- 把数据准备与逻辑判断拆分，是出于系统的性能/可扩展性的角度考虑的。前边提到，做规则判断需要事实的相关指标，比如最近一小时登陆次数，最近一小时注册账号数等等，这些指标通常有一段时间跨度，是某种状态或聚合，很难在实时风控过程中根据原始数据进行计算，因为风控的规则引擎往往是无状态的，不会记录前面的结果。
- 同时，这部分原始数据量很大，因为用户活动的原始数据都要传过来进行计算，所以这部分往往由一个流式大数据系统来完成。
- 业务系统把埋点数据发送到 Kafka；
  - Flink 订阅 Kafka，完成原子粒度的聚合；
- Flink 仅完成原子粒度的聚合是和规则的动态变更逻辑相关的。举例来说，在注册场景中，运营同学会根据效果一会要判断某 IP 最近 1 小时的注册账号数，一会要判断最近 3 小时的注册账号数，一会又要判断最近 5 小时的注册账号数……也就是说这个最近 N 小时的 N 是动态调整的。那 Flink 在计算时只应该计算 1 小时的账号数，在判断过程中根据规则来读取最近 3 个 1 小时还是 5 个 1 小时，然后聚合后进行判断。因为在 Flink 的运行机制中，作业提交后会持续运行，如果调整逻辑需要停止作业，修改代码，然后重启，相当麻烦；同时因为 Flink 中间状态的问题，重启还面临着中间状态能否复用的问题。所以假如直接由 Flink 完成 N 小时的聚合的话，每次 N 的变动都需要重复上面的操作，有时还需要追数据，非常繁琐。
  - Flink 把汇总的指标结果写入 Redis 或 Hbase，供实时风控系统查询。两者问题都不大，根据场景选择即可。
  - 通过把数据计算和逻辑判断拆分开来并引入 Flink，我们的风控系统可以应对极大的用户规模。

分析系统
     前面的东西静态来看是一个完整的风控系统，但动态来看就有缺失了，这种缺失不体现在功能性上，而是体现在演进上。即如果从动态的角度来看一个风控系统的话，我们至少还需要两部分，一是衡量系统的整体效果，一是为系统提供规则/逻辑升级的依据。
在衡量整体效果方面，我们需要：
        判断规则是否失效，比如拦截率的突然降低；
        判断规则是否多余，比如某规则从来没拦截过任何事件；
        判断规则是否有漏洞，比如在举办某个促销活动或发放代金券后，福利被领完了，但没有达到预期效果；
     在为系统提供规则/逻辑升级依据方面，我们需要：
        发现全局规则：比如某人在电子产品的花费突然增长了 100 倍，单独来看是有问题的，但整体来看，可能很多人都出现了这个现象，原来是苹果发新品了。
识别某种行为的组合：单次行为是正常的，但组合是异常的，比如用户买菜刀是正常的，买车票是正常的，买绳子也是正常的，去加油站加油也是正常的，但短时间内同时做这些事情就不是正常的。
群体识别：比如通过图分析技术，发现某个群体，然后给给这个群体的所有账号都打上群体标签，防止出现那种每个账号表现都正常，但整个群体却在集中薅羊毛的情况。
这便是分析系统的角色定位，在他的工作中有部分是确定性的，也有部分是探索性的，为了完成这种工作，该系统需要尽可能多的数据支持，如：
        业务系统的数据，业务的埋点数据，记录详细的用户、交易或活动数据；
        风控拦截数据，风控系统的埋点数据，比如某个用户在具有某些特征的状态下因为某条规则而被拦截，这条拦截本身就是一个事件数据；
        这是一个典型的大数据分析场景，架构也比较灵活

Flink_企业级风控平台架构建设_01

相对来说这个系统是最开放的，既有固定的指标分析，也可以使用机器学习/数据分析技术发现更多新的规则或模式。

3.第一版需求开发

数据源介绍：

包名：com.star.engine.pojo

日志实体类

package com.star.engine.pojo; import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; import lombok.ToString; import java.util.Map; @Data @NoArgsConstructor @AllArgsConstructor @ToString public class ClientLog { private String userNo; // 用户ID private String userName; // 用户名 private String appId; // app的编号 private String appVersion; // APP的版本 private String addr; // 地址 private String carrier; // 运营商 private String imei; // 设备编号 private String deviceType; // 设备类型 private String ip; // 客户端IP private String netType; // 网络类型：WIFI,4G,5G private String osName; // 操作系统类型 private String osVersion; // 操作系统版本 private String sessionId; // 会话ID private String detailTime; // 创建详细时间 private String eventId; // 事件编号 private String eventType; // 事件类型 private String createTime; // 创建时间 private String gps; // 经纬度信息 private Map 
  
    properties; // 事件详细属性 }

工具类介绍

包名：com.star.engine.utils;

Constants 常量类

package com.star.engine.utils; / * 常量信息 */ public class Constants { //topic public static String CLIENT_LOG = "client_log"; public static String HBASE_TABLE = "events_db:users"; public static String BROKERS = "star01:9092,star02:9092,star03:9092"; public static Integer REDIS_PORT = 6379; public static String HOST = "star01"; public static String REDIS_ADDR = "star01"; public static String ZOOKEEPER_PORT = "2181"; public static String RULE_TYPE_LOGIN = "login"; }

FlinkKafkaUtils FlinkKafka工具类

package com.star.engine.utils; import org.apache.flink.api.common.serialization.SimpleStringSchema; import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer; import java.util.Properties; public class FlinkKafkaUtils { public static Properties getProducerProperties(String brokers) { Properties properties = getCommonProperties(); properties.setProperty("bootstrap.servers", brokers); properties.setProperty("metadata.broker.list", brokers); properties.setProperty("zookeeper.connect", Constants.HOST+":"+Constants.ZOOKEEPER_PORT); return properties; } public static Properties getCommonProperties() { Properties properties = new Properties(); properties.setProperty("linger.ms", "100"); properties.setProperty("retries", "100"); properties.setProperty("retry.backoff.ms", "200"); properties.setProperty("buffer.memory", ""); properties.setProperty("batch.size", "100"); properties.setProperty("max.request.size", ""); properties.setProperty("compression.type", "snappy"); properties.setProperty("request.timeout.ms", ""); properties.setProperty("max.block.ms", ""); return properties; } public static FlinkKafkaConsumer 
  
    getKafkaEventSource(){ Properties props = getProducerProperties(Constants.BROKERS); props.setProperty("auto.offset.reset", "latest"); //指定Topic FlinkKafkaConsumer 
   
     source = new FlinkKafkaConsumer<>(Constants.CLIENT_LOG, new SimpleStringSchema(), props); return source; } public static FlinkKafkaConsumer 
    
      getKafkaRuleSource() { Properties props = getProducerProperties(Constants.BROKERS); props.setProperty("auto.offset.reset", "latest"); FlinkKafkaConsumer 
     
       source = new FlinkKafkaConsumer<>("yinew_drl_rule", new SimpleStringSchema(), props); return source; } }

HBaseUtils Hbase工具类

package com.star.engine.utils; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.client.*; import org.apache.hadoop.hbase.filter.FilterList; import org.apache.hadoop.hbase.util.Bytes; import java.io.IOException; import java.util.List; public class HBaseUtils { private static Connection connection; private static Configuration configuration; static { configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.property.clientPort", Constants.ZOOKEEPER_PORT); configuration.set("hbase.zookeeper.quorum", Constants.HOST); try { connection = ConnectionFactory.createConnection(configuration); } catch (IOException e) { e.printStackTrace(); } } public static HTable initHbaseClient(String tableName) { try { return new HTable(configuration, tableName); } catch (IOException e) { e.printStackTrace(); } return null; } / * 创建 HBase 表 * @param tableName 表名 * @param columnFamilies 列族的数组 */ public static boolean createTable(String tableName, List 
  
    columnFamilies) { try { HBaseAdmin admin = (HBaseAdmin) connection.getAdmin(); if (admin.tableExists(tableName)) { return false; } HTableDescriptor tableDescriptor = new HTableDescriptor(TableName.valueOf(tableName)); columnFamilies.forEach(columnFamily -> { HColumnDescriptor columnDescriptor = new HColumnDescriptor(columnFamily); columnDescriptor.setMaxVersions(2); tableDescriptor.addFamily(columnDescriptor); }); admin.createTable(tableDescriptor); } catch (IOException e) { e.printStackTrace(); } return true; } / * 删除 hBase 表 * @param tableName 表名 */ public static boolean deleteTable(String tableName) { try { HBaseAdmin admin = (HBaseAdmin) connection.getAdmin(); // 删除表前需要先禁用表 admin.disableTable(tableName); admin.deleteTable(tableName); } catch (Exception e) { e.printStackTrace(); } return true; } / * 插入数据 * * @param tableName 表名 * @param rowKey 唯一标识 * @param columnFamilyName 列族名 * @param qualifier 列标识 * @param value 数据 */ public static boolean putRow(String tableName, String rowKey, String columnFamilyName, String qualifier, String value) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Put put = new Put(Bytes.toBytes(rowKey)); put.addColumn(Bytes.toBytes(columnFamilyName), Bytes.toBytes(qualifier), Bytes.toBytes(value)); table.put(put); table.close(); } catch (IOException e) { e.printStackTrace(); } return true; } / * 根据 rowKey 获取指定行的数据 * * @param tableName 表名 * @param rowKey 唯一标识 */ public static Result getRow(String tableName, String rowKey) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Get get = new Get(Bytes.toBytes(rowKey)); return table.get(get); } catch (IOException e) { e.printStackTrace(); } return null; } / * 获取指定行指定列 (cell) 的最新版本的数据 * * @param tableName 表名 * @param rowKey 唯一标识 * @param columnFamily 列族 * @param qualifier 列标识 */ public static String getCell(String tableName, String rowKey, String columnFamily, String qualifier) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Get get = new Get(Bytes.toBytes(rowKey)); if (!get.isCheckExistenceOnly()) { get.addColumn(Bytes.toBytes(columnFamily), Bytes.toBytes(qualifier)); Result result = table.get(get); byte[] resultValue = result.getValue(Bytes.toBytes(columnFamily), Bytes.toBytes(qualifier)); return Bytes.toString(resultValue); } else { return null; } } catch (IOException e) { e.printStackTrace(); } return null; } / * 检索全表 * * @param tableName 表名 */ public static ResultScanner getScanner(String tableName) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Scan scan = new Scan(); return table.getScanner(scan); } catch (IOException e) { e.printStackTrace(); } return null; } / * 检索表中指定数据 * * @param tableName 表名 * @param filterList 过滤器 */ public static ResultScanner getScanner(String tableName, FilterList filterList) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Scan scan = new Scan(); scan.setFilter(filterList); return table.getScanner(scan); } catch (IOException e) { e.printStackTrace(); } return null; } / * 检索表中指定数据 * * @param tableName 表名 * @param startRowKey 起始 RowKey * @param endRowKey 终止 RowKey * @param filterList 过滤器 */ public static ResultScanner getScanner(String tableName, String startRowKey, String endRowKey, FilterList filterList) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Scan scan = new Scan(); scan.setStartRow(Bytes.toBytes(startRowKey)); scan.setStopRow(Bytes.toBytes(endRowKey)); scan.setFilter(filterList); return table.getScanner(scan); } catch (IOException e) { e.printStackTrace(); } return null; } / * 删除指定行记录 * * @param tableName 表名 * @param rowKey 唯一标识 */ public static boolean deleteRow(String tableName, String rowKey) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Delete delete = new Delete(Bytes.toBytes(rowKey)); table.delete(delete); } catch (IOException e) { e.printStackTrace(); } return true; } / * 删除指定行的指定列 * * @param tableName 表名 * @param rowKey 唯一标识 * @param familyName 列族 * @param qualifier 列标识 */ public static boolean deleteColumn(String tableName, String rowKey, String familyName, String qualifier) { try { Table table = connection.getTable(TableName.valueOf(tableName)); Delete delete = new Delete(Bytes.toBytes(rowKey)); delete.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(qualifier)); table.delete(delete); table.close(); } catch (IOException e) { e.printStackTrace(); } return true; } }

KafkaProducerUtils Kafka生产者工具类

package com.star.engine.utils; import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.Producer; import java.util.Properties; public class KafkaProducerUtils { static Producer 
  
    producer; public static void init() { Properties props = new Properties(); //此处配置的是kafka的端口 props.put("metadata.broker.list", Constants.BROKERS); props.put("bootstrap.servers", Constants.BROKERS); //配置value的序列化类 props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); //配置key的序列化类 props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); props.put("producer.type", "async"); props.put("request.required.acks", "-1"); producer = new KafkaProducer<>(props); } public static Producer getProducer() { if (producer == null) { init(); } return producer; } }

第一版需求：      1、完成是否异地登录判断

package com.star.engine.core; import com.alibaba.fastjson.JSON; import com.star.engine.pojo.ClientLog; import com.star.engine.utils.Constants; import com.star.engine.utils.FlinkKafkaUtils; import com.star.engine.utils.HBaseUtils; import org.apache.flink.api.common.state.ListState; import org.apache.flink.api.common.state.ListStateDescriptor; import org.apache.flink.configuration.Configuration; import org.apache.flink.streaming.api.datastream.DataStreamSource; import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.KeyedProcessFunction; import org.apache.flink.util.Collector; import org.apache.hadoop.hbase.client.HTable; import org.locationtech.spatial4j.distance.DistanceUtils; import java.text.SimpleDateFormat; import java.util.Iterator; / * 第一版需求： * 1、完成是否异地登录判断 * */ public class Processl { public static void main(String[] args) throws Exception { StreamExecutionEnvironment environment = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration()); // 1.获取Kafka数据源 DataStreamSource 
  
    source = environment.addSource(FlinkKafkaUtils.getKafkaEventSource()); SingleOutputStreamOperator 
   
     clientlogSource = source.map(str -> JSON.parseObject(str, ClientLog.class)); clientlogSource.keyBy(clientLog -> clientLog.getUserNo()) .process(new KeyedProcessFunction 
    
      () { / * 从hbase读取用户画像数据 * @param parameters * @throws Exception */ HTable table; // 基于状态存储用户上一次登录的数据 ListState 
     
       privousLoginData; @Override public void open(Configuration parameters) throws Exception { table = HBaseUtils.initHbaseClient(Constants.HBASE_TABLE); privousLoginData = getRuntimeContext().getListState(new ListStateDescriptor 
      
        ("privoulsLoginData", ClientLog.class)); } // 对每条数据进行处理 @Override public void processElement(ClientLog clientLog, Context context, Collector 
       
         out) throws Exception { String eventType = clientLog.getEventType(); if (eventType.equals(Constants.RULE_TYPE_LOGIN)) { // 用户登录 / * 1、5分钟登录次数限制 * 2、异地登录 * 3、不在常用地区登录 */ // 5分钟登录次数限制 // 异地登录 distanceProcess(clientLog,privousLoginData); // 在常用地区登录 } } }).print(); environment.execute("Processl"); } / * 异地登录 * @param clientLog * @param privousLoginData */ public static void distanceProcess(ClientLog clientLog, ListState 
        
          privousLoginData) { SimpleDateFormat simpleDateFormat = new SimpleDateFormat(); String score = "0"; String reason = "第一次登录：" + clientLog.getAddr() + " 本地登录时间：" + clientLog.getDetailTime(); try { // 判断上一次是否登录 Iterator 
         
           iterator = privousLoginData.get().iterator(); if (iterator.hasNext()) { ClientLog privousMessage = iterator.next(); // 不是第一次登录 // 静态规则 String distanceRule = "500|1@400|2@300|3@200|4@100|5"; String[] distanceRules = distanceRule.split("@"); String oldTime = privousMessage.getDetailTime(); String oldGps = privousMessage.getGps(); String oldAddr = privousMessage.getAddr(); // 计算距离 double distanceReal = DistanceUtils.distHaversineRAD(Double.parseDouble(clientLog.getGps().split("\\,")[0]), Double.parseDouble(clientLog.getGps().split("\\,")[1]), Double.parseDouble(oldGps.split("\\,")[0]), Double.parseDouble(oldGps.split("\\,")[1])); // 时间差 long time = simpleDateFormat.parse(clientLog.getDetailTime()).getTime() - simpleDateFormat.parse(oldTime).getTime(); double speed = distanceReal / (time / (1000 * 3600.0)); // 规则匹配 for (String rule : distanceRules) { double speedLimit = Double.parseDouble(rule.split("\\|")[0]); String speedScore = rule.split("\\|")[1]; if (speed >= speedLimit) { score = speedScore; reason += "=== 短时间内速度为："+ speed + " 规定约定速度为：" + speedLimit+" 当前登录地：" + clientLog.getAddr() + " 上一次登录地：" + privousMessage.getAddr(); } } } else { // 第一次登录 score = "5"; reason = "第一次登录：" + clientLog.getAddr() + " 登录时间为：" + clientLog.getDetailTime(); } privousLoginData.clear(); privousLoginData.add(clientLog); } catch (Exception e) { e.printStackTrace(); } } }

发布者：全栈程序员-站长，转载请注明出处：https://javaforall.net/217036.html原文链接：https://javaforall.net

Flink_企业级风控平台架构建设_01

概要

1. 风控背景

2. 总体架构

3.第一版需求开发

关于作者

全栈程序员-站长

发表回复

Flink_企业级风控平台架构建设_01

概要

1. 风控背景

2. 总体架构

3.第一版需求开发

关于作者

全栈程序员-站长

相关推荐

js移除数组中指定元素的值_js删除数组中指定几个元素

Xilinx Vivado和SDK安装

敏捷开发流程简介

idea 2021.5.3 删除之前的激活码（最新序列号破解）

Android系统服务PMS

【01】初识ThreadX

发表回复