How AHI Fintech and DataVisor are Securing Data through AI and Big Data

How AHI Fintech and DataVisor are Securing Data through AI and Big Data

大家好,又见面了,我是你们的朋友全栈君。

The field of financial risk control has recently seen a sudden increase in competition over the past year. Several budding enterprises find themselves currently fighting a battle on two fronts—data acquisition capabilities and algorithm technology.

In June 2017, China’s Cyber Security Act was launched. Companies that crawl users’ mobile phone for data without prior authorization may now be facing serious legal implications. This can include a 7-year jail sentence for legal representatives of the convicted company. Furthermore, many businesses in the field of data acquisition and transactional data are now facing thorough investigation.

With the loss of the gray data industry, the risk control industry seems to be facing an opportunity to move toward healthy compliance, despite the technical challenges it still faces.

A Close Race: Algorithms or Data?

Despite the increasingly thorough regulations, cyber-attack rates are still on the rise. This makes user data even more precious, especially for businesses reliant on the integration of external data. Instead of circumventing regulations, enterprises have now shifted their focus on two major aspects of big data analysis – algorithms and modeling. These two areas are crucial in the field of big data as their emergence has given rise to a group of new risk-control companies in China.

Huang Ling, CEO of AHI FinTech and a Ph.D. in computer science at the University of California, Berkeley, and a part-time professor at the Interdisciplinary Information Institute at Tsinghua University described his work as “A global war.” “In the risk control industry, our opponents are huge, and feature worldwide black-market production chain.”

Since the very beginning, risk control has been facing a global opponent in the form of a community of hackers. These hackers invade other people’s phones and computers through malicious software. On the one hand, they can access confidential data, on the other hand, they can use this compromised information to open fake accounts. They also can do all kinds of false social interactions such as leaving reviews or making purchases. They eventually have a seemingly normal account with a lot of friends and good credit history. Ultimately, they use these accounts to apply for a variety of financial products.

Meanwhile, the core of risk control is using relevant data to conduct modeling analysis and eliminate fake users, then provide repayment ability and repayment willingness and risk control evaluations for real users.

The data risk control companies that have begun to pop up over the past few years complete the resources that this work relies on algorithms and data acquisition capabilities.

Currently, in the face of the massive worldwide “gangs” of hackers and an enormous black-market production chain, risk control, and anti-fraud solutions in the market are lagging an obvious step behind regarding of algorithmic technology. Program providers often use device fingerprints, black/white lists, regular systems, or tagged machine learning models to detect fraudulent activities. Some methods only conduct shallow analyses; therefore, it is easy for malicious opponents to circumvent and deceive. Some use machine learning methods but often rely on tagged historical data to train models. This labeled data is often scarce and represents only the fraudulent activities that have occurred in the past. Furthermore, the models trained with this data are not accurate enough to cope with the ever-changing fraudulent practices.

Meanwhile, a vast number of risk control companies are primarily reliant on strong data acquisition and integration capabilities. But by malicious crawling, purchasing of hacked data, and so on, the consolidated data eventually includes a substantial portion of the individual’s confidential data which include ID card, phone number, bank card, savings account, or exact home address. The solutions that currently exist in the industry are extremely dependent on this kind of data. However, this data uses infringes on personal privacy and its legal legitimacy has been the target of heavy criticism. However, the amount of data available within the industry has a profound impact on modeling accuracy, and after the omission of these sensitive data, it will test the algorithms’ ability to perform.

Acquisition of Risk Control Data: Is Magnitude or Scenario More Important?

Will the sophistication of the algorithm make up for the loss of a large amount of sensitive data? Last month, in an exclusive interview with HC Financial Service Group CFO Shen Yutong (Tony Shen) in New York, the Big Data Digest said we should be more cautious when using data that is not directly related to lending behavior or credit behavior.

“Some people frequently shop online, but because of their frequent shopping habits, they find themselves short of cash and in need of a loan. It is necessary to realize that regular online payments do not always mean that the buyer is a good person to give credit to. Therefore, they fail to get loans when applying for greater amounts of credit.”

Social and online-shopping data, while valuable, are not necessarily more useful than data that is directly related to a person’s financial situation. Mr. Shen’s attitude reflects that of traditional financial experts toward Internet risk control—that is one of caution. Furthermore, it embodies a problem currently faced by the industry, when acquiring risk control data, is the volume of data more important than the data scenarios or vice versa?

With regards to this issue, Huang Ling is obviously more supportive of the latter type, “I think it depends on what kind of data we’re talking about and how we use those data. The data which we’re crawling isn’t necessarily going to make much of a difference here. For customer data and application scenarios, it is important to help them mine the data more accurately and closer to the goal of the data.

1

Core customers for financial risk control are Internet companies, Internet financial enterprises, and other financial institutions. What these institutions have in common is a large number of accounts. With accounts at the center, we can acquire a lot of personal information including bank balances, purchasing history, borrowing history, and more. The work of risk control then is to build models around this data and make assessments of the user’s ability and willingness to make repayments, proper loan amount, etc.

Huang Ling believes that desensitizing data for user behavior models can also help realize the goal of risk control fraud prevention. “When executing behavior analysis, we usually look at the person’s social relationships, phone records, and e-commerce behavior. Here, behavior refers to where, when, and on what device the user registers and logs on to the website, what they did after logging in (the pages they visited, the products they purchased, the friends they added, who the spoke to, etc.). Even though this data includes some sensitive information (such as who your friends are) the data is desensitized. This data is then fed into a graphing algorithm and user association analysis is used to identify the hidden information related to the interaction between users.”

Huang Ling and his team at AHI FinTech Quest

“We have almost no sensitive data belonging to users, more important for us is to use non-sensitive data, targeting the client’s behavior data. We then combine it with user use scenarios, use AI and Big Data methods to help the client get value out of their data, then create the most appropriate risk control model for the client’s user scenario. Subsequently, we help them achieve the most optimal testing results on their platform. This way, we can automatically detect abnormal connections among tens of millions of users, produce risk control warnings, and guard against organized and systematic risks. We execute all without infringing on users’ privacy or having insight on the type or characteristics of the attack.”

Redefining Risk Control: Starting from the Data Source

This kind of data acquisition method also poses greater and more serious requirements to companies in the industry.

“When we do risk control modeling, the first thing we look at is the quality of the data, including whether or not the data is complete and whether or not it includes data that is relevant to risk control.”

Huang Ling believes that risk control happens not only during modeling and testing, but on the side of the company, beginning with the collection of data. When dealing with customers, AHI FinTech focuses on helping its clients elevate their related abilities from a service perspective:

First, after a risk control signal is sent to the client’s platform, the platform can then block users with a high-risk value. For users with lower but insignificant risk control values, the system can merge data from other dimensions into their rules and models. It can also perform further processing and refining, and then re-process the user.

2

Furthermore, the system provides feedback during each step of data collection as well as the discovery of fraudulent trends and modeling.

The systems conduct these two aspects in parallel. If collected data does not meet quality standards, then the client must be required to adjust, then provide feedback on its issues in certain aspects. Even if the data cannot be made up for right away, the system has to fix it as soon as possible.”

“We make recommendations to the client on how to collect data according to our own risk control and fraud prevention experience, so working with us is not just a matter of us helping you meet fraud detection requirements, we also give the client a lot of feedback and keep open lines of communication. We offer them comprehensive consulting and service from system applications to data collection and risk control.”

The opponent of financial risk control is the enormous black-market production chain, making the matter extremely complicated. Organizations are introducing innovative technologies like AI and machine learning to the field in droves but using them correctly is certainly not a simple matter.

Most of solutions currently in the market are reliant on collecting massive amounts of data, then using rules systems or supervised machine learning generated models. These solutions harbor an obvious shortcoming: the models are always reliant on training with historical tag data. However, we can produce tags can only after we have suffered a fraud attack. We create them at the cost of our own sweat, blood, and tears. As our goal is for these kinds of attacks to be increasingly rare, we find ourselves lacking data for training models. Models produced by this kind of tag training are never good enough, and they can only represent fraudulent behavior that we’ve seen in the past. When fraudsters invent new methods, our models that are reliant on tag training always have difficulty in quickly and accurately stopping them—often creating massive losses.

Huang Ling’s team uses semi-supervised learning on data with only a few—or even no—tags to generate models, allowing them to significantly reduce the cost of acquiring new tags, increasing data usage, and producing higher quality models. Using an active machine learning platform, massive data processing capabilities provided by a Big Data system that combines organic and artificial intelligence, and the experience of risk control experts to help artificial intelligence automatically learn previously unknown fraud tactics. Additionally, it can track new fraud methods, and constantly adapt to an ever-changing environment to created anti-fraud machine learning models. This makes it significantly difficult for fraudsters to evade detection.

The Black-Market Production Chain in the Risk Control Industry

In addition to AHI FinTech, there is a Silicon Valley company—Datavisor—in the risk control field that takes a similar approach. Beginning in 2014, Huang Ling left his 7-year career as a senior researcher at Intel, becoming a founding member and Director of Data at Datavisor where he hosted the company’s entire machine learning, user behavior analysis, and credit modeling system. Here, he became a party to the next generation in Silicon Valley and became the most well-known expert in using unsupervised risk control methods.

3

Huang Ling has always believed that risk control in China is not particularly comparable to that in Silicon Valley. The black products faced by the anti-fraud industry are an entire production chain, made up of a gang of sorts that is spread out around the globe. This chain stretches from Eastern Europe to America to China to India. Moreover, it includes security attack software at the top to the people who use this software to control people and phones around the world. It also includes people and phones to create fake users who execute all kinds of fraudulent activity and reap the benefits.

Therefore, to a certain extent, you can say that risk control and anti-fraud work are universal. Several Internet companies and financial institutions in China are also facing attacks from abroad, and a lot of attacks perpetrated in America are conducted via China, India, Africa, or one of several South East Asian countries.

As a result, the much significant difference between America and China likely lies in differences in political policy and industry development:

Because the credit system in America is sound, the cost of committing fraud is relatively high. Whether the user is defrauding a bank or an online merchant, these kinds of activities usually affect the user’s credit score through a variety of channels. In China, this system is still under-developed; therefore, in many situations, user’s credit rating according to the central bank does not reflect online financial fraudulent activities. Therefore, the cost of committing fraud is comparatively low. As a result, examples of large-scale fraud are more abundant in China than they are in the States and they tend to be harder to handle effectively.

Furthermore, the development of the industry has been different in China and America. China’s mobile applications and Internet financial industries have grown to be larger than their counterparts in America, so there is more fraudulent activity surrounding these two sectors than in America.

“After coming back, we noticed that in China—especially in fields related to finance—this kind of fraud gang was larger and craftier than they are in America. They also use more real people to commit fraud, making them harder to detect and requiring more machine learning and AI modeling methods.” Huang Ling said.

And on a global battlefield such as this, the addition of experts in artificial intelligence algorithms and security scientists is even more invaluable.

Turning to the entrepreneurial mind, Huang Ling said, “I have been researching and practicing in the fields of artificial intelligence algorithms and Internet security for several years. I hope that the skills and experience I have acquired over this time will be useful in the fields of financial risk control and fraud prevention. I also wish to provide a complete set of systems and services to accompany financial and Internet products and thereby achieve a more secure, honest, and fair industry environment. “

Aside from Huang Ling, the other co-founder of AHI FinTech—chief scientist Xu Wei—also hails from academia. Xu Wei served as a cross-institute adjunct professor at Tsinghua University.

Conclusion

Huang Ling regards the entrepreneurship of AI scientists in the field of risk control as a good thing. This is because they possess the skill and understanding of algorithms. Additionally, he is willing to give them a chance to really participate in the industry, rather than just being a cog in the machine.

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/107542.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • win10闲置服务如何关闭_任务管理器中服务主机进程有什么用

    win10闲置服务如何关闭_任务管理器中服务主机进程有什么用在使用Windows10系统电脑过程中,一位用户打开任务管理器时发现一些空闲进程会占用比较多的CPU,因此想知道能否将它关闭掉。为此,小编整理了关闭方法,有需要的用户,请来看看win10系统空闲进程占用cpu怎么关闭吧。windows10系统使用过程中,会默认运行很多进程,但有许多是空闲进程,且会占用很多空间,因此win10系统空闲进程占用cpu多最好的解决方法就是关闭空闲进程,如何关闭空闲进程呢…

    2022年10月20日
    3
  • 数据库怎么创建学生表_设计数据库,创建数据库和数据表

    数据库怎么创建学生表_设计数据库,创建数据库和数据表知识点:数据库表的相关概念、创建数据库表的方法、设计数据库表、向数据库表中插入数据、建立不同数据库表之间的关系、删除数据库表。1、数据表相关的一些概念1.1数据库里的数据是如何保存的?数据库到底是怎么存储数据的?比如要把学生信息存储到数据库里,能把学生塞进数据库吗?肯定是把学生的数据信息抽象出来,把一些重要信息以文字或数字的形式保存到数据库中去。…

    2022年9月25日
    1
  • NVIC库函数

    NVIC库函数1.voidNVIC_Init(NVIC_InitTypeDef*NVIC_InitStruct)功能:根据NVIC_InitStruct结构体变量中的参数初始化NVIC外设注释:结构体中的NVIC_IRQChannel成员赋值要到stm32f10x.h中的IRQn_Type(STM32F10x中断数定义)去复制例如:NVIC_Init(&NVIC_InitStructur…

    2022年5月28日
    106
  • 基于cesuim三维框架开发的三维路径分析的实现「建议收藏」

    基于cesuim三维框架开发的三维路径分析的实现「建议收藏」1、可以利用百度地图web服务或者天地图web服务,得到二维的路径分析的经纬度;2、利用cesuim地形数据采样接口:sampleTerrain得到高程,然后就有了三维路径分析的坐标信息;3、然后利用画线的接口,就能完成路径分析;记录一下sanpleTerrain的用法://QuerytheterrainheightoftwoCartographicpositionsva…

    2022年8月24日
    5
  • 颜色校准调整伽马_色彩gamma什么意思

    颜色校准调整伽马_色彩gamma什么意思目录1、色彩矫正(CCM)2、伽马校正(Gamma)1、色彩矫正(CCM)色彩校正(ColorCorrection)是指用相同的方法改变图像中的所有像素的颜色值,以得到不同得显示效果。图像采集系统在获得数字图像时,由于一起或环境光照或人为因素的影响,采集的图像往往与原始图像有很大差别。颜色校正可以在一定程度上减少这种差别。利用RGB颜色模型可以方便地调整图像的RGB分量值,这对校正偏色很有用。色彩校正的基本原理如下:其中,Mij…

    2022年9月16日
    4
  • Django(17)orm查询操作[通俗易懂]

    Django(17)orm查询操作[通俗易懂]前言查找是数据库操作中一个非常重要的技术。查询一般就是使用filter、exclude以及get三个方法来实现。我们可以在调用这些方法的时候传递不同的参数来实现查询需求。在ORM层面,这些查询条件都

    2022年7月28日
    9

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号