How AHI Fintech and DataVisor are Securing Data through AI and Big Data

How AHI Fintech and DataVisor are Securing Data through AI and Big Data

大家好,又见面了,我是你们的朋友全栈君。

The field of financial risk control has recently seen a sudden increase in competition over the past year. Several budding enterprises find themselves currently fighting a battle on two fronts—data acquisition capabilities and algorithm technology.

In June 2017, China’s Cyber Security Act was launched. Companies that crawl users’ mobile phone for data without prior authorization may now be facing serious legal implications. This can include a 7-year jail sentence for legal representatives of the convicted company. Furthermore, many businesses in the field of data acquisition and transactional data are now facing thorough investigation.

With the loss of the gray data industry, the risk control industry seems to be facing an opportunity to move toward healthy compliance, despite the technical challenges it still faces.

A Close Race: Algorithms or Data?

Despite the increasingly thorough regulations, cyber-attack rates are still on the rise. This makes user data even more precious, especially for businesses reliant on the integration of external data. Instead of circumventing regulations, enterprises have now shifted their focus on two major aspects of big data analysis – algorithms and modeling. These two areas are crucial in the field of big data as their emergence has given rise to a group of new risk-control companies in China.

Huang Ling, CEO of AHI FinTech and a Ph.D. in computer science at the University of California, Berkeley, and a part-time professor at the Interdisciplinary Information Institute at Tsinghua University described his work as “A global war.” “In the risk control industry, our opponents are huge, and feature worldwide black-market production chain.”

Since the very beginning, risk control has been facing a global opponent in the form of a community of hackers. These hackers invade other people’s phones and computers through malicious software. On the one hand, they can access confidential data, on the other hand, they can use this compromised information to open fake accounts. They also can do all kinds of false social interactions such as leaving reviews or making purchases. They eventually have a seemingly normal account with a lot of friends and good credit history. Ultimately, they use these accounts to apply for a variety of financial products.

Meanwhile, the core of risk control is using relevant data to conduct modeling analysis and eliminate fake users, then provide repayment ability and repayment willingness and risk control evaluations for real users.

The data risk control companies that have begun to pop up over the past few years complete the resources that this work relies on algorithms and data acquisition capabilities.

Currently, in the face of the massive worldwide “gangs” of hackers and an enormous black-market production chain, risk control, and anti-fraud solutions in the market are lagging an obvious step behind regarding of algorithmic technology. Program providers often use device fingerprints, black/white lists, regular systems, or tagged machine learning models to detect fraudulent activities. Some methods only conduct shallow analyses; therefore, it is easy for malicious opponents to circumvent and deceive. Some use machine learning methods but often rely on tagged historical data to train models. This labeled data is often scarce and represents only the fraudulent activities that have occurred in the past. Furthermore, the models trained with this data are not accurate enough to cope with the ever-changing fraudulent practices.

Meanwhile, a vast number of risk control companies are primarily reliant on strong data acquisition and integration capabilities. But by malicious crawling, purchasing of hacked data, and so on, the consolidated data eventually includes a substantial portion of the individual’s confidential data which include ID card, phone number, bank card, savings account, or exact home address. The solutions that currently exist in the industry are extremely dependent on this kind of data. However, this data uses infringes on personal privacy and its legal legitimacy has been the target of heavy criticism. However, the amount of data available within the industry has a profound impact on modeling accuracy, and after the omission of these sensitive data, it will test the algorithms’ ability to perform.

Acquisition of Risk Control Data: Is Magnitude or Scenario More Important?

Will the sophistication of the algorithm make up for the loss of a large amount of sensitive data? Last month, in an exclusive interview with HC Financial Service Group CFO Shen Yutong (Tony Shen) in New York, the Big Data Digest said we should be more cautious when using data that is not directly related to lending behavior or credit behavior.

“Some people frequently shop online, but because of their frequent shopping habits, they find themselves short of cash and in need of a loan. It is necessary to realize that regular online payments do not always mean that the buyer is a good person to give credit to. Therefore, they fail to get loans when applying for greater amounts of credit.”

Social and online-shopping data, while valuable, are not necessarily more useful than data that is directly related to a person’s financial situation. Mr. Shen’s attitude reflects that of traditional financial experts toward Internet risk control—that is one of caution. Furthermore, it embodies a problem currently faced by the industry, when acquiring risk control data, is the volume of data more important than the data scenarios or vice versa?

With regards to this issue, Huang Ling is obviously more supportive of the latter type, “I think it depends on what kind of data we’re talking about and how we use those data. The data which we’re crawling isn’t necessarily going to make much of a difference here. For customer data and application scenarios, it is important to help them mine the data more accurately and closer to the goal of the data.

1

Core customers for financial risk control are Internet companies, Internet financial enterprises, and other financial institutions. What these institutions have in common is a large number of accounts. With accounts at the center, we can acquire a lot of personal information including bank balances, purchasing history, borrowing history, and more. The work of risk control then is to build models around this data and make assessments of the user’s ability and willingness to make repayments, proper loan amount, etc.

Huang Ling believes that desensitizing data for user behavior models can also help realize the goal of risk control fraud prevention. “When executing behavior analysis, we usually look at the person’s social relationships, phone records, and e-commerce behavior. Here, behavior refers to where, when, and on what device the user registers and logs on to the website, what they did after logging in (the pages they visited, the products they purchased, the friends they added, who the spoke to, etc.). Even though this data includes some sensitive information (such as who your friends are) the data is desensitized. This data is then fed into a graphing algorithm and user association analysis is used to identify the hidden information related to the interaction between users.”

Huang Ling and his team at AHI FinTech Quest

“We have almost no sensitive data belonging to users, more important for us is to use non-sensitive data, targeting the client’s behavior data. We then combine it with user use scenarios, use AI and Big Data methods to help the client get value out of their data, then create the most appropriate risk control model for the client’s user scenario. Subsequently, we help them achieve the most optimal testing results on their platform. This way, we can automatically detect abnormal connections among tens of millions of users, produce risk control warnings, and guard against organized and systematic risks. We execute all without infringing on users’ privacy or having insight on the type or characteristics of the attack.”

Redefining Risk Control: Starting from the Data Source

This kind of data acquisition method also poses greater and more serious requirements to companies in the industry.

“When we do risk control modeling, the first thing we look at is the quality of the data, including whether or not the data is complete and whether or not it includes data that is relevant to risk control.”

Huang Ling believes that risk control happens not only during modeling and testing, but on the side of the company, beginning with the collection of data. When dealing with customers, AHI FinTech focuses on helping its clients elevate their related abilities from a service perspective:

First, after a risk control signal is sent to the client’s platform, the platform can then block users with a high-risk value. For users with lower but insignificant risk control values, the system can merge data from other dimensions into their rules and models. It can also perform further processing and refining, and then re-process the user.

2

Furthermore, the system provides feedback during each step of data collection as well as the discovery of fraudulent trends and modeling.

The systems conduct these two aspects in parallel. If collected data does not meet quality standards, then the client must be required to adjust, then provide feedback on its issues in certain aspects. Even if the data cannot be made up for right away, the system has to fix it as soon as possible.”

“We make recommendations to the client on how to collect data according to our own risk control and fraud prevention experience, so working with us is not just a matter of us helping you meet fraud detection requirements, we also give the client a lot of feedback and keep open lines of communication. We offer them comprehensive consulting and service from system applications to data collection and risk control.”

The opponent of financial risk control is the enormous black-market production chain, making the matter extremely complicated. Organizations are introducing innovative technologies like AI and machine learning to the field in droves but using them correctly is certainly not a simple matter.

Most of solutions currently in the market are reliant on collecting massive amounts of data, then using rules systems or supervised machine learning generated models. These solutions harbor an obvious shortcoming: the models are always reliant on training with historical tag data. However, we can produce tags can only after we have suffered a fraud attack. We create them at the cost of our own sweat, blood, and tears. As our goal is for these kinds of attacks to be increasingly rare, we find ourselves lacking data for training models. Models produced by this kind of tag training are never good enough, and they can only represent fraudulent behavior that we’ve seen in the past. When fraudsters invent new methods, our models that are reliant on tag training always have difficulty in quickly and accurately stopping them—often creating massive losses.

Huang Ling’s team uses semi-supervised learning on data with only a few—or even no—tags to generate models, allowing them to significantly reduce the cost of acquiring new tags, increasing data usage, and producing higher quality models. Using an active machine learning platform, massive data processing capabilities provided by a Big Data system that combines organic and artificial intelligence, and the experience of risk control experts to help artificial intelligence automatically learn previously unknown fraud tactics. Additionally, it can track new fraud methods, and constantly adapt to an ever-changing environment to created anti-fraud machine learning models. This makes it significantly difficult for fraudsters to evade detection.

The Black-Market Production Chain in the Risk Control Industry

In addition to AHI FinTech, there is a Silicon Valley company—Datavisor—in the risk control field that takes a similar approach. Beginning in 2014, Huang Ling left his 7-year career as a senior researcher at Intel, becoming a founding member and Director of Data at Datavisor where he hosted the company’s entire machine learning, user behavior analysis, and credit modeling system. Here, he became a party to the next generation in Silicon Valley and became the most well-known expert in using unsupervised risk control methods.

3

Huang Ling has always believed that risk control in China is not particularly comparable to that in Silicon Valley. The black products faced by the anti-fraud industry are an entire production chain, made up of a gang of sorts that is spread out around the globe. This chain stretches from Eastern Europe to America to China to India. Moreover, it includes security attack software at the top to the people who use this software to control people and phones around the world. It also includes people and phones to create fake users who execute all kinds of fraudulent activity and reap the benefits.

Therefore, to a certain extent, you can say that risk control and anti-fraud work are universal. Several Internet companies and financial institutions in China are also facing attacks from abroad, and a lot of attacks perpetrated in America are conducted via China, India, Africa, or one of several South East Asian countries.

As a result, the much significant difference between America and China likely lies in differences in political policy and industry development:

Because the credit system in America is sound, the cost of committing fraud is relatively high. Whether the user is defrauding a bank or an online merchant, these kinds of activities usually affect the user’s credit score through a variety of channels. In China, this system is still under-developed; therefore, in many situations, user’s credit rating according to the central bank does not reflect online financial fraudulent activities. Therefore, the cost of committing fraud is comparatively low. As a result, examples of large-scale fraud are more abundant in China than they are in the States and they tend to be harder to handle effectively.

Furthermore, the development of the industry has been different in China and America. China’s mobile applications and Internet financial industries have grown to be larger than their counterparts in America, so there is more fraudulent activity surrounding these two sectors than in America.

“After coming back, we noticed that in China—especially in fields related to finance—this kind of fraud gang was larger and craftier than they are in America. They also use more real people to commit fraud, making them harder to detect and requiring more machine learning and AI modeling methods.” Huang Ling said.

And on a global battlefield such as this, the addition of experts in artificial intelligence algorithms and security scientists is even more invaluable.

Turning to the entrepreneurial mind, Huang Ling said, “I have been researching and practicing in the fields of artificial intelligence algorithms and Internet security for several years. I hope that the skills and experience I have acquired over this time will be useful in the fields of financial risk control and fraud prevention. I also wish to provide a complete set of systems and services to accompany financial and Internet products and thereby achieve a more secure, honest, and fair industry environment. “

Aside from Huang Ling, the other co-founder of AHI FinTech—chief scientist Xu Wei—also hails from academia. Xu Wei served as a cross-institute adjunct professor at Tsinghua University.

Conclusion

Huang Ling regards the entrepreneurship of AI scientists in the field of risk control as a good thing. This is because they possess the skill and understanding of algorithms. Additionally, he is willing to give them a chance to really participate in the industry, rather than just being a cog in the machine.

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/107542.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • idea 2022.01.21 激活码【2022免费激活】2022.02.26「建议收藏」

    (idea 2022.01.21 激活码)好多小伙伴总是说激活码老是失效,太麻烦,关注/收藏全栈君太难教程,2021永久激活的方法等着你。https://javaforall.net/100143.htmlIntelliJ2021最新激活注册码,破解教程可免费永久激活,亲测有效,上面是详细链接哦~2KLKA7BQFO-eyJsaWNlbnNlSWQiOi…

    2022年4月1日
    93
  • c语言线程间传递消息,线程间通信[通俗易懂]

    c语言线程间传递消息,线程间通信[通俗易懂]线程间通信前面一章讲了线程间同步,提到了信号量、互斥量、事件集等概念;本章接着上一章的内容,讲解线程间通信。在裸机编程中,经常会使用全局变量进行功能间的通信,如某些功能可能由于一些操作而改变全局变量的值,另一个功能对此全局变量进行读取,根据读取到的全局变量值执行相应的动作,达到通信协作的目的。RT-Thread中则提供了更多的工具帮助在不同的线程中间传递信息,本章会详细介绍这些工具。学习完本章,…

    2022年10月7日
    0
  • win10系统显示打印机未连接到服务器,win10系统无法连接到打印机的解决方法

    win10系统显示打印机未连接到服务器,win10系统无法连接到打印机的解决方法很多小伙伴都遇到过win10系统无法连接到打印机的困惑吧,一些朋友看过网上零散的win10系统无法连接到打印机的处理方法,并没有完完全全明白win10系统无法连接到打印机是如何解决的,今天小编准备了简单的解决办法,只需要按照1、右键点击开始菜单,选择弹出菜单中的“控制面板”,2、在控制面板里点击“管理工具”,如果找不到的话先将右上角的查看那方式修改为【小图标】或【大图标】的顺序即可轻松解决…

    2022年6月9日
    68
  • navicat 15.0.25激活码【2022最新】

    (navicat 15.0.25激活码)好多小伙伴总是说激活码老是失效,太麻烦,关注/收藏全栈君太难教程,2021永久激活的方法等着你。IntelliJ2021最新激活注册码,破解教程可免费永久激活,亲测有效,下面是详细链接哦~https://javaforall.net/100143.html40ZKSWCX8G-eyJsaWNlbnNlSW…

    2022年4月2日
    196
  • java子窗口获取父窗口句柄_java获得窗口句柄[通俗易懂]

    java子窗口获取父窗口句柄_java获得窗口句柄[通俗易懂]学会用按键精灵获取子窗口句柄来源:按键学院【按键精灵】电脑的桌面是最顶级的…Windows通过句柄(Handle)识别每个窗体、控件、菜单和菜单项,当程序运行时,它所包含的每个部件都有一个惟一确定的句柄同其他的部件相区别句柄在WindowsAPI中具有举足……其实呢,“抓抓”抓句柄的功能,实现起来是很容易的,我们一起来操作看看。实现功能点击图片控件之后鼠标不松开,到了需要…

    2022年7月14日
    17
  • springboot 整合 Mybatis、JPA、Redis「建议收藏」

    springboot 整合 Mybatis、JPA、Redis「建议收藏」引言在springboot项目中,我们是用ORM框架来操作数据库变的非常方便。下面我们分别整合mysql,springdatajpa以及redis。让我们感受下快车道。我们首先创建一个springboot项目,创建好之后,我们来一步步的实践。使用mybatis引入依赖:<dependency><groupId>org.mybatis.spring.boot</groupId><artifactId>mybatis

    2022年10月20日
    0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号