Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

I noticed that that ‘r2_score’ and ‘explained_variance_score’ are both build-in sklearn.metrics methods for regression problems.

I was always under the impression that r2_score is the percent variance explained by the model. How is it different from ‘explained_variance_score’?

When would you choose one over the other?

Thanks!

 

OK, look at this example:

In [123]:
#data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.957173447537
0.948608137045
In [124]:
#what explained_variance_score really is
1-np.cov(np.array(y_true)-np.array(y_pred))/np.cov(y_true)
Out[124]:
0.95717344753747324
In [125]:
#what r^2 really is
1-((np.array(y_true)-np.array(y_pred))**2).sum()/(4*np.array(y_true).std()**2)
Out[125]:
0.94860813704496794
In [126]:
#Notice that the mean residue is not 0
(np.array(y_true)-np.array(y_pred)).mean()
Out[126]:
-0.25
In [127]:
#if the predicted values are different, such that the mean residue IS 0:
y_pred=[2.5, 0.0, 2, 7]
(np.array(y_true)-np.array(y_pred)).mean()
Out[127]:
0.0
In [128]:
#They become the same stuff
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.982869379015
0.982869379015

 

So, when the mean residue is 0, they are the same. Which one to choose dependents on your needs, that is, is the mean residue suppose to be 0?

 

Most of the answers I found (including here) emphasize on the difference between R2 and Explained Variance Score, that is: The Mean Residue (i.e. The Mean of Error).

However, there is an important question left behind, that is: Why on earth I need to consider The Mean of Error?


Refresher:

R2: is the Coefficient of Determination which measures the amount of variation explained by the (least-squares) Linear Regression.

You can look at it from a different angle for the purpose of evaluating the predicted values of y like this:

Varianceactual_y × R2actual_y = Variancepredicted_y

 

So intuitively, the more R2 is closer to 1, the more actual_y and predicted_y will have samevariance (i.e. same spread)


As previously mentioned, the main difference is the Mean of Error; and if we look at the formulas, we find that’s true:

R2 = 1 - [(Sum of Squared Residuals / n) / Variancey_actual]

Explained Variance Score = 1 - [Variance(Ypredicted - Yactual) / Variancey_actual]

 

in which:

Variance(Ypredicted - Yactual) = (Sum of Squared Residuals - Mean Error) / n 

 

So, obviously the only difference is that we are subtracting the Mean Error from the first formula! … But Why?


When we compare the R2 Score with the Explained Variance Score, we are basically checking the Mean Error; so if R2 = Explained Variance Score, that means: The Mean Error = Zero!

The Mean Error reflects the tendency of our estimator, that is: the Biased v.s Unbiased Estimation.


In Summary:

If you want to have unbiased estimator so our model is not underestimating or overestimating, you may consider taking Mean of Error into account.

 

参考链接:https://stackoverflow.com/questions/24378176/python-sci-kit-learn-metrics-difference-between-r2-score-and-explained-varian

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/119588.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • pycharm如何调试python程序_Pycharm断点调试Python程序的步骤方法

    pycharm如何调试python程序_Pycharm断点调试Python程序的步骤方法利用Pycharm断点调试Python程序的方法1.代码准备没有语法错误的Python程序:#!/usr/bin/pythonimportnumpyasnpclassNetwork:def__init__(self,sizes):self.num_layers=len(sizes)self.sizes=sizesself.biases=[np.random.randn(y,1)for…

    2025年6月26日
    4
  • mysql的uuid获取「建议收藏」

    mysql的uuid获取「建议收藏」mysql>SELECTUUID();mysql>c2cb8f66-351f-11e7-b3ed-00163e0429b6mysql>SELECTREPLACE(UUID(),’-‘,”);#将’-‘符号替换掉mysql>45c87fa0352211e78d40d4977a9ea871带‘-’字段长度是36,去掉后32位…

    2022年8月10日
    51
  • svn客户端的安装与使用教程(svn汉化教程)

    SVN服务端与客户端安装使用(客户端汉化包)客户端下载地址:https://tortoisesvn.net/downloads.zh.html下载64位SVN安装包和64位简体中文安装包安装SVN打开安装包,直接NextNext选择安装目录,如果是自定义目录要新建一个文件夹,否则会把安装文件散落在盘符(此处不安装命令行工具会导致在idea中无法使用subversio…

    2022年4月17日
    58
  • 转:网页设计常用颜色16进制代码

    转:网页设计常用颜色16进制代码转 https www cnblogs com blueslu p 8309779 html 转载于 https www cnblogs com unique1319 articles 8311339 html

    2026年1月30日
    1
  • 到底什么是微服务_微服务用什么技术

    到底什么是微服务_微服务用什么技术​前言最近几年微服务很火,大家都在建设微服务,仿佛不谈点微服务相关的技术,都显得不是那么主流了。近几年见识到身边朋友的很多公司和团队都在尝试进行微服务的改变,但很多团队并没有实际微服务踩坑经验,

    2022年8月3日
    10
  • Burpsuite 抓包

    Burpsuite 抓包打开火狐设置代理1、在火狐“常规”里设置手动代理,在http代理输入:127.0.0.1端口80802、输入用户和密码–>打开抓包模块–>点击登录3、抓包成功:可以在Burpsuite里查看请求包里的用户和密码4、改包:我们可以在Burpsuite里修改这两个数据为任意内容,然后放走。请求包就会带着我们修改后的数据到达服务器5、放包:一直forwoard放走,直到网页返回响应的内容。因为后台可能不止开了一个窗口的原因,所以也会抓到其…

    2022年6月2日
    46

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号