Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

Python scikit-learn (metrics): difference between r2_score and explained_variance_score?

I noticed that that ‘r2_score’ and ‘explained_variance_score’ are both build-in sklearn.metrics methods for regression problems.

I was always under the impression that r2_score is the percent variance explained by the model. How is it different from ‘explained_variance_score’?

When would you choose one over the other?

Thanks!

 

OK, look at this example:

In [123]:
#data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.957173447537
0.948608137045
In [124]:
#what explained_variance_score really is
1-np.cov(np.array(y_true)-np.array(y_pred))/np.cov(y_true)
Out[124]:
0.95717344753747324
In [125]:
#what r^2 really is
1-((np.array(y_true)-np.array(y_pred))**2).sum()/(4*np.array(y_true).std()**2)
Out[125]:
0.94860813704496794
In [126]:
#Notice that the mean residue is not 0
(np.array(y_true)-np.array(y_pred)).mean()
Out[126]:
-0.25
In [127]:
#if the predicted values are different, such that the mean residue IS 0:
y_pred=[2.5, 0.0, 2, 7]
(np.array(y_true)-np.array(y_pred)).mean()
Out[127]:
0.0
In [128]:
#They become the same stuff
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.982869379015
0.982869379015

 

So, when the mean residue is 0, they are the same. Which one to choose dependents on your needs, that is, is the mean residue suppose to be 0?

 

Most of the answers I found (including here) emphasize on the difference between R2 and Explained Variance Score, that is: The Mean Residue (i.e. The Mean of Error).

However, there is an important question left behind, that is: Why on earth I need to consider The Mean of Error?


Refresher:

R2: is the Coefficient of Determination which measures the amount of variation explained by the (least-squares) Linear Regression.

You can look at it from a different angle for the purpose of evaluating the predicted values of y like this:

Varianceactual_y × R2actual_y = Variancepredicted_y

 

So intuitively, the more R2 is closer to 1, the more actual_y and predicted_y will have samevariance (i.e. same spread)


As previously mentioned, the main difference is the Mean of Error; and if we look at the formulas, we find that’s true:

R2 = 1 - [(Sum of Squared Residuals / n) / Variancey_actual]

Explained Variance Score = 1 - [Variance(Ypredicted - Yactual) / Variancey_actual]

 

in which:

Variance(Ypredicted - Yactual) = (Sum of Squared Residuals - Mean Error) / n 

 

So, obviously the only difference is that we are subtracting the Mean Error from the first formula! … But Why?


When we compare the R2 Score with the Explained Variance Score, we are basically checking the Mean Error; so if R2 = Explained Variance Score, that means: The Mean Error = Zero!

The Mean Error reflects the tendency of our estimator, that is: the Biased v.s Unbiased Estimation.


In Summary:

If you want to have unbiased estimator so our model is not underestimating or overestimating, you may consider taking Mean of Error into account.

 

参考链接:https://stackoverflow.com/questions/24378176/python-sci-kit-learn-metrics-difference-between-r2-score-and-explained-varian

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/119588.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • MessageDigest简要

    MessageDigest简要本文博客原參考文章:http://blog.sina.com.cn/s/blog_4f36423201000c1e.html一、概述java.security.MessageDigest类用于为应用程序提供信息摘要算法的功能,如MD5或SHA算法。简单点说就是用于生成散列码。信息摘要是安全的单向哈希函数,它接收随意大小的数据。输出固定长度的哈希值。关…

    2022年6月29日
    24
  • scope=prototype有什么作用_prototype和单例模式

    scope=prototype有什么作用_prototype和单例模式@Scope(“prototype”)//多例模式

    2022年8月19日
    5
  • Prometheus时序数据库

    Prometheus时序数据库Prometheus时序数据库 一、Prometheus1、Prometheus安装1)源码安装prometheus安装包最新版本下载地址:https://prometheus.io/download/wgethttps://github.com/prometheus/prometheus/releases/download/v2.3.2/prometheus-2….

    2022年9月28日
    2
  • potplayer使用madvr后没有了_potplayer配置

    potplayer使用madvr后没有了_potplayer配置 所需资源下载 Resource Version Download Notes Potplayer   Download 根据系统选择32位或64位安装文件 LAVFilters 0.68.1 Download 32位或64位版本应对应于Potplayer版本 madVR 0.90.24 Download …

    2022年9月14日
    5
  • docker下载安装教程_docker镜像存储位置

    docker下载安装教程_docker镜像存储位置前言Docker提供轻量的虚拟化,你能够从Docker获得一个额外抽象层,你能够在单台机器上运行多个Docker微容器,而每个微容器里都有一个微服务或独立应用,例如你可以将Tomcat运行在一个D

    2022年7月30日
    11
  • tomcat宕机自动重启和每日定时启动tomcat

    tomcat宕机自动重启和每日定时启动tomcat在项目后期维护中会遇到这样的情况,tomcat在内存溢出的时候就出现死机的情况和遇到长时间不响应,需要人工手动关闭和重启服务,针对这样的突发情况,希望程序能自动处理问题而不需要人工关于,所以才有了目前的需求。一、设置tomcat定时启动1,首先将tomcat注册为服务,先打开tomcat的bin目录下service.bat文件,修改下面的值,这是sevvice的注册名称和显

    2022年7月23日
    9

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号