Louvain 社团发现算法学习(我的java实现+数据用例)

Louvain 社团发现算法学习(我的java实现+数据用例)为了大家方便 直接把数据放在 github 了 https github com Louvain 算法介绍 Louvain 算法是基于模块度的社区发现算法 该算法在效率和效果上都表现较好 并且能够发现层次性的社区结构 其优化目标是最大化整个社区网络的模块度 社区网络的模块度 Modularity 是评估一个社区网络划分好坏的度量方法 它的含义是社区内节点的连边数与随机情况下的

为了大家方便,直接把数据放在github了:

https://github.com//Louvain

Louvain 算法是基于模块度的社区发现算法,该算法在效率和效果上都表现较好,并且能够发现层次性的社区结构,其优化目标是最大化整个社区网络的模块度。

社区网络的模块度(Modularity)是评估一个社区网络划分好坏的度量方法,它的含义是社区内节点的连边数与随机情况下的边数之差,它的取值范围是 (0,1),其定义如下:

Louvain 社团发现算法学习(我的java实现+数据用例)

上式中,Aij代表结点i和j之间的边权值(当图不带权时,边权值可以看成1)。 ki代表结点i的領街边的边权和(当图不带权时,即为结点的度数)。

模块度的公式定义可以做如下的简化:

Louvain 社团发现算法学习(我的java实现+数据用例)

其中Sigma in表示社区c内的边的权重之和,Sigma tot表示与社区c内的节点相连的边的权重之和。

我们的目标,是要找出各个结点处于哪一个社团,并且让这个划分结构的模块度最大。

Louvain算法的思想很简单:

1)将图中的每个节点看成一个独立的社区,次数社区的数目与节点个数相同;

2)对每个节点i,依次尝试把节点i分配到其每个邻居节点所在的社区,计算分配前与分配后的模块度变化Delta Q,并记录Delta Q最大的那个邻居节点,如果maxDelta Q>0,则把节点i分配Delta Q最大的那个邻居节点所在的社区,否则保持不变;

3)重复2),直到所有节点的所属社区不再变化;

4)对图进行压缩,将所有在同一个社区的节点压缩成一个新节点,社区内节点之间的边的权重转化为新节点的环的权重,社区间的边权重转化为新节点间的边权重;

5)重复1)直到整个图的模块度不再发生变化。

在写代码时,要注意几个要点:

* 第二步,尝试分配结点时,并不是只尝试独立的结点,也可以尝试所在社区的结点数大于1的点,我在看paper时一开始把这个部分看错了,导致算法出问题。

* 第三步,重复2)的时候,并不是将每个点遍历一次后就对图进行一次重构,而是要不断循环的遍历每个结点,直到一次循环中所有结点所在社区都不更新了,表示当前网络已经稳定,然后才进行图的重构。

* 模块度增益的计算,请继续看下文

Louvain 社团发现算法学习(我的java实现+数据用例)

可以看出,louvain是一个启发式的贪心算法。我们需要对模块度进行一个启发式的更新。这样的话这个算法会有如下几个问题:

1:尝试将节点i分配到相邻社团时,如果真的移动结点i,重新计算模块度,那么算法的效率很难得到保证

2:在本问题中,贪心算法只能保证局部最优,而不能够保证全局最优

3:将节点i尝试分配至相邻社团时,要依据一个什么样的顺序

第一个问题,在该算法的paper中给了我们解答。我们实际上不用真的把节点i加入相邻社团后重新计算模块度,paper中给了我们一个计算把结点i移动至社团c时,模块度的增益公式:

Louvain 社团发现算法学习(我的java实现+数据用例)

其中Sigma in表示起点终点都在社区c内的边的权重之和,Sigma tot表示入射社区c内的边的权重之和,ki代表结点i的带权度数和,m为所有边权和。

但是请注意,我们只是想通过模块增益度来判断一个结点i是否能移动到社团c中,而我们实际上是没有必要真正的去求精确的模块度,只需要知道,当前的这步操作,模块度是否发生了增长。

因此,就有了相对增益公式的出现:

Louvain 社团发现算法学习(我的java实现+数据用例)

相对增益的值可能大于1,不是真正的模块度增长值,但是它的正负表示了当前的操作是否增加了模块度。用该公式能大大降低算法的时间复杂度。

第二个问题,该算法的确不能够保证全局最优。但是我们该算法的启发式规则很合理,因此我们能够得到一个十分精确的近似结果。

同时,为了校准结果,我们可以以不同的序列多次调用该算法,保留一个模块度最大的最优结果。

第三个问题,在第二个问题中也出现了,就是给某个结点i找寻領接点时,应当以一个什么顺序?递增or随机or其它规则?我想这个问题需要用实验数据去分析。

在paper中也有提到一些能够使结果更精确的序列。我的思路是,如果要尝试多次取其最优,取随机序列应该是比较稳定比较精确的方式。

说了这么多,下面给上本人的Louvain算法java实现供参考。 本人代码注释比较多,适合学习

我的代码后面是国外大牛的代码,时空复杂度和我的代码一样,但是常数比我小很多,速度要快很多,而且答案更精准(迭代了10次),适合实际运用~

本人的java代码:(时间复杂度o(e),空间复杂度o(e))

Edge.java:

package myLouvain; public class Edge implements Cloneable{ int v; //v表示连接点的编号,w表示此边的权值 double weight; int next; //next负责连接和此点相关的边 Edge(){} public Object clone(){ Edge temp=null; try{ temp = (Edge)super.clone(); //浅复制 }catch(CloneNotSupportedException e) { e.printStackTrace(); } return temp; } } 

Louvain.java:

package myLouvain; import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Random; public class Louvain implements Cloneable{     int n; // 结点个数     int m; // 边数     int cluster[]; // 结点i属于哪个簇     Edge edge[]; // 邻接表     int head[]; // 头结点下标     int top; // 已用E的个数     double resolution; // 1/2m 全局不变     double node_weight[]; // 结点的权值     double totalEdgeWeight; // 总边权值     double[] cluster_weight; // 簇的权值     double eps = 1e-14; // 误差     int global_n; // 最初始的n     int global_cluster[]; // 最后的结果,i属于哪个簇     Edge[] new_edge;   //新的邻接表     int[] new_head;     int new_top = 0;     final int iteration_time = 3; // 最大迭代次数          Edge global_edge[];   //全局初始的临接表  只保存一次,永久不变,不参与后期运算     int global_head[];     int global_top=0;          void addEdge(int u, int v, double weight) {           if(edge[top]==null)             edge[top]=new Edge();         edge[top].v = v;         edge[top].weight = weight;         edge[top].next = head[u];         head[u] = top++;     }     void addNewEdge(int u, int v, double weight) {         if(new_edge[new_top]==null)             new_edge[new_top]=new Edge();         new_edge[new_top].v = v;         new_edge[new_top].weight = weight;         new_edge[new_top].next = new_head[u];         new_head[u] = new_top++;     }          void addGlobalEdge(int u, int v, double weight) {         if(global_edge[global_top]==null)             global_edge[global_top]=new Edge();         global_edge[global_top].v = v;         global_edge[global_top].weight = weight;         global_edge[global_top].next = global_head[u];         global_head[u] = global_top++;     }     void init(String filePath) {         try {             String encoding = "UTF-8";             File file = new File(filePath);             if (file.isFile() && file.exists()) { // 判断文件是否存在                 InputStreamReader read = new InputStreamReader(new FileInputStream(file), encoding);// 考虑到编码格式                 BufferedReader bufferedReader = new BufferedReader(read);                 String lineTxt = null;                 lineTxt = bufferedReader.readLine();                 // 预处理部分                 String cur2[] = lineTxt.split(" ");                 global_n = n = Integer.parseInt(cur2[0]);                 m = Integer.parseInt(cur2[1]);                 m *= 2;                 edge = new Edge[m];                 head = new int[n];                 for (int i = 0; i < n; i++)                     head[i] = -1;                 top = 0;                                  global_edge=new Edge[m];                 global_head = new int[n];                 for(int i=0;i 
     
       2) {                         curw = Double.parseDouble(cur[2]);                     } else {                         curw = 1.0;                     }                     addEdge(u, v, curw);                     addEdge(v, u, curw);                     addGlobalEdge(u,v,curw);                     addGlobalEdge(v,u,curw);                     totalEdgeWeight += 2 * curw;                     node_weight[u] += curw;                     if (u != v) {                         node_weight[v] += curw;                     }                 }                 resolution = 1 / totalEdgeWeight;                 read.close();             } else {                 System.out.println("找不到指定的文件");             }         } catch (Exception e) {             System.out.println("读取文件内容出错");             e.printStackTrace();         }     }     void init_cluster() {         cluster = new int[n];         for (int i = 0; i < n; i++) { // 一个结点一个簇             cluster[i] = i;         }     }     boolean try_move_i(int i) { // 尝试将i加入某个簇         double[] edgeWeightPerCluster = new double[n];         for (int j = head[i]; j != -1; j = edge[j].next) {             int l = cluster[edge[j].v]; // l是nodeid所在簇的编号             edgeWeightPerCluster[l] += edge[j].weight;         }         int bestCluster = -1; // 最优的簇号下标(先默认是自己)         double maxx_deltaQ = 0.0; // 增量的最大值         boolean[] vis = new boolean[n];         cluster_weight[cluster[i]] -= node_weight[i];         for (int j = head[i]; j != -1; j = edge[j].next) {             int l = cluster[edge[j].v]; // l代表領接点的簇号             if (vis[l]) // 一个領接簇只判断一次                 continue;             vis[l] = true;             double cur_deltaQ = edgeWeightPerCluster[l];             cur_deltaQ -= node_weight[i] * cluster_weight[l] * resolution;             if (cur_deltaQ > maxx_deltaQ) {                 bestCluster = l;                 maxx_deltaQ = cur_deltaQ;             }             edgeWeightPerCluster[l] = 0;         }         if (maxx_deltaQ < eps) {             bestCluster = cluster[i];         }          //System.out.println(maxx_deltaQ);         cluster_weight[bestCluster] += node_weight[i];         if (bestCluster != cluster[i]) { // i成功移动了             cluster[i] = bestCluster;             return true;         }         return false;     }     void rebuildGraph() { // 重构图         /// 先对簇进行离散化         int[] change = new int[n];         int change_size=0;         boolean vis[] = new boolean[n];         for (int i = 0; i < n; i++) {             if (vis[cluster[i]])                 continue;             vis[cluster[i]] = true;             change[change_size++]=cluster[i];         }         int[] index = new int[n]; // index[i]代表 i号簇在新图中的结点编号         for (int i = 0; i < change_size; i++)             index[change[i]] = i;         int new_n = change_size; // 新图的大小         new_edge = new Edge[m];         new_head = new int[new_n];         new_top = 0;         double new_node_weight[] = new double[new_n]; // 新点权和         for(int i=0;i 
      
        [] nodeInCluster = new ArrayList[new_n]; // 代表每个簇中的节点列表         for (int i = 0; i < new_n; i++)             nodeInCluster[i] = new ArrayList 
       
         ();         for (int i = 0; i < n; i++) {             nodeInCluster[index[cluster[i]]].add(i);         }         for (int u = 0; u < new_n; u++) { // 将同一个簇的挨在一起分析。可以将visindex数组降到一维             boolean visindex[] = new boolean[new_n]; // visindex[v]代表新图中u节点到v的边在临街表是第几个(多了1,为了初始化方便)             double delta_w[] = new double[new_n]; // 边权的增量             for (int i = 0; i < nodeInCluster[u].size(); i++) {                 int t = nodeInCluster[u].get(i);                 for (int k = head[t]; k != -1; k = edge[k].next) {                     int j = edge[k].v;                     int v = index[cluster[j]];                     if (u != v) {                         if (!visindex[v]) {                             addNewEdge(u, v, 0);                             visindex[v] = true;                         }                         delta_w[v] += edge[k].weight;                     }                 }                 new_node_weight[u] += node_weight[t];             }             for (int k = new_head[u]; k != -1; k = new_edge[k].next) {                 int v = new_edge[k].v;                 new_edge[k].weight = delta_w[v];             }         }         // 更新答案         int[] new_global_cluster = new int[global_n];         for (int i = 0; i < global_n; i++) {             new_global_cluster[i] = index[cluster[global_cluster[i]]];         }         for (int i = 0; i < global_n; i++) {             global_cluster[i] = new_global_cluster[i];         }         top = new_top;         for (int i = 0; i < m; i++) {             edge[i] = new_edge[i];         }         for (int i = 0; i < new_n; i++) {             node_weight[i] = new_node_weight[i];             head[i] = new_head[i];         }         n = new_n;         init_cluster();     }     void print() {         for (int i = 0; i < global_n; i++) {             System.out.println(i + ": " + global_cluster[i]);         }         System.out.println("-------");     }     void louvain() {         init_cluster();         int count = 0; // 迭代次数         boolean update_flag; // 标记是否发生过更新         do { // 第一重循环,每次循环rebuild一次图         //    print(); // 打印簇列表             count++;             cluster_weight = new double[n];             for (int j = 0; j < n; j++) { // 生成簇的权值                 cluster_weight[cluster[j]] += node_weight[j];             }             int[] order = new int[n]; // 生成随机序列             for (int i = 0; i < n; i++)                 order[i] = i;             Random random = new Random();             for (int i = 0; i < n; i++) {                 int j = random.nextInt(n);                 int temp = order[i];                 order[i] = order[j];                 order[j] = temp;             }             int enum_time = 0; // 枚举次数,到n时代表所有点已经遍历过且无移动的点             int point = 0; // 循环指针             update_flag = false; // 是否发生过更新的标记             do {                 int i = order[point];                 point = (point + 1) % n;                 if (try_move_i(i)) { // 对i点做尝试                     enum_time = 0;                     update_flag = true;                 } else {                     enum_time++;                 }             } while (enum_time < n);             if (count > iteration_time || !update_flag) // 如果没更新过或者迭代次数超过指定值                 break;             rebuildGraph(); // 重构图         } while (true);     } } 
        
       
     

Main.java:

package myLouvain; import java.io.BufferedWriter; import java.io.FileWriter; import java.io.IOException; import java.util.ArrayList; public class Main {     public static void writeOutputJson(String fileName, Louvain a) throws IOException {         BufferedWriter bufferedWriter;         bufferedWriter = new BufferedWriter(new FileWriter(fileName));         bufferedWriter.write("{\n\"nodes\": [\n");         for(int i=0;i 
        
          ();         }         for(int i=0;i 
          
        

用d3绘图工具对上文那张16个结点的图的划分进行louvain算法后的可视化结果:

Louvain 社团发现算法学习(我的java实现+数据用例)

Louvain 社团发现算法学习(我的java实现+数据用例)

//

下面是国外大牛的代码实现(为了大家方便直接贴上了,时间复杂度o(e),空间复杂度o(e)):

ModularityOptimizer.java

import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; import java.util.Arrays; import java.util.Random; public class ModularityOptimizer { public static void main(String[] args) throws IOException { boolean update; double modularity, maxModularity, resolution, resolution2; int algorithm, i, j, modularityFunction, nClusters, nIterations, nRandomStarts; int[] cluster; double beginTime, endTime; Network network; Random random; String inputFileName, outputFileName; inputFileName = "/home/eason/Louvain/facebook_combined.txt"; outputFileName = "/home/eason/Louvain/answer.txt"; modularityFunction = 1; resolution = 1.0; algorithm = 1; nRandomStarts = 10; nIterations = 3; System.out.println("Modularity Optimizer version 1.2.0 by Ludo Waltman and Nees Jan van Eck"); System.out.println(); System.out.println("Reading input file..."); System.out.println(); network = readInputFile(inputFileName, modularityFunction); System.out.format("Number of nodes: %d%n", network.getNNodes()); System.out.format("Number of edges: %d%n", network.getNEdges() / 2); System.out.println(); System.out.println("Running " + ((algorithm == 1) ? "Louvain algorithm" : ((algorithm == 2) ? "Louvain algorithm with multilevel refinement" : "smart local moving algorithm")) + "..."); System.out.println(); resolution2 = ((modularityFunction == 1) ? (resolution / network.getTotalEdgeWeight()) : resolution); beginTime = System.currentTimeMillis(); cluster = null; nClusters = -1; maxModularity = Double.NEGATIVE_INFINITY; random = new Random(100); for (i = 0; i < nRandomStarts; i++) { if (nRandomStarts > 1) System.out.format("Random start: %d%n", i + 1); network.initSingletonClusters(); //网络初始化,每个节点一个簇 j = 0; update = true; do { if (nIterations > 1) System.out.format("Iteration: %d%n", j + 1); if (algorithm == 1) update = network.runLouvainAlgorithm(resolution2, random); j++; modularity = network.calcQualityFunction(resolution2); if (nIterations > 1) System.out.format("Modularity: %.4f%n", modularity); } while ((j < nIterations) && update); if (modularity > maxModularity) { cluster = network.getClusters(); nClusters = network.getNClusters(); maxModularity = modularity; } if (nRandomStarts > 1) { if (nIterations == 1) System.out.format("Modularity: %.8f%n", modularity); System.out.println(); } } endTime = System.currentTimeMillis(); if (nRandomStarts == 1) { if (nIterations > 1) System.out.println(); System.out.format("Modularity: %.8f%n", maxModularity); } else System.out.format("Maximum modularity in %d random starts: %f%n", nRandomStarts, maxModularity); System.out.format("Number of communities: %d%n", nClusters); System.out.format("Elapsed time: %.4f seconds%n", (endTime - beginTime) / 1000.0); System.out.println(); System.out.println("Writing output file..."); System.out.println(); writeOutputFile(outputFileName, cluster); } private static Network readInputFile(String fileName, int modularityFunction) throws IOException { BufferedReader bufferedReader; double[] edgeWeight1, edgeWeight2, nodeWeight; int i, j, nEdges, nLines, nNodes; int[] firstNeighborIndex, neighbor, nNeighbors, node1, node2; Network network; String[] splittedLine; bufferedReader = new BufferedReader(new FileReader(fileName)); nLines = 0; while (bufferedReader.readLine() != null) nLines++; bufferedReader.close(); bufferedReader = new BufferedReader(new FileReader(fileName)); node1 = new int[nLines]; node2 = new int[nLines]; edgeWeight1 = new double[nLines]; i = -1; bufferedReader.readLine() ; for (j = 1; j < nLines; j++) { splittedLine = bufferedReader.readLine().split(" "); node1[j] = Integer.parseInt(splittedLine[0]); if (node1[j] > i) i = node1[j]; node2[j] = Integer.parseInt(splittedLine[1]); if (node2[j] > i) i = node2[j]; edgeWeight1[j] = (splittedLine.length > 2) ? Double.parseDouble(splittedLine[2]) : 1; } nNodes = i + 1; bufferedReader.close(); nNeighbors = new int[nNodes]; for (i = 0; i < nLines; i++) if (node1[i] < node2[i]) { nNeighbors[node1[i]]++; nNeighbors[node2[i]]++; } firstNeighborIndex = new int[nNodes + 1]; nEdges = 0; for (i = 0; i < nNodes; i++) { firstNeighborIndex[i] = nEdges; nEdges += nNeighbors[i]; } firstNeighborIndex[nNodes] = nEdges; neighbor = new int[nEdges]; edgeWeight2 = new double[nEdges]; Arrays.fill(nNeighbors, 0); for (i = 0; i < nLines; i++) if (node1[i] < node2[i]) { j = firstNeighborIndex[node1[i]] + nNeighbors[node1[i]]; neighbor[j] = node2[i]; edgeWeight2[j] = edgeWeight1[i]; nNeighbors[node1[i]]++; j = firstNeighborIndex[node2[i]] + nNeighbors[node2[i]]; neighbor[j] = node1[i]; edgeWeight2[j] = edgeWeight1[i]; nNeighbors[node2[i]]++; } { nodeWeight = new double[nNodes]; for (i = 0; i < nEdges; i++) nodeWeight[neighbor[i]] += edgeWeight2[i]; network = new Network(nNodes, firstNeighborIndex, neighbor, edgeWeight2, nodeWeight); } return network; } private static void writeOutputFile(String fileName, int[] cluster) throws IOException { BufferedWriter bufferedWriter; int i; bufferedWriter = new BufferedWriter(new FileWriter(fileName)); for (i = 0; i < cluster.length; i++) { bufferedWriter.write(Integer.toString(cluster[i])); bufferedWriter.newLine(); } bufferedWriter.close(); } } 

import java.io.Serializable; import java.util.Random; public class Network implements Cloneable, Serializable { private static final long serialVersionUID = 1; private int nNodes; private int[] firstNeighborIndex; private int[] neighbor; private double[] edgeWeight; private double[] nodeWeight; private int nClusters; private int[] cluster; private double[] clusterWeight; private int[] nNodesPerCluster; private int[][] nodePerCluster; private boolean clusteringStatsAvailable; public Network(int nNodes, int[] firstNeighborIndex, int[] neighbor, double[] edgeWeight, double[] nodeWeight) { this(nNodes, firstNeighborIndex, neighbor, edgeWeight, nodeWeight, null); } public Network(int nNodes, int[] firstNeighborIndex, int[] neighbor, double[] edgeWeight, double[] nodeWeight, int[] cluster) { int i, nEdges; this.nNodes = nNodes; this.firstNeighborIndex = firstNeighborIndex; this.neighbor = neighbor; if (edgeWeight == null) { nEdges = neighbor.length; this.edgeWeight = new double[nEdges]; for (i = 0; i < nEdges; i++) this.edgeWeight[i] = 1; } else this.edgeWeight = edgeWeight; if (nodeWeight == null) { this.nodeWeight = new double[nNodes]; for (i = 0; i < nNodes; i++) this.nodeWeight[i] = 1; } else this.nodeWeight = nodeWeight; } public int getNNodes() { return nNodes; } public int getNEdges() { return neighbor.length; } public double getTotalEdgeWeight() // 计算总边权 { double totalEdgeWeight; int i; totalEdgeWeight = 0; for (i = 0; i < neighbor.length; i++) totalEdgeWeight += edgeWeight[i]; return totalEdgeWeight; } public double[] getEdgeWeights() { return edgeWeight; } public double[] getNodeWeights() { return nodeWeight; } public int getNClusters() { return nClusters; } public int[] getClusters() { return cluster; } public void initSingletonClusters() { int i; nClusters = nNodes; cluster = new int[nNodes]; for (i = 0; i < nNodes; i++) cluster[i] = i; deleteClusteringStats(); } public void mergeClusters(int[] newCluster) { int i, j, k; if (cluster == null) return; i = 0; for (j = 0; j < nNodes; j++) { k = newCluster[cluster[j]]; if (k > i) i = k; cluster[j] = k; } nClusters = i + 1; deleteClusteringStats(); } public Network getReducedNetwork() { double[] reducedNetworkEdgeWeight1, reducedNetworkEdgeWeight2; int i, j, k, l, m, reducedNetworkNEdges1, reducedNetworkNEdges2; int[] reducedNetworkNeighbor1, reducedNetworkNeighbor2; Network reducedNetwork; if (cluster == null) return null; if (!clusteringStatsAvailable) calcClusteringStats(); reducedNetwork = new Network(); reducedNetwork.nNodes = nClusters; reducedNetwork.firstNeighborIndex = new int[nClusters + 1]; reducedNetwork.nodeWeight = new double[nClusters]; reducedNetworkNeighbor1 = new int[neighbor.length]; reducedNetworkEdgeWeight1 = new double[edgeWeight.length]; reducedNetworkNeighbor2 = new int[nClusters - 1]; reducedNetworkEdgeWeight2 = new double[nClusters]; reducedNetworkNEdges1 = 0; for (i = 0; i < nClusters; i++) { reducedNetworkNEdges2 = 0; for (j = 0; j < nodePerCluster[i].length; j++) { k = nodePerCluster[i][j]; // k是簇i中第j个节点的id for (l = firstNeighborIndex[k]; l < firstNeighborIndex[k + 1]; l++) { m = cluster[neighbor[l]]; // m是k的在l位置的邻居节点所属的簇id if (m != i) { if (reducedNetworkEdgeWeight2[m] == 0) { reducedNetworkNeighbor2[reducedNetworkNEdges2] = m; reducedNetworkNEdges2++; } reducedNetworkEdgeWeight2[m] += edgeWeight[l]; } } reducedNetwork.nodeWeight[i] += nodeWeight[k]; } for (j = 0; j < reducedNetworkNEdges2; j++) { reducedNetworkNeighbor1[reducedNetworkNEdges1 + j] = reducedNetworkNeighbor2[j]; reducedNetworkEdgeWeight1[reducedNetworkNEdges1 + j] = reducedNetworkEdgeWeight2[reducedNetworkNeighbor2[j]]; reducedNetworkEdgeWeight2[reducedNetworkNeighbor2[j]] = 0; } reducedNetworkNEdges1 += reducedNetworkNEdges2; reducedNetwork.firstNeighborIndex[i + 1] = reducedNetworkNEdges1; } reducedNetwork.neighbor = new int[reducedNetworkNEdges1]; reducedNetwork.edgeWeight = new double[reducedNetworkNEdges1]; System.arraycopy(reducedNetworkNeighbor1, 0, reducedNetwork.neighbor, 0, reducedNetworkNEdges1); System.arraycopy(reducedNetworkEdgeWeight1, 0, reducedNetwork.edgeWeight, 0, reducedNetworkNEdges1); return reducedNetwork; } public double calcQualityFunction(double resolution) { double qualityFunction, totalEdgeWeight; int i, j, k; if (cluster == null) return Double.NaN; if (!clusteringStatsAvailable) calcClusteringStats(); qualityFunction = 0; totalEdgeWeight = 0; for (i = 0; i < nNodes; i++) { j = cluster[i]; for (k = firstNeighborIndex[i]; k < firstNeighborIndex[i + 1]; k++) { if (cluster[neighbor[k]] == j) qualityFunction += edgeWeight[k]; totalEdgeWeight += edgeWeight[k]; } } for (i = 0; i < nClusters; i++) qualityFunction -= clusterWeight[i] * clusterWeight[i] * resolution; qualityFunction /= totalEdgeWeight; return qualityFunction; } public boolean runLocalMovingAlgorithm(double resolution) { return runLocalMovingAlgorithm(resolution, new Random()); } public boolean runLocalMovingAlgorithm(double resolution, Random random) { boolean update; double maxQualityFunction, qualityFunction; double[] clusterWeight, edgeWeightPerCluster; int bestCluster, i, j, k, l, nNeighboringClusters, nStableNodes, nUnusedClusters; int[] neighboringCluster, newCluster, nNodesPerCluster, nodeOrder, unusedCluster; if ((cluster == null) || (nNodes == 1)) return false; update = false; clusterWeight = new double[nNodes]; nNodesPerCluster = new int[nNodes]; for (i = 0; i < nNodes; i++) { clusterWeight[cluster[i]] += nodeWeight[i]; nNodesPerCluster[cluster[i]]++; } nUnusedClusters = 0; unusedCluster = new int[nNodes]; for (i = 0; i < nNodes; i++) if (nNodesPerCluster[i] == 0) { unusedCluster[nUnusedClusters] = i; nUnusedClusters++; } nodeOrder = new int[nNodes]; for (i = 0; i < nNodes; i++) nodeOrder[i] = i; for (i = 0; i < nNodes; i++) { j = random.nextInt(nNodes); k = nodeOrder[i]; nodeOrder[i] = nodeOrder[j]; nodeOrder[j] = k; } edgeWeightPerCluster = new double[nNodes]; neighboringCluster = new int[nNodes - 1]; nStableNodes = 0; i = 0; do { j = nodeOrder[i]; nNeighboringClusters = 0; for (k = firstNeighborIndex[j]; k < firstNeighborIndex[j + 1]; k++) { l = cluster[neighbor[k]]; if (edgeWeightPerCluster[l] == 0) { neighboringCluster[nNeighboringClusters] = l; nNeighboringClusters++; } edgeWeightPerCluster[l] += edgeWeight[k]; } clusterWeight[cluster[j]] -= nodeWeight[j]; nNodesPerCluster[cluster[j]]--; if (nNodesPerCluster[cluster[j]] == 0) { unusedCluster[nUnusedClusters] = cluster[j]; nUnusedClusters++; } bestCluster = -1; maxQualityFunction = 0; for (k = 0; k < nNeighboringClusters; k++) { l = neighboringCluster[k]; qualityFunction = edgeWeightPerCluster[l] - nodeWeight[j] * clusterWeight[l] * resolution; if ((qualityFunction > maxQualityFunction) || ((qualityFunction == maxQualityFunction) && (l < bestCluster))) { bestCluster = l; maxQualityFunction = qualityFunction; } edgeWeightPerCluster[l] = 0; } if (maxQualityFunction == 0) { bestCluster = unusedCluster[nUnusedClusters - 1]; nUnusedClusters--; } clusterWeight[bestCluster] += nodeWeight[j]; nNodesPerCluster[bestCluster]++; if (bestCluster == cluster[j]) nStableNodes++; else { cluster[j] = bestCluster; nStableNodes = 1; update = true; } i = (i < nNodes - 1) ? (i + 1) : 0; } while (nStableNodes < nNodes); // 优化步骤是直到所有的点都稳定下来才结束 newCluster = new int[nNodes]; nClusters = 0; for (i = 0; i < nNodes; i++) if (nNodesPerCluster[i] > 0) { newCluster[i] = nClusters; nClusters++; } for (i = 0; i < nNodes; i++) cluster[i] = newCluster[cluster[i]]; deleteClusteringStats(); return update; } public boolean runLouvainAlgorithm(double resolution) { return runLouvainAlgorithm(resolution, new Random()); } public boolean runLouvainAlgorithm(double resolution, Random random) { boolean update, update2; Network reducedNetwork; if ((cluster == null) || (nNodes == 1)) return false; update = runLocalMovingAlgorithm(resolution, random); if (nClusters < nNodes) { reducedNetwork = getReducedNetwork(); reducedNetwork.initSingletonClusters(); update2 = reducedNetwork.runLouvainAlgorithm(resolution, random); if (update2) { update = true; mergeClusters(reducedNetwork.getClusters()); } } deleteClusteringStats(); return update; } private Network() { } private void calcClusteringStats() { int i, j; clusterWeight = new double[nClusters]; nNodesPerCluster = new int[nClusters]; nodePerCluster = new int[nClusters][]; for (i = 0; i < nNodes; i++) { clusterWeight[cluster[i]] += nodeWeight[i]; nNodesPerCluster[cluster[i]]++; } for (i = 0; i < nClusters; i++) { nodePerCluster[i] = new int[nNodesPerCluster[i]]; nNodesPerCluster[i] = 0; } for (i = 0; i < nNodes; i++) { j = cluster[i]; nodePerCluster[j][nNodesPerCluster[j]] = i; nNodesPerCluster[j]++; } clusteringStatsAvailable = true; } private void deleteClusteringStats() { clusterWeight = null; nNodesPerCluster = null; nodePerCluster = null; clusteringStatsAvailable = false; } }

最后给上测试数据的链接,里面有facebook,亚马逊等社交信息数据,供大家调试用:

http://snap.stanford.edu/data/#socnets






















版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/217594.html原文链接:https://javaforall.net

(0)
上一篇 2026年3月18日 上午9:13
下一篇 2026年3月18日 上午9:13


相关推荐

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号