Commit c0fea85d authored by ljia's avatar ljia

* ADD calculation of the time spend to acquire kernel matrices for each kernel. - linlin

* MOD floydTransformation function, calculate shortest paths taking into consideration user-defined edge weight. - linlin
* MOD implementation of nodes and edges attributes genericity for all kernels. - linlin
* ADD detailed results file results.md. - linlin
parent aaa35456
......@@ -11,9 +11,9 @@ a python package for graph kernels.
* tabulate - 0.8.2
## results with minimal test RMSE for each kernel on dataset Asyclic
- All the kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
- The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
- For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
-- All the kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
-- The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
-- For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
| Kernels | RMSE(℃) | std(℃) | parameter | k_time |
|---------------|:---------:|:--------:|-------------:|-------:|
......@@ -21,6 +21,7 @@ a python package for graph kernels.
| marginalized | 17.90 | 6.59 | p_quit = 0.1 | - |
| path | 14.27 | 6.37 | - | - |
| WL subtree | 9.00 | 6.37 | height = 1 | 0.85" |
**In each line, paremeter is the one with which the kenrel achieves the best results**
**In each line, k_time is the time spent on building the kernel matrix.**
**See detail results in [results.md](pygraph/kernels/results.md)**
**In each line, paremeter is the one with which the kenrel achieves the best results.
In each line, k_time is the time spent on building the kernel matrix.
See detail results in [results.md](pygraph/kernels/results.md).**
......@@ -357,7 +357,7 @@
" print(Kmatrix)\n",
" else:\n",
" print('\\n Calculating kernel matrix, this could take a while...')\n",
" Kmatrix = marginalizedkernel(dataset, p_quit, 20)\n",
" Kmatrix, run_time = marginalizedkernel(dataset, p_quit, 20, node_label = 'atom', edge_label = 'bond_type')\n",
" print(Kmatrix)\n",
" print('\\n Saving kernel matrix to file...')\n",
" np.savetxt(kernel_file, Kmatrix)\n",
......
......@@ -686,7 +686,7 @@
" print(Kmatrix)\n",
"else:\n",
" print('\\n Calculating kernel matrix, this could take a while...')\n",
" Kmatrix = pathkernel(dataset)\n",
" Kmatrix, run_time = pathkernel(dataset, node_label = 'atom', edge_label = 'bond_type')\n",
" print(Kmatrix)\n",
" print('\\n Saving kernel matrix to file...')\n",
" np.savetxt(kernel_file, Kmatrix)\n",
......
......@@ -182,7 +182,8 @@
" print(Kmatrix)\n",
"else:\n",
" print('\\n Calculating kernel matrix, this could take a while...')\n",
" Kmatrix = spkernel(dataset)\n",
" #@Q: is it appropriate to use bond type between atoms as the edge weight to calculate shortest path????????\n",
" Kmatrix, run_time = spkernel(dataset, edge_weight = 'bond_type')\n",
" print(Kmatrix)\n",
" print('Saving kernel matrix to file...')\n",
" np.savetxt(kernel_file_path, Kmatrix)\n",
......
......@@ -11,9 +11,9 @@ a python package for graph kernels.
* tabulate - 0.8.2
## results with minimal test RMSE for each kernel on dataset Asyclic
- All the kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
- The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
- For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
-- All the kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
-- The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
-- For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
| Kernels | RMSE(℃) | std(℃) | parameter | k_time |
|---------------|:---------:|:--------:|-------------:|-------:|
......@@ -21,11 +21,17 @@ a python package for graph kernels.
| marginalized | 17.90 | 6.59 | p_quit = 0.1 | - |
| path | 14.27 | 6.37 | - | - |
| WL subtree | 9.00 | 6.37 | height = 1 | 0.85" |
**In each line, paremeter is the one with which the kenrel achieves the best results**
**In each line, k_time is the time spent on building the kernel matrix.**
**In each line, paremeter is the one with which the kenrel achieves the best results.
In each line, k_time is the time spent on building the kernel matrix.
See detail results in [results.md](pygraph/kernels/results.md).**
## updates
### 2017.12.22
* ADD calculation of the time spend to acquire kernel matrices for each kernel. - linlin
* MOD floydTransformation function, calculate shortest paths taking into consideration user-defined edge weight. - linlin
* MOD implementation of nodes and edges attributes genericity for all kernels. - linlin
* ADD detailed results file results.md. - linlin
### 2017.12.21
* MOD Weisfeiler-Lehman subtree kernel and the test code. - linlin
### 2017.12.20
......
......@@ -8,7 +8,7 @@ import time
from pygraph.kernels.deltaKernel import deltakernel
def marginalizedkernel(*args):
def marginalizedkernel(*args, node_label = 'atom', edge_label = 'bond_type'):
"""Calculate marginalized graph kernels between graphs.
Parameters
......@@ -22,6 +22,10 @@ def marginalizedkernel(*args):
the termination probability in the random walks generating step
itr : integer
time of iterations to calculate R_inf
node_label : string
node attribute used as label. The default node label is atom.
edge_label : string
edge attribute used as label. The default edge label is bond_type.
Return
------
......@@ -34,38 +38,43 @@ def marginalizedkernel(*args):
"""
if len(args) == 3: # for a list of graphs
Gn = args[0]
Kmatrix = np.zeros((len(Gn), len(Gn)))
start_time = time.time()
for i in range(0, len(Gn)):
for j in range(i, len(Gn)):
Kmatrix[i][j] = _marginalizedkernel_do(Gn[i], Gn[j], args[1], args[2])
Kmatrix[i][j] = _marginalizedkernel_do(Gn[i], Gn[j], node_label, edge_label, args[1], args[2])
Kmatrix[j][i] = Kmatrix[i][j]
print("\n --- marginalized kernel matrix of size %d built in %s seconds ---" % (len(Gn), (time.time() - start_time)))
run_time = time.time() - start_time
print("\n --- marginalized kernel matrix of size %d built in %s seconds ---" % (len(Gn), run_time))
return Kmatrix
return Kmatrix, run_time
else: # for only 2 graphs
start_time = time.time()
kernel = _marginalizedkernel_do(args[0], args[1], args[2], args[3])
kernel = _marginalizedkernel_do(args[0], args[1], node_label, edge_label, args[2], args[3])
print("\n --- marginalized kernel built in %s seconds ---" % (time.time() - start_time))
run_time = time.time() - start_time
print("\n --- marginalized kernel built in %s seconds ---" % (run_time))
return kernel
return kernel, run_time
def _marginalizedkernel_do(G1, G2, p_quit, itr):
def _marginalizedkernel_do(G1, G2, node_label = 'atom', edge_label = 'bond_type', p_quit, itr):
"""Calculate marginalized graph kernels between 2 graphs.
Parameters
----------
G1, G2 : NetworkX graphs
2 graphs between which the kernel is calculated.
node_label : string
node attribute used as label. The default node label is atom.
edge_label : string
edge attribute used as label. The default edge label is bond_type.
p_quit : integer
the termination probability in the random walks generating step
itr : integer
......@@ -106,8 +115,8 @@ def _marginalizedkernel_do(G1, G2, p_quit, itr):
for neighbor2 in neighbor_n2:
t = p_trans_n1 * p_trans_n2 * \
deltakernel(G1.node[neighbor1]['label'] == G2.node[neighbor2]['label']) * \
deltakernel(neighbor_n1[neighbor1]['label'] == neighbor_n2[neighbor2]['label'])
deltakernel(G1.node[neighbor1][node_label] == G2.node[neighbor2][node_label]) * \
deltakernel(neighbor_n1[neighbor1][edge_label] == neighbor_n2[neighbor2][edge_label])
R_inf_new[node1[0]][node2[0]] += t * R_inf[neighbor1][neighbor2] # ref [1] equation (8)
R_inf[:] = R_inf_new
......@@ -115,7 +124,7 @@ def _marginalizedkernel_do(G1, G2, p_quit, itr):
# add elements of R_inf up and calculate kernel
for node1 in G1.nodes(data = True):
for node2 in G2.nodes(data = True):
s = p_init_G1 * p_init_G2 * deltakernel(node1[1]['label'] == node2[1]['label'])
s = p_init_G1 * p_init_G2 * deltakernel(node1[1][node_label] == node2[1][node_label])
kernel += s * R_inf[node1[0]][node2[0]] # ref [1] equation (6)
return kernel
\ No newline at end of file
......@@ -8,7 +8,7 @@ import time
from pygraph.kernels.deltaKernel import deltakernel
def pathkernel(*args):
def pathkernel(*args, node_label = 'atom', edge_label = 'bond_type'):
"""Calculate mean average path kernels between graphs.
Parameters
......@@ -18,6 +18,10 @@ def pathkernel(*args):
/
G1, G2 : NetworkX graphs
2 graphs between which the kernel is calculated.
node_label : string
node attribute used as label. The default node label is atom.
edge_label : string
edge attribute used as label. The default edge label is bond_type.
Return
------
......@@ -29,38 +33,43 @@ def pathkernel(*args):
[1] Suard F, Rakotomamonjy A, Bensrhair A. Kernel on Bag of Paths For Measuring Similarity of Shapes. InESANN 2007 Apr 25 (pp. 355-360).
"""
if len(args) == 1: # for a list of graphs
Gn = args[0]
Gn = args[0]
Kmatrix = np.zeros((len(Gn), len(Gn)))
start_time = time.time()
for i in range(0, len(Gn)):
for j in range(i, len(Gn)):
Kmatrix[i][j] = _pathkernel_do(Gn[i], Gn[j])
Kmatrix[i][j] = _pathkernel_do(Gn[i], Gn[j], node_label, edge_label)
Kmatrix[j][i] = Kmatrix[i][j]
print("\n --- mean average path kernel matrix of size %d built in %s seconds ---" % (len(Gn), (time.time() - start_time)))
run_time = time.time() - start_time
print("\n --- mean average path kernel matrix of size %d built in %s seconds ---" % (len(Gn), run_time))
return Kmatrix
return Kmatrix, run_time
else: # for only 2 graphs
start_time = time.time()
kernel = _pathkernel_do(args[0], args[1])
kernel = _pathkernel_do(args[0], args[1], node_label, edge_label)
print("\n --- mean average path kernel built in %s seconds ---" % (time.time() - start_time))
run_time = time.time() - start_time
print("\n --- mean average path kernel built in %s seconds ---" % (run_time))
return kernel
return kernel, run_time
def _pathkernel_do(G1, G2):
def _pathkernel_do(G1, G2, node_label = 'atom', edge_label = 'bond_type'):
"""Calculate mean average path kernels between 2 graphs.
Parameters
----------
G1, G2 : NetworkX graphs
2 graphs between which the kernel is calculated.
node_label : string
node attribute used as label. The default node label is atom.
edge_label : string
edge attribute used as label. The default edge label is bond_type.
Return
------
......@@ -72,24 +81,24 @@ def _pathkernel_do(G1, G2):
num_nodes = G1.number_of_nodes()
for node1 in range(num_nodes):
for node2 in range(node1 + 1, num_nodes):
sp1.append(nx.shortest_path(G1, node1, node2, weight = 'cost'))
sp1.append(nx.shortest_path(G1, node1, node2, weight = edge_label))
sp2 = []
num_nodes = G2.number_of_nodes()
for node1 in range(num_nodes):
for node2 in range(node1 + 1, num_nodes):
sp2.append(nx.shortest_path(G2, node1, node2, weight = 'cost'))
sp2.append(nx.shortest_path(G2, node1, node2, weight = edge_label))
# calculate kernel
kernel = 0
for path1 in sp1:
for path2 in sp2:
if len(path1) == len(path2):
kernel_path = deltakernel(G1.node[path1[0]]['label'] == G2.node[path2[0]]['label'])
kernel_path = deltakernel(G1.node[path1[0]][node_label] == G2.node[path2[0]][node_label])
if kernel_path:
for i in range(1, len(path1)):
# kernel = 1 if all corresponding nodes and edges in the 2 paths have same labels, otherwise 0
kernel_path *= deltakernel(G1[path1[i - 1]][path1[i]]['label'] == G2[path2[i - 1]][path2[i]]['label']) * deltakernel(G1.node[path1[i]]['label'] == G2.node[path2[i]]['label'])
kernel_path *= deltakernel(G1[path1[i - 1]][path1[i]][edge_label] == G2[path2[i - 1]][path2[i]][edge_label]) * deltakernel(G1.node[path1[i]][node_label] == G2.node[path2[i]][node_label])
kernel += kernel_path # add up kernels of all paths
kernel = kernel / (len(sp1) * len(sp2)) # calculate mean average
......
# results with minimal test RMSE for each kernel on dataset Asyclic
- All the kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
- The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
- For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
-- All the kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
-- The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
-- For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
## summary
......@@ -11,20 +11,26 @@
| marginalized | 17.90 | 6.59 | p_quit = 0.1 | - |
| path | 14.27 | 6.37 | - | - |
| WL subtree | 9.00 | 6.37 | height = 1 | 0.85" |
**In each line, paremeter is the one with which the kenrel achieves the best results**
**In each line, k_time is the time spent on building the kernel matrix.**
## detailed results for WL subtree kernel.
height RMSE_test std_test RMSE_train std_train kernel_build_time(s)
-------- ----------- ---------- ------------ ----------- ----------------------
0 36.2108 7.33179 141.419 1.08284 0.374255
1 9.00098 6.37145 140.065 0.877976 0.853411
2 19.8113 4.04911 140.075 0.928821 1.31835
3 25.0455 4.94276 140.198 0.873857 1.83817
4 28.2255 6.5212 140.272 0.838915 2.27403
5 30.6354 6.73647 140.247 0.86363 2.53348
6 32.1027 6.85601 140.239 0.872475 3.06373
7 32.9709 6.89606 140.094 0.917704 3.4109
8 33.5112 6.90753 140.076 0.931866 4.05149
9 33.8502 6.91427 139.913 0.928974 4.62658
10 34.0963 6.93115 139.894 0.942612 4.99069
**In each line, paremeter is the one with which the kenrel achieves the best results.
In each line, k_time is the time spent on building the kernel matrix.**
## detailed results of WL subtree kernel.
The table below shows the results of the WL subtree under different subtree heights.
```
height RMSE_test std_test RMSE_train std_train k_time
-------- ----------- ---------- ------------ ----------- --------
0 36.2108 7.33179 141.419 1.08284 0.392911
1 9.00098 6.37145 140.065 0.877976 0.812077
2 19.8113 4.04911 140.075 0.928821 1.36955
3 25.0455 4.94276 140.198 0.873857 1.78629
4 28.2255 6.5212 140.272 0.838915 2.30847
5 30.6354 6.73647 140.247 0.86363 2.8258
6 32.1027 6.85601 140.239 0.872475 3.1542
7 32.9709 6.89606 140.094 0.917704 3.46081
8 33.5112 6.90753 140.076 0.931866 4.08857
9 33.8502 6.91427 139.913 0.928974 4.25243
10 34.0963 6.93115 139.894 0.942612 5.02607
```
**The unit of the *RMSEs* and *stds* is *℃*, The unit of the *k_time* is *s*.
k_time is the time spent on building the kernel matrix.**
......@@ -10,7 +10,7 @@ import time
from pygraph.utils.utils import getSPGraph
def spkernel(*args):
def spkernel(*args, edge_weight = 'bond_type'):
"""Calculate shortest-path kernels between graphs.
Parameters
......@@ -20,6 +20,8 @@ def spkernel(*args):
/
G1, G2 : NetworkX graphs
2 graphs between which the kernel is calculated.
edge_weight : string
edge attribute corresponding to the edge weight. The default edge weight is bond_type.
Return
------
......@@ -37,7 +39,7 @@ def spkernel(*args):
Sn = [] # get shortest path graphs of Gn
for i in range(0, len(Gn)):
Sn.append(getSPGraph(Gn[i]))
Sn.append(getSPGraph(Gn[i], edge_weight = edge_weight))
start_time = time.time()
for i in range(0, len(Gn)):
......@@ -48,21 +50,23 @@ def spkernel(*args):
Kmatrix[i][j] += 1
Kmatrix[j][i] += (0 if i == j else 1)
print("--- shortest path kernel matrix of size %d built in %s seconds ---" % (len(Gn), (time.time() - start_time)))
run_time = time.time() - start_time
print("--- shortest path kernel matrix of size %d built in %s seconds ---" % (len(Gn), run_time))
return Kmatrix
return Kmatrix, run_time
else: # for only 2 graphs
G1 = args[0]
G2 = args[1]
G1 = getSPGraph(args[0], edge_weight = edge_weight)
G2 = getSPGraph(args[1], edge_weight = edge_weight)
kernel = 0
start_time = time.time()
for e1 in G1.edges(data = True):
for e2 in G2.edges(data = True):
if e1[2]['cost'] != 0 and e1[2]['cost'] == e2[2]['cost'] and ((e1[0] == e2[0] and e1[1] == e2[1]) or (e1[0] == e2[1] and e1[1] == e2[0])):
kernel += 1
print("--- shortest path kernel built in %s seconds ---" % (time.time() - start_time))
# print("--- shortest path kernel built in %s seconds ---" % (time.time() - start_time))
return kernel
\ No newline at end of file
......@@ -23,7 +23,7 @@ import time
from pygraph.kernels.spkernel import spkernel
from pygraph.kernels.pathKernel import pathkernel
def weisfeilerlehmankernel(*args, height = 0, base_kernel = 'subtree'):
def weisfeilerlehmankernel(*args, node_label = 'atom', edge_label = 'bond_type', height = 0, base_kernel = 'subtree'):
"""Calculate Weisfeiler-Lehman kernels between graphs.
Parameters
......@@ -32,12 +32,15 @@ def weisfeilerlehmankernel(*args, height = 0, base_kernel = 'subtree'):
List of graphs between which the kernels are calculated.
/
G1, G2 : NetworkX graphs
2 graphs between which the kernel is calculated.
height : subtree height
base_kernel : base kernel used in each iteration of WL kernel
the default base kernel is subtree kernel
2 graphs between which the kernel is calculated.
node_label : string
node attribute used as label. The default node label is atom.
edge_label : string
edge attribute used as label. The default edge label is bond_type.
height : int
subtree height
base_kernel : string
base kernel used in each iteration of WL kernel. The default base kernel is subtree kernel.
Return
------
......@@ -57,7 +60,7 @@ def weisfeilerlehmankernel(*args, height = 0, base_kernel = 'subtree'):
# for WL subtree kernel
if base_kernel == 'subtree':
Kmatrix = _wl_subtreekernel_do(args[0], height = height, base_kernel = 'subtree')
Kmatrix = _wl_subtreekernel_do(args[0], node_label, edge_label, height = height, base_kernel = 'subtree')
# for WL edge kernel
elif base_kernel == 'edge':
......@@ -86,7 +89,7 @@ def weisfeilerlehmankernel(*args, height = 0, base_kernel = 'subtree'):
if base_kernel == 'subtree':
args = [args[0], args[1]]
kernel = _wl_subtreekernel_do(args, height = height, base_kernel = 'subtree')
kernel = _wl_subtreekernel_do(args, node_label, edge_label, height = height, base_kernel = 'subtree')
# for WL edge kernel
elif base_kernel == 'edge':
......@@ -104,13 +107,21 @@ def weisfeilerlehmankernel(*args, height = 0, base_kernel = 'subtree'):
return kernel, run_time
def _wl_subtreekernel_do(*args, height = 0, base_kernel = 'subtree'):
def _wl_subtreekernel_do(*args, node_label = 'atom', edge_label = 'bond_type', height = 0, base_kernel = 'subtree'):
"""Calculate Weisfeiler-Lehman subtree kernels between graphs.
Parameters
----------
Gn : List of NetworkX graph
List of graphs between which the kernels are calculated.
List of graphs between which the kernels are calculated.
node_label : string
node attribute used as label. The default node label is atom.
edge_label : string
edge attribute used as label. The default edge label is bond_type.
height : int
subtree height
base_kernel : string
base kernel used in each iteration of WL kernel. The default base kernel is subtree kernel.
Return
------
......@@ -129,9 +140,9 @@ def _wl_subtreekernel_do(*args, height = 0, base_kernel = 'subtree'):
num_of_labels_occured = all_num_of_labels_occured # number of the set of letters that occur before as node labels at least once in all graphs
# for each graph
for idx, G in enumerate(Gn):
for G in Gn:
# get the set of original labels
labels_ori = list(nx.get_node_attributes(G, 'label').values())
labels_ori = list(nx.get_node_attributes(G, node_label).values())
all_labels_ori.update(labels_ori)
num_of_each_label = dict(Counter(labels_ori)) # number of occurence of each label in graph
all_num_of_each_label.append(num_of_each_label)
......@@ -163,10 +174,10 @@ def _wl_subtreekernel_do(*args, height = 0, base_kernel = 'subtree'):
set_multisets = []
for node in G.nodes(data = True):
# Multiset-label determination.
multiset = [ G.node[neighbors]['label'] for neighbors in G[node[0]] ]
multiset = [ G.node[neighbors][node_label] for neighbors in G[node[0]] ]
# sorting each multiset
multiset.sort()
multiset = node[1]['label'] + ''.join(multiset) # concatenate to a string and add the prefix
multiset = node[1][node_label] + ''.join(multiset) # concatenate to a string and add the prefix
set_multisets.append(multiset)
# label compression
......@@ -185,10 +196,10 @@ def _wl_subtreekernel_do(*args, height = 0, base_kernel = 'subtree'):
# relabel nodes
for node in G.nodes(data = True):
node[1]['label'] = set_compressed[set_multisets[node[0]]]
node[1][node_label] = set_compressed[set_multisets[node[0]]]
# get the set of compressed labels
labels_comp = list(nx.get_node_attributes(G, 'label').values())
labels_comp = list(nx.get_node_attributes(G, node_label).values())
all_labels_ori.update(labels_comp)
num_of_each_label = dict(Counter(labels_comp))
all_num_of_each_label.append(num_of_each_label)
......
......@@ -10,13 +10,15 @@ def getSPLengths(G1):
distances[i, j] = len(sp[i][j])-1
return distances
def getSPGraph(G):
def getSPGraph(G, edge_weight = 'bond_type'):
"""Transform graph G to its corresponding shortest-paths graph.
Parameters
----------
G : NetworkX graph
The graph to be tramsformed.
edge_weight : string
edge attribute corresponding to the edge weight. The default edge weight is bond_type.
Return
------
......@@ -31,15 +33,17 @@ def getSPGraph(G):
----------
[1] Borgwardt KM, Kriegel HP. Shortest-path kernels on graphs. InData Mining, Fifth IEEE International Conference on 2005 Nov 27 (pp. 8-pp). IEEE.
"""
return floydTransformation(G)
return floydTransformation(G, edge_weight = edge_weight)
def floydTransformation(G):
def floydTransformation(G, edge_weight = 'bond_type'):
"""Transform graph G to its corresponding shortest-paths graph using Floyd-transformation.
Parameters
----------
G : NetworkX graph
The graph to be tramsformed.
edge_weight : string
edge attribute corresponding to the edge weight. The default edge weight is bond_type.
Return
------
......@@ -50,7 +54,7 @@ def floydTransformation(G):
----------
[1] Borgwardt KM, Kriegel HP. Shortest-path kernels on graphs. InData Mining, Fifth IEEE International Conference on 2005 Nov 27 (pp. 8-pp). IEEE.
"""
spMatrix = nx.floyd_warshall_numpy(G) # @todo weigth label not considered
spMatrix = nx.floyd_warshall_numpy(G, weight = edge_weight)
S = nx.Graph()
S.add_nodes_from(G.nodes(data=True))
for i in range(0, G.number_of_nodes()):
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment