The idea of using a hierarchical clustering of nodes in order to guide network embedding is very nice!
### Related work:
Hierarchical clustering for network embedding has also been used in the two following papers:
- [HARP: Hierarchical Representation Learning for Networks](https://papers-gamma.link/paper/109)
- [MILE: A Multi-Level Framework for Scalable Graph Embedding](https://papers-gamma.link/paper/104)
### Scalability:
The largest graph used in the experiments has only 334k edges and the running time on that network is not reported. In figure 7, the method takes 2 minutes on a BA network of less than 10k nodes and 150k edges.
It would have been nice to report the running time on a graph of 1G edges in order to evaluate the scalability in a better way.
### Reproducibility:
The implementation does not seem to be publically available and as the proposed method is complicated, the results are hard to reproduce.
### Table 3:
It is interesting that the dot-product approach leads to the best results for link prediction in comparison with the 4 strategies used in node2vec. Usualy, Hadamard leads to the best results.

The idea of using a hierarchical clustering of nodes in order to guide network embedding is very nice!
### Related work:
Hierarchical clustering for network embedding has also been used in the two following papers:
- [HARP: Hierarchical Representation Learning for Networks](https://papers-gamma.link/paper/109)
- [MILE: A Multi-Level Framework for Scalable Graph Embedding](https://papers-gamma.link/paper/104)
### Scalability:
The largest graph used in the experiments has only 334k edges and the running time on that network is not reported. In figure 7, the method takes 2 minutes on a BA network of less than 10k nodes and 150k edges.
It would have been nice to report the running time on a graph of 1G edges in order to evaluate the scalability in a better way.
### Reproducibility:
The implementation does not seem to be publically available and as the proposed method is complicated, the results are hard to reproduce.
### Table 3:
It is interesting that the dot-product approach leads to the best results for link prediction in comparison with the 4 strategies used in node2vec. Usualy, Hadamard leads to the best results.

Comments to a comment:
1. Code is available at https://github.com/JianxinMa/jianxinma.github.io/blob/master/assets/nethiex-linux.zip at the moment. It will probably be put on http://nrl.thumedialab.com/ later.
Yeah the implementation is quite complicated. Still, a lot simpler than the variational inference for the nested chinese restaurant process (which I can't find any implementation).
2. Scalability shouldn't be a problem in theory (the time complexity is linear per iteration in theory). But the released code is not really optimized (not even multi-threaded or distributed).
3. It's possible for Hadamard to be worse than dot product, as it requires training a classifier and can over-fit if not tuned carefully or trained on a small sample.

Comments to a comment:
1. Code is available at https://github.com/JianxinMa/jianxinma.github.io/blob/master/assets/nethiex-linux.zip at the moment. It will probably be put on http://nrl.thumedialab.com/ later.
Yeah the implementation is quite complicated. Still, a lot simpler than the variational inference for the nested chinese restaurant process (which I can't find any implementation).
2. Scalability shouldn't be a problem in theory (the time complexity is linear per iteration in theory). But the released code is not really optimized (not even multi-threaded or distributed).
3. It's possible for Hadamard to be worse than dot product, as it requires training a classifier and can over-fit if not tuned carefully or trained on a small sample.

## Comments: