Machine Learning/Kaggle Social Network Contest/Features
Jump to navigation
Jump to search
TODO
- Precisely define the listed features
Possible Features
- Node Features
- nodeid
- outdegree
- indegree
- local clustering coefficient
- reciprocation of inbound probability (num of edges returned / num of inbound edges)
- reciprocation of outbound probability (num of edges returned / num of outbound edges)
- Edge Features
- nodetofollowid
- shortest distance nodeid to nodetofollowid
- density? (
median path length) - does reverse edge exist? (aka is nodetofollowid following nodeid?)
- number of common friends
- indegrees & outdegrees of nodetofollowid
- Network features
- unweighted random walk score
- global clustering coefficient
- Adamic-Adar score
- see original paper
- R igraph: similarity.invlogweighted
- Clustering
- membership of the same strongly connected cluster
- using igraph clusters
- membership of the same strongly connected cluster
The response variable is the probability that the nodeid to nodetofollowid edge will be created in the future