Editing
Machine Learning/Kaggle Social Network Contest/Network Description
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Conectivity == "A digraph is strongly connected if every vertex is reachable from every other following the directions of the arcs. On the contrary, a digraph is weakly connected if its underlying undirected graph is connected. A weakly connected graph can be thought of as a digraph in which every vertex is "reachable" from every other but not necessarily following the directions of the arcs. A strong orientation is an orientation that produces a strongly connected digraph." [http://en.wikipedia.org/wiki/Glossary_of_graph_theory wikipedia] * The Training Graph is '''not''' weakly connected * It contains 27 subgraphs This means that it can be broken down into at least two discrete subgraphs. ** c.f. [http://cneurocvs.rmki.kfki.hu/igraph/doc/R/clusters.html igraph clustering] ** There is one very large cluster containing all but 154 verticies, then 4 with size 10 - 37, 8 sized 3 - 7 and 13 size 2 ** note that igraph seems to create a vertex labelled 0 but the labels in the traindata file range from 1 to 1133547 * I also grabbed the number of strongly connected subgraphs {| border="1" |- !| Cluster Size | 1 | 2 | 3 | 4 | 5 | 9 | 10 | 32464 |- !| freq |1100647 | 162 | 18 |5 |4 | 1 | 1 | 1 |} When I added all of the test data to the graph and then re-ran the cluster analysis it found 22 clusters instead of 27. The largest cluster grew by 72 vertices. {| border="1" |- !| Cluster Size | 1 | 2 | 3 | 4 | 5 | 7 | 10 | 23 | 37 | 1133394 | 1133466 |- !| Train | 1 | 13 | 3 | 2 | 2 | 1 | 1 | 2 | 1 | 1 | 0 |- !| Train + Test | 1 | 13 | 2 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1 |} Is it more likely that clusters were created by removing nodes or that they merged due to randomly adding nodes? * TODO: figure out probs of adding and removing nodes under different sampling hypotheses. * TODO: identify the edges which are merging the clusters * I'm guessing that the chances of a randomly generated edge joins the small clusters is very low. * Diameter of the directed graph is 14 ** This is the longest of the shortest directed paths between two nodes ** R igraph *** diameter (dg, directed = TRUE, unconnected = TRUE) *** Was taking forever so I aborted (after 34 minutes...) * Total number of direct neighbours out: 7 275 672, in: 508 688, all: 7 473 273 ** For each of our 38k I calculated the number of outbound neighbours and summed it ** R igraph: *** sum([http://cneurocvs.rmki.kfki.hu/igraph/doc/R/neighborhood.html neighborhood.size](dg, 1, nodes=myGuys, mode="out")) *** mode = "in", "out" or "all"
Summary:
Please note that all contributions to Noisebridge are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
Noisebridge:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Request account
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Dig in!
Noisebridge
- Status: MOVED
- Donate
- ABOUT
- Accessibility
- Vision
- Blog
Manual
MANUAL
Visitors
Participation
Community Standards
Channels
Operations
Events
EVENTS
Guilds
GUILDS
- Meta
- Electronics
- Fabrication
- Games
- Music
- Library
- Neuro
- Philosophy
- Funding
- Art
- Crypto
- Documentation/Wiki
Wiki
Recent Changes
Random Page
Help
Categories
(Edit)
Tools
What links here
Related changes
Special pages
Page information