Table Of Contents
Venture Capital – Startup Network
- Initial Report PDF
- Initial input data file Excel File( ), CSV File()
- git repo /home/ec2-user/GitRepo/R/VentureCapital.git/ ec2:/home/ec2-user/GitRepo/R/VentureCapital.git ec2: is a alias
- For Firms FirmCluster.json
- For Companies
- For Firms
Command to import data:
- Change directory to mongo bin path.
- ./mongoimport -d ndataconsulting -c VCDeals –type csv –file /home/ntreees/VCdeals.csv –headerline
- This above command import the csv file to ndataconsulting database and within that uses VCDeals collection.
- Indexes single column:
iGraph R Module
- IGraph Community Detection Details http://www.r-bloggers.com/summary-of-community-detection-algorithms-in-igraph-0-6/
- iGraph Documentation ()
- iGraph Tutorial (http://igraph.sourceforge.net/igraphbook/igraphbook-datamodel.html)
- Drawing Graph (http://horicky.blogspot.com/2012/04/basic-graph-analytics-using-igraph.html)
- Psuedo Inverse()
- Get Adjency Graph(http://stackoverflow.com/questions/14849835/how-to-calculate-adjacency-matrices-in-r)
- layout=layout.fruchterman.reingold Force Based Implementation
- Drawing Graph
- Degree of Graph
- Laplacian of Graph
- Example to create graph from data http://igraph.sourceforge.net/igraphbook/igraphbook-creating.html
- Integration With R
- Interesting Snippets
Detail about the algorithm (Newman-Girvan cohesion-based clustering algorithm) :
- Is a algorithm which is used to find Community Structure. For a community structure n0rmally a set of nodes are densely connected to each other form a group and these groups are sparsely connected to other groups. Basically nodes will be more likely to be connected to each other if they are in the same community and less likely if in different communities.
- The algorithm works by finding an edge between communities and then removes these edges leaving behind only the communities themselves. For this it uses is Betweenness.
- Betweenness assigns a large number to edges if they are between many pair of nodes.
- Popular but slow takes O(m2n) on a network of n vertices and m edges making it impractical for a large set of nodes.
- It focuses on these edges that are least central, the edges that are most “between” communities. The communities are detected by progressively removing edges from the original graph
- If a network contains communities or groups that are only loosely connected by a few intergroup edges, then all shortest paths between different communities must go along one of these few edges. Thus, the edges connecting communities will have high edge betweenness (at least one of them). By removing these edges, the groups are separated from one another and so the underlying community structure of the network is revealed.
- Initial Report PDF