Machine Learning: Difference between revisions

From Noisebridge
Jump to navigation Jump to search
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=== Join the Mailing List ===
https://www.noisebridge.net/mailman/listinfo/ml
=== Next Meeting===
=== Next Meeting===


*When:  
*When: Thursday, January 30, 2013 @ 7:00pm
*Where: 2169 Mission St. (back NE corner, Church classroom)
*Where: 2169 Mission St. (Church classroom)
*Topic:  
*Topic: k-Nearest Neighbors and k-Means Clustering
*Details: Currently on hiatus until somebody decides to pick it back up!
*Details:  
*Who:  
*Who: Mike S


=== Take the Noisebridge ML Survey ===
=== Take the Noisebridge ML Survey ===
[http://www.surveymonkey.com/s/W2T9ZB6 Take a survey] and vote for what you want to learn!
[http://www.surveymonkey.com/s/W2T9ZB6 Take a survey] and vote for what you want to learn!
=== Crowdsourced Q&A ===
Are you working on a data mining, machine learning, or statistics problem? Do you want some help? Consider sending an email to the [https://www.noisebridge.net/mailman/listinfo/ml mailing list] about it! Also consider setting up a day to come in and talk about the project you're working on and get input from <span class="plainlinks">[http://www.andrewflusche.com/services/spotsylvania-reckless-driving-defense/<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">Spotsylvania reckless driving</span>] other ML people.
=== About Us ===
We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you <span class="plainlinks">[http://www.monoloop.com<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">website personalization</span>] participate? Join the [https://www.noisebridge.net/mailman/listinfo/ml mailing list], send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.


=== Talks and Workshops ===
=== Talks and Workshops ===
Line 42: Line 40:
*Working with the Kinect
*Working with the Kinect
*Computer Vision with OpenCV
*Computer Vision with OpenCV
=== Mailing List ===
https://www.noisebridge.net/mailman/listinfo/ml


=== Projects ===
=== Projects ===
Line 68: Line 62:
**Upload your algorithm and objectively compare it's performance to other algorithms
**Upload your algorithm and objectively compare it's performance to other algorithms
*[http://www.ntis.gov/products/ssa-dmf.aspx Social Security Death Master File!]
*[http://www.ntis.gov/products/ssa-dmf.aspx Social Security Death Master File!]
*[http://www.sipri.org/databases SIPRI Social Databases]
**Wealth of information on international arms transfers and peace missions.
*[http://aws.amazon.com/publicdatasets/ Amazon AWS Public Datasets]
*[http://www.prio.no/Data/Armed-Conflict/ UCDP/PRIO Armed Conflict Datasets]
*[https://opendata.socrata.com/browse Socrata Government Datasets]


=== Software Tools ===
=== Software Tools ===
Line 74: Line 73:
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka]
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka]
**a collection of data mining tools and machine learning algorithms.
**a collection of data mining tools and machine learning algorithms.
*[http://moa.cs.waikato.ac.nz/ MOA (Massive Online Analysis)]
**Offshoot of weka, has all online-algorithms
*[http://scikit-learn.sourceforge.net/ scikits.learn]
*[http://scikit-learn.sourceforge.net/ scikits.learn]
**Machine learning Python package
**Machine learning Python package
Line 108: Line 105:
*[http://www.mlpack.org/ MLPACK]
*[http://www.mlpack.org/ MLPACK]
**High performance scalable ML Library
**High performance scalable ML Library
*[http://www.torch.ch/ Torch]
**MATLAB-like environment for state-of-the art ML libraries written in LUA


==== Online ML ====
==== Online ML ====
*[http://moa.cs.waikato.ac.nz/ MOA (Massive Online Analysis)]
**Offshoot of weka, has all online-algorithms
*[http://jubat.us/en/ Jubatus]
*[http://jubat.us/en/ Jubatus]
**Distributed Online ML
**Distributed Online ML
*[http://dogma.sourceforge.net/ DOGMA]
**MATLAB-based online learning stuff
*[http://code.google.com/p/libol/ libol]
*[http://code.google.com/p/oll/ oll]
*[http://code.google.com/p/scw-learning/ scw-learning]


==== Graphical Models ====
==== Graphical Models ====
Line 120: Line 126:
*[http://mc-stan.org/ Stan]
*[http://mc-stan.org/ Stan]
**A graphical model compiler
**A graphical model compiler
*[https://github.com/kutschkem/Jayes Jayes]
**Bayesian networks in Java
*[http://tops.sourceforge.net/ ToPS]
**Probabilistic models of sequences


==== Text Stuff ====
==== Text Stuff ====
Line 126: Line 136:
*[http://www.mlsec.org/sally/ SALLY]
*[http://www.mlsec.org/sally/ SALLY]
**Tool for embedding strings into vector spaces
**Tool for embedding strings into vector spaces
*[http://radimrehurek.com/gensim/ Gensim]
**Topic modeling


==== Collaborative Filtering ====
==== Collaborative Filtering ====
Line 140: Line 152:
*[http://drwn.anu.edu.au/ DARWIN]
*[http://drwn.anu.edu.au/ DARWIN]
**Generic C++ ML and Computer Vision Library
**Generic C++ ML and Computer Vision Library
*[http://sourceforge.net/projects/petavision/ PetaVision]
**Developing a real-time, full-scale model of the primate visual cortex.


==== Audio Processing ====
==== Audio Processing ====
Line 152: Line 166:
*[http://ofer.sci.ccny.cuny.edu/sound_analysis_pro Sound Analysis Pro]
*[http://ofer.sci.ccny.cuny.edu/sound_analysis_pro Sound Analysis Pro]
**Tool for analyzing animal sounds
**Tool for analyzing animal sounds
*[http://luscinia.sourceforge.net/ Luscinia]
**Software for archiving, measuring, and analyzing bioacoustic data
*[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python]
*[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python]


Line 165: Line 182:
*[http://cytoscape.github.io/cytoscape.js/ Cytoscape]
*[http://cytoscape.github.io/cytoscape.js/ Cytoscape]
**A JavaScript graph library for analysis and visualisation
**A JavaScript graph library for analysis and visualisation
*[https://plot.ly/ plot.ly]
**Web-based plotting


==== Cluster Computing ====
==== Cluster Computing ====
Line 171: Line 190:
*[http://web.mit.edu/star/cluster/ STAR: Cluster]
*[http://web.mit.edu/star/cluster/ STAR: Cluster]
**Easily build your own Python computing cluster on Amazon EC2
**Easily build your own Python computing cluster on Amazon EC2
==== Database Stuff ====
*[http://madlib.net/ MADlib]
**Machine learning algorithms for in-database data
*[http://www.joyent.com/products/manta Manta]
**Distributed object storage
==== Neural Simulation ====
*[http://nengo.ca/ Nengo]


==== Other ====
==== Other ====

Revision as of 21:42, 23 January 2014

Join the Mailing List

https://www.noisebridge.net/mailman/listinfo/ml

Next Meeting

  • When: Thursday, January 30, 2013 @ 7:00pm
  • Where: 2169 Mission St. (Church classroom)
  • Topic: k-Nearest Neighbors and k-Means Clustering
  • Details:
  • Who: Mike S

Take the Noisebridge ML Survey

Take a survey and vote for what you want to learn!

Talks and Workshops

We've given lots of workshops and talks over the past year or so, here's a few. Many of the workshops we've given previously are recurring and will be given again, especially upon request!

Code and SourceForge Site

    git clone git://ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge
  • Send an email to the list if you want to become an administrator on the site to get write access to the git repo!

Future Talks and Topics, Ideas

  • Random Forests in R
  • Restricted Boltzmann Machines (Mike S, some day)
  • Analyzing brain cells (Mike S)
  • Deep Nets w/ Stacked Autoencoders (Mike S, some day)
  • Generalized Linear Models (Mike S, Erin L? some day)
  • Graphical Models
  • Working with the Kinect
  • Computer Vision with OpenCV

Projects

Datasets and Websites

Software Tools

Generic ML Libraries

Online ML

Graphical Models

  • BUGS
    • MCMC for Bayesian Models
  • JAGS
    • Hierarchical Bayesian Models
  • Stan
    • A graphical model compiler
  • Jayes
    • Bayesian networks in Java
  • ToPS
    • Probabilistic models of sequences

Text Stuff

Collaborative Filtering

  • PREA
    • Personalized Recommendation Algorithms Toolkit
  • SVDFeature
    • Collaborative Filtering and Ranking Toolkit

Computer Vision

  • OpenCV
    • Computer Vision Library
    • Has ML component (SVM, trees, etc)
    • Online tutorials here
  • DARWIN
    • Generic C++ ML and Computer Vision Library
  • PetaVision
    • Developing a real-time, full-scale model of the primate visual cortex.

Audio Processing

  • Friture
    • Real-time spectrogram generation
  • pyo
    • Real-time audio signal processing
  • PYMir
    • A library for reading mp3's into python, and doing analysis
  • PRAAT
    • Speech analysis toolkit
  • Sound Analysis Pro
    • Tool for analyzing animal sounds
  • Luscinia
    • Software for archiving, measuring, and analyzing bioacoustic data

Data Visualization

  • Orange
    • Strong data visualization component
  • Gephi
    • Graph Visualization
  • ggplot
    • Nice plotting package for R
  • MayaVi2
    • 3D Scientific Data Visualization
  • Cytoscape
    • A JavaScript graph library for analysis and visualisation
  • plot.ly
    • Web-based plotting

Cluster Computing

  • Mahout
    • Hadoop cluster based ML package.
  • STAR: Cluster
    • Easily build your own Python computing cluster on Amazon EC2

Database Stuff

  • MADlib
    • Machine learning algorithms for in-database data
  • Manta
    • Distributed object storage

Neural Simulation

Other

Presentations and other Materials

Topics to Learn and Teach

NBML Course - Noisebridge Machine Learning Curriculum (work-in-progress)

CS229 - The Stanford Machine learning Course @ noisebridge

  • Supervised Learning
    • Linear Regression
    • Linear Discriminants
    • Neural Nets/Radial Basis Functions
    • Support Vector Machines
    • Classifier Combination [1]
    • A basic decision tree builder, recursive and using entropy metrics
  • Reinforcement Learning
    • Temporal Difference Learning
  • Math, Probability & Statistics
    • Metric spaces and what they mean
    • Fundamentals of probabilities
    • Decision Theory (Bayesian)
    • Maximum Likelihood
    • Bias/Variance Tradeoff, VC Dimension
    • Bagging, Bootstrap, Jacknife [2]
    • Information Theory: Entropy, Mutual Information, Gaussian Channels
    • Estimation of Misclassification [3]
    • No-Free Lunch Theorem [4]
  • Machine Learning SDK's
    • OpenCV ML component (SVM, trees, etc)
    • Mahout a Hadoop cluster based ML package.
    • Weka a collection of data mining tools and machine learning algorithms.
  • Applications
    • Collective Intelligence & Recommendation Engines

Meeting Notes