Machine Learning: Difference between revisions
No edit summary |
Mschachter (talk | contribs) |
||
(40 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
=== Next Meeting=== | === Next Meeting=== | ||
*When: | *When: | ||
*Where: 2169 Mission St. (back corner, Church | *Where: 2169 Mission St. (back NE corner, Church classroom) | ||
*Topic: | *Topic: | ||
*Details: | *Details: Currently on hiatus until somebody decides to pick it back up! | ||
*Who: | *Who: | ||
=== Take the Noisebridge ML Survey === | === Take the Noisebridge ML Survey === | ||
Line 14: | Line 14: | ||
=== About Us === | === About Us === | ||
We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you participate? Join the [https://www.noisebridge.net/mailman/listinfo/ml mailing list], send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome. | We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you <span class="plainlinks">[http://www.monoloop.com<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">website personalization</span>] participate? Join the [https://www.noisebridge.net/mailman/listinfo/ml mailing list], send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome. | ||
=== Talks and Workshops === | === Talks and Workshops === | ||
Line 74: | Line 74: | ||
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka] | *[http://www.cs.waikato.ac.nz/ml/weka/ Weka] | ||
**a collection of data mining tools and machine learning algorithms. | **a collection of data mining tools and machine learning algorithms. | ||
*[http://scikit-learn.sourceforge.net/ scikits.learn] | *[http://scikit-learn.sourceforge.net/ scikits.learn] | ||
**Machine learning Python package | **Machine learning Python package | ||
Line 100: | Line 98: | ||
*[http://www.pytables.org/moin PyTables] | *[http://www.pytables.org/moin PyTables] | ||
**Adds querying capabilities to HDF5 files | **Adds querying capabilities to HDF5 files | ||
*[http://statsmodels.sourceforge.net/ statsmodels] | |||
**Regression, time series analysis, statistics stuff for python | |||
*[https://github.com/JohnLangford/vowpal_wabbit/wiki Vowpal Wabbit] | |||
**"Intrinsically Fast" implementation of gradient descent for large datasets | |||
*[http://www.shogun-toolbox.org/ Shogun] | |||
**Fast implementations of SVMs | |||
*[http://www.mlpack.org/ MLPACK] | |||
**High performance scalable ML Library | |||
*[http://www.torch.ch/ Torch] | |||
**MATLAB-like environment for state-of-the art ML libraries written in LUA | |||
==== Online ML ==== | |||
*[http://moa.cs.waikato.ac.nz/ MOA (Massive Online Analysis)] | |||
**Offshoot of weka, has all online-algorithms | |||
*[http://jubat.us/en/ Jubatus] | |||
**Distributed Online ML | |||
*[http://dogma.sourceforge.net/ DOGMA] | |||
**MATLAB-based online learning stuff | |||
*[http://code.google.com/p/libol/ libol] | |||
*[http://code.google.com/p/oll/ oll] | |||
*[http://code.google.com/p/scw-learning/ scw-learning] | |||
==== Graphical Models ==== | |||
*[http://www.mrc-bsu.cam.ac.uk/bugs/ BUGS] | |||
**MCMC for Bayesian Models | |||
*[http://mcmc-jags.sourceforge.net/ JAGS] | |||
**Hierarchical Bayesian Models | |||
*[http://mc-stan.org/ Stan] | |||
**A graphical model compiler | |||
==== Text Stuff ==== | |||
*[http://www.crummy.com/software/BeautifulSoup/ Beautiful Soup] | |||
**Screen-scraping tools | |||
*[http://www.mlsec.org/sally/ SALLY] | |||
**Tool for embedding strings into vector spaces | |||
==== Collaborative Filtering ==== | |||
*[http://prea.gatech.edu/ PREA] | |||
**Personalized Recommendation Algorithms Toolkit | |||
*[http://svdfeature.apexlab.org/wiki/Main_Page SVDFeature] | |||
**Collaborative Filtering and Ranking Toolkit | |||
==== Computer Vision ==== | ==== Computer Vision ==== | ||
Line 106: | Line 145: | ||
**Has ML component (SVM, trees, etc) | **Has ML component (SVM, trees, etc) | ||
**Online tutorials [http://www.pages.drexel.edu/~nk752/tutorials.html here] | **Online tutorials [http://www.pages.drexel.edu/~nk752/tutorials.html here] | ||
*[http://drwn.anu.edu.au/ DARWIN] | |||
**Generic C++ ML and Computer Vision Library | |||
==== Audio Processing ==== | ==== Audio Processing ==== | ||
Line 114: | Line 155: | ||
*[https://github.com/jsawruk/pymir PYMir] | *[https://github.com/jsawruk/pymir PYMir] | ||
**A library for reading mp3's into python, and doing analysis | **A library for reading mp3's into python, and doing analysis | ||
*[http://www.fon.hum.uva.nl/praat/ PRAAT] | |||
**Speech analysis toolkit | |||
*[http://ofer.sci.ccny.cuny.edu/sound_analysis_pro Sound Analysis Pro] | |||
**Tool for analyzing animal sounds | |||
*[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python] | *[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python] | ||
Line 125: | Line 170: | ||
*[http://code.enthought.com/projects/mayavi/ MayaVi2] | *[http://code.enthought.com/projects/mayavi/ MayaVi2] | ||
**3D Scientific Data Visualization | **3D Scientific Data Visualization | ||
*[http://cytoscape.github.io/cytoscape.js/ Cytoscape] | |||
**A JavaScript graph library for analysis and visualisation | |||
*[https://plot.ly/ plot.ly] | |||
**Web-based plotting | |||
==== Cluster Computing ==== | ==== Cluster Computing ==== | ||
Line 187: | Line 236: | ||
=== [[Machine Learning/Meeting Notes|Meeting Notes]]=== | === [[Machine Learning/Meeting Notes|Meeting Notes]]=== | ||
[[Category:Events]] | |||
[[Category:Projects]] |
Revision as of 19:48, 21 June 2013
Next Meeting
- When:
- Where: 2169 Mission St. (back NE corner, Church classroom)
- Topic:
- Details: Currently on hiatus until somebody decides to pick it back up!
- Who:
Take the Noisebridge ML Survey
Take a survey and vote for what you want to learn!
Crowdsourced Q&A
Are you working on a data mining, machine learning, or statistics problem? Do you want some help? Consider sending an email to the mailing list about it! Also consider setting up a day to come in and talk about the project you're working on and get input from Spotsylvania reckless driving other ML people.
About Us
We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you website personalization participate? Join the mailing list, send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.
Talks and Workshops
We've given lots of workshops and talks over the past year or so, here's a few. Many of the workshops we've given previously are recurring and will be given again, especially upon request!
- Intro to Machine Learning
- A Brief Tour of Statistics
- Generalized Linear Models
- Neural Nets Workshop
- Support Vector Machines
- Random Forests
- Independent Components Analysis
- Deep Nets
Code and SourceForge Site
- We have a Sourceforge Project
- We have a git repository on the project page, accessible as:
git clone git://ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge
- Send an email to the list if you want to become an administrator on the site to get write access to the git repo!
Future Talks and Topics, Ideas
- Random Forests in R
- Restricted Boltzmann Machines (Mike S, some day)
- Analyzing brain cells (Mike S)
- Deep Nets w/ Stacked Autoencoders (Mike S, some day)
- Generalized Linear Models (Mike S, Erin L? some day)
- Graphical Models
- Working with the Kinect
- Computer Vision with OpenCV
Mailing List
https://www.noisebridge.net/mailman/listinfo/ml
Projects
- Small Group Subproblems
- Fundraising
- Noisebridge Machine Learning Course
- Kaggle Social Network Contest
- KDD Competition 2010
- HIV
Datasets and Websites
- UCI Machine Learning Repository
- DataSF.org
- Infochimps
- Face Recognition Databases
- Time Series Data Library
- Data Q&A Forum
- Metaoptimize
- Quora ML Page
- A ton of Weather Data
- MLcomp
- Upload your algorithm and objectively compare it's performance to other algorithms
- Social Security Death Master File!
Software Tools
Generic ML Libraries
- Weka
- a collection of data mining tools and machine learning algorithms.
- scikits.learn
- Machine learning Python package
- scikits.statsmodels
- Statistical models to go with scipy
- PyBrain
- Does feedforward, recurrent, SOM, deep belief nets.
- LIBSVM
- c-based SVM package
- PyML
- MDP
- Modular framework, has lots of stuff!
- VirtualBox Virtual Box Image with Pre-installed Libraries listed here
- Theano: Symbolic Expressions and Transparent GPU Integration
- sympy Does symbolic math
- Waffles
- Open source C++ set of machine learning command line tools.
- RapidMiner
- Mobile Robotic Programming Toolkit
- nitime
- NeuroImaging in Python, has some good time series analysis stuff and multi-variate response fitting.
- Pandas
- Data analysis workflow in python
- PyTables
- Adds querying capabilities to HDF5 files
- statsmodels
- Regression, time series analysis, statistics stuff for python
- Vowpal Wabbit
- "Intrinsically Fast" implementation of gradient descent for large datasets
- Shogun
- Fast implementations of SVMs
- MLPACK
- High performance scalable ML Library
- Torch
- MATLAB-like environment for state-of-the art ML libraries written in LUA
Online ML
- MOA (Massive Online Analysis)
- Offshoot of weka, has all online-algorithms
- Jubatus
- Distributed Online ML
- DOGMA
- MATLAB-based online learning stuff
- libol
- oll
- scw-learning
Graphical Models
Text Stuff
- Beautiful Soup
- Screen-scraping tools
- SALLY
- Tool for embedding strings into vector spaces
Collaborative Filtering
- PREA
- Personalized Recommendation Algorithms Toolkit
- SVDFeature
- Collaborative Filtering and Ranking Toolkit
Computer Vision
- OpenCV
- Computer Vision Library
- Has ML component (SVM, trees, etc)
- Online tutorials here
- DARWIN
- Generic C++ ML and Computer Vision Library
Audio Processing
- Friture
- Real-time spectrogram generation
- pyo
- Real-time audio signal processing
- PYMir
- A library for reading mp3's into python, and doing analysis
- PRAAT
- Speech analysis toolkit
- Sound Analysis Pro
- Tool for analyzing animal sounds
- List of Sound Tools for Python
Data Visualization
- Orange
- Strong data visualization component
- Gephi
- Graph Visualization
- ggplot
- Nice plotting package for R
- MayaVi2
- 3D Scientific Data Visualization
- Cytoscape
- A JavaScript graph library for analysis and visualisation
- plot.ly
- Web-based plotting
Cluster Computing
- Mahout
- Hadoop cluster based ML package.
- STAR: Cluster
- Easily build your own Python computing cluster on Amazon EC2
Other
Presentations and other Materials
- Awesome Machine Learning Applications -- A list of cool applications of ML
- Hands-on Machine Learning, a presentation jbm gave on 2009-01-07.
- http://www.youtube.com/user/StanfordUniversity#g/c/A89DCFA6ADACE599 Stanford Machine Learning online course videos]
- Media:Brief_statistics_slides.pdf, a presentation given on statistics for the machine learning group
- LinkedIn discussion on good resources for data mining and predictive analytics
- Face Recognition Algorithms
- Max Welling's ML classnotes
Topics to Learn and Teach
NBML Course - Noisebridge Machine Learning Curriculum (work-in-progress)
CS229 - The Stanford Machine learning Course @ noisebridge
- Supervised Learning
- Linear Regression
- Linear Discriminants
- Neural Nets/Radial Basis Functions
- Support Vector Machines
- Classifier Combination [1]
- A basic decision tree builder, recursive and using entropy metrics
- Unsupervised Learning
- Hidden Markov Models
- Clustering: PCA, k-Means, Expectation-Maximization
- Graphical Modeling
- Generative Models: gaussian distribution, multinomial distributions, HMMs, Naive Bayes
- Deep Belief Networks & Restricted Boltzmann Machines
- Reinforcement Learning
- Temporal Difference Learning
- Math, Probability & Statistics
- Metric spaces and what they mean
- Fundamentals of probabilities
- Decision Theory (Bayesian)
- Maximum Likelihood
- Bias/Variance Tradeoff, VC Dimension
- Bagging, Bootstrap, Jacknife [2]
- Information Theory: Entropy, Mutual Information, Gaussian Channels
- Estimation of Misclassification [3]
- No-Free Lunch Theorem [4]
- Machine Learning SDK's
- Applications
- Collective Intelligence & Recommendation Engines