Machine Learning Meetup Notes: 2010-06-30

From Noisebridge
Revision as of 22:18, 30 June 2010 by SpammerHellDontDelete (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

amino acids build proteins 20 amino acids

protein has an amino acid sequence (three bases make up an amino acid) dna comprised of 4 bases: A, T, C, G rna comprised of 4 bases, A, U, C, G

A goes with T C with G

every three bases is a codon dave wrote a script that will take the codons and map them to their amino acids

protease - are a type of proteins that cleave other proteins?

reverse transcriptase - takes viral rna and transcribes it into dna sends mrna (bad) into the ribosomes they replicate very fast in your immune cells and thats how they kill them

99 amino acids in protease (297 dna bases)

reverse transcriptase is not predictable - each sequence is a different length

Possible Features: -for the OR acids, perhaps create all possible combinations and weight them by 1/(number of combinations), normal rows weight = 1 -find most probable sequences -correlating permutations -molecular weight/length -acidity/charge -edit distance (differences between the sequences), use to cluster -list of known resistant mvt sites -find out which sites are most variable


for each site, and look at frequency of each amino acid could put into a tree classifier

Personal tools