Machine Learning/SqliteImport

From Noisebridge
Revision as of 18:19, 24 May 2010 by SpammerHellDontDelete (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

How to load data into SQLITE

Sqlite is the most widely deployed Database in the world. Its fast and great for prototyping.

In bash, create a new sqlite DB by invoking sqlite with the db file that you wish (file does not exist -> will create new one):

sqlite3 dbKdd.dbev

Create new tables: (these tables are examples for KDD competition data)

CREATE TABLE 'bridge' (
row INT, 
studentid VARCHAR(30), 
problemhierarchy TEXT,
problemname TEXT,
problemview INT,
stepname TEXT,
stepstarttime VARCHAR(30), 
firsttransactiontime VARCHAR(30), 
correcttransactiontime VARCHAR(30), 
stependtime VARCHAR(30), 
stepduration INT,
correctstepduration INT,
errorstepduration INT,
cfa INT,
incorrects INT,
hints INT,
corrects INT,
kcsub TEXT,
opportunitysub TEXT,
kctraced TEXT,
opportunitytraced TEXT
);

Then import the data by typing (note we're only loading test data since the training data file would be too big):

.mode tabs
.import "./bridge_to_algebra_2008_2009_test.txt" bridge

Then you can query the data with the standard SQL commands, for example:

select count(distinct studentid) from bridge;


Analogously to above, here's the table for the algebra data set:

CREATE TABLE 'algebra' (
row INT, 
studentid VARCHAR(30), 
problemhierarchy TEXT,
problemname TEXT,
problemview INT,
stepname TEXT,
stepstarttime VARCHAR(30), 
firsttransactiontime VARCHAR(30), 
correcttransactiontime VARCHAR(30), 
stependtime VARCHAR(30), 
stepduration INT,
correctstepduration INT,
errorstepduration INT,
cfa INT,
incorrects INT,
hints INT,
corrects INT,
kcsub TEXT,
opportunitysub TEXT,
kctraced TEXT,
opportunitytraced TEXT,
kcrules TEXT,
opportunityrules TEXT
);

Import it like this:

.mode tabs
.import ./algebra_2008_2009_test.txt algebra

To exit sqlite type:

.q

Ideally, we'd clean up the data/columns a bit more before importing... but this is a quick 'n dirty tutorial.


Andy