it will generate all .out of each feature and data_training in .csv
Description
phone_dataset.csv
The source of raw data which originally taken from here (credit to the owner). This data use comma / , as delimiter but there's some data who got shifted on N821, N6060 and O6663 (download the original too see it). So we manually fix it by shift back to the right position and also remove extra newline in the EOF because of this.
data_training_1.csv
Result of cleaning data without normalization.
data_training_2.csv
Result of cleaning data with normalization.
battery, cpu, dimension, etc.
Folders which contain some features from raw data for training. Please see the scratch to understand what features are used. Each folder contains 3 file :
name.go, as script to clean the data.
This script will process name.in and produce name.out
name.in, as input data, taken from raw data.
At some point there's some manually clean data before it proceed to name.go, because the data's too complex. If there's, it will be written in name.go along with example.
name.out, as output data.
Purely from name.in which processed by name.go
data_training.go
Main file to clean, combine, and normalize data.