Commandline flags
 Training (svm_train) flags

kernel_type
: 0: normalizedlinear
 1: normalizedpolynomial
 2: RBF Gaussian
 3: Laplasian

rank_ratio
: approximation ratio between 0 and 1. 
hyper_parm
: C in SVM. This is the same as libsvm "c" parameter 
gamma
: gamma value if you use RBF kernel. This is the same as libsvm "g" parameter 
poly_degree
: degree if you use normalizedpolynomial kernel. This is the same as libsvm "d" parameter 
model_path
: the location to save the training model and checkpoints to. Be sure that this path is EMPTY before training a new model: svm_train will interpret any of its checkpoints left in this directory as checkpoints for the current model. 
failsafe
: If failsafe is set to true, program will periodically write checkpoints tomodel_path
and if program fail, it will restart from last checkpoints. 
save_interval
: Because PSVM supports failsafe. On everysave_interval
seconds, program will write a checkpoint. If PSVM fails such as machine is down, it will restart from last checkpoint on next execution. 
surrogate_gap_threshold
,feasible_threshold
,max_iteration
: Because PSVM use Interior Point Method, there needs many iterations. The iteration will stop by checking ((surrogate_gap <surrogate_gap_threshold
and primal residual <feasible_threshold
and dual residual <feasible_threshold
) or iterations >max_iteration
). Usually setting them to default will handle most of the cases. 
positive_weight
,negative_weight
: For unbalanced data, we should set a morethanone weight to one of the class. For example there are 100 positive data and 10 negative data, it is suggested you set negative_weight to 10.  Others: simply run svm_train to get description for each parameter. They are not frequently used unless you are quite familiar with algorithm details.

 Predicting (svm_predict) flags

model_path
: the path of the model which we use to predict. 
output_path
: where to outputPredictResult

 Selection of
rank_ratio
rank_ratio
decides the reduced matrix dimensionp
(p = n*rank_ratio
). Higher values yield higher accuracy at the cost of increased memory usage and training time. If you are unsure what value to use, we recommend a value of1/sqrt(n)
, wheren
is the number of training samples. Also you should consider the number of machines and memory you can use when settingrank_ratio
.