# Examples

Following the guide of Cluster Setup, we use **5** machines to build the cluster. Each machine is configured with **FOUR** 3.4GHz CPU processors and memory larger than 8GB. To meet the requirement of memory, one machine with 3.4GHz CPU and 16GB memory is chosen to be the baseline to measure the speedup ratio.

Two datasets *ijcnn1, epsilon* are used to evaluate the performance of PSVM. The training samples' scale ranges from 50k to 400k. The format of data files of these datasets are same with psvm's expected format, so you can directly download them and conduct the experiments without any conversion.

After downloading and decompressing the datasets in $psvm_dir/data/$name, you can train and test by running the following commands:

(Note: 1. the parameter **-n** sets the number of processors specified for execution which should not exceed the total number of your cluster's processors. Otherwise two processes will run on a single processor and training time will be limited by this slowest processor. We have **5** machines with **4** CPU processors, so, we can set **-n 20**. 2. we set **-rank_ratio 1/sqrt(n)**, where n is training sample number.)

```
mkdir -p models/ijcnn1
mpiexec -n 20 -f machinefile ./svm_train -rank_ratio 0.0045 -kernel_type 2 -hyper_parm 1 \
-gamma 0.01 -model_path models/ijcnn1/ data/ijcnn1/ijcnn1
```

```
mkdir -p models/epsilon
mpiexec -n 20 -f machinefile ./svm_train -rank_ratio 0.001 -kernel_type 2 -hyper_parm 1 \
-gamma 0.01 -model_path models/epsilon/ data/epsilon/epsilon_normalized
```

`mpiexec -n 20 -f machinefile ./svm_predict -model_path models/ijcnn1/ data/ijcnn1/ijcnn1.t`

```
mpiexec -n 20 -f machinefile ./svm_predict -model_path models/epsilon/ \
data/epsilon/epsilon_normalized.t
```

The running time, memory and accuracy of PSVM are reported in the following tables. The performance of LIBSVM[1] is reported for reference, with c=1, g=0.01.

training(49,990) | testing(91,701) | ||||||
---|---|---|---|---|---|---|---|

algorithm(machines) | time(s) | speedup | mem(MB) | time(s) | speedup | mem(MB) | accuracy |

LIBSVM(1) | 45 | - | 117 | 34 | - | 2.3 | 0.9066 |

PSVM(1) | 68 | 1 | 124 | 36 | 1 | 23 | 0.9053 |

PSVM(5) | 6.8 | 10 | 22 | 4.1 | 9 | 52 | 0.9053 |

Note: we sorted the datafile of ijcnn1 to make the samples with label -1 followed by the samples with label 1. With models trained on the original datafile, the predicting accuracy of PSVM(5) is 18% less than PSVM(1): 74% and 92%. Sorted samples make sure that each processor can get training subset with the same distribution of the universal set. Otherwise, different distributions may effect the performance of parallel PSVM.

training(400k) | testing(100k) | ||||||
---|---|---|---|---|---|---|---|

algorithm(machines) | time(s) | speedup | mem(MB) | time(s) | speedup | mem(MB) | accuracy |

LIBSVM(1) | 225240 | - | 12385 | 50967 | - | 6914 | 0.8856 |

PSVM(1) | 3537 | 1 | 16384 | 73160 | 1 | 16384 | 0.8624 |

PSVM(5) | 653 | 5 | 4495 | 9981 | 7 | 2470 | 0.8627 |

Note: if using bigger rank_ratio, you can get models with more accuracy but in longer time. For example, the accuracy of PSVM(5) is 0.8855 when we set rank_ratio = 0.005, and training/testing time is 5137s/7979s.