Skip to content

Latest commit

 

History

History

benchmarks

Benchmarks

All benchmarks reported here were performed on an Intel i7-7820x CPU. GPU Benchmarks were done on a NVIDIA A6000.

Spark Comparison

The benchmark_spark.py script compares the AlternatingLeastSquares model found here to the implementation found in Spark MLlib.

To run this comparison, you should first compile Spark with native BLAS support.

This benchmark compares the Conjugate Gradient solver found in implicit on both the CPU and GPU, to the Cholesky solver used in Spark.

The times per iteration are average times over 5 iterations.

last.fm 360k dataset

For the lastm.fm dataset at 256 factors, implicit on the CPU is 30x faster than Spark and the GPU version of implicit is 260x faster than Spark:

last.fm als train time

MovieLens 20M dataset

For the ml20m dataset at 256 factors, implicit on the CPU was 23x faster than Spark while the GPU version was 180x faster than Spark:

als train time

Note that this dataset was filtered down for all versions to reviews that were positive (4+ stars), to simulate a truly implicit dataset.