Intel open-sources BigDL , a distributed deep learning library that runs on Apache Spark. It leverages existing Spark clusters to run deep learning computations and simplifies the data loading from big datasets stored in Hadoop.
Tests show a significant speedup performance running on Xeon servers compared to other open source frameworks Caffe , Torch or TensorFlow. The speed is comparable with a mainstream GPU and BigDL is able to scale to tens of Xeon servers.
BigDL library supports Spark versions 1.5, 1.6 and 2.0 and allows for deep learning to be embedded in existing Spark based programs. It contains methods to convert Spark RDDs to BigDL DataSet and can be used directly with Spark ML Pipelines.
For model training, BigDL applies a synchronous mini-batch SGD ( Stochastic Gradient Descent ) executed in a single Spark task across multiple executors.
Home
United States
USA — software Intel Open-Sources BigDL, Distributed Deep Learning Library for Apache Spark