Apache Spark is a fast cluster computing system supporting interactive queries with SQL, machine learning, and graph computation all handled through the Spark API. The Apache Spark Python Library enables developers to quickly write programs in Python that access a unified engine in order to process large amounts of data. Supported by the Apache Software Foundation, the Python library comes well documented.
This article is part of a 10-part series about interesting APIs that were added to our directory during 2016. Big Data and Data Analytics APIs are covered in this segment. The APIs were chosen by our researchers, by popularity according to website traffic, and by mentions on social media.