Apache Spark is a fast cluster computing system supporting interactive queries with SQL, machine learning, and graph computation all handled through the Spark API. The Apache Spark Python Library enables developers to quickly write programs in Python that access a unified engine in order to process large amounts of data. Supported by the Apache Software Foundation, the Python library comes well documented.
Quantified Self, "a collaboration of users and tool makers who share an interest in self knowledge through self-tracking," requires tapping into massive amounts of data for its very existence. Members meet in an online community, and in person through small groups and a larger conference. The goal of any meeting is to learn, develop, and refine methods for collecting personal data in hopes to generate an effective response to the data. At Quantified Self forums around the world, "[i]ndividuals tell their stores of using data to better understand who they are and how they interact with the world around them." says Singly.