Research Library > Pure Storage > A Discussion With Pure Storage’s Brian Gold on Big Data Analytics for Apache Spark

A Discussion With Pure Storage’s Brian Gold on Big Data Analytics for Apache Spark

White Paper Published By: Pure Storage
Pure Storage
Published:  Jul 03, 2019
Type:  White Paper
Length:  8 pages

Apache® Spark™ has become a vital technology for development teams looking to leverage an ultrafast in-memory data engine for big data analytics. Spark is a flexible open-source platform, letting developers write applications in Java, Scala, Python or R. With Spark, development teams can accelerate analytics applications by orders of magnitude.

The rapid growth of Spark has not been without challenges. Most organizations have relied on sprawling deployments of the Hadoop Distributed File System (HDFS), with racks of spinning disks to meet the capacity and performance demands of data-intensive applications. That is about to change, however.

Pure Storage, a pioneer in block-based flash arrays, has developed a technology called FlashBlade, designed specifically for file and object storage environments. With FlashBlade, IT teams now have a simple-to-manage shared storage solution that delivers the performance and capacity needed to bring Spark deployments on premise.



Tags