Hands-On Big Data Analytics with PySpark

Hands-On Big Data Analytics with PySpark

Analyze large datasets and discover techniques for testing, immunizing, and parallelizing Spark jobs

Versandkostenfrei!
Versandfertig in 1-2 Wochen
28,99 €
inkl. MwSt.
PAYBACK Punkte
14 °P sammeln!
Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily parallelizable Spark jobs Key Features: - Work with large amounts of agile data using distributed datasets and in-memory caching - Source data from all popular data hosting platforms, such as HDFS, Hive, JSON, and S3 - Employ the easy-to-use PySpark API to deploy big data Analytics for production Book Description: Apache Spark is an open source parallel-processing framework that has been around for quite some time now. One of the many uses of Apache Spark is for data analyti...