51,99 €
inkl. MwSt.
Versandkostenfrei*
Versandfertig in über 4 Wochen
payback
26 °P sammeln
  • Broschiertes Buch

Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator—either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu column-oriented data store, you can easily perform fast analytics on fast data. This practical guide shows you how. Begun as an internal project at Cloudera, Kudu is an open source solution compatible with many data processing frameworks in the Hadoop environment. In…mehr

Produktbeschreibung
Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator—either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu column-oriented data store, you can easily perform fast analytics on fast data. This practical guide shows you how. Begun as an internal project at Cloudera, Kudu is an open source solution compatible with many data processing frameworks in the Hadoop environment. In this book, current and former solutions professionals from Cloudera provide use cases, examples, best practices, and sample code to help you get up to speed with Kudu. * Explore Kudu's high-level design, including how it spreads data across servers * Fully administer a Kudu cluster, enable security, and add or remove nodes * Learn Kudu's client-side APIs, including how to integrate Apache Impala, Spark, and other frameworks for data manipulation * Examine Kudu's schema design, including basic concepts and primitives necessary to make your project successful * Explore case studies for using Kudu for real-time IoT analytics, predictive modeling, and in combination with another storage engine
Autorenporträt
Jean-Marc Spaggiari, an early adopter of Kudu, works as a Principal Solutions Architect for Cloudera to support Hadoop, Kudu, HBase and other tools through technical support and consulting work. His deep knowledge of HBase and HDFS allows him to better understand Kudu and its applications. Jean-Marc's primary role is to support HBase users over their HBase cluster deployments, upgrades, configuration and optimization, as well as to support them regarding HBase related application development. He is also a very active HBase community member, testing every release from performance and stability standpoints. However, with Kudu being geared to quickly penetrate the market, he will also begin recommending, building demo applications and deploying proof of concepts around it. Prior to Cloudera, Jean-Marc worked as a Project Manager and as a Solutions Architect for CGI and insurances companies. He has almost 20 years of Java development experience. In addition to regularly attending Strata+Hadoop World and HBaseCon, he has spoken at various Hadoop User Group meetings and many conferences in North America, usually focusing on HBase related presentations and demonstrations. Jean-Marc is also the author of Architecting HBase Applications (O'Reilly). Mladen Kovacevic comes from a development background in RDBMS technology, and sees Kudu as a game changer in the Hadoop ecosystem. He has presented Kudu at several local meetups, presented on the state of Spark on Kudu during its beta while providing feedback early enough to ensure Spark with Kudu is a first-class citizen at its launch. He is a contributor to Apache Kudu and Kite SDK projects, and works as a Solutions Architect at Cloudera. Mladen's experience includes years of RDBMS engine development, systems optimization, performance and architecture, including optimizing Hadoop on the Power 8 platform while developing IBM's Big SQL technology. Brock Noland followed Kudu months before the first line of code was written, by following Todd Lipcon's paper reading habits. Brock is Chief Architect of phData, a pure-play Hadoop Managed Service Provider. Prior to founding phData, Brock spent four years at Cloudera as a Trainer, Solution Architect, Engineer, Sales Engineer, and Engineering Manager. Brock is a co-founder of Apache Sentry and Apache Project Committee Member on Apache Hive, Parquet, Crunch, Flume, and Incubator. Brock was a mentor to Kudu in the incubator and currently mentors Apache Impala (incubating). In addition he is a member of the Apache Software Foundation. Brock is frequent public speaker, having spoken at dozens of conferences including HBaseCon, numerous Hadoop User Groups, and other conferences. Ryan Bosshart is a Principal Systems Engineer at Cloudera. Ryan has spent the last 10 years building and architecting distributed systems. At Cloudera, Ryan leads the field storage specialization team where he focuses on Apache HDFS, HBase, and Kudu. He has worked with many early users of Kudu to build their relational, time-series, IOT, or real-time architectures. He has seen first-hand Kudu's ability to improve performance and simplify architectures. Ryan is a co-chair of the Twin Cities Spark and Hadoop User Group and the author of the training video Getting Started with Kudu (O'Reilly).