Hadoop in Practice
(By: Alex Holmes )
Hadoop is an open-source software framework written in Java language, primarily used for distributing processing of extra-large databases and storage that is set on various computer clusters built from commodity hardware.
With increasing use in software programming, there are many big data books in the market that are y used for teaching the basics and advanced feature of Hadoop.
One of the prominent books has been written by Alex Holmes, a prominent software engineer, blogger, speaker and author who specializes in large-scale projects of Hadoop. This book offers information on the latest features of Hadoop core architecture.
Who Should Read This Book?
As this book offers updated knowledge about Hadoop, the main readers targeted by the author are developers who already have fundamental knowledge of this open-source software framework that is written in Java. This book is filled with various real-life examples to show the practical use of Hadoop in the programming.
As per Alex Holmes, the readers who have been targeted should have some experience with JSON, XML and Java for taking maximum advantage of the book. The writer has provided eighty five case studies regarding the use of Hadoop and has broken them down in a consumable format. This is because the software examples given in the book assumes the experience of reader is advanced, hence takes the reader to the next level of Hadoop programming.
Intermediate knowledge level of Java enables readers to understand the flow of the source code given in the book.
What Is Good In This Book
It is a fact that Hadoop, the open-source software framework, has undergone many changes and the second edition of this book offers comprehensive coverage of the latest features. One of the key features useful for the readers is 104 techniques that showcase how to accomplish various important and practical tasks in Hadoop projects.
These case studies include various aspects of Hadoop such as analysis of real-time streams, moving and securing data, taming big data by using Hadoop, management of large-scale clusters and machine learning, among others.
The reader can pick-up the best techniques for integration of Impala, Kafka and Spark. In addition, the books also offer updated techniques for the latest versions of different software such as Mahout, Sqoop and Flume. As the software developers work with various types of software such as MapReduce, YARN, Kafka, Parquet, Camus and other technologies during Hadoop projects, this book is the best option to understand the relationship between these software.
The book also offers a reasonably deep overview of Hadoop software network that spans various major topics such as the fundamentals, background, big data patterns, data logistics and then move beyond MapReduce. The book offers detailed information regarding various topics such as how to create YARN applications, working with predictive analytics by using RR and Mahout and the integration of various real-time technologies such as Spark, Impala and Storm.
Further, the book provides techniques of leveraging the algorithms and data structures at scale. The language used by the writer is very easy and he has used practical case studies extensively to explain abstract concepts. This is a wonderful addition to any technical library for Hadoop that offers up-to-date information for advanced Hadoop users.
What Is Not Good In This Book
Although this book is lauded by critics in software for its knowledge level, language and easy explanation of various abstract concepts, it also has some downsides. As this book is primarily targeted towards experienced software programmers in Hadoop, the novice readers cannot use this book as a reference unless they have some experience in Hadoop programming.
Moreover, the author does not offer much in-depth information about Hadoop related topics and assumes the reader has enough experience to understand various technical terms.
One of the popular books on Hadoop in the market, this book offers a lot of updated information for programmers who work with Hadoop. If you have prior experience in large-scale projects, this book is indispensable, even for intermediates, the book offers immense value.