Hadoop: The Definitive Guide [Kindle Edition]
(By: Tom White )
This book is a comprehensive guide to knowing everything about Apache Hadoop, and learning all about distributed data storage and processing.
The Hadoop framework provides a high amount of reliability, and faster processing. For managing large amounts of data of internet scale, Hadoop is an indispensable and critical framework.
Different problems have been discussed in the form of detailed case studies, and all the latest changes in Hadoop have been incorporated and clearly explained in relevant sections.
About Tom White
Tom has more than 8 years of experience working with Apache Hadoop. He’s one of the members of Apache foundation, and has several years of experience in a company (Cloudera!) that exclusively offers training and support for Hadoop. He also works as a consultant for Hadoop (Independent consultant).
Who Should Read The Book?
For programmers doing data analysis, students, administrators who’re using or plan to use Hadoop. For people who are planning a career in computers that deals with Big Data, or newbies and beginners who have just completed practical training, this book is highly recommended. It’s a great resource, and comprehensively covers every aspect.
Students and working professional programmers, will both gain deep insights about the Hadoop environment, and any programmer or computer professional will benefit by having it as part of their reference book collection or personal library.
Hadoop developers will specifically find it comprehensive, well-organized, and extremely useful. Even for experienced professionals, it’s a great book to refer back to, when you’re in doubt about something.
Here’s What Is Best About The Book
- A great book for you, if you’re about to begin with Big Data. Fantastic for beginners, and if you’re about to appear for CCD-410 it’s the perfect start.
- Hadoop is a critically important framework and getting Cloudera certified is a difficult task. Getting a comprehensive guide that elaborately covers the Hadoop ecosystem makes a difference to your chances of doing better in your certification exams or being better at your job.
- The author’s expertise becomes evident as you turn the pages and you will distinctly feel the impact of years of experience by observing the way in which the instruction is designed and delivered. Learning depends a lot on instructional delivery and an expert of Tom’s calibre knows exactly what works and how every small piece fits into the puzzle. This will help beginners avoid confusion and advance in your career with confidence and clarity.
- Each section has been carefully structured and practical examples are provided to explain things clearly. All major topics have been broken down systematically into sequential sub-topics, and nothing will appear out of place, which does seem to happen with many newer authors.
- A definitive manual must have accurate facts, and be a reliable source of information for future references, this is especially important because you will have noticed a large number of online tutorials and information sources that have not been created by certified professionals. Such sources are good secondary resources, but cannot replace the essential necessity of having your own primary reference manual that can be cited with confidence at any place. This book is just the right blend of every ingredient, and has been appreciated by every reader, developer and programmer.
- For a programmer about to deal with large size dataset analysis this book will be invaluable, and also for someone who wants to begin with setting up a Hadoop environment, and will eventually need to run your own Hadoop clusters.
- Real time Big data management and handling is riddled with unexpected challenges that are unavoidable and you cannot prepare for everything by reading a book. But, learning from someone who has done it all, with proven efficiency, managed to solve almost every possible kind of Hadoop related issue, and has established credentials will immensely improve your performance, save your time and accelerate your learning.
Some Things That Could Have Been Better
- There are multiple references to the book’s website, to get additional information, in the text, and yet a single website has not been pointed out as the book’s official website, such ambiguity could have been avoided.
- MapReduce programming could have been explained in greater detail, although the overview is fantastic and really helpful.
- The earlier edition was great and widely appreciated, but for someone buying the new edition there is a lot of content that hasn’t been changed.
- Newer changes, Hadoop 2.0, and other latest information need more content and attention, hopefully it will be added in the next edition.
This book is by far one of the best books in the market to learn Hadoop, and get a complete overview. It fulfils the needs of every beginner. The previous edition was great, positively reviewed by almost every reader, and the new edition is an improvement on a well-written book. If you work with Hadoop or plan to work in a Big Data environment, this is the right book for you.