Data ScienceIT & Software

Hadoop Fundamentals

Description

Apache Hadoop is a freely available open source tool-set that enables big data analysis. This Hadoop Fundamentals LiveLessons tutorial demonstrates the core components of Hadoop including Hadoop Distriuted File Systems (HDFS) and MapReduce. In addition, the tutorial demonstrates how to use Hadoop at several levels including the native Java interface, C++ pipes, and the universal streaming program interface. Examples of how to use high level tools include the Pig scripting language and the Hive ‘SQL like’ interface. Finally, the steps for installing Hadoop on a desktop virtual machine, in a Cloud environment, and on a local stand-alone cluster are presented. Topics covered in this tutorial apply to Hadoop version 2 (i.e., MR2 or Yarn).

Leave a Reply

Your email address will not be published. Required fields are marked *