Big Data Final project

Big Data Final project

The primary theme of the paper is Big Data Final project in which you are required to emphasize its aspects in detail. The cost of the paper starts from $99 and it has been purchased and rated 4.9 points on the scale of 5 points by the students. To gain deeper insights into the paper and achieve fresh information, kindly contact our support.

Final Project

Analysis on Million songs

Analysis on Million Songs Data set:

Summer 2016 – Big Data Final project

The goal of this project assignment is to gain experience in building applications using

  • Hadoop Open Source Platform
  • Map reduce programming framework
  • Hive Query Language
  • Pig– Latin script

Please build the Hadoop cluster with more than one instance on any linux flavored platform.

–          Download the million songs meta data from the below repository and load the same into HDFS.

One good source for download ur data set:

  • https://drive.google.com/open?id=0B4qvMVe-iB-eWGI1X29FNDYwVXc

Please filter the input data by letter which has assigned to you while reading the input source and use the corresponding data set for the project work.

Eg: If the letter ‘K’ has assigned to you then consider the input data where 2nd column value starts with letter ‘k’.

In this assignment my letter is ‘k’.

The first row has corresponding column names in the spreadsheet.

Submit JPS Output, ifconfig output, Cluster Details & Total number of files count in HDFS.

Once the data is loaded successfully into HDFS, please submit the below analytical metrics usingHive, Map reduce or Pig latin.

  1. Analyze the Duration of Songs for each year.

Submit the calculated results data and also corresponding bar graph / pie chart.

  1. Analyze on no of songs which ending with same last digit of their digital ID.

Submit the calculated results data and also corresponding bar graph / pie chart.

  1. Analyze on number of artists by the first letter of their name OR

Analyze the familiarity of song for each year.

Submit the calculated results data and also corresponding bar graph / pie chart.

  1. Analyze on range of tempo or loudness for each year.

Submit the calculated results data and also corresponding bar graph / pie chart.

  1. Analyze on songs with same key value.

Submit the calculated results data.

100% Plagiarism Free & Custom Written
Tailored to your instructions