Notes from Big Data Class-2

  • Big Data = Value
  • Big Data Insight Action = Data Science
  • Big Data + Question Analysis = Data Product

Amazon:

  • Previous purchase
  • Customer review
  • Recommendation
  1. Historical data + new real-time data = Predecting ⇔ Action

  1. Passion for data
  2. Relate pfob. to insights
  3. Eng Sol
  4. Curiosity
  5. Teamwork

Big Data ⇒ Actionable Insight

Big Data Strategy

Plan of action [Overall act]

  • Aim
    • Long Term
    • Short Term
    • Commitment
    • Sponsorships
    • Communication
  • Policy
    • Privacy
    • Lifetime
  • Plan
  • Action

Five P

People & Purpose ⇒ Process ⇒ Platform ( This sequence called data science)

Acquire ⇒ Prepare ⇒ Analyse ⇒  Report ⇒ Act

Assess the Situation

  • Risks
  • Benefits
  • Resources
  • Requirements

Define Goals

  • Objectives
  • Criteria

Data Science Process

  1. Acquire
    • Data Set
    • Retrieve Data
    • Query Data
  2. Prepare
    • Explore pre-process
    • Clean
    • Integrate
    • Package
  3. Analyze
    • Define MOdel
    • Build Model
  4. Report
  5. Act

1)  Acquire Data 

Where’s data? -> Identify suitable data / -> Acquire all available data (SQL + NoSQL)

2)  Prepare

  1. Explore 
    1. Understand Data
      • Correlation
      • Histograms
      • Outliers
    2. Visualization Data
  2. Histogram (İnformal Analysis)
  3. Re-Process
    1. Clean garbage + transform
      • In constant
      • Missing
      • Outliers

3)  Analyse

  • Build model
  • Input data = analyse
  • Classification
  • Clustering = organize similar items into groups
  • Regression = Predict numeric values
  • Graph Analysis
  • Association analysis = Find rules to capture association between items

4)  Report

  • What to present
  • How to present
  • Visualisation

5)  Act

  • Implementation

 

figure 4 from: https://www.phy.ornl.gov/csep/mc/node13.html