Singapore University of Social Sciences

Big Data Computing in the Cloud

Big Data Computing in the Cloud (ICT337)

Synopsis

ICT337 Big Data Computing in the Cloud introduces students to the challenges of processing big data. It provides a foundational understanding of how computing clusters can be setup for big data processing. Students learn how to set up computing clusters that manage resources and schedule jobs in the cloud to perform relevant data analytics. Through hands-on training with relevant tools, students develop programs for processing big data. With such knowledge, students are able to develop smart solutions that can handle real-time big data.

Level: 3
Credit Units: 5
Presentation Pattern: Every July

Topics

  • Challenges of big data computing (5 Vs)
  • Big data computing requirements
  • Big data computing framework: Hadoop ecosystem
  • Real-time big data computing: Spark
  • Setup of Spark Cluster
  • Job Scheduling in Spark Cluster
  • Programming tool for big data computing: PySpark
  • Application interface for big data computing: Jupyter
  • Interactive queries of structured data: Spark SQL
  • Fault-tolerant processing of data streams: Spark Streaming
  • Real-time big data computing case studies
  • Integration of big data computing in smart solutions

Learning Outcome

  • Examine the challenges in big data computing
  • Evaluate big data computing framework
  • Design and implement big data computing cluster in cloud
  • Plan and execute the deployment of big data computing cluster in cloud
  • Formulate effective resource management and job scheduling
  • Organize the operation and maintenance of big data computing cluster
  • Appraise approaches to resolve big data computing issues in smart solutions
  • Construct smart solution based on big data computing
Back to top
Back to top