Course ID: | MIST 9777. 3 hours. |
Course Title: | Big Data Research |
Course Description: | Helps students incorporate big-data research methods into research projects and gain skills for collecting and analyzing large research datasets. Students are exposed to state-of-the-art computational methods for analyzing structured, sequence, network, textual, and image data. Execute a big-data analytics research project and survey emerging big-data topics. |
Oasis Title: | Big Data Research |
Semester Course Offered: | Offered every year. |
Grading System: | A-F (Traditional) |
|
Course Objectives: | 1. To introduce the big data research paradigm, landscape, and research methods for emerging, high-impact, and diverse business intelligence and data analytics applications
2. To provide the foundation in state-of-the-art data, text, and image mining research, including the emerging AI and deep learning methods
3. To help students gain basic hands-on skills and a basic understanding of selected tools for developing, managing, and analyzing large datasets for research |
Topical Outline: | Topic 1: Big Data and Research
* Big data overview
* Business intelligence and analytics
* Big data technologies: Bigtable, Google File Systems, MapReduce
Topic 2: Structured Data Research
* Explanatory analytics
* Causal inference
* Tree-based methods: Decision Trees, Random Forest
* Generative Bayesian modeling
* Predictive analytics
* Linear regression
* Classification: Naïve Bayes, Support Vector Machines, Logistic Regression, K-Nearest Neighbor
* Ensemble learning: Bagging, Boosting, AdaBoost, Stacking
* Deep neural networks: Autoencoder, Deep Belief Network
* Collaborative filtering
* Policy prediction problems
* Sequence modeling
* Markov models: Hidden Markov Model, Conditional Random Fields
* Recurrent neural networks: Gated Recurrent Unit, Long Short-Term Memory, Attention Mechanism
* Network/Graph analysis
* Graph embedding
* Graph Neural Network, Graph Convolutional Network
* Community detection
* Link prediction
Topic 3: Unstructured Data Research
* Text mining
* Distributional semantics: Word embedding
* Language modeling: Transformer, BERT, GPT
* Information retrieval and information extraction
* Sentiment analysis
* Topic modeling: Latent Dirichlet Allocation, Hierarchical Dirichlet Process
* Image mining
* Convolutional Neural Network
* Generative Adversarial Network, Variational Autoencoder
* Image classification
* Object detection and scene understanding
Topic 4: Emerging Topics in Big Data Research
* Topics to be selected and presented by students include: data privacy, adversarial machine learning, interpretable machine learning, ethics of AI, AI governance, computational psychographics, computational social science, FinTech |