Course Description
Helps students incorporate big-data research methods into research projects and gain skills for collecting and analyzing large research datasets. Students are exposed to state-of-the-art computational methods for analyzing structured, sequence, network, textual, and image data. Execute a big-data analytics research project and survey emerging big-data topics.
Athena Title
Big Data Research
Semester Course Offered
Offered every year.
Grading System
A - F (Traditional)
Course Objectives
1. To introduce the big data research paradigm, landscape, and research methods for emerging, high-impact, and diverse business intelligence and data analytics applications 2. To provide the foundation in state-of-the-art data, text, and image mining research, including the emerging AI and deep learning methods 3. To help students gain basic hands-on skills and a basic understanding of selected tools for developing, managing, and analyzing large datasets for research
Topical Outline
Topic 1: Big Data and Research * Big data overview * Business intelligence and analytics * Big data technologies: Bigtable, Google File Systems, MapReduce Topic 2: Structured Data Research * Explanatory analytics * Causal inference * Tree-based methods: Decision Trees, Random Forest * Generative Bayesian modeling * Predictive analytics * Linear regression * Classification: Naïve Bayes, Support Vector Machines, Logistic Regression, K-Nearest Neighbor * Ensemble learning: Bagging, Boosting, AdaBoost, Stacking * Deep neural networks: Autoencoder, Deep Belief Network * Collaborative filtering * Policy prediction problems * Sequence modeling * Markov models: Hidden Markov Model, Conditional Random Fields * Recurrent neural networks: Gated Recurrent Unit, Long Short-Term Memory, Attention Mechanism * Network/Graph analysis * Graph embedding * Graph Neural Network, Graph Convolutional Network * Community detection * Link prediction Topic 3: Unstructured Data Research * Text mining * Distributional semantics: Word embedding * Language modeling: Transformer, BERT, GPT * Information retrieval and information extraction * Sentiment analysis * Topic modeling: Latent Dirichlet Allocation, Hierarchical Dirichlet Process * Image mining * Convolutional Neural Network * Generative Adversarial Network, Variational Autoencoder * Image classification * Object detection and scene understanding Topic 4: Emerging Topics in Big Data Research * Topics to be selected and presented by students include: data privacy, adversarial machine learning, interpretable machine learning, ethics of AI, AI governance, computational psychographics, computational social science, FinTech
Syllabus