Data Science

Data Science in Practice
Class: DSCI D590
Section: 33205 (Online)
Instructor: David Wild, Kyle Stirling
Syllabus: View document
Synopsis: This course connects interested data science project sponsors with data science students, so that together both can accomplish something neither could achieve alone. The overarching goal for the course is for the students to experience the real-world work of Data Science and to complete short consulting/technical projects in small teams. This course is for anyone who applies their expertise to the demands of data-driven decision making and analysis. This is a “learning by doing” course on the practice of delivering data science expertise. ... show more

Graduate Internship
Class: DSCI D591
Section: 33206, 33207 (Online)
Credits: 0-3 Instructor: Haixu Tang
Synopsis: Students gain professional work experience in an industry or research organization setting, using skills and knowledge acquired in Informatics course work. May be repeated for a maximum of 6 credit hours.

Independent Study
Class: DSCI D699
Section: 33208, 33209 (Online)
Credits: 1-3 Instructor: Haixu Tang
Synopsis: Independent readings and research for M.S. students under the direction of a faculty member, culminating in a written report.


Cloud Computing
Class: ENGR E516
Section: 33909 (Online)
Instructor: Gregor Von Laszewski
Syllabus: View document
Synopsis: This course covers basic concepts on programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms, parallel algorithms, storage and high level language for proficiency with a complex ecosystem of tools that span many disciplines.... show more

Intro to High Performance Computing
Class: ENGR E517
Section: 14273 (Online)
Instructor: Thomas Sterling
Syllabus: View document
Synopsis: Students will learn about the development, operation, and application of HPC systems, making them prepared to address future challenges demanding capability and expertise. The course combines critical elements from hardware technology and architecture, system software and tools, and programming models and application algorithms with the cross-cutting theme of performance management and measurement. ... show more

Information Visualization
Class: ENGR E583
Section: 33677 (Online)
Instructor: Katy Börner, Michael Ginda
Syllabus: View document
Synopsis: Introduces information visualization, highlighting processes which produce effective visualizations. Topics include perceptual basis of information visualization, data analysis to extract relationships, and interaction techniques.

Information and Library Science

Class: ILS Z534
Section: 33038 (Online)
Instructor: Zheng Gao
Syllabus: View document
Synopsis: The success of commercial search engines shows that information retrieval is key to helping users find the information they seek. This course provides an introduction to information retrieval theories and concepts underlying all search applications. We investigate techniques used in modern search engines and demonstrate their significance via experiment. ... show more

Social Media Mining
Class: ILS Z639
Section: 12908 (Online)
Instructor: Vincent Malic
Syllabus: View document
Synopsis: This course provides a graduate-level introduction to social media mining and methods. The course provides hands-on experience mining social data for social meaning extraction (focus on sentiment analysis) using automated methods and machine learning technologies. We will read, discuss, and critique claims and findings from contemporary research related to SMM.... show more


Big Data Software and Projects
Class: INFO I524
Section: 13054 (Online)
Instructor: Gregor Von Laszewski
Syllabus: View document
Synopsis: This course studies software HPC-ABDS used in either High Performance Computing or open source commercial Big Data cloud computing. The student builds analysis systems using this software on clouds and then uses it in a project either chosen by the student or selected from a list given by the instructor. Credit given for only one of INFO-I424 or I524. ... show more

Applied Machine Learning
Class: INFO I526
Section: 33910 (Online)
Instructor: James Shanahan
Syllabus: View document
Synopsis: The main aim of the course is to provide skills to apply machine learning algorithms on real applications. We will devote less time to learning algorithms and math/theory, and instead spend more time with hands-on skills required for algorithms to work on a variety of datasets.

Systems & Protocol Security & Info Assurance
Class: INFO I533
Section: 13926 (Online)
Instructor: Steve Myers
Syllabus: View document
Synopsis: This course looks at systems and protocols, how to design threat models for them and how to use a large number of current security technologies and concepts to block specific vulnerabilities. Students will use numerous systems and programming security tools in the laboratories.

Applied Data Science
Class: INFO I590
Section: 11892 (Online)
Instructor: Joanne Luciano
Syllabus: View document
Synopsis: The aim of the Applied Data Science course is to provide the skills needed to apply data science principles on real world applications at every stage in the data science workflow. The course is organized around each stage covering the algorithms, best practices, and evaluation criteria. Both good and bad application examples will be discussed to help the student develop an intuition and deeper understanding of the choice of algorithm for the data, and the development of the best practices and methods for evaluating results of different approaches. Students will learn Tableau and use it to visually analyze and report data. ... show more

Data Science On-Ramp
Class: INFO I590
Section: 14150 (Online)
Credits: 1-3 Instructor: Ying Ding
Syllabus: View document
Synopsis: Self-paced modules to build and strengthen core competencies necessary for Data Science curriculum. Individual lessons vary from beginner to intermediate and will cover C++, MongoDB, R, Java, Python, Tableau, SQL, Hadoop/MapReduce, Spark, Scala, Github, Web Scraping, and Text Mining (NLP). If you would like descriptions of each lesson and how these will be mapped to credit, please consult Professor Ying Ding for more information. ... show more

Intro to Business Analytics Modeling
Class: INFO I590
Section: 14152 (Online, 2nd 8 Weeks only)
Instructor: Doug Blocher, Rex Cutshall
Syllabus: View document
Synopsis: In this course, we develop analytical models using simulation and optimization to analyze and recommend sound solutions to complex business problems. Models are discussed to solve sophisticated problems using various tools on spreadsheets, including Excel solver for linear, integer and genetic programming problems, probabilistic simulations, and risk analysis including statistical analysis of simulation models. ... show more

Network Science
Class: INFO I590
Section: 14149, 32299 (Online)
Instructor: Yong-Yeol Ahn
Syllabus: View document
Synopsis: Networks are everywhere. We can easily find network structure in many complex systems around us: our cells, brains, society, etc.The inherent generality of network approach allowed wide applications of network theory to flourish across diverse fields including biology, sociology, and epidemiology. The questions that we will address in the class are the following: Why do networks matter? What are the fundamental theories to understand the structure and dynamics of networks? How has it been applied to other fields? What are the frontiers of the research? We will explore key papers ranging from the fundamental theory to the various applications of network theory. This course will focus more on round-table discussion between students than presentation. Students will work on research projects in groups and finish a paper at the end of the class. ... show more

Class: INFO I590
Section: 14262 (Online)
Instructor: Vel Melbasa
Syllabus: View document
Synopsis: This course provides a gentle, yet intense, introduction to programming using Python for students with little or no prior experience in programming. Python, an open-source language that allows rapid application development of both large and small software systems, is object-oriented by design and provides an excellent platform for learning the basics of language programming. The course will focus on planning and organizing programs, and developing high quality, working software that solves real problems.... show more

Real World Data Science
Class: INFO I590
Section: 14530 (Online)
Instructor: Joanne Luciano
Syllabus: View document
Synopsis: The purpose of this course is to provide Data Science graduate students with practical experience applying their data science skillsets to real-world datasets. Data for the first offering of this course in 2017 used a deidentified clinical trials dataset provided by Eli Lilly, but subsequent offerings could include public data or data provided by other industry partners. Students will be led through the full data analysis process of data preparation, model planning, model building, analysis, and communication of results. Students will meet (virtually or physically) daily to devise a plan. ... show more

Class: INFO I590
Section: 14149 (Online)
Instructor: Ying Ding
Syllabus: View document
Synopsis: A database is the central focus in data science to store and manage data. Relational databases have empowered major industries for decades and are still widely adopted. In our new era of Big Data, the database landscape is undergoing significant change. Many non-relational databases become an important part of the enterprise data architecture of companies. Relational databases were developed long before the Internet and the Web to tackle the issues of central-controlled data storage and management. NoSQL databases emerged with the rise of Internet and Web applications to connect companies with customers (i.e., online or mobile) and to develop the agility to adapt faster. The new challenges of being agile and being able to accommodate data variability/data integration drove enterprises to turn to NoSQL database technology. It is important for every data scientist to master the skills of current databases and know about the future of databases in a world of NoSQL. This course aims to provide the basic overview of the current database landscape, starting with relational databases and SQL, and moving to several different NoSQL databases, such as XML database and MongoDB. ... show more

School of Public and Environmental Affairs

Data Analysis and Modeling in Public Affairs
Class: SPEA P507
Section: 13838 (Online)
Instructor: Barry Rubin
Syllabus: View document
Synopsis: Focus on analytical models and their use in solving problems and making decisions in the public sector. Discussion of standard approaches to modeling and estimation of parameters. V507 provides students of public and environmental affairs and related disciplines with a detailed, intermediate-level perspective on statistical concepts and techniques for analyzing and modeling complex systems. The course content includes estimating the parameters of such models based on existing data, testing hypotheses about these systems, and forecasting. The context of the course is the application of these techniques to problems and policies in public and environmental affairs. Multivariate regression analysis is one of the primary tools for statistical modeling for purposes of policy analysis, program evaluation, simulation of systems, and general forecasting. Thus, most of the course is devoted to single equation regression models and the extension of these models to a variety of situations. A prerequisite for the class is a graduate-level, introductory statistics course that includes coverage of the simple (two-variable) regression model and an introduction to multivariate regression. ... show more

Statistical Analysis for Effective Decision-Making
Class: SPEA V506
Section: 13839 (Online)
Instructor: Kand McQueen
Syllabus: View document, online version
Synopsis: An introduction to statistics. Nature of statistical data. Ordering and manipulation of data. Measures of central tendency and dispersion. Elementary probability. Concepts of statistical inference decision: estimation and hypothesis testing. Special topics discussed may include regression and correlation, analysis of variance, nonparametric methods. This course will provide an introduction to the analysis of quantitative data via statistical analyses. Topics covered include, but are not limited to, descriptive statistics, z-scores, probability, z-tests, t-tests, correlation, regression. The focus is on the practical interpretation and application of statistics. ... show more

School of Public Health

Semiparametric Regression with R
Class: SPH Q650
Section: 33594 (Online)
Instructor: Jaroslaw Harezlak
Syllabus: View document
Synopsis: Semiparametric regression methods build on parametric regression models by allowing more flexible relationships between the predictors and the response variables. Examples of semiparametric regression include generalized additive models, additive mixed models and spatial smoothing. Our goal is to provide an easy-to-follow applied course on semiparametric regression methods using R. There is a vast body of literature on the semiparametric regression methods. However, most of it is geared towards researchers with advanced knowledge of statistical methods. This course explains the techniques and benefits of semiparametric regression in a concise and modular fashion. Spline functions, linear mixed models and hierarchical models are shown to play an important role in semiparametric regression. There will be a strong emphasis on implementation in R with a lot of computing exercises. This course is based on the book ‘Semiparametric Regression with R’ by J. Harezlak, D. Ruppert, and M.P. Wand (Springer). ... show more


Introduction to Statistics
Class: STAT S520
Section: 13452 (Online)
Instructor: Jianyu Wang
Syllabus: View document
Synopsis: This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual datasets. S320 is the basic version of this course, intended for undergraduates. It is the gateway to more advanced courses offered by the Department of Statistics. S520 is an expanded version of S320 that covers additional material. S520 serves two constituencies: Graduate students in quantitative disciplines who are looking for a solid introduction to statistics and who may want to take additional courses in statistics, and graduate students pursuing an M.S. in Applied Statistics who desire a more gentle introduction to the fundamental principles of statistical inference than is provided in the more theoretical STAT S620. ... show more