Home

Data Science Courses 2018-2019

A list of Data Science online courses offered in current/upcoming semesters is available through the tabs below.

Follow this link to register for classes.

Note: Unless otherwise specified, all courses listed are worth 3 credit hours.

Computer Science

Applied Algorithms
Class: CSCI B505
Section: 13926(online)
Syllabus: View document
Instructor: Funda Ergun
Synopsis: The course studies the design, implementation, and analysis of algorithms and data structures as applied to real world problems. Topics include divide-and-conquer, optimization, and randomized algorithms applied to problems such as sorting, searching, and graph analysis. Students will learn about trees, hash tables, heaps, and graphs.

Elements of Artificial Intelligence
Class: CSCI B551
Section: 13924 (online)
Syllabus: View Document
Instructor: TBD
Synopsis: Introduction to major issues and approaches in artificial intelligence. Principles of reactive, goal-based, and utility-based agents. Problem-solving and search. Knowledge representation and design of representational vocabularies. Inference and theorem proving, reasoning under uncertainty, and planning. Overview of machine learning.



Data Science

Data Science in Practice
Class: DSCI D590
Section: 31460 (online)
Syllabus:
Instructor: David Wild
Synopsis: This course connects interested data science project sponsors with data science students, so that together both can accomplish something neither could achieve alone. The overarching goal for the course is for the students to experience the real-world work of Data Science and to complete short consulting/technical projects in small teams. This course is for anyone who applies their expertise to the demands of data-driven decision-making and analysis. This is a “learning by doing” course on the practice of delivering data science expertise.

Independent Study in Data Science
Class: DSCI D699
Section:30784 (online)
Syllabus:
Instructor: Haixu Tang
Synopsis: Independent readings and research for M.S. students under the direction of a faculty member, culminating in a written report. This course requires permission from the Data Science Graduate Office.



Informatics

Security for Networked Systems
Class: INFO I520
Section: 12492 (online)
Syllabus: View Document
Instructor: Raquel Hill
Synopsis: This course is an extensive survey of system and network security. Course materials cover the threats to information confidentiality, integrity and availability and the defense mechanisms that control such threats. The course provides the foundation for more advanced security courses and hands-on experiences through course projects.

Big Data Applications and Analytics
Class: INFO I523
Section: 11858 (online)
Syllabus: View Document
Instructor: Gregor Von Laszewski
Synopsis: The Big Data Applications and Analytics course is an overview course in Data Science and covers the applications and technologies (data analytics and clouds) needed to process the application data. It is organized around rallying cry: Use Clouds running Data Analytics Collaboratively processing Big Data to solve problems in X-Informatics.

Organizational Informatics & Economics of Security
Class: INFO I525
Section: 13931 (online)
Syllabus: View Document
Instructor: Jean Camp
Synopsis: Security technologies make explicit organizational choices that allocate power. Security implementations allocate risk, determine authority, reify or alter relationships, and determine trust extended to organizational participants. The course begins with an introduction to relevant definitions (security, privacy, trust) and then moves to a series of timely case studies of security technologies. ... show more

Applied Machine Learning
Class: INFO I526
Section: 35973 (100% Online)
Syllabus: View Document
Credits: 3
Instructor: James "Jimi" Shanahan
Synopsis: The main aim of the course is to provide skills to apply machine learning algorithms on real applications. We will devote less time to learning algorithms and math/theory, and instead spend more time with hands-on skills required for algorithms to work on a variety of datasets.
Entrance Exam: Each prospective student will need to complete an entrance exam for this course. Your entrance exam performance will form an important component in determining your admittance to this course at this time. You may access the exam here.

Data Science OnRamp Basics
Class: INFO I590
Section: 36011 (online)
Syllabus: View Document
Instructor: Ying Ding
Synopsis: Data Science Onramp Basics contains self-paced modules with the goal to build and enhance your data science skills within the Data Science program. Specifically, these basic modules cover competencies in web scraping, machine learning, NLP, and Tableau. Each module will be counted as one credit hour. Each time you enroll, you can select 1-3 credit hours, which means that you can select 1 or 2 or 3 mini courses. You are allowed to enroll twice for this whole course, max 6 credit hours total.

Data Science OnRamp Advanced
Class: INFO I590
Section: 36012 (online)
Syllabus: View Document
Instructor: Ying Ding
Synopsis: Data Science Onramp Advanced contains self-paced modules to further enhance and refine enhance your data science skills, which are oftentimes demanded or desired in data science related jobs. These advanced modules will introduce you to Spark, Scala, Hadoop, Deep Learning, and Kaggle Competitions. Each module will be counted as one credit hour. Each time you enroll, you can select 1-3 credit hours, which means that you can select 1 or 2 or 3 mini courses. You are allowed to enroll twice for this whole course, max 6 credit hours total.

Data Semantics
Class: INFO I590
Section: 12395 (online)
Syllabus: View Document
Instructor: Ying Ding
Synopsis: This course aims to provide the basic overview of the Semantic Web in particular, and data semantics in general, and how they can be applied to enhance data integration and knowledge inference. Ontology is the backbone of the Semantic Web. It models the semantics of data and represents them in markup languages proposed by the World Wide Web Consortium (W3C). W3C plays a significant role in directing major efforts at specifying, developing, and deploying standards for sharing information. Semantically enriched data pave the crucial way to facilitate the Web functionality and interoperability. ... show more

Data Visualization
Class: INFO I590
Section: 12488 (online)
Syllabus: View Document
Instructor: Yong-Yeol Ahn
Synopsis: From dashboards in a car to cutting-edge scientific papers, we extensively use visual representation of data. As our world becomes increasingly connected and digitized and as more decisions are being driven by data, data visualization is becoming a critical skill for every knowledge worker. In this course we will learn fundamentals of data visualization and create visualizations that can provide insights into complex datasets.

Python
Class: INFO I590
Section: 14197
Syllabus: View Document
Instructor: Vel Melbasa
Synopsis: This course provides a gentle yet intense introduction to programming with Python for students who have little or no prior experience in programming. Python, an open-source language that allows rapid application development of both large and small software systems, is object-oriented by design and provides an excellent platform for learning the basics of language programming. The course will focus on planning and organizing programs, and developing high quality working software that solves real problems.



Information and Library Science

Search
Class: ILS Z534
Section: 12423 (online)
Syllabus: View Document
Instructor: Zheng Gao
Synopsis: The success of commercial search engines shows that Information Retrieval is a key in helping users find the information they seek. This course provides an introduction to information retrieval theories and concepts underlying all search applications. We investigate techniques used in modern search engines and demonstrate their significance by experiment.

Social Media Mining
Class: ILS Z639
Section: 33237 (online)
Syllabus:
Instructor: Ali Ghazinejad
Synopsis: This course provides a graduate-level introduction to social media mining and methods. The course provides hands-on experience mining social data for social meaning extraction (focus on sentiment analysis) using automated methods and machine learning technologies. We will read, discuss, and critique claims and findings from contemporary research related to SMM.



Intelligent Systems Engineering

Machine Learning for Signal Processing
Class: ENGR E511
Section: 35976 (online)
Syllabus: View Document
Instructor: Minje Kim
Synopsis: The course discusses advanced signal processing topics as an application of machine learning. Hands-on signal processing tasks are introduced and tackled using a problem-solving manner, so students can grasp important machine learning concepts. The course can help students learn to build an intelligent signal processing system in a systematical way. Students should be accustomed to Calculus, Linear Algebra, Probability Theory, CSCI-B 555 and one of the scientific programming languages, MATLAB, Python, or R.

Engineering Cloud Computing
Class: ENGR E516 (online)
Section: 36046
Syllabus:View Document
Instructor: Gregor von Laszewski
Synopsis: This course covers basic concepts on programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms, parallel algorithms, storage and high level language for proficiency with a complex ecosystem of tools that span many disciplines.

Intro to High Performance Computing
Class: ENGR E517
Section: 36095 (online)
Syllabus:
Instructor: Thomas Sterling
Synopsis: Students will learn about the development, operation, and application of HPC systems, making them prepared to address future challenges demanding capability and expertise. The course combines critical elements from hardware technology and architecture, system software and tools, and programming models and application algorithms with the cross-cutting theme of performance management and measurement.



School of Public and Environmental Affairs

Statistical Analysis for Effective Decision-Making
Class: SPEA V506
Section:9090 (online)
Syllabus: View document
Instructor: Tom Robovsky
Synopsis: An introduction to statistics. Nature of statistical data. Ordering and manipulation of data. Measures of central tendency and dispersion. Elementary probability. Concepts of statistical inference decision: estimation and hypothesis testing. Special topics discussed may include regression and correlation, analysis of variance, nonparametric methods. This course will provide an introduction to the analysis of quantitative data via statistical analyses. Topics covered include, but are not limited to, descriptive statistics, z-scores, probability, z-tests, ttests, correlation, regression. The focus is on the practical interpretation and application of statistics. ... show more



Statistics

Introduction to Statistics
Class: STAT S520
Section: 14918 (online)
Syllabus: View Document
Instructor: Jianyu Wang
Synopsis:  This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual data sets. S320 is the basic version of this course, intended for undergraduates. It is the gateway to more advanced courses offered by the Department of Statistics. S520 is an expanded version of S320 that covers additional material (see syllabus below). S520 serves two constituencies: Graduate students in quantitative disciplines who are looking for a solid introduction to statistics and who may want to take additional courses in statistics, and graduate students pursuing an M.S. in Applied Statistics who desire a more gentle introduction to the fundamental principles of statistical inference than is provided in the more theoretical STAT S620. ... show more



Computer Science

Advanced Database Concepts
Class: CSCI B561
Section: 29367 (Online)
Instructor: Dirk Van Gucht
Syllabus: Please contact instructor.
Synopsis: Database models and systems: especially relational and object-oriented; relational database design theory; structures for efficient data access; query languages and processing; database applications development; views. Transaction management: concurrency and recovery. It is assumed that you can program; ideally in various programming styles: imperative, functional, and object-oriented. ... show more

Computer Vision
Class: CSCI B657
Section: 29368 (Online)
Instructor: David Crandall
Syllabus: Please contact instructor
Synopsis: This is an introductory course in computer vision. We will give a broad overview of the field, with a slight bias towards some topics to reflect current research trends (e.g. object recognition, deep learning). The emphasis will be on algorithms, mathematical models, and techniques that are broadly applicable to many problems not only in vision but also in other areas of AI and CS. ... show more



Data Science

Data Science in Practice
Section: 13057 (Online)
Instructor: David Wild
Syllabus
Synopsis: This course connects interested data science project sponsors with data science students, so that together both can accomplish something neither could achieve alone. The overarching goal for the course is for the students to experience the real-world work of Data Science and to complete short consulting/technical projects in small teams. This course is for anyone who applies their expertise to the demands of data-driven decision-making and analysis. This is a “learning by doing” course on the practice of delivering data science expertise. ... show more

Graduate Internship in Data Science
Class: DSCI D591
Section: 13059 (Online)
Instructor: Haixu Tang
Note: To enroll, please contact the Data Science Graduate Office Office for permission.
Synopsis: Students gain professional work experience in an industry or research organization setting, using skills and knowledge acquired in Informatics course work. May be repeated for a maximum of 6 credit hours. This course requires permission from the Data Science Graduate Office.

Independent Study in Data Science
Class: DSCI D699
Section: 13061 (Online)
Instructor: Haixu Tang
Note: To enroll, please contact the Data Science Graduate Office for permission.
Synopsis: Independent readings and research for M.S. students under the direction of a faculty member, culminating in a written report. This course requires permission from the Data Science Graduate Office.



Informatics

Applied Machine Learning
Class: INFO I526
Section: 33620 (100% Online)
Instructor: James "Jimi Shanahan
Syllabus
Note: Each prospective student will need to complete an entrance exam for this course. Your entrance exam performance will form an important component in determining your admittance to this course at this time. You may access the exam here.
Synopsis: The main aim of the course is to provide skills to apply machine learning algorithms on real applications. We will devote less time to learning algorithms and math/theory, and instead spend more time with hands-on skills required for algorithms to work on a variety of datasets.

Systems and Protocol Security and Information Assurance
Class: INFO I533
Section: 11110 (Online)
Instructor:
Syllabus:
Synopsis: This course looks at systems and protocols, how to design threat models for them and how to use a large number of current security technologies and concepts to block specific vulnerabilities. Students will use a large number of systems and programming security tools in the laboratories.

Management, Access, and Use of Big and Complex Data
Class: INFO I535
Section: 33094
Instructor:
Syllabus
Synopsis: Data is abundant and its abundance offers potential for new discovery, and economic and social gain. But data can be difficult to use. It can be noisy and inadequately contextualized. There can be too big a gap from data to knowledge, or due to limits in technology or policy not easily combined with other data. This course will examine the underlying principles and technologies needed to capture data, clean it, contextualize it, store it, access it, and trust it for a repurposed use. Specifically the course will cover the 1) distributed systems and database concepts underlying noSQL and graph databases, 2) best practices in data pipelines, 3) foundational concepts in metadata and provenance plus examples, and 4) developing theory in data trust and its role in reuse. ... show more

Advanced Data Science OnRamp
Class: INFO I590
Section: 29392 (Online)
Instructor: Ying Ding
Syllabus
Synopsis: This asynchronous course contains mini lessons with the goal to build and enhance your data science skills which are oftentimes demanded or desired in data science related jobs. Each mini lesson will be counted as one credit hour. Each time you enroll, you can select 1-3 credit hours, which means that you can select 1 or 2 or 3 mini lessons.

Applied Data Science
Class: INFO I590
Section: 9772 (Online)
Instructor: Olga Scrivner
Syllabus
Synopsis: The aim of the Applied Data Science course is to provide the skills needed to apply data science principles on real world applications at every stage in the data science workflow. The course is organized around each stage covering the algorithms, best practices, and evaluation criteria. Both good and bad applications examples will be discussed to help the student develop an intuition and deeper understanding of the choice of algorithm for the data, the development of the best practices and methods for evaluating results of different approaches. Students will learn Tableau and use it to visually analyze and report data. ... show more

Basic Data Science OnRamp
Class: INFO I590
Section: 11233 (Online)
Instructor: Ying Ding
Syllabus
Synopsis: This asynchronous course contains mini lessons with the goal to build and enhance your data science skills which are oftentimes demanded or desired in data science related jobs. Each mini lesson will be counted as one credit hour. Each time you enroll, you can select 1-3 credit hours, which means that you can select 1 or 2 or 3 mini lesson.

Introduction to Business Analytics Modeling
Class: INFO I590
Section: 11234 (Online)
Instructor: Doug Blocher & Rex Cutshall
Syllabus
Note: This course has special term dates and does not follow the typical term schedule. Course will start Feb 18, 2019 and end May 9, 2019.
Synopsis: In this course, we develop analytical models using simulation and optimization to analyze and recommend sound solutions to complex business problems. Models are discussed to solve complex problems using various tools on spreadsheets; including Excel solver for linear, integer and genetic programming problems, probabilistic simulations, and risk analysis including statistical analysis of simulation models. ... show more

Python
Class: I590
Section: 11293 (Online)
Instructor: Vel Malbasa
Syllabus
Synopsis: This course provides a gentle yet intense introduction to programming with Python for students who have little or no prior experience in programming. Python, an open-source language that allows rapid application development of both large and small software systems, is object-oriented by design and provides an excellent platform for learning the basics of language programming. The course will focus on planning and organizing programs, and developing high quality working software that solves real problems. ... show more

SQL and NoSQL
Class: INFO I590
Section: 29709 (Online)
Instructor: Ying Ding
Syllabus
Synopsis: Database is the central focus in data science to store and manage data. Relational database has empowered the main industries for decades and is still widely adopted. In the new era of big data, database landscape is undergoing significant change. Many non-relational databases become an important part of the enterprise data architecture of companies. Relational databases were developed long before the Internet and the Web to tackle the issues of central-controlled data storage and management. NoSQL databases emerged with the rise of Internet and Web applications to connect companies with customers (i.e., online or mobile) and to develop with agility to adapt to faster changes. The new challenges of being agile and being able to accommodate data variablity/data integration drive enterprises to turn to NoSQL database technology. It is important for every data scientist to master the skills of current database and know about the future of databases in a world of NoSQL. This course aims to provide the basic overview of the current database landscape, starting with relational databases, SQL, and moving to several different NoSQL databases, such as XML database, MongoDB, Neo4j, Cassandra, and HBase. ... show more

Time Series Analysis
Class: INFO I590
Section: 31863 (Online)
Instructor: Olga Scrivner
Syllabus
Synopsis: Techniques for analyzing data collected at different points in time. Probability models, forecasting methods, analysis in both time and frequency domains, linear systems, state-space models, intervention analysis, transfer function models and the Kalman filter. Topics also include: stationary processes, autocorrelations, partial autocorrelations, autoregressive, moving average, and ARMA processes, spectral density of stationary processes, periodograms and estimation of spectral density. ... show more

Network Science
Class: INFO I606
Section: 12760
Instructor: Y Y Ahn
Syllabus
Synopsis: Networks are everywhere. We can easily find network structure in many complex systems around us: our cells, brains, society, etc. The inherent generality of network approach flourished wide applications of network theory across diverse fields including biology, sociology, and epidemiology.



Information and Library Science

Search
Class: ILS Z534
Section: 12990
Instructor: Zheng Gao
Syllabus
Synopsis: The success of commercial search engines shows that Information Retrieval is a key in helping users find the information they seek. This course provides an introduction to information retrieval theories and concepts underlying all search applications. We investigate techniques used in modern search engines and demonstrate their significance by experiment. ... show more



Intelligent Systems Engineering

Engineering Cloud Computing
Class: ENGR E516
Section: 13405 (Online)
Instructor: Gregor Von Laszewski
Syllabus
Synopsis: This course covers basic concepts on programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms, parallel algorithms, storage and high-level language for proficiency with a complex ecosystem of tools that span many disciplines.

Deep Learning Systems
Class: ENGR E533
Section: 29391 (Online)
Instructor: Minje Kim
Syllabus: Please contact instructor
Synopsis: This course teaches the pipeline for building state-of-the-art deep learning-based intelligent systems. It covers general training mechanisms and acceleration options that use GPU computing libraries and parallelization techniques running on high performance computing systems. The course also aims at deploying the networks to the low-powered hardware systems. ... show more

Information Visualization
Class: ENGR E583
Section: 13312 (Online)
Instructor: Katy Börner
Syllabus:
Synopsis: The visual representation of information requires a deep understanding of human perceptual and cognitive capabilities, computer graphics, interface and interaction design, as well as creativity. Data—such as log files reporting access of webpages or social network data—is typically non-spatial and needs to be mapped into a physical space that represents relationships contained in the information faithfully and efficiently. If done successfully, visualizations can provide a very intuitive and efficient "interface between two powerful information processing systems—the human mind and the modern computer" (Gershom et al., 1998). This course provides an overview about the state-of-the-art in information visualization. It teaches the process of producing effective temporal, geospatial, topical, and network visualizations. Students get the chance to use tools such as Tableau, D3.js, OpenRefine, Gephi, and Plot.ly. Students have the opportunity to collaborate on real-world projects for a variety of clients. ... show more

Advanced Cloud Computing
Class: ENGR E616
Section: 33237
Instructor: Gregor Von Laszewski
Syllabus: Please contact instructor
Synopsis: This course describes Cloud 3.0 in which DevOps, Microservices, and Function as a Service is added to basic cloud computing. The discussion is centered around the Apache Big Data Stack and a major student project aimed at demonstrating integration of cloud capabilities. Java and Python will be used as a Programming Languages. ENGR-E516 or introduction to cloud computing required. ... show more



School of Public and Environmental Affairs (SPEA)

Data Analysis and Modeling in Public Affairs
Class: SPEA P507
Section: 11065 (Online)
Instructor: Barry Rubin
Syllabus
Synopsis: Intermediate-level perspective on statistical concepts and techniques for analyzing and modeling complex systems via regression analysis. Includes estimating the parameters of such models based on existing data, testing hypotheses about these systems, forecasting, correcting for violations of assumptions, and dealing with commonly encountered problems such as near multcollinearity. Primarily focused on single equation regression models and the extension of these models to a variety of situations, but includes an introduction to simultaneous equation models. Application of these techniques to problems and policies in public and environmental affairs, as well as general social sciences. ... show more

Statistical Analysis for Effective Decision-Making
Class: SPEA V506
Section: 11066 (Online)
Instructor: Joanna Woronkowicz
Syllabus: Please conact instructor
Synopsis: An introduction to statistics. Nature of statistical data. Ordering and manipulation of data. Measures of central tendency and dispersion. Elementary probability. Concepts of statistical inference decision: estimation and hypothesis testing. Special topics discussed may include regression and correlation, analysis of variance, nonparametric methods. This course will provide an introduction to the analysis of quantitative data via statistical analyses. Topics covered include, but are not limited to, descriptive statistics, z-scores, probability, z-tests, ttests, correlation, regression. The focus is on the practical interpretation and application of statistics. ... show more



Statistics

Introduction to Statistics
Class: STAT S520
Section: 10859 (Online)
Instructor: Jianyu Wang
Syllabus
Synopsis: This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual data sets. S320 is the basic version of this course, intended for undergraduates. It is the gateway to more advanced courses offered by the Department of Statistics. S520 is an expanded version of S320 that covers additional material (see syllabus below). S520 serves two constituencies: Graduate students in quantitative disciplines who are looking for a solid introduction to statistics and who may want to take additional courses in statistics, and graduate students pursuing an M.S. in Applied Statistics who desire a more gentle introduction to the fundamental principles of statistical inference than is provided in the more theoretical STAT S620. ... show more

Introduction to Regression Models and Noparametrics
Class: STAT S681
Section: 30753 (Online)
Instructor: Brad Luen
Syllabus
Synopsis:This course is a survey of statistical methods that do not rely on parametric assumptions. As such, it will review the parametric techniques learned in that and similar introductory courses, and compare them to nonparametric alternatives to see when one technique outperforms another.The prerequisite to this course is STAT-S 520. You should already know how to calculate probabilities using software or otherwise for the fundamental probability distributions like the binomial and the normal. You should know the forms and interpretations of t-tests, confidence intervals, and the simple linear regression line. You should have some experience with R. If not, download R for free from cran.r-project.org, install it, and start playing around with it. The clearest introductory guide to actually doing statistics in R is John Verzani's simpleR at . The material up to chapter 7 (Simulations) will be immediately useful. ... show more



Data Science



Informatics



School of Public and Environmental Affairs



Statistics