Spring 2018 Term
Class: STAT S520 Introduction to Statistics
Instructor: Jianyu Wang
Synopsis: This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual datasets. S320 is the basic version of this course, intended for undergraduates. It is the gateway to more advanced courses offered by the Department of Statistics. S520 is an expanded version of S320 that covers additional material. S520 serves two constituencies: Graduate students in quantitative disciplines who are looking for a solid introduction to statistics and who may want to take additional courses in statistics, and graduate students pursuing an M.S. in Applied Statistics who desire a more gentle introduction to the fundamental principles of statistical inference than is provided in the more theoretical STAT S620.
Department: Information and Library Science
Class: ILS Z534 Search
Instructor: Xiaozhong Liu
Synopsis: The success of commercial search engines shows that Information Retrieval is key in helping users find the information they seek. This course provides an introduction to information retrieval theories and concepts underlying all search applications. We investigate techniques used in modern search engines and demonstrate their significance by experiment.
Class: ILS Z637 Information Visualization
Instructor: Katy Börner, Michael Ginda
Synopsis: Introduces information visualization, highlighting processes which produce effective visualizations. Topics include perceptual basis of information visualization, data analysis to extract relationships, and interaction techniques.
Class: INFO I524 Big Data Software and Projects
Instructor: Gregor Von Laszewski
Synopsis: This course studies software HPC-ABDS used in either High Performance Computing or open source commercial Big Data cloud computing. The student builds analysis systems using this software on clouds and then uses it in a project either chosen by the student or selected from a list given by the instructor. Credit given for only one of INFO-I424 or I524.
Class: INFO I526 Applied Machine Learning
Instructor: Sriraam Natarajan
Synopsis: The aim of the course is to provide skills in applying machine learning algorithms on real applications. We will focus less on learning algorithms, math and theory, and instead spend more time on hands-on skills required for algorithms to work on a variety of data sets.
Class: INFO I533 Systems & Protocol Security & Info Assurance
Instructor: Steve Myers
Synopsis: This course looks at systems and protocols, how to design threat models for them and how to use a large number of current security technologies and concepts to block specific vulnerabilities. Students will use numerous systems and programming security tools in the laboratories.
Class: INFO I590 Intro to Business Analytics Modeling
Instructor: Doug Blocher & Rex Cutshall
Synopsis: In this course, we develop analytical models using simulation and optimization to analyze and recommend sound solutions to complex business problems. Models are discussed to solve sophisticated problems using various tools on spreadsheets, including Excel solver for linear, integer and genetic programming problems, probabilistic simulations, and risk analysis including statistical analysis of simulation models.
Class: INFO I590 Network Science
Instructor: Yong-Yeol Ahn
Synopsis: Networks are everywhere. We can easily find network structure in many complex systems around us: our cells, brains, society, etc.The inherent generality of network approach allowed wide applications of network theory to flourish across diverse fields including biology, sociology, and epidemiology. The questions that we will address in the class are the following: why do networks matter? What are the fundamental theories to understand the structure and dynamics of networks? How has it been applied to other fields? What are the frontiers of the research? We will explore key papers ranging from the fundamental theory to the various applications of network theory. This course will focus more on round-table discussion between students than presentation. Students will work on research projects in groups and finish a paper at the end of the class.
Class: INFO I590 Perspectives in Data Science
Instructor: Kyle Stirling
Synopsis: This course will introduce multiple perspectives of the application of data science through recorded interviews with leaders in Silicon Valley companies, and map these to the practical skillsets of the data scientist.
Class: INFO I590 Practice in Data Science
Instructor: Kyle Stirling
Synopsis: This course is for anyone who applies their expertise to the demands of data-driven decision making and analysis. This is not so much a course on theory as it is on the practice of delivering Data Science expertise. Even if you don’t call yourself a consultant, every time a professional attempts to provide their expertise in Data Science it is often in a situation where you do not have direct control over how it is used or the implementation. This course enables you to learn the skills required to leverage your expertise and have the biggest impact in providing value and getting your expertise used, offering students the tools they need to apply their skills in Data Science during every stage of the consulting process. It will describe and give examples of how to conduct Data Science consulting behavior that is effectively used in projects.
Class: INFO I590 Real World Data Science
Instructor: Joanne Luciano
Synopsis: This purpose of this course is to provide Data Science graduate students with practical experience applying their data science skill sets to real-world datasets. Data for the first offering of this course in 2017 used a deidentified clinical trials dataset provided by Eli Lilly (agreement already in place with IU), but subsequent offerings could include public data or data provided by other industry partners. Students will be led through the full data analysis process of data preparation, model planning, model building, analysis, and communication of results. Students will meet (virtually or physically) daily to devise a plan.
Class: INFO I590 SQL and NOSQL
Instructor: Ying Ding
Synopsis: A database is the central focus in data science to store and manage data. Relational databases have empowered major industries for decades and are still widely adopted. In our new era of Big Data, the database landscape is undergoing significant change. Many non-relational databases become an important part of the enterprise data architecture of companies. Relational databases were developed long before the Internet and the Web to tackle the issues of central-controlled data storage and management. NoSQL databases emerged with the rise of Internet and Web applications to connect companies with customers (i.e., online or mobile) and to develop agility to adapt to faster changes. The new challenges of being agile and being able to accommodate data variablity/data integration drove enterprises to turn to NoSQL database technology. It is important for every data scientist to master the skills of current databases and know about the future of databases in a world of NoSQL. This course aims to provide the basic overview of the current database landscape, starting with relational databases and SQL, and moving to several different NoSQL databases, such as XML database and MongoDB.
Department: School of Public and Environmental Affairs
Class: SPEA P507 Data Analysis and Modeling in Public Affairs
Synopsis: V507 provides students of public and environmental affairs and related disciplines with a detailed, intermediate-level perspective on statistical concepts and techniques for analyzing and modeling complex systems. The course content includes estimating the parameters of such models based on existing data, testing hypotheses about these systems, and forecasting. The context of the course is the application of these techniques to problems and policies in public and environmental affairs. Multivariate regression analysis is one of the primary tools for statistical modeling for purposes of policy analysis, program evaluation, simulation of systems, and general forecasting. Thus, most of the course is devoted to single equation regression models and the extension of these models to a variety of situations. A prerequisite for the class is a graduate-level, introductory statistics course that includes coverage of the simple (two-variable) regression model and an introduction to multivariate regression.