Back to Courses

Data Analysis Courses - Page 2

Showing results 11-20 of 998
Dealing With Missing Data
This course will cover the steps used in weighting sample surveys, including methods for adjusting for nonresponse and using data external to the survey for calibration. Among the techniques discussed are adjustments using estimated response propensities, poststratification, raking, and general regression estimation. Alternative techniques for imputing values for missing items will be discussed. For both weighting and imputation, the capabilities of different statistical software packages will be covered, including R®, Stata®, and SAS®.
Social Network Analysis
This course is designed to quite literally ‘make a science’ out of something at the heart of society: social networks. Humans are natural network scientists, as we compute new network configurations all the time, almost unaware, when thinking about friends and family (which are particular forms of social networks), about colleagues and organizational relations (other, overlapping network structures), and about how to navigate delicate or opportunistic network configurations to save guard or advance in our social standing (with society being one big social network itself). While such network structures always existed, computational social science has helped to reveal and to study them more systematically. In the first part of the course we focus on network structure. This looks as static snapshots of networks, which can be intricate and reveal important aspects of social systems. In our hands-on lab, you will also visualize and analyze a network with a software yourself, which will help to appreciate the complexity social networks can take on. During the second part of the course, we will look at how networks evolve in time. We ask how we can predict what kind of network will form and if and how we could influence network dynamics.
Introduction to EDA in R
Welcome to this project-based course Introduction to EDA in R. In this project, you will learn how to perform extensive exploratory data analysis on both quantitative and qualitative variables using basic R functions. By the end of this 2-hour long project, you will understand how to create different basic plots in R. Also, you will learn how to create plots for categorical variables and numeric or quantitative variables. By extension, you will learn how to plot three variables and save your plot as an image in R. Note, you do not need to be a data scientist to be successful in this guided project, just a familiarity with basic statistics and using R suffice for this project. If you are not familiar with R and want to learn the basics, start with my previous guided projects titled “Getting Started with R” and “Calculating Descriptive Statistics in R”
Line Balancing With MILP Optimization In RStudio
By the end of this project, you will learn to use R lpSolveAPI. You will learn to: # Formulate Line Balancing Problem & Determine Objective Function # Apply Constraints On Tasks Assignment To Stations # Apply The Sum Of Durations Constraints On Tasks # Apply Task Precedence Relationship Constraints # Run Optimiser, Obtain & Analyse Solution
Visualization for Statistical Analysis
In this project you will learn about several visualization techniques and their importance for Statistical Analysis. The project demonstrates different plotting techniques, for example, histograms, scatter plots, box and whiskers plot, violin plot, bar plot, addition of regression line to scatter plot, and creating matrix of multiple plots. It also discusses the suitability of each plots according to the data type of the variables and illustrates multiple ways to achieve the desired plots efficiently. The project refers to 'Palmer Penguins' data set for the illustrative purpose.
Introduction to Microsoft Excel
By the end of this project, you will learn how to create an Excel Spreadsheet by using a free version of Microsoft Office Excel. Excel is a spreadsheet that works like a database. It consists of individual cells that can be used to build functions, formulas, tables, and graphs that easily organize and analyze large amounts of information and data. Excel is organized into rows (represented by numbers) and columns (represented by letters) that contain your information. This format allows you to present large amounts of information and data in a concise and easy to follow format. Microsoft Excel is the most widely used software within the business community. Whether it is bankers or accountants or business analysts or marketing professionals or scientists or entrepreneurs, almost all professionals use Excel on a consistent basis. You will learn what an Excel Spreadsheet is, why we use it and the most important keyboard shortcuts, functions, and basic formulas.
Data Analysis Tools
In this course, you will develop and test hypotheses about your data. You will learn a variety of statistical tests, as well as strategies to know how to apply the appropriate one to your specific data and question. Using your choice of two powerful statistical software packages (SAS or Python), you will explore ANOVA, Chi-Square, and Pearson correlation analysis. This course will guide you through basic statistical principles to give you the tools to answer questions you have developed. Throughout the course, you will share your progress with others to gain valuable feedback and provide insight to other learners about their work.
Diabetic Retinopathy Detection with Artificial Intelligence
In this project, we will train deep neural network model based on Convolutional Neural Networks (CNNs) and Residual Blocks to detect the type of Diabetic Retinopathy from images. Diabetic Retinopathy is the leading cause of blindness in the working-age population of the developed world and estimated to affect over 347 million people worldwide. Diabetic Retinopathy is disease that results from complication of type 1 & 2 diabetes and can develop if blood sugar levels are left uncontrolled for a prolonged period of time. With the power of Artificial Intelligence and Deep Learning, doctors will be able to detect blindness before it occurs.
Materials Data Sciences and Informatics
This course aims to provide a succinct overview of the emerging discipline of Materials Informatics at the intersection of materials science, computational science, and information science. Attention is drawn to specific opportunities afforded by this new field in accelerating materials development and deployment efforts. A particular emphasis is placed on materials exhibiting hierarchical internal structures spanning multiple length/structure scales and the impediments involved in establishing invertible process-structure-property (PSP) linkages for these materials. More specifically, it is argued that modern data sciences (including advanced statistics, dimensionality reduction, and formulation of metamodels) and innovative cyberinfrastructure tools (including integration platforms, databases, and customized tools for enhancement of collaborations among cross-disciplinary team members) are likely to play a critical and pivotal role in addressing the above challenges.