Back to Courses

Data Analysis Courses - Page 51

Showing results 501-510 of 998
The R Programming Environment
This course provides a rigorous introduction to the R programming language, with a particular focus on using R for software development in a data science setting. Whether you are part of a data science team or working individually within a community of developers, this course will give you the knowledge of R needed to make useful contributions in those settings. As the first course in the Specialization, the course provides the essential foundation of R needed for the following courses. We cover basic R concepts and language fundamentals, key concepts like tidy data and related "tidyverse" tools, processing and manipulation of complex and large datasets, handling textual data, and basic data science tasks. Upon completing this course, learners will have fluency at the R console and will be able to create tidy datasets from a wide range of possible data sources.
Employee Attrition Prediction Using Machine Learning
In this project-based course, we will build, train and test a machine learning model to predict employee attrition using features such as employee job satisfaction, distance from work, compensation and performance. We will explore two machine learning algorithms, namely: (1) logistic regression classifier model and (2) Extreme Gradient Boosted Trees (XG-Boost). This project could be effectively applied in any Human Resources department to predict which employees are more likely to quit based on their features. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Introduction to the C# Type System
By the end of this project, you will use the C# Type System to represent data in a C# program. The C# type system is used to represent data of various types such as decimal, integer, character, and string in an efficient manner.
Performing Network, Path, and Text Analyses in SAS Visual Analytics
In this course, you learn about the data structure needed for network, path, and text analytics and how to create network analysis, path analysis, and text analytics in SAS Visual Analytics.
Hyperparameter Tuning with Neural Network Intelligence
In this 2-hour long guided project, we will learn the basics of using Microsoft's Neural Network Intelligence (NNI) toolkit and will use it to run a Hyperparameter tuning experiment on a Neural Network. NNI is an open source, AutoML toolkit created by Microsoft which can help machine learning practitioners automate Feature engineering, Hyperparameter tuning, Neural Architecture search and Model compression. In this guided project, we are going to take a look at using NNI to perform hyperparameter tuning. Please note that we are going to learn to use the NNI toolkit for hyperparameter tuning, and are not going to implement the tuning algorithms ourselves. We will use the popular MNIST dataset and train a simple Neural Network to learn to classify images of hand-written digits from the dataset. Once a basic script is in place, we will use the NNI toolkit to run a hyperparameter tuning experiment to find optimal values for batch size, learning rate, choice of activation function for the hidden layer, number of hidden units for the hidden layer, and dropout rate for the dropout layer. To be able to complete this project successfully, you should be familiar with the Python programming language. You should also be familiar with Neural Networks, TensorFlow and Keras. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Business Intelligence and Competitive Analysis
By the end of 2019, it is clear that American Airlines (AAL), the world’s largest airline group and a SP500 company, is in trouble. With the growth rate of its stock price ranked at the bottom of all major US airlines and going in the opposite direction from the SP500 index, AAL needs to find out what is going on, and how to turn the company and its stock price around. Addressing the challenge faced by AAL may well be a large-scale management consulting project. To start, business intelligence and competitive analysis (or competitive intelligence for short) is required to discover the problems and opportunities for the company which lay the foundation for the turning-around strategies. In this course, you will gain the knowledge and skills to combine data, analytics models and visualization tools for effective and efficient competitive intelligence. Upon completion of the course, you should be able to conduct competitive intelligence on companies of your choice as a management consultant. Note: To gain hands-on experiences in competitive intelligence via data analytics, you need to get into action. To this end, this course utilizes an external website (free) for you to practice the skills learned.
Exploratory vs Confirmatory data analysis using Python
This Guided Project, Exploratory and Confirmatory Data Analysis using python, is for those who want to learn about different methods of data analysis. In this 2-hour-long project-based course, you will understand and apply Exploratory Data Analysis, build different Data visualizations, apply different exploration techniques based on the data at hand and define and understand the concept of Confirmatory Data Analysis. This project is unique because you will learn how and where to start your data exploration. You will also learn how to implement different data visualizations using python and when to use them. To be successful in this project, you will need to be experienced in python programming language and working with jupyter notebook environment. Let's get started!
Meaningful Marketing Insights
With marketers are poised to be the largest users of data within the organization, there is a need to make sense of the variety of consumer data that the organization collects. Surveys, transaction histories and billing records can all provide insight into consumers’ future behavior, provided that they are interpreted correctly. In Introduction to Marketing Analytics, we introduce the tools that learners will need to convert raw data into marketing insights. The included exercises are conducted using Microsoft Excel, ensuring that learners will have the tools they need to extract information from the data available to them. The course provides learners with exposure to essential tools including exploratory data analysis, as well as regression methods that can be used to investigate the impact of marketing activity on aggregate data (e.g., sales) and on individual-level choice data (e.g., brand choices). To successfully complete the assignments in this course, you will require Microsoft Excel. If you do not have Excel, you can download a free 30-day trial here: https://products.office.com/en-us/try
Explainable Machine Learning with LIME and H2O in R
Welcome to this hands-on, guided introduction to Explainable Machine Learning with LIME and H2O in R. By the end of this project, you will be able to use the LIME and H2O packages in R for automatic and interpretable machine learning, build classification models quickly with H2O AutoML and explain and interpret model predictions using LIME. Machine learning (ML) models such as Random Forests, Gradient Boosted Machines, Neural Networks, Stacked Ensembles, etc., are often considered black boxes. However, they are more accurate for predicting non-linear phenomena due to their flexibility. Experts agree that higher accuracy often comes at the price of interpretability, which is critical to business adoption, trust, regulatory oversight (e.g., GDPR, Right to Explanation, etc.). As more industries from healthcare to banking are adopting ML models, their predictions are being used to justify the cost of healthcare and for loan approvals or denials. For regulated industries that use machine learning, interpretability is a requirement. As Finale Doshi-Velez and Been Kim put it, interpretability is "The ability to explain or to present in understandable terms to a human.". To successfully complete the project, we recommend that you have prior experience with programming in R, basic machine learning theory, and have trained ML models in R. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Measuring Total Data Quality
By the end of this second course in the Total Data Quality Specialization, learners will be able to: 1. Learn various metrics for evaluating Total Data Quality (TDQ) at each stage of the TDQ framework. 2. Create a quality concept map that tracks relevant aspects of TDQ from a particular application or data source. 3. Think through relative trade-offs between quality aspects, relative costs and practical constraints imposed by a particular project or study. 4. Identify relevant software and related tools for computing the various metrics. 5. Understand metrics that can be computed for both designed and found/organic data. 6. Apply the metrics to real data and interpret their resulting values from a TDQ perspective. This specialization as a whole aims to explore the Total Data Quality framework in depth and provide learners with more information about the detailed evaluation of total data quality that needs to happen prior to data analysis. The goal is for learners to incorporate evaluations of data quality into their process as a critical component for all projects. We sincerely hope to disseminate knowledge about total data quality to all learners, such as data scientists and quantitative analysts, who have not had sufficient training in the initial steps of the data science process that focus on data collection and evaluation of data quality. We feel that extensive knowledge of data science techniques and statistical analysis procedures will not help a quantitative research study if the data collected/gathered are not of sufficiently high quality. This specialization will focus on the essential first steps in any type of scientific investigation using data: either generating or gathering data, understanding where the data come from, evaluating the quality of the data, and taking steps to maximize the quality of the data prior to performing any kind of statistical analysis or applying data science techniques to answer research questions. Given this focus, there will be little material on the analysis of data, which is covered in myriad existing Coursera specializations. The primary focus of this specialization will be on understanding and maximizing data quality prior to analysis.