Back to Courses

Data Analysis Courses - Page 14

Showing results 131-140 of 998
ANOVA and Experimental Design
This second course in statistical modeling will introduce students to the study of the analysis of variance (ANOVA), analysis of covariance (ANCOVA), and experimental design. ANOVA and ANCOVA, presented as a type of linear regression model, will provide the mathematical basis for designing experiments for data science applications. Emphasis will be placed on important design-related concepts, such as randomization, blocking, factorial design, and causality. Some attention will also be given to ethical issues raised in experimentation. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder. Logo adapted from photo by Vincent Ledvina on Unsplash
Exploratory Data Analysis in R
In this 1-hour long project-based course, you will learn how to do basic exploratory data analysis (EDA) in R, automate your EDA reports and learn advanced EDA tips Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Hypothesis Testing in R
Welcome to this project-based course Hypothesis Testing in R. In this project, you will learn how to perform extensive hypothesis tests for one and two samples in R. By the end of this 2-hour long project, you will understand the rationale behind performing hypothesis testing. Also, you will learn how to perform hypothesis tests for proportions and means. By extension, you will learn how to perform a hypothesis test for means of matched or paired samples in R. Note, you do not need to be a statistical analyst or data scientist to be successful in this guided project, just a familiarity with basic statistics and using R suffice for this project. If you are not familiar with R and want to learn the basics, start with my previous guided project titled “Getting Started with R”, and "Calculating Descriptive Statistics in R". A fundamental prerequisite is having a good understanding of the theory of hypothesis test.
Calculus through Data & Modeling: Applying Differentiation
As rates of change, derivatives give us information about the shape of a graph. In this course, we will apply the derivative to find linear approximations for single-variable and multi-variable functions. This gives us a straightforward way to estimate functions that may be complicated or difficult to evaluate. We will also use the derivative to locate the maximum and minimum values of a function. These optimization techniques are important for all fields, including the natural sciences and data analysis. The topics in this course lend themselves to many real-world applications, such as machine learning, minimizing costs or maximizing profits.
Databases and SQL for Data Science with Python
Working knowledge of SQL (or Structured Query Language) is a must for data professionals like Data Scientists, Data Analysts and Data Engineers. Much of the world's data resides in databases. SQL is a powerful language used for communicating with and extracting data from databases. In this course you will learn SQL inside out- from the very basics of Select statements to advanced concepts like JOINs. You will: -write foundational SQL statements like: SELECT, INSERT, UPDATE, and DELETE -filter result sets, use WHERE, COUNT, DISTINCT, and LIMIT clauses -differentiate between DML & DDL -CREATE, ALTER, DROP and load tables -use string patterns and ranges; ORDER and GROUP result sets, and built-in database functions -build sub-queries and query data from multiple tables -access databases as a data scientist using Jupyter notebooks with SQL and Python -work with advanced concepts like Stored Procedures, Views, ACID Transactions, Inner & Outer JOINs Through hands-on labs and projects, you will practice building SQL queries, work with real databases on the Cloud, and use real data science tools. In the final project you’ll analyze multiple real-world datasets to demonstrate your skills.
Bayesian Inference with MCMC
The objective of this course is to introduce Markov Chain Monte Carlo Methods for Bayesian modeling and inference, The attendees will start off by learning the the basics of Monte Carlo methods. This will be augmented by hands-on examples in Python that will be used to illustrate how these algorithms work. This will be the second course in a specialization of three courses .Python and Jupyter notebooks will be used throughout this course to illustrate and perform Bayesian modeling with PyMC3. The course website is located at https://sjster.github.io/introduction_to_computational_statistics/docs/index.html. The course notebooks can be downloaded from this website by following the instructions on page https://sjster.github.io/introduction_to_computational_statistics/docs/getting_started.html. The instructor for this course will be Dr. Srijith Rajamohan.
BigQuery Machine Learning using Soccer Data
This is a self-paced lab that takes place in the Google Cloud console. Learn how to use BigQuery ML with soccer shot data to create and use an expected goals model.
Amazon Echo Reviews Sentiment Analysis Using NLP
In this hands-on project, we will predict customer sentiment using natural language processing techniques. In this project, we will build a machine learning model to analyze thousands of amazon echo reviews to predict customers sentiment. Artificial Intelligence and Machine Learning (AI/ML)-based sentiment analysis is crucial for companies to automatically predict whether their customers are happy or not. This project is practical and directly applicable to any company with that has online presence. The algorithm could be used automatically detect customers sentiment. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Introduction to Distributions in R
This project is aimed at beginners who have a basic familiarity with the statistical programming language R and the RStudio environment, or people with a small amount of experience who would like to review the fundamentals of generating random numerical data from distributions in R.
Publication-Ready Tables in R
Learn how to create Publication-Ready Tables in R for descriptive statistics, contingency tables, correlation tables, model summary tables and survival probabilities tables