Back to Courses

Data Analysis Courses - Page 13

Showing results 121-130 of 998
Support Vector Machines in Python, From Start to Finish
In this lesson we will built this Support Vector Machine for classification using scikit-learn and the Radial Basis Function (RBF) Kernel. Our training data set contains continuous and categorical data from the UCI Machine Learning Repository to predict whether or not a patient has heart disease. This course runs on Coursera's hands-on project platform called Rhyme. On Rhyme, you do projects in a hands-on manner in your browser. You will get instant access to pre-configured cloud desktops containing all of the software and data you need for the project. Everything is already set up directly in your Internet browser so you can just focus on learning. For this project, you’ll get instant access to a cloud desktop with (e.g. Python, Jupyter, and Tensorflow) pre-installed. Prerequisites: In order to be successful in this project, you should be familiar with programming in Python and the concepts behind Support Vector Machines, the Radial Basis Function, Regularization, Cross Validation and Confusion Matrices. Notes: - You will be able to access the cloud desktop 5 times. However, you will be able to access instructions videos as many times as you want. - This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
ANOVA and Experimental Design
This second course in statistical modeling will introduce students to the study of the analysis of variance (ANOVA), analysis of covariance (ANCOVA), and experimental design. ANOVA and ANCOVA, presented as a type of linear regression model, will provide the mathematical basis for designing experiments for data science applications. Emphasis will be placed on important design-related concepts, such as randomization, blocking, factorial design, and causality. Some attention will also be given to ethical issues raised in experimentation. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder. Logo adapted from photo by Vincent Ledvina on Unsplash
Exploratory Data Analysis in R
In this 1-hour long project-based course, you will learn how to do basic exploratory data analysis (EDA) in R, automate your EDA reports and learn advanced EDA tips Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Hypothesis Testing in R
Welcome to this project-based course Hypothesis Testing in R. In this project, you will learn how to perform extensive hypothesis tests for one and two samples in R. By the end of this 2-hour long project, you will understand the rationale behind performing hypothesis testing. Also, you will learn how to perform hypothesis tests for proportions and means. By extension, you will learn how to perform a hypothesis test for means of matched or paired samples in R. Note, you do not need to be a statistical analyst or data scientist to be successful in this guided project, just a familiarity with basic statistics and using R suffice for this project. If you are not familiar with R and want to learn the basics, start with my previous guided project titled “Getting Started with R”, and "Calculating Descriptive Statistics in R". A fundamental prerequisite is having a good understanding of the theory of hypothesis test.
Tidy Messy Data using tidyr in R
As data enthusiasts and professionals, our work often requires dealing with data in different forms. In particular, messy data can be a big challenge because the quality of your analysis largely depends on the quality of the data. This project-based course, "Tidy Messy Data using tidyr in R," is intended for beginner and intermediate R users with related experiences who are willing to advance their knowledge and skills. In this course, you will learn practical ways for data cleaning, reshaping, and transformation using R. You will learn how to use different tidyr functions like pivot_longer(), pivot_wider(), separate_rows(), separate(), and others to achieve the tidy data principles. By the end of this 2-hour-long project, you will get hands-on massaging data to put in the proper format. By extension, you will learn to create plots using ggplot(). This project-based course is a beginner to an intermediate-level course in R. Therefore, to get the most out of this project, it is essential to have a basic understanding of using R. Specifically, you should be able to load data into R and understand how the pipe function works. It will be helpful to complete my previous project titled "Data Manipulation with dplyr in R."
Calculus through Data & Modeling: Applying Differentiation
As rates of change, derivatives give us information about the shape of a graph. In this course, we will apply the derivative to find linear approximations for single-variable and multi-variable functions. This gives us a straightforward way to estimate functions that may be complicated or difficult to evaluate. We will also use the derivative to locate the maximum and minimum values of a function. These optimization techniques are important for all fields, including the natural sciences and data analysis. The topics in this course lend themselves to many real-world applications, such as machine learning, minimizing costs or maximizing profits.
Databases and SQL for Data Science with Python
Working knowledge of SQL (or Structured Query Language) is a must for data professionals like Data Scientists, Data Analysts and Data Engineers. Much of the world's data resides in databases. SQL is a powerful language used for communicating with and extracting data from databases. In this course you will learn SQL inside out- from the very basics of Select statements to advanced concepts like JOINs. You will: -write foundational SQL statements like: SELECT, INSERT, UPDATE, and DELETE -filter result sets, use WHERE, COUNT, DISTINCT, and LIMIT clauses -differentiate between DML & DDL -CREATE, ALTER, DROP and load tables -use string patterns and ranges; ORDER and GROUP result sets, and built-in database functions -build sub-queries and query data from multiple tables -access databases as a data scientist using Jupyter notebooks with SQL and Python -work with advanced concepts like Stored Procedures, Views, ACID Transactions, Inner & Outer JOINs Through hands-on labs and projects, you will practice building SQL queries, work with real databases on the Cloud, and use real data science tools. In the final project you’ll analyze multiple real-world datasets to demonstrate your skills.
Data Analyst Career Guide and Interview Preparation
This course is designed to prepare you to enter the job market as a data analyst. It provides guidance about the regular functions and tasks of data analysts and their place in the data ecosystem, as well as the opportunities of the profession and some options for career development. It explains practical techniques for creating essential job-seeking materials such as a resume and a portfolio, as well as auxiliary tools like a cover letter and an elevator pitch. You will learn how to find and assess prospective job positions, apply to them, and lay the groundwork for interviewing. You will also get inside tips and steps you can use to perform professionally and effectively at interviews. Let seasoned professionals share their experience to help you get ahead of the competition.
Data Analysis with Python
Analyzing data with Python is an essential skill for Data Scientists and Data Analysts. This course will take you from the basics of data analysis with Python to building and evaluating data models. Topics covered include: - collecting and importing data - cleaning, preparing & formatting data - data frame manipulation - summarizing data, - building machine learning regression models - model refinement - creating data pipelines You will learn how to import data from multiple sources, clean and wrangle data, perform exploratory data analysis (EDA), and create meaningful data visualizations. You will then predict future trends from data by developing linear, multiple, polynomial regression models & pipelines and learn how to evaluate them. In addition to video lectures you will learn and practice using hands-on labs and projects. You will work with several open source Python libraries, including Pandas and Numpy to load, manipulate, analyze, and visualize cool datasets. You will also work with scipy and scikit-learn, to build machine learning models and make predictions. If you choose to take this course and earn the Coursera course certificate, you will also earn an IBM digital badge.
Excel Time Series Models for Business Forecasting
This course explores different time series business forecasting methods. The course covers a variety of business forecasting methods for different types of components present in time series data — level, trending, and seasonal. We will learn about the theoretical methods and apply these methods to business data using Microsoft Excel. These forecasting methods will be programmed into Microsoft Excel, displayed graphically, and we will optimise these models to produce accurate forecasts. We will compare different models and their forecasts to decide which model best suits our business' needs.