Back to Courses

Probability And Statistics Courses - Page 4

Showing results 31-40 of 133
Developing Data Products
A data product is the production output from a statistical analysis. Data products automate complex analysis tasks or use technology to expand the utility of a data informed model, algorithm or inference. This course covers the basics of creating data products using Shiny, R packages, and interactive graphics. The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.
Bayesian Statistics: Time Series Analysis
This course for practicing and aspiring data scientists and statisticians. It is the fourth of a four-course sequence introducing the fundamentals of Bayesian statistics. It builds on the course Bayesian Statistics: From Concept to Data Analysis, Techniques and Models, and Mixture models. Time series analysis is concerned with modeling the dependency among elements of a sequence of temporally related variables. To succeed in this course, you should be familiar with calculus-based probability, the principles of maximum likelihood estimation, and Bayesian inference. You will learn how to build models that can describe temporal dependencies and how to perform Bayesian inference and forecasting for the models. You will apply what you've learned with the open-source, freely available software R with sample databases. Your instructor Raquel Prado will take you from basic concepts for modeling temporally dependent data to implementation of specific classes of models
BigQuery Soccer Data Analysis
This is a self-paced lab that takes place in the Google Cloud console. Learn the fundamentals of writing and executing queries to query soccer data stored in BigQuery tables. In this lab you will learn more fundamentals of sports data science by writing and executing queries to query data stored in BigQuery tables. The emphasis of the lab is to illustrate how the database works and answer some interesting questions related to the following topics in soccer.
Statistics for Data Science with Python
This Statistics for Data Science course is designed to introduce you to the basic principles of statistical methods and procedures used for data analysis. After completing this course you will have practical knowledge of crucial topics in statistics including - data gathering, summarizing data using descriptive statistics, displaying and visualizing data, examining relationships between variables, probability distributions, expected values, hypothesis testing, introduction to ANOVA (analysis of variance), regression and correlation analysis. You will take a hands-on approach to statistical analysis using Python and Jupyter Notebooks – the tools of choice for Data Scientists and Data Analysts. At the end of the course, you will complete a project to apply various concepts in the course to a Data Science problem involving a real-life inspired scenario and demonstrate an understanding of the foundational statistical thinking and reasoning. The focus is on developing a clear understanding of the different approaches for different data types, developing an intuitive understanding, making appropriate assessments of the proposed methods, using Python to analyze our data, and interpreting the output accurately. This course is suitable for a variety of professionals and students intending to start their journey in data and statistics-driven roles such as Data Scientists, Data Analysts, Business Analysts, Statisticians, and Researchers. It does not require any computer science or statistics background. We strongly recommend taking the Python for Data Science course before starting this course to get familiar with the Python programming language, Jupyter notebooks, and libraries. An optional refresher on Python is also provided. After completing this course, a learner will be able to: ✔Calculate and apply measures of central tendency and measures of dispersion to grouped and ungrouped data. ✔Summarize, present, and visualize data in a way that is clear, concise, and provides a practical insight for non-statisticians needing the results. ✔Identify appropriate hypothesis tests to use for common data sets. ✔Conduct hypothesis tests, correlation tests, and regression analysis. ✔Demonstrate proficiency in statistical analysis using Python and Jupyter Notebooks.
Follow a Machine Learning Workflow
Machine learning is not just a single task or even a small group of tasks; it is an entire process, one that practitioners must follow from beginning to end. It is this process—also called a workflow—that enables the organization to get the most useful results out of their machine learning technologies. No matter what form the final product or service takes, leveraging the workflow is key to the success of the business's AI solution. This second course within the Certified Artificial Intelligence Practitioner (CAIP) professional certificate explores each step along the machine learning workflow, from problem formulation all the way to model presentation and deployment. The overall workflow was introduced in the previous course, but now you'll take a deeper dive into each of the important tasks that make up the workflow, including two of the most hands-on tasks: data analysis and model training. You'll also learn about how machine learning tasks can be automated, ensuring that the workflow can recur as needed, like most important business processes. Ultimately, this course provides a practical framework upon which you'll build many more machine learning models in the remaining courses.
Data Science Fundamentals for Data Analysts
In this course we're going to guide you through the fundamental building blocks of data science, one of the fastest-growing fields in the world! With the help of our industry-leading data scientists, we’ve designed this course to build ready-to-apply data science skills in just 15 hours of learning. First, we’ll give you a quick introduction to data science - what it is and how it is used to solve real-world problems. For the rest of the course, we'll teach you the skills you need to apply foundational data science concepts and techniques to solve these real-world problems. By the end of this course, you'll be able to leverage your existing data analysis skills to design, execute, assess, and communicate the results of your very own data science projects.
Managing, Describing, and Analyzing Data
In this course, you will learn the basics of understanding the data you have and why correctly classifying data is the first step to making correct decisions. You will describe data both graphically and numerically using descriptive statistics and R software. You will learn four probability distributions commonly used in the analysis of data. You will analyze data sets using the appropriate probability distribution. Finally, you will learn the basics of sampling error, sampling distributions, and errors in decision-making. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder.
Introduction to Statistics & Data Analysis in Public Health
Welcome to Introduction to Statistics & Data Analysis in Public Health! This course will teach you the core building blocks of statistical analysis - types of variables, common distributions, hypothesis testing - but, more than that, it will enable you to take a data set you've never seen before, describe its keys features, get to know its strengths and quirks, run some vital basic analyses and then formulate and test hypotheses based on means and proportions. You'll then have a solid grounding to move on to more sophisticated analysis and take the other courses in the series. You'll learn the popular, flexible and completely free software R, used by statistics and machine learning practitioners everywhere. It's hands-on, so you'll first learn about how to phrase a testable hypothesis via examples of medical research as reported by the media. Then you'll work through a data set on fruit and vegetable eating habits: data that are realistically messy, because that's what public health data sets are like in reality. There will be mini-quizzes with feedback along the way to check your understanding. The course will sharpen your ability to think critically and not take things for granted: in this age of uncontrolled algorithms and fake news, these skills are more important than ever. Prerequisites Some formulae are given to aid understanding, but this is not one of those courses where you need a mathematics degree to follow it. You will need only basic numeracy (for example, we will not use calculus) and familiarity with graphical and tabular ways of presenting results. No knowledge of R or programming is assumed.
Mastering Data Analysis with Pandas: Learning Path Part 4
In this structured series of hands-on guided projects, we will master the fundamentals of data analysis and manipulation with Pandas and Python. Pandas is a super powerful, fast, flexible and easy to use open-source data analysis and manipulation tool. This guided project is the fourth of a series of multiple guided projects (learning path) that is designed for anyone who wants to master data analysis with pandas. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Importing Data in the Tidyverse
Getting data into your statistical analysis system can be one of the most challenging parts of any data science project. Data must be imported and harmonized into a coherent format before any insights can be obtained. You will learn how to get data into R from commonly used formats and harmonizing different kinds of datasets from different sources. If you work in an organization where different departments collect data using different systems and different storage formats, then this course will provide essential tools for bringing those datasets together and making sense of the wealth of information in your organization. This course introduces the Tidyverse tools for importing data into R so that it can be prepared for analysis, visualization, and modeling. Common data formats are introduced, including delimited files, spreadsheets and relational databases, and techniques for obtaining data from the web are demonstrated, such as web scraping and web APIs. In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.