Back to Courses

Data Analysis Courses - Page 71

Showing results 701-710 of 998
Introduction to Machine Learning in Sports Analytics
In this course students will explore supervised machine learning techniques using the python scikit learn (sklearn) toolkit and real-world athletic data to understand both machine learning algorithms and how to predict athletic outcomes. Building on the previous courses in the specialization, students will apply methods such as support vector machines (SVM), decision trees, random forest, linear and logistic regression, and ensembles of learners to examine data from professional sports leagues such as the NHL and MLB as well as wearable devices such as the Apple Watch and inertial measurement units (IMUs). By the end of the course students will have a broad understanding of how classification and regression techniques can be used to enable sports analytics across athletic activities and events.
How to Create a Program Evaluation for Your Non-Profit
In this 1.5 hour long project-based course, you will learn how to create a program evaluation plan for your non-profit. By the end of the course, you will understand the importance of program evaluation, how to use Logic Models, how to write SMART goals for your program, and how to formulate good questions to gather the data you need. Learning Objectives: Task 1: Understand why evaluation is valuable for your program or organization. Task 2: Learn how to identify and define the typical components of a Logic Model. Task 3: Write SMART goals for a program outcome. Task 4: Identify different measurement tools and how they will be helpful in the evaluation process. Task 5: Understand what makes a good question, identify different types of questions, and practice writing good questions. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
AI Workflow: Machine Learning, Visual Recognition and NLP
This is the fourth course in the IBM AI Enterprise Workflow Certification specialization.    You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.  Course 4 covers the next stage of the workflow, setting up models and their associated data pipelines for a hypothetical streaming media company.  The first topic covers the complex topic of evaluation metrics, where you will learn best practices for a number of different metrics including regression metrics, classification metrics, and multi-class metrics, which you will use to select the best model for your business challenge.  The next topics cover best practices for different types of models including linear models, tree-based models, and neural networks.  Out-of-the-box Watson models for natural language understanding and visual recognition will be used.  There will be case studies focusing on natural language processing and on image analysis to provide realistic context for the model pipelines.   By the end of this course you will be able to: Discuss common regression, classification, and multilabel classification metrics Explain the use of linear and logistic regression in supervised learning applications Describe common strategies for grid searching and cross-validation Employ evaluation metrics to select models for production use Explain the use of tree-based algorithms in supervised learning applications Explain the use of Neural Networks in supervised learning applications Discuss the major variants of neural networks and recent advances Create a neural net model in Tensorflow Create and test an instance of Watson Visual Recognition Create and test an instance of Watson NLU Who should take this course? This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses.   What skills should you have? It is assumed that you have completed Courses 1 through 3 of the IBM AI Enterprise Workflow specialization and you have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.
AI Workflow: AI in Production
This is the sixth course in the IBM AI Enterprise Workflow Certification specialization.   You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.     This course focuses on models in production at a hypothetical streaming media company.  There is an introduction to IBM Watson Machine Learning.  You will build your own API in a Docker container and learn how to manage containers with Kubernetes.  The course also introduces  several other tools in the IBM ecosystem designed to help deploy or maintain models in production.  The AI workflow is not a linear process so there is some time dedicated to the most important feedback loops in order to promote efficient iteration on the overall workflow.   By the end of this course you will be able to: 1.  Use Docker to deploy a flask application 2.  Deploy a simple UI to integrate the ML model, Watson NLU, and Watson Visual Recognition 3.  Discuss basic Kubernetes terminology 4.  Deploy a scalable web application on Kubernetes  5.  Discuss the different feedback loops in AI workflow 6.  Discuss the use of unit testing in the context of model production 7.  Use IBM Watson OpenScale to assess bias and performance of production machine learning models. Who should take this course? This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses.   What skills should you have? It is assumed that you have completed Courses 1 through 5 of the IBM AI Enterprise Workflow specialization and you have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.
Validity and Bias in Epidemiology
Epidemiological studies can provide valuable insights about the frequency of a disease, its potential causes and the effectiveness of available treatments. Selecting an appropriate study design can take you a long way when trying to answer such a question. However, this is by no means enough. A study can yield biased results for many different reasons. This course offers an introduction to some of these factors and provides guidance on how to deal with bias in epidemiological research. In this course you will learn about the main types of bias and what effect they might have on your study findings. You will then focus on the concept of confounding and you will explore various methods to identify and control for confounding in different study designs. In the last module of this course we will discuss the phenomenon of effect modification, which is key to understanding and interpreting study results. We will finish the course with a broader discussion of causality in epidemiology and we will highlight how you can utilise all the tools that you have learnt to decide whether your findings indicate a true association and if this can be considered causal.
Memorystore: Qwik Start
This is a self-paced lab that takes place in the Google Cloud console. In this lab you will create a Memorystore instance and leverage its basic capabilities.
Machine Learning: Clustering & Retrieval
Case Studies: Finding Similar Documents A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover? In this third case study, finding similar documents, you will examine similarity-based algorithms for retrieval. In this course, you will also examine structured representations for describing the documents in the corpus, including clustering and mixed membership models, such as latent Dirichlet allocation (LDA). You will implement expectation maximization (EM) to learn the document clusterings, and see how to scale the methods using MapReduce. Learning Outcomes: By the end of this course, you will be able to: -Create a document retrieval system using k-nearest neighbors. -Identify various similarity metrics for text data. -Reduce computations in k-nearest neighbor search by using KD-trees. -Produce approximate nearest neighbors using locality sensitive hashing. -Compare and contrast supervised and unsupervised learning tasks. -Cluster documents by topic using k-means. -Describe how to parallelize k-means using MapReduce. -Examine probabilistic clustering approaches using mixtures models. -Fit a mixture of Gaussian model using expectation maximization (EM). -Perform mixed membership modeling using latent Dirichlet allocation (LDA). -Describe the steps of a Gibbs sampler and how to use its output to draw inferences. -Compare and contrast initialization techniques for non-convex optimization objectives. -Implement these techniques in Python.
Working with Datasets
By the end of this project, you will use Python to wrangle two datasets to visualize the relationships among the data. Datasets are collections of data that may exist in a database, a csv file, or an ordinary file. Python is a popular language to use work with dataset data. It has tools to read data in various formats, and libraries to visualize the datasets.
Text Retrieval and Search Engines
Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text. This course will cover search engine technologies, which play an important role in any data mining applications involving text data for two reasons. First, while the raw data may be large for any particular problem, it is often a relatively small subset of the data that are relevant, and a search engine is an essential tool for quickly discovering a small subset of relevant text data in a large text collection. Second, search engines are needed to help analysts interpret any patterns discovered in the data by allowing them to examine the relevant original text data to make sense of any discovered pattern. You will learn the basic concepts, principles, and the major techniques in text retrieval, which is the underlying science of search engines.
Implementing Connected Planning
From this course, you’ll understand how to make Connected Planning a reality in an organization. Organizations need a vision supported by a focused effort to move from traditional planning and static business modeling to a Connected Planning approach--where data, people, and plans are linked throughout the organization. Creating a digital ecosystem using a Connected Planning technology platform is a big part of the process, but it’s not the only thing needed. In this course, we’ll explore the role of corporate culture and sponsorship, process redesign, master data management, and change management to successfully implement Connected Planning. You’ll be challenged to examine your own organization’s readiness to undertake the Connected Planning journey. If you’re not ready yet, you’ll identify areas where you need to focus your efforts. If you are ready, you’ll have the framework you need to get started on the path to make Connected Planning a reality in your organization. By the end of this course, you’ll be able to: • Identify challenges to a Connected Planning implementation stemming from issues with people, data, and planning processes • Articulate the benefits of Connected Planning over traditional planning and static business modeling • Explain why corporate culture, data management, and change management are critical to successful Connected Planning execution • Drive or constructively participate in the implementation of Connected Planning in your organization This course is presented by Anaplan, provider of a leading technology platform that is purpose-built for Connected Planning.