Back to Courses

Data Analysis Courses - Page 85

Showing results 841-850 of 998
AI Workflow: Data Analysis and Hypothesis Testing
This is the second course in the IBM AI Enterprise Workflow Certification specialization.  You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones.   In this course you will begin your work for a hypothetical streaming media company by doing exploratory data analysis (EDA).  Best practices for data visualization, handling missing data, and hypothesis testing will be introduced to you as part of your work.  You will learn techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests. You will apply what you learn through two hands on case studies: data visualization and multiple testing using a simple pipeline.   By the end of this course you should be able to: 1.  List several best practices concerning EDA and data visualization 2.  Create a simple dashboard in Watson Studio 3.  Describe strategies for dealing with missing data 4.  Explain the difference between imputation and multiple imputation 5.  Employ common distributions to answer questions about event probabilities 6.  Explain the investigative role of hypothesis testing in EDA 7.  Apply several methods for dealing with multiple testing   Who should take this course? This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses. What skills should you have? It is assumed that you have completed Course 1 of the IBM AI Enterprise Workflow specialization and have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.
Introduction to Accounting Data Analytics and Visualization
Accounting has always been about analytical thinking. From the earliest days of the profession, Luca Pacioli emphasized the importance of math and order for analyzing business transactions. The skillset that accountants have needed to perform math and to keep order has evolved from pencil and paper, to typewriters and calculators, then to spreadsheets and accounting software. A new skillset that is becoming more important for nearly every aspect of business is that of big data analytics: analyzing large amounts of data to find actionable insights. This course is designed to help accounting students develop an analytical mindset and prepare them to use data analytic programming languages like Python and R. We’ve divided the course into three main sections. In the first section, we bridge accountancy to analytics. We identify how tasks in the five major subdomains of accounting (i.e., financial, managerial, audit, tax, and systems) have historically required an analytical mindset, and we then explore how those tasks can be completed more effectively and efficiently by using big data analytics. We then present a FACT framework for guiding big data analytics: Frame a question, Assemble data, Calculate the data, and Tell others about the results. In the second section of the course, we emphasize the importance of assembling data. Using financial statement data, we explain desirable characteristics of both data and datasets that will lead to effective calculations and visualizations. In the third, and largest section of the course, we demonstrate and explore how Excel and Tableau can be used to analyze big data. We describe visual perception principles and then apply those principles to create effective visualizations. We then examine fundamental data analytic tools, such as regression, linear programming (using Excel Solver), and clustering in the context of point of sale data and loan data. We conclude by demonstrating the power of data analytic programming languages to assemble, visualize, and analyze data. We introduce Visual Basic for Applications as an example of a programming language, and the Visual Basic Editor as an example of an integrated development environment (IDE).
Modeling Time Series and Sequential Data
In this course you learn to build, refine, extrapolate, and, in some cases, interpret models designed for a single, sequential series. There are three modeling approaches presented. The traditional, Box-Jenkins approach for modeling time series is covered in the first part of the course. This presentation moves students from models for stationary data, or ARMA, to models for trend and seasonality, ARIMA, and concludes with information about specifying transfer function components in an ARIMAX, or time series regression, model. A Bayesian approach to modeling time series is considered next. The basic Bayesian framework is extended to accommodate autoregressive variation in the data as well as dynamic input variable effects. Machine learning algorithms for time series is the third approach. Gradient boosting and recurrent neural network algorithms are particularly well suited for accommodating nonlinear relationships in the data. Examples are provided to build intuition on the effective use of these algorithms. The course concludes by considering how forecasting precision can be improved by combining the strengths of the different approaches. The final lesson includes demonstrations on creating combined (or ensemble) and hybrid model forecasts. This course is appropriate for analysts interested in augmenting their machine learning skills with analysis tools that are appropriate for assaying, modifying, modeling, forecasting, and managing data that consist of variables that are collected over time. This course uses a variety of different software tools. Familiarity with Base SAS, SAS/ETS, SAS/STAT, and SAS Visual Forecasting, as well as open-source tools for sequential data handling and modeling, is helpful but not required. The lessons on Bayesian analysis and machine learning models assume some prior knowledge of these topics. One way that students can acquire this background is by completing these SAS Education courses: Bayesian Analyses Using SAS and Machine Learning Using SAS Viya.
Algebra: Elementary to Advanced - Functions & Applications
After completing this course, students will learn how to successfully apply functions to model different data and real world occurrences. This course reviews the concept of a function and then provide multiple examples of common and uncommon types of functions used in a variety of disciplines. Formulas, domains, ranges, graphs, intercepts, and fundamental behavior are all analyzed using both algebraic and analytic techniques. From this core set of functions, new functions are created by arithmetic operations and function composition. These functions are then applied to solve real world problems. The ability to picture many different types of functions will help students learn how and when to apply these functions, as well as give students the geometric intuition to understand the algebraic techniques. The skills and objectives from this course improve problem solving abilities.
Practical Predictive Analytics: Models and Methods
Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems. Learning Goals: After completing this course, you will be able to: 1. Design effective experiments and analyze the results 2. Use resampling methods to make clear and bulletproof statistical arguments without invoking esoteric notation 3. Explain and apply a core set of classification methods of increasing complexity (rules, trees, random forests), and associated optimization methods (gradient descent and variants) 4. Explain and apply a set of unsupervised learning concepts and methods 5. Describe the common idioms of large-scale graph analytics, including structural query, traversals and recursive queries, PageRank, and community detection
Documentation and Usability for Cancer Informatics
Introduction: Cancer datasets are plentiful, complicated, and hold information that may be critical for the next research advancements. In order to use these data to their full potential, researchers are dependent on the specialized data tools that are continually being published and developed. Bioinformatics tools can often be unfriendly to their users, who often have little to no background in programming (Bolchini et al. 2008). The usability and quality of the documentation of a tool can be a major factor in how efficiently a researcher is able to obtain useful findings for the next steps of their research. Increasing the usability and quality of documentation for a tool is not only helpful for the researcher users, but also for the developers themselves – the many hours of work put into the product will have a higher impact if the tool is usable by the target user community. 70% of bioinformatics tools surveyed by Duck et al. (2016) were not reused beyond their introductory publication. Even the most well-programmed tool will be overlooked by the user community if there is little to no user-friendly documentation or if they were not designed with the user in mind. Target Audience: The course is intended for cancer informatics tool developers, particularly those creating tools as a part of the Informatics Technology Cancer Research. Learning Objectives: 1. Understanding why usability and documentation is vital 2. Identifying your user community 3. Building documentation and tutorials to maximize the usability of developed tools 4. Obtaining feedback from your users Curriculum: This course will demonstrate how to: Understanding why usability and documentation is vital, Identifying your user community, Building documentation and tutorials to maximize the usability of developed tools, Obtaining feedback from your users The course includes a hands-on exercises with templates for building documentation and tutorials for cancer informatics tools. Individuals who take this course are encouraged to use these templates as they follow along with the course material to help increase the usability of their informatics tool. This course is part of a series of courses for the Informatics Technology for Cancer Research (ITCR) called the Informatics Technology for Cancer Research Education Resource. This material was created by the ITCR Training Network (ITN) which is a collaborative effort of researchers around the United States to support cancer informatics and data science training through resources, technology, and events. This initiative is funded by the following grant: National Cancer Institute (NCI) UE5 CA254170. Our courses feature tools developed by ITCR Investigators and make it easier for principal investigators, scientists, and analysts to integrate cancer informatics into their workflows. Please see our website at www.itcrtraining.org for more information.
Get Started with Microsoft Forms
There are many uses for Microsoft online. In this project you will explore Microsoft Forms. You will be able to create surveys, polls, quizzes and invitations for free using the web based Microsoft Forms application. You can tailor the form that you make by adding special fonts, various answer choices and images. Data collection is made easy with the program and you can even track your responses and perform data analytics with your results.
The Data Scientist’s Toolbox
In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.
Visualizing Data & Communicating Results in R with RStudio
Code and run your first R program in minutes without installing anything! This course is designed for learners with limited coding experience, providing foundational knowledge of data visualizations and R Markdown. The modules in this course cover different types of visualization models such as bar charts, histograms, and heat maps as well as R Markdown. Completion of the previous course (Data Analysis in R with RStudio & Tidyverse) in this specialization or similar experience is recommended. To allow for a truly hands-on, self-paced learning experience, this course is video-free. Assignments contain short explanations with images and runnable code examples with suggested edits to explore code examples further, building a deeper understanding by doing. You’ll benefit from instant feedback from a variety of assessment items along the way, gently progressing from quick understanding checks (multiple choice, fill in the blank, and un-scrambling code blocks) to small, approachable coding exercises that take minutes instead of hours. Finally, a cumulative lab at the end of the course will provide you an opportunity to apply all learned concepts within a real-world context.
Data Processing with Azure
This Azure training course is designed to equip students with the knowledge need to process, store and analyze data for making informed business decisions. Through this Azure course, the student will understand what big data is along with the importance of big data analytics, which will improve the students mathematical and programming skills. Students will learn the most effective method of using essential analytical tools such as Python, R, and Apache Spark.