Back to Courses

Data Analysis Courses - Page 34

Showing results 331-340 of 998
CASL Programming for Distributed Computing in SAS® Viya®
Welcome to the CASL Programming for Distributed Computing in SAS Viya course. SAS Viya is an AI, analytic and data management platform running on a scalable, distributed, cloud-native architecture. In this course you will learn how how to use the native CAS programming language (CASL) to leverage SAS Cloud Analytics Services (CAS), the high-performance, in-memory analytics and distributed computing engine in SAS Viya . You will learn how to use CASL to access, explore, prepare, analyze, and summarize data in the CAS server's massively parallel processing environment. This is an advanced course, intended for learners with at least one year of programming experience with a modern language: (SAS, R, Python, SQL, and so on), and at least one year of experience working with data. To be successful in this course, you should have a general understanding of fundamental computer programming concepts and the data analytics lifecycle. By the end of the course, you will be able to: - Understand and use various SAS Viya servers. - Connect to the CAS server to access and manage data. - Use CASL to explore, prepare and analyze data. - Create reports and visualizations using SAS Viya.
Doing More with SAS Programming
This course is for business analysts and SAS programmers who want to learn data manipulation techniques using the SAS DATA step and procedures to access, transform, and summarize data. The course builds on the concepts that are presented in the Getting Started with SAS Programming course and is not recommended for beginning SAS software users. In this course you learn how to understand and control DATA step processing, create an accumulating column and process data in groups, manipulate data with functions, convert column type, create custom formats, concatenate and merge tables, process repetitive code, and restructure tables. This course addresses Base SAS software. Before attending this course, you should be able to write DATA step code to access data, subset rows and columns, compute new columns, and process data conditionally. You should also be able to sort tables using the SORT procedure and apply SAS formats.
Introduction to Spreadsheets and Models
The simple spreadsheet is one of the most powerful data analysis tools that exists, and it’s available to almost anyone. Major corporations and small businesses alike use spreadsheet models to determine where key measures of their success are now, and where they are likely to be in the future. But in order to get the most out of a spreadsheet, you have the know-how to use it. This course is designed to give you an introduction to basic spreadsheet tools and formulas so that you can begin harness the power of spreadsheets to map the data you have now and to predict the data you may have in the future. Through short, easy-to-follow demonstrations, you’ll learn how to use Excel or Sheets so that you can begin to build models and decision trees in future courses in this Specialization. Basic familiarity with, and access to, Excel or Sheets is required.
Business Statistics and Analysis Capstone
The Business Statistics and Analysis Capstone is an opportunity to apply various skills developed across the four courses in the specialization to a real life data. The Capstone, in collaboration with an industry partner uses publicly available ‘Housing Data’ to pose various questions typically a client would pose to a data analyst. Your job is to do the relevant statistical analysis and report your findings in response to the questions in a way that anyone can understand. Please remember that this is a Capstone, and has a degree of difficulty/ambiguity higher than the previous four courses. The aim being to mimic a real life application as close as possible.
Machine Learning Introduction for Everyone
This three-module course introduces machine learning and data science for everyone with a foundational understanding of machine learning models. You’ll learn about the history of machine learning, applications of machine learning, the machine learning model lifecycle, and tools for machine learning. You’ll also learn about supervised versus unsupervised learning, classification, regression, evaluating machine learning models, and more. Our labs give you hands-on experience with these machine learning and data science concepts. You will develop concrete machine learning skills as well as create a final project demonstrating your proficiency. After completing this program, you’ll be able to realize the potential of machine learning algorithms and artificial intelligence in different business scenarios. You’ll be able to identify when to use machine learning to explain certain behaviors and when to use it to predict future outcomes. You’ll also learn how to evaluate your machine learning models and to incorporate best practices. In addition to receiving a certificate from Coursera, you'll also earn an IBM Badge to help you share your accomplishments with your network and potential employer. This Course Is Part of Multiple Programs You can also leverage the learning from the program to complete the remaining two courses of the six-course IBM Machine Learning Professional Certificate and power a new career in the field of machine learning.
COVID19 Data Analysis Using Python
In this project, you will learn how to preprocess and merge datasets to calculate needed measures and prepare them for an Analysis. In this project, we are going to work with the COVID19 dataset, published by John Hopkins University, which consists of the data related to the cumulative number of confirmed cases, per day, in each Country. Also, we have another dataset consist of various life factors, scored by the people living in each country around the globe. We are going to merge these two datasets to see if there is any relationship between the spread of the virus in a country and how happy people are, living in that country. Notes: This project works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Get Started with R Markdown
Welcome to this project-based course, Get Started with R Markdown. This project-based course is for people who are learning R and seek useful ways to organize their work in R. We will start this hands-on project with an overview of the project; then, we will get familiar with the RStudio interface and install the rmarkdown package. Be rest assured that you will learn a ton of good work here. In this project, you will learn about R Markdowns and its’ usefulness to you as an R user. By the end of this 2-hour long project, you will be able to create an R Markdown, understand the different components of the file, knit the file as an HTML document or a pdf document and write some R Markdown commands. By extension, we will learn how to publish the knitted document on RPubs. This project aims at learners looking to get started using the R programming language to create reproducible documents. There are no hard prerequisites, and any competent computer user should complete the project successfully.
Algorithms for DNA Sequencing
We will learn computational methods -- algorithms and data structures -- for analyzing DNA sequencing data. We will learn a little about DNA, genomics, and how DNA sequencing is used. We will use Python to implement key algorithms and data structures and to analyze real genomes and DNA sequencing datasets.
Data-Driven Process Improvement
By the end of this course, learners are empowered to implement data-driven process improvement objectives at their organization. The course covers: the business case for IoT (Internet of Things), the strategic importance of aligning operations and performance goals, best practices for collecting data, and facilitating a process mapping activity to visualize and analyze a process’s flow of materials and information. Learners are prepared to focus efforts around business needs, evaluate what the organization should measure, discern between different types of IoT data and collect key performance indicators (KPIs) using IoT technology. Learners have the opportunity to implement process improvement objectives in a mock scenario and consider how the knowledge can be transferred to their own organizational contexts. Material includes online lectures, videos, demos, project work, readings and discussions. This course is ideal for individuals keen on developing a data-driven mindset that derives powerful insights useful for improving a company’s bottom line. It is helpful if learners have some familiarity with reading reports, gathering and using data, and interpreting visualizations. It is the first course in the Data-Driven Decision Making (DDDM) specialization. To learn more about the specialization, check out a video overview at https://www.youtube.com/watch?v=Oi4mmeSWcVc&list=PLQvThJe-IglyYljMrdqwfsDzk56ncfoLx&index=11.
Clinical Natural Language Processing
This course teaches you the fundamentals of clinical natural language processing (NLP). In this course you will learn the basic linguistic principals underlying NLP, as well as how to write regular expressions and handle text data in R. You will also learn practical techniques for text processing to be able to extract information from clinical notes. Finally, you will have a chance to put your skills to the test with a real-world practical application where you develop text processing algorithms to identify diabetic complications from clinical notes. You will complete this work using a free, online computational environment for data science hosted by our Industry Partner Google Cloud.