Back to Courses

Data Analysis Courses - Page 29

Showing results 281-290 of 998
Digital Marketing Analytics in Theory
Successfully marketing brands today requires a well-balanced blend of art and science. This course introduces students to the science of web analytics while casting a keen eye toward the artful use of numbers found in the digital space. The goal is to provide the foundation needed to apply data analytics to real-world challenges marketers confront daily. Digital Analytics for Marketing Professionals: Marketing Analytics in Theory is the first in a two-part series of complementary courses and focuses on the background information and frameworks analysts need to be successful in today's digital business world. You will be able to: - Identify the web analytic tool right for your specific needs - Understand valid and reliable ways to collect, analyze, and visualize data from the web - Utilize data in decision making for agencies, organizations, or clients This course is part of Gies College of Business’ suite of online programs, including the iMBA and iMSM. Learn more about admission into these programs and explore how your Coursera work can be leveraged if accepted into a degree program at https://degrees.giesbusiness.illinois.edu/idegrees/.
Distributed Computing with Spark SQL
This course is all about big data. It’s for students with SQL experience that want to take the next step on their data journey by learning distributed computing using Apache Spark. Students will gain a thorough understanding of this open-source standard for working with large datasets. Students will gain an understanding of the fundamentals of data analysis using SQL on Spark, setting the foundation for how to combine data with advanced analytics at scale and in production environments. The four modules build on one another and by the end of the course you will understand: the Spark architecture, queries within Spark, common ways to optimize Spark SQL, and how to build reliable data pipelines. The first module introduces Spark and the Databricks environment including how Spark distributes computation and Spark SQL. Module 2 covers the core concepts of Spark such as storage vs. compute, caching, partitions, and troubleshooting performance issues via the Spark UI. It also covers new features in Apache Spark 3.x such as Adaptive Query Execution. The third module focuses on Engineering Data Pipelines including connecting to databases, schemas and data types, file formats, and writing reliable data. The final module covers data lakes, data warehouses, and lakehouses. Students build production grade data pipelines by combining Spark with the open-source project Delta Lake. By the end of this course, students will hone their SQL and distributed computing skills to become more adept at advanced analysis and to set the stage for transitioning to more advanced analytics as Data Scientists.
Setting Up Cost Control with Quota
In this lab you will query a large dataset, update the BigQuery API quota, and then optimize your query to run within quota.
SAS® Programming for Distributed Computing in SAS® Viya®
Welcome to the SAS Programming for Distributed Computing in SAS Viya course. SAS Viya is an AI, analytic and data management platform running on a scalable, distributed, cloud-native architecture. In this course you will learn how to modify existing Base SAS programs to execute in SAS Viya. The programs you create will leverage the power of SAS Cloud Analytic Services (CAS) to access, manage, and analyze in-memory tables. This is an advanced course, intended for learners with SAS programming experience. To be successful, you should be able to access data via SAS libraries, read and prepare data with the DATA step, query data using PROC SQL, and summarize data with the MEANS and FREQ procedures. This foundational knowledge can be acquired in the Coursera SAS Programmer specialization. By the end of the course, you will be able to: - Load data into SAS Cloud Analytic Services - Modify DATA step and SQL procedure code to execute in CAS - Use CAS-enabled procedures - Write CASL code to execute CAS actions
Basic Recommender Systems
This course introduces you to the leading approaches in recommender systems. The techniques described touch both collaborative and content-based approaches and include the most important algorithms used to provide recommendations. You'll learn how they work, how to use and how to evaluate them, pointing out benefits and limits of different recommender system alternatives. After completing this course, you'll be able to describe the requirements and objectives of recommender systems based on different application domains. You'll know how to distinguish recommender systems according to their input data, their internal working mechanisms, and their goals. You’ll have the tools to measure the quality of a recommender system and to incrementally improve it with the design of new algorithms. You'll learn as well how to design recommender systems tailored for new application domains, also considering surrounding social and ethical issues such as identity, privacy, and manipulation. Providing affordable, personalised and high-quality recommendations is always a challenge! This course also leverages two important EIT Overarching Learning Outcomes (OLOs), related to creativity and innovation skills. In trying to design a new recommender system you need to think beyond boundaries and try to figure out how you can improve the quality of the predictions. You should also be able to use knowledge, ideas and technology to create new or significantly improved recommendation tools to support choice-making processes and strategies in different and innovative scenarios, for a better quality of life.
Importing Data in the Tidyverse
Getting data into your statistical analysis system can be one of the most challenging parts of any data science project. Data must be imported and harmonized into a coherent format before any insights can be obtained. You will learn how to get data into R from commonly used formats and harmonizing different kinds of datasets from different sources. If you work in an organization where different departments collect data using different systems and different storage formats, then this course will provide essential tools for bringing those datasets together and making sense of the wealth of information in your organization. This course introduces the Tidyverse tools for importing data into R so that it can be prepared for analysis, visualization, and modeling. Common data formats are introduced, including delimited files, spreadsheets and relational databases, and techniques for obtaining data from the web are demonstrated, such as web scraping and web APIs. In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.
Learning SAS: History and SAS Studio
In this 1.25-hour long project-based course, you will learn to explain the highlights of the history of SAS, how to access and explore SAS Studio and how to transfer a NOTEPAD file into SAS Studio. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
K-Means Clustering 101: World Happiness Report
In this case study, we will train an unsupervised machine learning algorithm to cluster countries based on features such as economic production, social support, life expectancy, freedom, absence of corruption, and generosity. The World Happiness Report determines the state of global happiness. The happiness scores and rankings data has been collected by asking individuals to rank their life from 0 (worst possible life) to 10 (best possible life).
Predictive Analytics and Data Mining
This course introduces students to the science of business analytics while casting a keen eye toward the artful use of numbers found in the digital space. The goal is to provide businesses and managers with the foundation needed to apply data analytics to real-world challenges they confront daily in their professional lives. Students will learn to identify the ideal analytic tool for their specific needs; understand valid and reliable ways to collect, analyze, and visualize data; and utilize data in decision making for their agencies, organizations or clients.
Supervised Machine Learning: Classification
This course introduces you to one of the main types of modeling families of supervised Machine Learning: Classification. You will learn how to train predictive models to classify categorical outcomes and how to use error metrics to compare across different models. The hands-on section of this course focuses on using best practices for classification, including train and test splits, and handling data sets with unbalanced classes. By the end of this course you should be able to: -Differentiate uses and applications of classification and classification ensembles -Describe and use logistic regression models -Describe and use decision tree and tree-ensemble models -Describe and use other ensemble methods for classification -Use a variety of error metrics to compare and select the classification model that best suits your data -Use oversampling and undersampling as techniques to handle unbalanced classes in a data set   Who should take this course? This course targets aspiring data scientists interested in acquiring hands-on experience with Supervised Machine Learning Classification techniques in a business setting.   What skills should you have? To make the most out of this course, you should have familiarity with programming on a Python development environment, as well as fundamental understanding of Data Cleaning, Exploratory Data Analysis, Calculus, Linear Algebra, Probability, and Statistics.