Back to Courses

Data Science Courses - Page 103

Showing results 1021-1030 of 1407
Building Data Visualization Tools
The data science revolution has produced reams of new data from a wide variety of new sources. These new datasets are being used to answer new questions in way never before conceived. Visualization remains one of the most powerful ways draw conclusions from data, but the influx of new data types requires the development of new visualization techniques and building blocks. This course provides you with the skills for creating those new visualization building blocks. We focus on the ggplot2 framework and describe how to use and extend the system to suit the specific needs of your organization or team. Upon completing this course, learners will be able to build the tools needed to visualize a wide variety of data types and will have the fundamentals needed to address new data types as they come about.
Big Data Science with the BD2K-LINCS Data Coordination and Integration Center
The Library of Integrative Network-based Cellular Signatures (LINCS) is an NIH Common Fund program. The idea is to perturb different types of human cells with many different types of perturbations such as: drugs and other small molecules; genetic manipulations such as knockdown or overexpression of single genes; manipulation of the extracellular microenvironment conditions, for example, growing cells on different surfaces, and more. These perturbations are applied to various types of human cells including induced pluripotent stem cells from patients, differentiated into various lineages such as neurons or cardiomyocytes. Then, to better understand the molecular networks that are affected by these perturbations, changes in level of many different variables are measured including: mRNAs, proteins, and metabolites, as well as cellular phenotypic changes such as changes in cell morphology. The BD2K-LINCS Data Coordination and Integration Center (DCIC) is commissioned to organize, analyze, visualize and integrate this data with other publicly available relevant resources. In this course we briefly introduce the DCIC and the various Centers that collect data for LINCS. We then cover metadata and how metadata is linked to ontologies. We then present data processing and normalization methods to clean and harmonize LINCS data. This follow discussions about how data is served as RESTful APIs. Most importantly, the course covers computational methods including: data clustering, gene-set enrichment analysis, interactive data visualization, and supervised learning. Finally, we introduce crowdsourcing/citizen-science projects where students can work together in teams to extract expression signatures from public databases and then query such collections of signatures against LINCS data for predicting small molecules as potential therapeutics.
Factorial and Fractional Factorial Designs
Many experiments in engineering, science and business involve several factors. This course is an introduction to these types of multifactor experiments. The appropriate experimental strategy for these situations is based on the factorial design, a type of experiment where factors are varied together. This course focuses on designing these types of experiments and on using the ANOVA for analyzing the resulting data. These types of experiments often include nuisance factors, and the blocking principle can be used in factorial designs to handle these situations. As the number of factors of interest grows full factorials become too expensive and fractional versions of the factorial design are useful. This course will cover the benefits of fractional factorials, along with methods for constructing and analyzing the data from these experiments.
Excel Regression Models for Business Forecasting
This course allows learners to explore Regression Models in order to utilise these models for business forecasting. Unlike Time Series Models, Regression Models are causal models, where we identify certain variables in our business that influence other variables. Regressions model this causality, and then we can use these models in order to forecast, and then plan for our business' needs. We will explore simple regression models, multiple regression models, dummy variable regressions, seasonal variable regressions, as well as autoregressions. Each of these are different forms of regression models, tailored to unique business scenarios, in order to forecast and generate business intelligence for organisations.
Inventory Management
Inventory is a strategic asset for organizations. The effective management of inventory can minimize a company’s spending while dramatically increasing its profit. In this course, we will explore how to use data science to manage inventory in uncertain environments, how to set inventory levels based on customer service requirements, and how to calculate inventory for products that have short sales cycles.
Translate Text with the Cloud Translation API
This is a self-paced lab that takes place in the Google Cloud console. The Cloud Translation allows you to translate an arbitrary string into any supported language. In this hands-on lab you’ll learn how to use the Cloud Translation API to translate text and detect the language of text if the language is unknown.
Custom Prediction Routine on Google AI Platform
Please note: You will need a Google Cloud Platform account to complete this course. Your GCP account will be charged as per your usage. Please make sure that you are able to access Google AI Platform within your GCP account. You should be familiar with python programming, and Google Cloud Platform before starting this hands on project. Please also ensure that you have access to the custom prediction routine feature in Google AI Platform. In this 2-hour long project-based course, you will learn how to deploy, and use a model on Google’s AI Platform. Normally, any model trained with the TensorFlow framework is quite easy to deploy, and you can simply upload a Saved Model on Google Storage, and create an AI Platform model with it. But, in practice, we may not always use TensorFlow. Fortunately, the AI Platform allows for custom prediction routines as well and that’s what we are going to focus on. Instead of converting a Keras model to a TensorFlow Saved Model, we will use the h5 file as is. Additionally, since we will be working with image data, we will use this opportunity to look at encoding and decoding of byte data into string for data transmission and then encoding of the received data in our custom prediction routine on the AI Platform before using it with our model. This course runs on Coursera's hands-on project platform called Rhyme. On Rhyme, you do projects in a hands-on manner in your browser. You will get instant access to pre-configured cloud desktops containing all of the software and data you need for the project. Everything is already set up directly in your Internet browser so you can just focus on learning. For this project, you’ll get instant access to a cloud desktop with (e.g. Python, Jupyter, and Tensorflow) pre-installed. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Using Advanced Formulas and Functions in Excel
In this project, you will learn the advanced formulas and functions in Excel to perform analysis on several different topics. First, we will review basic formulas and functions and take a tour through the many choices in the Excel Formulas tab. Then we will explore the advanced financial formulas and functions followed by logic and text. Last, we will learn many ways to perform lookup and reference type queries. Throughout the project, you will work through some examples that will show you how to apply the formulas and functions you have learned.
Data Visualization using dplyr and ggplot2 in R
Welcome to this project-based course Data Visualization using ggplot2 and dplyr in R. In this project, you will learn how to manipulate data with the dplyr package and create beautiful plots using the ggplot2 package in R. By the end of this 2-hour long project, you will understand how to use different dplyr verbs such as the select verb, filter verb, arrange verb, mutate verb, summarize verb, and the group_by verb to manipulate the gapminder dataset. You will also learn how to use the ggplot2 package to render beautiful plots from the data returned from using the dplyr verbs. Note that this is a follow-up to the project on Data Manipulation with dplyr in R. I recommend that you take the Data Manipulation with dplyr in R project before taking this project. This will give you better experience working on this project.
Simulation of Inventory Replenishment Using R Simmer
By the end of this project, you will gain introductiory knowledge of Discrete Event Simulation, Inventory Replenishment, be able to use R Studio and Simmer library, create statistical variables required for simulation, define process trajectory, define and assign resources, define arrivals (eg. incoming customers / work units), run simulation in R, store results in data frames, plot charts and interpret the results.