Back to Courses

Data Management Courses - Page 5

Showing results 41-50 of 399
Linux on LinuxONE
This course is for Linux Systems Administrators, Architects and Developers who are already familiar with Linux components and everyday tasks, but need a primer on how to best take advantage of the LinuxONE platform. This includes working with the hardware, software, facilities, and processes unique to LinuxONE.  It is comprised of videos, links to online resources, and a final test for a badge. 
Tidy Messy Data using tidyr in R
As data enthusiasts and professionals, our work often requires dealing with data in different forms. In particular, messy data can be a big challenge because the quality of your analysis largely depends on the quality of the data. This project-based course, "Tidy Messy Data using tidyr in R," is intended for beginner and intermediate R users with related experiences who are willing to advance their knowledge and skills. In this course, you will learn practical ways for data cleaning, reshaping, and transformation using R. You will learn how to use different tidyr functions like pivot_longer(), pivot_wider(), separate_rows(), separate(), and others to achieve the tidy data principles. By the end of this 2-hour-long project, you will get hands-on massaging data to put in the proper format. By extension, you will learn to create plots using ggplot(). This project-based course is a beginner to an intermediate-level course in R. Therefore, to get the most out of this project, it is essential to have a basic understanding of using R. Specifically, you should be able to load data into R and understand how the pipe function works. It will be helpful to complete my previous project titled "Data Manipulation with dplyr in R."
Databases and SQL for Data Science with Python
Working knowledge of SQL (or Structured Query Language) is a must for data professionals like Data Scientists, Data Analysts and Data Engineers. Much of the world's data resides in databases. SQL is a powerful language used for communicating with and extracting data from databases. In this course you will learn SQL inside out- from the very basics of Select statements to advanced concepts like JOINs. You will: -write foundational SQL statements like: SELECT, INSERT, UPDATE, and DELETE -filter result sets, use WHERE, COUNT, DISTINCT, and LIMIT clauses -differentiate between DML & DDL -CREATE, ALTER, DROP and load tables -use string patterns and ranges; ORDER and GROUP result sets, and built-in database functions -build sub-queries and query data from multiple tables -access databases as a data scientist using Jupyter notebooks with SQL and Python -work with advanced concepts like Stored Procedures, Views, ACID Transactions, Inner & Outer JOINs Through hands-on labs and projects, you will practice building SQL queries, work with real databases on the Cloud, and use real data science tools. In the final project you’ll analyze multiple real-world datasets to demonstrate your skills.
Publication-Ready Tables in R
Learn how to create Publication-Ready Tables in R for descriptive statistics, contingency tables, correlation tables, model summary tables and survival probabilities tables
Designing and Querying Bigtable Schemas
This is a self-paced lab that takes place in the Google Cloud console. In this lab, you explore a Bigtable instance and use the Bigtable CLI (cbt CLI) to query data in Bigtable. You also design a table schema and row key using best practices for Bigtable.
Monitoring and Managing Bigtable Health and Performance
This is a self-paced lab that takes place in the Google Cloud console. In this lab, you monitor disk and CPU usage in a Bigtable instance, update an existing cluster to apply node autoscaling, implement replication in an instance, and back up and restore data in Bigtable.
Processing Data with Google Cloud Dataflow
This is a self-paced lab that takes place in the Google Cloud console. In this lab you will simulate a real-time real world data set from a historical data set. This simulated data set will be processed from a set of text files using Python and Google Cloud DataFlow, and the resulting simulated real-time data will be stored in Google BigQuery.
Creating Permanent Tables and Access-Controlled Views in BigQuery
This is a self-paced lab that takes place in the Google Cloud console. This lab focuses on how to create new permanent reporting tables and logical reviews from an existing ecommerce dataset.
Visualizing static networks with R
In daily life, our connections with family and friends form our social networks. Across the country, roads between different places form transportation networks. In research areas, collaborations among different researchers form research collaboration networks. Visible or invisible, networks exist in many aspects of our life. Being able to visualize networks will help us understand relationships embedded in complicated network information. In this project, learners will visualize various types of static networks of marvel heroes using the igraph package and base R plot functions. We can easily use static networks in reports and presentations. A good handle of this method will help learners, from both academia and industry, quickly express informative relationships and connections among different variables.
Cloud Spanner: Qwik Start
This is a self-paced lab that takes place in the Google Cloud console. This lab shows how to perform basic operations in Cloud Spanner using the Google Cloud Platform Console. Watch the short video Get a Highly Consistent, Scalable Database Service with Cloud Spanner.