Back to Courses

Data Analysis Courses - Page 37

Showing results 361-370 of 998
Convolutions for Text Classification with Keras
Welcome to this hands-on, guided introduction to Text Classification using 1D Convolutions with Keras. By the end of this project, you will be able to apply word embeddings for text classification, use 1D convolutions as feature extractors in natural language processing (NLP), and perform binary text classification using deep learning. As a case study, we will work on classifying a large number of Wikipedia comments as being either toxic or not (i.e. comments that are rude, disrespectful, or otherwise likely to make someone leave a discussion). This issue is especially important, given the conversations the global community and tech companies are having on content moderation, online harassment, and inclusivity. The data set we will use comes from the Toxic Comment Classification Challenge on Kaggle. To complete this guided project, we recommend that you have prior experience in Python programming, deep learning theory, and have used either Tensorflow or Keras to build deep learning models. We assume you have this foundational knowledge and want to learn how to use convolutions in NLP tasks such as classification. Note: This course works best for learners based in the North America region. We’re currently working on providing the same experience in other regions.
Clinical Data Models and Data Quality Assessments
This course aims to teach the concepts of clinical data models and common data models. Upon completion of this course, learners will be able to interpret and evaluate data model designs using Entity-Relationship Diagrams (ERDs), differentiate between data models and articulate how each are used to support clinical care and data science, and create SQL statements in Google BigQuery to query the MIMIC3 clinical data model and the OMOP common data model.
Generate an Invoice with LibreOffice Base
By the end of this project, you will have used a LibreOffice Base query to retrieve data from a database and used the query results to build an invoice with the LibreOffice Base reporting feature. An invoice is a typical document used by many organizations to bill customers for products or services. Creating the invoice is a two-step process: retrieve the data and display it as an invoice. While retrieving the correct data is an essential skill for a database application developer, arranging and presenting the data in a format that a user finds useful is just as important. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Interactive Geospatial Visualization:Kepler GL & Jupyter Lab
In this 1-hour long project-based course, you will learn how to easily create beautiful data visualization with Kepler inside Jupyter Notebooks and effectively design different geospatial data visualizations.
Data Processing and Feature Engineering with MATLAB
In this course, you will build on the skills learned in Exploratory Data Analysis with MATLAB to lay the foundation required for predictive modeling. This intermediate-level course is useful to anyone who needs to combine data from multiple sources or times and has an interest in modeling. These skills are valuable for those who have domain knowledge and some exposure to computational tools, but no programming background. To be successful in this course, you should have some background in basic statistics (histograms, averages, standard deviation, curve fitting, interpolation) and have completed Exploratory Data Analysis with MATLAB. Throughout the course, you will merge data from different data sets and handle common scenarios, such as missing data. In the last module of the course, you will explore special techniques for handling textual, audio, and image data, which are common in data science and more advanced modeling. By the end of this course, you will learn how to visualize your data, clean it up and arrange it for analysis, and identify the qualities necessary to answer your questions. You will be able to visualize the distribution of your data and use visual inspection to address artifacts that affect accurate modeling.
Hypothesis Testing with Python and Excel
In today's job market, leaders need to understand the fundamentals of data to be competitive. An essential procedure to understand business and analytics is hypothesis testing. This short course, designed by Tufts University expert faculty, will teach the fundamentals of hypothesis testing of a population mean and a population proportion, using Excel and Python for calculations. You'll also discover the central limit theorem, which is essential for hypothesis testing. To conclude the course, you will apply your newfound skills by creating a plan for an experiment in your own workplace that uses hypothesis testing.
Data Science with R - Capstone Project
In this capstone course, you will apply various data science skills and techniques that you have learned as part of the previous courses in the IBM Data Science with R Specialization or IBM Data Analytics with Excel and R Professional Certificate. For this project, you will assume the role of a Data Scientist who has recently joined an organization and be presented with a challenge that requires data collection, analysis, basic hypothesis testing, visualization, and modeling to be performed on real-world datasets. You will collect and understand data from multiple sources, conduct data wrangling and preparation with Tidyverse, perform exploratory data analysis with SQL, Tidyverse and ggplot2, model data with linear regression, create charts and plots to visualize the data, and build an interactive dashboard. The project will culminate with a presentation of your data analysis report, with an executive summary for the various stakeholders in the organization.
Simulation of KANBAN Production Control Using R Simmer
Understand Kanban Production Control Model Discrete Event Simulations Using R Simmer Capture Simulation Data, Plot Charts & Interpret Results
Communicating Data Science Results
Important note: The second assignment in this course covers the topic of Graph Analysis in the Cloud, in which you will use Elastic MapReduce and the Pig language to perform graph analysis over a moderately large dataset, about 600GB. In order to complete this assignment, you will need to make use of Amazon Web Services (AWS). Amazon has generously offered to provide up to $50 in free AWS credit to each learner in this course to allow you to complete the assignment. Further details regarding the process of receiving this credit are available in the welcome message for the course, as well as in the assignment itself. Please note that Amazon, University of Washington, and Coursera cannot reimburse you for any charges if you exhaust your credit. While we believe that this assignment contributes an excellent learning experience in this course, we understand that some learners may be unable or unwilling to use AWS. We are unable to issue Course Certificates for learners who do not complete the assignment that requires use of AWS. As such, you should not pay for a Course Certificate in Communicating Data Results if you are unable or unwilling to use AWS, as you will not be able to successfully complete the course without doing so. Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations. Just because you can make a prediction and convince others to act on it doesn’t mean you should. In this course you will explore the ethical considerations around big data and how these considerations are beginning to influence policy and practice. You will learn the foundational limitations of using technology to protect privacy and the codes of conduct emerging to guide the behavior of data scientists. You will also learn the importance of reproducibility in data science and how the commercial cloud can help support reproducible research even for experiments involving massive datasets, complex computational infrastructures, or both. Learning Goals: After completing this course, you will be able to: 1. Design and critique visualizations 2. Explain the state-of-the-art in privacy, ethics, governance around big data and data science 3. Use cloud computing to analyze large datasets in a reproducible way.
Introduction to Statistics
Stanford's "Introduction to Statistics" teaches you statistical thinking concepts that are essential for learning from data and communicating insights. By the end of the course, you will be able to perform exploratory data analysis, understand key principles of sampling, and select appropriate tests of significance for multiple contexts. You will gain the foundational skills that prepare you to pursue more advanced topics in statistical thinking and machine learning. Topics include Descriptive Statistics, Sampling and Randomized Controlled Experiments, Probability, Sampling Distributions and the Central Limit Theorem, Regression, Common Tests of Significance, Resampling, Multiple Comparisons.