Back to Courses

Data Science Courses - Page 10

Showing results 91-100 of 1407
Applied Text Mining in Python
This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling). This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.
Introduction to Machine Learning
This course will provide you a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction. In addition, we have designed practice exercises that will give you hands-on experience implementing these data science models on data sets. These practice exercises will teach you how to implement machine learning algorithms with PyTorch, open source libraries used by leading tech companies in the machine learning field (e.g., Google, NVIDIA, CocaCola, eBay, Snapchat, Uber and many more).
Object Detection with Amazon Sagemaker
Please note: You will need an AWS account to complete this course. Your AWS account will be charged as per your usage. Please make sure that you are able to access Sagemaker within your AWS account. If your AWS account is new, you may need to ask AWS support for access to certain resources. You should be familiar with python programming, and AWS before starting this hands on project. We use a Sagemaker P type instance in this project, and if you don't have access to this instance type, please contact AWS support and request access. In this 2-hour long project-based course, you will learn how to train and deploy an object detector using Amazon Sagemaker. Sagemaker provides a number of machine learning algorithms ready to be used for solving a number of tasks. We will use the SSD Object Detection algorithm from Sagemaker to create, train and deploy a model that will be able to localize faces of dogs and cats from the popular IIIT-Oxford Pets Dataset. Since this is a practical, project-based course, we will not dive in the theory behind deep learning based SSD or Object Detection, but will focus purely on training and deploying a model with Sagemaker. You will also need to have some experience with Amazon Web Services (AWS). Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Regression Models
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.
Python Project for Data Engineering
This mini-course is intended to apply foundational Python skills by implementing different techniques to collect and work with data. Assume the role of a Data Engineer and extract data from multiple file formats, transform it into specific datatypes, and then load it into a single source for analysis. Continue with the course and test your knowledge by implementing webscraping and extracting data with APIs all with the help of multiple hands-on labs. After completing this course you will have acquired the confidence to begin collecting large datasets from multiple sources and transform them into one primary source, or begin web scraping to gain valuable business insights all with the use of Python. PRE-REQUISITE: **Python for Data Science, AI and Development** course from IBM is a pre-requisite for this project course. Please ensure that before taking this course you have either completed the Python for Data Science, AI and Development course from IBM or have equivalent proficiency in working with Python and data. NOTE: This course is not intended to teach you Python and does not have too much instructional content. It is intended for you to apply prior Python knowledge.
Linear Regression and Multiple Linear Regression in Julia
This guided project is for those who want to learn how to use Julia for linear regression and multiple linear regression. You will learn what linear regression is, how to build linear regression models in Julia and how to test the performance of your model. While you are watching me code, you will get a cloud desktop with all the required software pre-installed. This will allow you to code along with me. After all, we learn best with active, hands-on learning. Special Features: 1) Work with real-world stock market data. 2) Best practices and tips are provided. 3) You get a copy of the jupyter notebook that you create which acts as a handy reference guide. Please note that the version of Julia used is 1.0.4 Note: This project works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Spatial Data Science and Applications
Spatial (map) is considered as a core infrastructure of modern IT world, which is substantiated by business transactions of major IT companies such as Apple, Google, Microsoft, Amazon, Intel, and Uber, and even motor companies such as Audi, BMW, and Mercedes. Consequently, they are bound to hire more and more spatial data scientists. Based on such business trend, this course is designed to present a firm understanding of spatial data science to the learners, who would have a basic knowledge of data science and data analysis, and eventually to make their expertise differentiated from other nominal data scientists and data analysts. Additionally, this course could make learners realize the value of spatial big data and the power of open source software's to deal with spatial data science problems. This course will start with defining spatial data science and answering why spatial is special from three different perspectives - business, technology, and data in the first week. In the second week, four disciplines related to spatial data science - GIS, DBMS, Data Analytics, and Big Data Systems, and the related open source software's - QGIS, PostgreSQL, PostGIS, R, and Hadoop tools are introduced together. During the third, fourth, and fifth weeks, you will learn the four disciplines one by one from the principle to applications. In the final week, five real world problems and the corresponding solutions are presented with step-by-step procedures in environment of open source software's.
Probability Theory: Foundation for Data Science
Understand the foundations of probability and its relationship to statistics and data science.  We’ll learn what it means to calculate a probability, independent and dependent outcomes, and conditional events.  We’ll study discrete and continuous random variables and see how this fits with data collection.  We’ll end the course with Gaussian (normal) random variables and the Central Limit Theorem and understand its fundamental importance for all of statistics and data science. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder Logo adapted from photo by Christopher Burns on Unsplash.
Preparing Data for Machine Learning Models
By the end of this project, you will extract colors pixels as training dataset into a form where you can feed it to your Machine Learning Model using numpy arrays. In this project we will work with images, you will get introduced to computer vision basic concepts. Moreover, you will be able to properly handle arrays and preprocess your training dataset and label it. Extracting features and preparing data is a very crucial task as it influences your model. So you will start to learn the basics of handling the data into the format where it would be accepted by a Machine Learning algorithm as Training Dataset.
How to Build a BI Dashboard Using Google Data Studio and BigQuery
This is a self-paced lab that takes place in the Google Cloud console. Learn how to build a BI dashboard with Data Studio as the front end, powered by BigQuery on the back end