Back to Courses

Data Science Courses - Page 11

Showing results 101-110 of 1407
Analyzing Big Data with SQL
In this course, you'll get an in-depth look at the SQL SELECT statement and its main clauses. The course focuses on big data SQL engines Apache Hive and Apache Impala, but most of the information is applicable to SQL with traditional RDBMs as well; the instructor explicitly addresses differences for MySQL and PostgreSQL. By the end of the course, you will be able to • explore and navigate databases and tables using different tools; • understand the basics of SELECT statements; • understand how and why to filter results; • explore grouping and aggregation to answer analytic questions; • work with sorting and limiting results; and • combine multiple tables in different ways. To use the hands-on environment for this course, you need to download and install a virtual machine and the software on which to run it. Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements: • Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work) • 8 GB RAM or more • 25GB free disk space or more • Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS) • For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)
Big data and Language 1
In this course, students will understand characteristics of language through big data. Students will learn how to collect and analyze big data, and find linguistic features from the data. A number of approaches to the linguistic analysis of written and spoken texts will be discussed. The class will consist of lecture videos which are approximately 1 hour and a quiz for each week. There will be a final project which requires students to conduct research on text data and language.
Applied Text Mining in Python
This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling). This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.
Introduction to Machine Learning
This course will provide you a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction. In addition, we have designed practice exercises that will give you hands-on experience implementing these data science models on data sets. These practice exercises will teach you how to implement machine learning algorithms with PyTorch, open source libraries used by leading tech companies in the machine learning field (e.g., Google, NVIDIA, CocaCola, eBay, Snapchat, Uber and many more).
Object Detection with Amazon Sagemaker
Please note: You will need an AWS account to complete this course. Your AWS account will be charged as per your usage. Please make sure that you are able to access Sagemaker within your AWS account. If your AWS account is new, you may need to ask AWS support for access to certain resources. You should be familiar with python programming, and AWS before starting this hands on project. We use a Sagemaker P type instance in this project, and if you don't have access to this instance type, please contact AWS support and request access. In this 2-hour long project-based course, you will learn how to train and deploy an object detector using Amazon Sagemaker. Sagemaker provides a number of machine learning algorithms ready to be used for solving a number of tasks. We will use the SSD Object Detection algorithm from Sagemaker to create, train and deploy a model that will be able to localize faces of dogs and cats from the popular IIIT-Oxford Pets Dataset. Since this is a practical, project-based course, we will not dive in the theory behind deep learning based SSD or Object Detection, but will focus purely on training and deploying a model with Sagemaker. You will also need to have some experience with Amazon Web Services (AWS). Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Regression Models
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.
Modeling Data in the Tidyverse
Developing insights about your organization, business, or research project depends on effective modeling and analysis of the data you collect. Building effective models requires understanding the different types of questions you can ask and how to map those questions to your data. Different modeling approaches can be chosen to detect interesting patterns in the data and identify hidden relationships. This course covers the types of questions you can ask of data and the various modeling approaches that you can apply. Topics covered include hypothesis testing, linear regression, nonlinear modeling, and machine learning. With this collection of tools at your disposal, as well as the techniques learned in the other courses in this specialization, you will be able to make key discoveries from your data for improving decision-making throughout your organization. In this specialization we assume familiarity with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.
Database Design and Basic SQL in PostgreSQL
In this course you will learn more about the historical design of databases and the use of SQL in the PostgreSQL environment. Using SQL techniques and common commands (INSERT INTO, WHERE, ORDER BY, ON DELETE CASCADE, etc) will enable you to create tables, column types and define the schema of your data in PostgreSQL. You will learn about data modeling and how to represent one-to-many and many-to-many relationships in PostgreSQL. Students will do hands-on assignments creating tables, inserting data, designing data models, creating relational structures and inserting and querying relational data in tables.
Precalculus: Mathematical Modeling
This course helps to build the foundational material to use mathematics as a tool to model, understand, and interpret the world around us. This is done through studying functions, their properties, and applications to data analysis. Concepts of precalculus provide the set of tools for the beginning student to begin their scientific career, preparing them for future science and calculus courses. This course is designed for all students, not just those interested in further mathematics courses. Students interested in the natural sciences, computer sciences, psychology, sociology, or similar will genuinely benefit from this introductory course, applying the skills learned to their discipline to analyze and interpret their subject material. Students will be presented with not only new ideas, but also new applications of an old subject. Real-life data, exercise sets, and regular assessments help to motivate and reinforce the content in this course, leading to learning and mastery.
Wrangling Data for Data Analysts with Python
By the end of this project, you will be able to analyze and data and answer three different questions by Data wrangling which is the process of gathering, selecting, and transforming data to answer an analytical question using Python. In this project, you will be able to gather the data for the whole year of 2020 and query it from the Quandl website using its API. It’s a free website for dummy data. You will be able to convert the returned JSON data into a Python dictionary. And you will be able to analyze this data to calculate the highest and lowest prices in this period, the biggest change based on High and Low price during this year, And finally, the average makeover during this year. This guided project is for people in the field of business and data analysis. people who want to wrangle data and answer business questions and Clarify the use case and predict the relations between the source data. It provides you with important steps to be a data analyst. Moreover, it equips you with the knowledge in python's native data structures Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.