Back to Courses

Data Analysis Courses - Page 7

Showing results 61-70 of 998
How to Use Lookup Reference Math and Text Functions in Excel
By the end of this project, you will learn how to use lookup reference, math and text functions in an Excel Spreadsheet by using a free version of Microsoft Office Excel.
Analyzing Big Data with SQL
In this course, you'll get an in-depth look at the SQL SELECT statement and its main clauses. The course focuses on big data SQL engines Apache Hive and Apache Impala, but most of the information is applicable to SQL with traditional RDBMs as well; the instructor explicitly addresses differences for MySQL and PostgreSQL. By the end of the course, you will be able to • explore and navigate databases and tables using different tools; • understand the basics of SELECT statements; • understand how and why to filter results; • explore grouping and aggregation to answer analytic questions; • work with sorting and limiting results; and • combine multiple tables in different ways. To use the hands-on environment for this course, you need to download and install a virtual machine and the software on which to run it. Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements: • Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work) • 8 GB RAM or more • 25GB free disk space or more • Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS) • For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)
Big data and Language 1
In this course, students will understand characteristics of language through big data. Students will learn how to collect and analyze big data, and find linguistic features from the data. A number of approaches to the linguistic analysis of written and spoken texts will be discussed. The class will consist of lecture videos which are approximately 1 hour and a quiz for each week. There will be a final project which requires students to conduct research on text data and language.
Applied Text Mining in Python
This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling). This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.
Regression Models
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.
Python Project for Data Engineering
This mini-course is intended to apply foundational Python skills by implementing different techniques to collect and work with data. Assume the role of a Data Engineer and extract data from multiple file formats, transform it into specific datatypes, and then load it into a single source for analysis. Continue with the course and test your knowledge by implementing webscraping and extracting data with APIs all with the help of multiple hands-on labs. After completing this course you will have acquired the confidence to begin collecting large datasets from multiple sources and transform them into one primary source, or begin web scraping to gain valuable business insights all with the use of Python. PRE-REQUISITE: **Python for Data Science, AI and Development** course from IBM is a pre-requisite for this project course. Please ensure that before taking this course you have either completed the Python for Data Science, AI and Development course from IBM or have equivalent proficiency in working with Python and data. NOTE: This course is not intended to teach you Python and does not have too much instructional content. It is intended for you to apply prior Python knowledge.
Spatial Data Science and Applications
Spatial (map) is considered as a core infrastructure of modern IT world, which is substantiated by business transactions of major IT companies such as Apple, Google, Microsoft, Amazon, Intel, and Uber, and even motor companies such as Audi, BMW, and Mercedes. Consequently, they are bound to hire more and more spatial data scientists. Based on such business trend, this course is designed to present a firm understanding of spatial data science to the learners, who would have a basic knowledge of data science and data analysis, and eventually to make their expertise differentiated from other nominal data scientists and data analysts. Additionally, this course could make learners realize the value of spatial big data and the power of open source software's to deal with spatial data science problems. This course will start with defining spatial data science and answering why spatial is special from three different perspectives - business, technology, and data in the first week. In the second week, four disciplines related to spatial data science - GIS, DBMS, Data Analytics, and Big Data Systems, and the related open source software's - QGIS, PostgreSQL, PostGIS, R, and Hadoop tools are introduced together. During the third, fourth, and fifth weeks, you will learn the four disciplines one by one from the principle to applications. In the final week, five real world problems and the corresponding solutions are presented with step-by-step procedures in environment of open source software's.
Probability Theory: Foundation for Data Science
Understand the foundations of probability and its relationship to statistics and data science.  We’ll learn what it means to calculate a probability, independent and dependent outcomes, and conditional events.  We’ll study discrete and continuous random variables and see how this fits with data collection.  We’ll end the course with Gaussian (normal) random variables and the Central Limit Theorem and understand its fundamental importance for all of statistics and data science. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder Logo adapted from photo by Christopher Burns on Unsplash.
Preparing Data for Machine Learning Models
By the end of this project, you will extract colors pixels as training dataset into a form where you can feed it to your Machine Learning Model using numpy arrays. In this project we will work with images, you will get introduced to computer vision basic concepts. Moreover, you will be able to properly handle arrays and preprocess your training dataset and label it. Extracting features and preparing data is a very crucial task as it influences your model. So you will start to learn the basics of handling the data into the format where it would be accepted by a Machine Learning algorithm as Training Dataset.
How to Build a BI Dashboard Using Google Data Studio and BigQuery
This is a self-paced lab that takes place in the Google Cloud console. Learn how to build a BI dashboard with Data Studio as the front end, powered by BigQuery on the back end