Back to Courses

Data Science Courses - Page 135

Showing results 1341-1350 of 1407
Big Data Modeling and Management Systems
Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. At the end of this course, you will be able to: * Recognize different data elements in your own work and in everyday life problems * Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design * Identify the frequent data operations required for various types of data * Select a data model to suit the characteristics of your data * Apply techniques to handle streaming data * Differentiate between a traditional Database Management System and a Big Data Management System * Appreciate why there are so many data management systems * Design a big data information system for an online game company This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
Finding and Preparing for the Right Job
Finding and preparing for the right job in the DS/AI field can be tricky. In this course, we will explore how the job market has vastly different descriptions for the same job title, how to identify what a company is really looking for, and how to search the “hidden” job market. We will also overview major skills areas experts recommend revisiting before applying for jobs in this field, how to tailor your resume to catch the eye of a DS/AI hiring manager, and how to create a stellar portfolio. Finally, we will discuss the importance of marketing yourself and tips and tricks on how to do it well. By the end of this course, students will be able to: • Decipher job descriptions with the same titles to discern the different skill sets needed. • Recall the major skill areas experts recommend revisiting and identify which skills to refresh in preparation for DS/AI applications and interviews. • Make their portfolio and resume stand out by applying tips specific to the field. • R​ecognize how to market themselves and how career fairs, connecting with recruiters, and networking can help. • Describe what kind of networking is beneficial in this field.
Clustering analysis and techniques
In this 2-hour long project-based course, you will learn how to perform clustering (one of the core pillar of unsupervised learning) and its importance in machine learning, set up PyCaret Clustering module, create, visualize & compare Clustering algorithms all this with just a few lines of code.
Designing, Running, and Analyzing Experiments
You may never be sure whether you have an effective user experience until you have tested it with users. In this course, you’ll learn how to design user-centered experiments, how to run such experiments, and how to analyze data from these experiments in order to evaluate and validate user experiences. You will work through real-world examples of experiments from the fields of UX, IxD, and HCI, understanding issues in experiment design and analysis. You will analyze multiple data sets using recipes given to you in the R statistical programming language -- no prior programming experience is assumed or required, but you will be required to read, understand, and modify code snippets provided to you. By the end of the course, you will be able to knowledgeably design, run, and analyze your own experiments that give statistical weight to your designs.
Human Factors in AI
This third and final course of the AI Product Management Specialization by Duke University's Pratt School of Engineering focuses on the critical human factors in developing AI-based products. The course begins with an introduction to human-centered design and the unique elements of user experience design for AI products. Participants will then learn about the role of data privacy in AI systems, the challenges of designing ethical AI, and approaches to identify sources of bias and mitigate fairness issues. The course concludes with a comparison of human intelligence and artificial intelligence, and a discussion of the ways that AI can be used to both automate as well as assist human decision-making. At the conclusion of this course, you should be able to: 1) Identify and mitigate privacy and ethical risks in AI projects 2) Apply human-centered design practices to design successful AI product experiences 3) Build AI systems that augment human intelligence and inspire model trust in users
Unsupervised Learning, Recommenders, Reinforcement Learning
In the third course of the Machine Learning Specialization, you will: • Use unsupervised learning techniques for unsupervised learning: including clustering and anomaly detection. • Build recommender systems with a collaborative filtering approach and a content-based deep learning method. • Build a deep reinforcement learning model. The Machine Learning Specialization is a foundational online program created in collaboration between DeepLearning.AI and Stanford Online. In this beginner-friendly program, you will learn the fundamentals of machine learning and how to use these techniques to build real-world AI applications. This Specialization is taught by Andrew Ng, an AI visionary who has led critical research at Stanford University and groundbreaking work at Google Brain, Baidu, and Landing.AI to advance the AI field. This 3-course Specialization is an updated and expanded version of Andrew’s pioneering Machine Learning course, rated 4.9 out of 5 and taken by over 4.8 million learners since it launched in 2012. It provides a broad introduction to modern machine learning, including supervised learning (multiple linear regression, logistic regression, neural networks, and decision trees), unsupervised learning (clustering, dimensionality reduction, recommender systems), and some of the best practices used in Silicon Valley for artificial intelligence and machine learning innovation (evaluating and tuning models, taking a data-centric approach to improving performance, and more.) By the end of this Specialization, you will have mastered key concepts and gained the practical know-how to quickly and powerfully apply machine learning to challenging real-world problems. If you’re looking to break into AI or build a career in machine learning, the new Machine Learning Specialization is the best place to start.
Prediction Models with Sports Data
In this course the learner will be shown how to generate forecasts of game results in professional sports using Python. The main emphasis of the course is on teaching the method of logistic regression as a way of modeling game results, using data on team expenditures. The learner is taken through the process of modeling past results, and then using the model to forecast the outcome games not yet played. The course will show the learner how to evaluate the reliability of a model using data on betting odds. The analysis is applied first to the English Premier League, then the NBA and NHL. The course also provides an overview of the relationship between data analytics and gambling, its history and the social issues that arise in relation to sports betting, including the personal risks.
Data Visualization in R with ggplot2
Data visualization is a critical skill for anyone that routinely using quantitative data in his or her work - which is to say that data visualization is a tool that almost every worker needs today. One of the critical tools for data visualization today is the R statistical programming language. Especially in conjunction with the tidyverse software packages, R has become an extremely powerful and flexible platform for making figures, tables, and reproducible reports. However, R can be intimidating for first time users, and there are so many resources online that it can be difficult to sort through without guidance. This course is the second in a specialization in Data Visualization offered by Johns Hopkins. It is intended for learners who have either have some experience with R and data wrangling in the tidyverse or have taken the previous course in the specialization. The focus in this course learning to use ggplot2 to make a variety of visualizations and to polish those visualizations using tools within ggplot as well as vector graphics editing software. The course will not go into detail about how the data management works behind the scenes.
Fundamentals of Database Systems
In this project you will learn to identify the components of a database system, also sometimes referred to as an information system. As you examine a database system and diagram a database, you will gain an understanding of how those components interact and fit together. The overall purpose of the database system is to store and provide access to secure, relevant, timely, accurate data which can be presented as information used for making business decisions. Whether you are in Information Technology or an end user, understanding how data is used by your organization makes you a more valuable employee. This project now has an optional challenge activity and an optional capstone activity to give you opportunities for extra review and practice!
Introduction to the Tidyverse
This course introduces a powerful set of data science tools known as the Tidyverse. The Tidyverse has revolutionized the way in which data scientists do almost every aspect of their job. We will cover the simple idea of "tidy data" and how this idea serves to organize data for analysis and modeling. We will also cover how non-tidy can be transformed to tidy data, the data science project life cycle, and the ecosystem of Tidyverse R packages that can be used to execute a data science project. If you are new to data science, the Tidyverse ecosystem of R packages is an excellent way to learn the different aspects of the data science pipeline, from importing the data, tidying the data into a format that is easy to work with, exploring and visualizing the data, and fitting machine learning models. If you are already experienced in data science, the Tidyverse provides a power system for streamlining your workflow in a coherent manner that can easily connect with other data science tools. In this course it is important that you be familiar with the R programming language. If you are not yet familiar with R, we suggest you first complete R Programming before returning to complete this course.