Back to Courses

Data Science Courses - Page 102

Showing results 1011-1020 of 1407
Working with Datasets
By the end of this project, you will use Python to wrangle two datasets to visualize the relationships among the data. Datasets are collections of data that may exist in a database, a csv file, or an ordinary file. Python is a popular language to use work with dataset data. It has tools to read data in various formats, and libraries to visualize the datasets.
Creating a Wordcloud using NLP and TF-IDF in Python
By the end of this project, you will learn how to create a professional looking wordcloud from a text dataset in Python. You will use an open source dataset containing Christmas recipes and will create a wordcloud of the most important ingredients used in these recipes. I will teach you how load a JSON dataset, clean the dataset by removing encodings and unwanted characters, and lemmatize your dataset. I will also teach you how to calculate TF-IDF weights of words in your dataset and use these weights to create a wordcloud. You will create a ready-to-use Jupyter notebook for creating a wordcloud on any text dataset. Lemmatization is a process of removing inflectional endings only and to return the base or dictionary form of a word, which is known as the lemma. TF-IDF stands for term frequency-inverse document frequency. TF-IDF gives a weight to each word which tells how important that term is. Using both lemmatization and TF-IDF, one can find the important words in the text dataset and use these important words to create the wordcloud. For example, these datasets could be customer complaints and the business can focus on the important issues that the customers are facing. Wordcloud is a powerful resource which can be used in reports and presentations. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Computational Social Science Capstone Project
CONGRATULATIONS! Not only did you accomplish to finish our intellectual tour de force, but, by now, you also already have all required skills to execute a comprehensive multi-method workflow of computational social science. We will put these skills to work in this final integrative lab, where we are bringing it all together. We scrape data from a social media site (drawing on the skills obtained in the 1st course of this specialization). We then analyze the collected data by visualizing the resulting networks (building on the skills obtained in the 3rd course). We analyze some key aspects of it in depth, using machine learning powered natural language processing (putting to work the insights obtained during the 2nd course). Finally, we use a computer simulation model to explore possible generative mechanism and scrutinize aspects that we did not find in our empirical reality, but that help us to improve this aspect of society (drawing on the skills obtained during the 4th course of this specialization). The result is the first glimpse at a new way of doing social science in a digital age: computational social science. Congratulations! Having done all of this yourself, you can consider yourself a fledgling computational social scientist!
Aggregate Data with LibreOffice Base Queries
By the end of this project, you will have written LibreOffice Base queries to retrieve and aggregate data from a Sales database. Using both the Query Design tool and the SQL View you will group and summarize data using functions such as: Sum, Average, Count, Min and Max. Aggregating (grouping and summarizing) data can significantly increase its value when provided to users for use in analysis. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
Practical Python for AI Coding 2
Introduction video : https://youtu.be/TRhwIHvehR0 This course is for a complete novice of Python coding, so no prior knowledge or experience in software coding is required. This course selects, introduces and explains Python syntaxes, functions and libraries that were frequently used in AI coding. In addition, this course introduces vital syntaxes, and functions often used in AI coding and explains the complementary relationship among NumPy, Pandas and TensorFlow, so this course is helpful for even seasoned python users. This course starts with building an AI coding environment without failures on learners’ desktop or notebook computers to enable them to start AI modeling and coding with Scikit-learn, TensorFlow and Keras upon completing this course. Because learners have an AI coding environment on their computers after taking this course, they can start AI coding and do not need to join or use the cloud-based services.
Build your first Machine Learning Pipeline using Dataiku
As part of this guided project, you shall build your first Machine Learning Pipeline using DataIku tool without writing a single line of code. You shall build a prediction model which inputs COVID daily count data across the world and predict COVID fatalities.DataIku tool is a low code no code platform which is gaining traction with citizen data scientist to quickly build and deploy their models.
SARS-CoV-2 Protein Modeling and Drug Docking
In this 1-hour long project-based course, you will construct a 3D structure of a SARS-CoV-2 protein sequence using homology modeling and perform molecular docking of drugs against this protein molecule and infer protein-drug interaction. We will accomplish it in by completing each task in the project which includes - Model protein structures from sequence data - Process proteins and ligands for docking procedure - Molecular docking of drugs against protein molecules Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.
CUDA Advanced Libraries
This course will complete the GPU specialization, focusing on the leading libraries distributed as part of the CUDA Toolkit. Students will learn how to use CuFFT, and linear algebra libraries to perform complex mathematical computations. The Thrust library’s capabilities in representing common data structures and associated algorithms will be introduced. Using cuDNN and cuTensor they will be able to develop machine learning applications that help with object detection, human language translation and image classification.
Database Architecture, Scale, and NoSQL with Elasticsearch
In this final course, you will explore database architecture, PostgreSQL, and various scalable deployment configurations. You will see how PostgreSQL implements basic CRUD operations and indexes, and review how transactions and the ACID (Atomicity, Consistency, Isolation, Durability) requirements are implemented. You’ll learn to use Elasticsearch NoSQL, which is a common NoSQL database and a supplement to a relational database to high-speed search and indexing. We will examine Elasticsearch as an example of a BASE-style (Basic Availability, Soft State, Eventual Consistency) database approach, as well as compare and contrast the advantages and challenges associated with ACID and BASE databases.
High Throughput Databases with Microsoft Azure Cosmos DB
By the end of this guided project, you will have successfully created an Azure account, logged into the Azure Portal and created, and configured a new Azure Cosmos DB Account, You will have created a Cosmos DB Database and Containers and imported sample data, tested the import by running queries against the database using a Cosmos DB Notebook. You will also have configured both Manual and autoscale throughput against Databases and individual containers and configured request units or RUs Having successfully configured throughput you will have made the database globally accessible by creating a Read replica in a different region thus providing near user access to the data and also high availability. You will have completed this project by then deleting any non required resources to keep costs to a minimum. If you enjoy this project, we recommend taking the Microsoft Azure Data Fundamentals DP-900 Exam Prep Specialization: https://www.coursera.org/specializations/microsoft-azure-dp-900-data-fundamentals