Tag: data visualization
-
Titanic Benchmark Hypothesis Testing in Disaster Risk Management: (Auto)EDA, ML, HPO & SHAP

This project aims to apply the Titanic benchmark to hypothesis testing in disaster risk management. Using the Titanic dataset on Kaggle, a Machine Learning (ML) analysis was performed to determine the statistical significance relation between a person’s death and their passenger class, age, sex, and port of embarkation. The project involved comprehensive ML pipeline implementation…
-
The 5-Step GCP IoT Device-to-Report via AI Roadmap

The Internet of Things (IoT) aids in the improvement of processes and enables new scenarios through network-connected devices. Recognized as a driver of the Fourth Industrial Revolution, IoT applications include predictive maintenance, industry safety, automation, remote monitoring, asset tracking, and fraud detection. Advancements in cloud IoT architectures over recent years have enabled efficient data ingestion,…
-
Plotly Dash TA Stock Market App

The post explains how to deploy a Plotly Dash stock market app in Python with the dashboard of user-defined stock prices. This includes technical indicators like volume, MACD, and stochastic. The steps include selecting a stock ticker symbol (NVDA), retrieving stock data from yfinance API, adding Moving Averages, saving the stock chart in HTML form,…
-
Low-Code AutoEDA of Dutch eHealth Data in Python

The article details the usage of Python’s Low-Code AutoEDA for examining Dutch Healthcare Authority’s eHealth data. Utilizing various Python libraries like D-Tale, SweetViz, etc., the study aims to understand the healthcare data’s key features to ready it for AI techniques. The motivations include the Dutch government’s support for digital healthcare applications, especially amidst the recent…
-
Wind Energy ML Prediction & Turbine Power Control

This text presents a detailed project on modeling the power curve of a wind turbine, which is crucial in wind energy management and forecasting. By using machine learning techniques such as Random Forest and Gradient Boosting Regressors, and validating with real-world Scada data from a Turkish wind farm, the project shows it’s possible to create…
-
Morocco Earthquake EDA

Featured design via Canva. Clickable Table of Contents Basic Installations and Imports Let’s set the working directory YOURPATH Let’s install and import the following libraries Download Earthquake Input Data For this project, we’ll use a dataset that contains all seismic events over the last seven days, which have a magnitude of 1.0 or greater: Output:…
-
NLP & Stock Impact of ChatGPT-Related Tweets

This Python project extends a recent study on half a million tweets about OpenAI’s language model, ChatGPT. It uncovers public sentiment about this rapidly growing app and examines its impact on the future of AI-powered LLMs, including stock influences. The project uses data analysis techniques such as text processing, sentiment analysis, identification of key influencers,…
-
An Overview of Video Games in 2023: Trends, Technology, and Market Research

The gaming industry is rapidly growing, projected to reach a revenue of $365.6 billion in 2023. Major trends include Web3 gaming, AI integration, and a push for consolidation. Fashion brands collaborate for virtual sales, and advances in gaming technology, such as AR/VR and cloud-based gaming, promise an even more immersive experience for gamers.
-
A Comparison of Automated EDA Tools in Python: Pandas-Profiling vs SweetViz

Exploratory Data Analysis (EDA) is an important part of data science projects, designed to identify patterns, anomalies, and relationships. It can employ univariate, bivariate, and multivariate data analytics, and can be accelerated using automated EDA tools. The article discusses Python libraries such as Pandas-Profiling and SweetViz for automating EDA and demonstrates their application to improve…
-
NLP of Restaurant Guest Reviews on Tripadvisor

This is a comprehensive study examining restaurant reviews on TripAdvisor across 31 major European cities. The research, based on a dataset scraped from TripAdvisor, aims to perform a sentiment analysis of reviews, exploring average ratings per city, vegetarian-friendly cities, and how local cuisine compares to foreign food. The analysis is carried out using Python, demonstrating…
-
Improved Multiple-Model ML/DL Credit Card Fraud Detection: F1=88% & ROC=91%

In 2023, the global card industry is projected to suffer $36.13 billion in fraud losses. This has necessitated a priority focus on enhancing credit card fraud detection by banks and financial organizations. AI-based techniques are making fraud detection easier and more accurate, with models able to recognize unusual transactions and fraud. The post discusses a…
-
Datapane Stock Screener App from Scratch

This content provides a quick guide for value investors to use the Datapane stock screener API in Python. It includes instructions for installation, importing standard libraries, setting the stock ticker, downloading stock Adj Close price, and creating visualizations. The post also describes how to build a powerful report using Datapane’s layout components.
-
Unsupervised ML, K-Means Clustering & Customer Segmentation

Table of Clickable Contents Motivation Methods Open-Source Datasets This file contains the basic information (ID, age, gender, income, and spending score) about the customers. Online retail is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion…
-
Dabl Auto EDA-ML

Dabl, short for Data Analysis Baseline Library, is a high-level data exploration library in Python that automates repetitive data wrangling tasks in the early stages of supervised machine learning model development. Developed by Andreas Mueller and the scikit-learn community, it facilitates data preprocessing, advanced integrated visualization, exploratory data analysis (EDA), and ML model development, demonstrated…
-
Joint Analysis of Bitcoin, Gold and Crude Oil Prices

The content discusses a comprehensive analysis on a joint time-series analysis of Bitcoin, Gold and Crude Oil prices from 2021 to 2023. It explores data processing, exploratory data analysis before running a range of statistical tests, ARIMA models fitting, and finally, using the Markowitz portfolio optimization method. It then presents a detailed analysis, including data…
-
Video Game Sales Data Exploration

The post explores the gaming industry’s size and state, highlighting a potential market value of $314bn by 2027. It emphasizes the industry’s three main subsectors: console, PC, and smartphone gaming. Moreover, the post conducts extensive data analysis on video game sales data, using Python to examine aspects such as genre profitability, platform sales prices, and…
-
Using AI/ANN AUC>90% for Early Diagnosis of Cardiovascular Disease (CVD)

The project utilizes AI-driven cardiovascular medicine with a focus on early diagnosis of heart disease using Artificial Neural Networks (ANN). Aiming to improve early detection of heart issues, the project processed a dataset of 303 patients using Python libraries and conducted extensive exploratory data analysis. A Sequential ANN model was subsequently built, revealing excellent performance…
-
Overview of AWS Tech Portfolio 2023

This summary focuses on the extensive capabilities of Amazon Web Services (AWS) by 2023, highlighting its 27% year-on-year growth and a net sales increase to $127.1 billion. AWS emerges as the top cloud service provider, offering over 200 services including compute, storage, databases, networking, AI, and machine learning. It is constantly expanding operations, having opened…

