Tag: Kaggle data
-
Malware Detection & Interpretation – PCA, T-SNE & ML

This post discusses the application of PCA, T-SNE, and supervised ML algorithms for malware detection using a benchmark dataset. Techniques such as Logistic Regression, SVC, KNN, and XGBoost are implemented, achieving high performance metrics. Results show potential for improving malware detection using ML while reducing false positives and enhancing cyber defense.
-
H2O AutoML Malware Detection

This study explores AI-powered malware detection using the H2O AutoML algorithm for effective and rapid classification of PE files. The optimized Stacked Ensemble model achieved high precision, recall, and F1 score. The research validates the H2O AutoML workflow’s accurate malware identification and supports related R&D products and solutions in the field of information security.
-
An Implemented Streamlit Crop Prediction App

Precision agriculture or smart farming: We implement the Streamlit crop prediction app. This is an ML-driven app that requires the trained model as input.
-
Time Series Forecasting of Hourly U.S.A. Energy Consumption – PJM East Electricity Grid

Table of Contents PJME Data Let’s set the working directory YOURPATH and import the following key libraries Let’s read the input csv file in our working directory Let’s plot the time series Data Preparation Output: (113926, 1, 9) (113926,) (31439, 1, 9) (31439,) LSTM TSF Let’s plot the LSTM train/test val_loss history Output: MSE: 1811223.125…
-
Wind Energy ML Prediction & Turbine Power Control

This text presents a detailed project on modeling the power curve of a wind turbine, which is crucial in wind energy management and forecasting. By using machine learning techniques such as Random Forest and Gradient Boosting Regressors, and validating with real-world Scada data from a Turkish wind farm, the project shows it’s possible to create…
-
Robust Fake News Detection: NLP Algorithms for Deep Learning and Supervised ML in Python

The project aims at setting up a robust system for fake news detection using Python. The system adopts a hybrid framework, leveraging Natural Language Processing (NLP) techniques to classify text-based fake vs real news. Involving exploratory data analysis, multi-model training, testing, validation, and performance metrics comparison, it assesses different Deep Learning, Supervised Machine Learning, and…
-
Supervised ML Room Occupancy IoT

The article presents a study on applying machine learning (ML) to IoT sensor data for workspace occupancy detection. Comparing 14 popular scikit-learn classifiers, the ML systems built use the gathered IoT sensor data to predict room occupancy with high certainty. The results suggest temperature and light are the significant factors affecting occupancy detection. The study…
-
WA House Price Prediction: EDA-ML-HPO

A predictive model of house sale prices in King County, Washington, was developed using multiple supervised machine learning (ML) regression models, including LinearRegression, SGDRegressor, RandomForestRegressor, XGBRegressor, and AdaBoostRegressor. The best-performing model, XGBRegressor, explained 90.6% of the price variance, with a RMSE of $18472.7. These results, valuable to local realtors, indicate houses with a waterfront are…
-
NLP & Stock Impact of ChatGPT-Related Tweets

This Python project extends a recent study on half a million tweets about OpenAI’s language model, ChatGPT. It uncovers public sentiment about this rapidly growing app and examines its impact on the future of AI-powered LLMs, including stock influences. The project uses data analysis techniques such as text processing, sentiment analysis, identification of key influencers,…
-
An Overview of Video Games in 2023: Trends, Technology, and Market Research

The gaming industry is rapidly growing, projected to reach a revenue of $365.6 billion in 2023. Major trends include Web3 gaming, AI integration, and a push for consolidation. Fashion brands collaborate for virtual sales, and advances in gaming technology, such as AR/VR and cloud-based gaming, promise an even more immersive experience for gamers.
-
Customer Reviews NLP Spacy Analysis and ML/AI Demand Forecasting of the Steam PC Video Game Service

Steam, a leading digital distribution platform for PC gaming, has seen over 6000 new games released in 2022, averaging over 34 games each day. This post aims to conduct comprehensive customer reviews NLP sentiment analysis and ML/AI demand forecasting using public-domain datasets. It covers EDA, NLP Spacy analysis, ML/AI pipeline, model validation, word clouds, and…
-
NLP of Restaurant Guest Reviews on Tripadvisor

This is a comprehensive study examining restaurant reviews on TripAdvisor across 31 major European cities. The research, based on a dataset scraped from TripAdvisor, aims to perform a sentiment analysis of reviews, exploring average ratings per city, vegetarian-friendly cities, and how local cuisine compares to foreign food. The analysis is carried out using Python, demonstrating…
-
Improved Multiple-Model ML/DL Credit Card Fraud Detection: F1=88% & ROC=91%

In 2023, the global card industry is projected to suffer $36.13 billion in fraud losses. This has necessitated a priority focus on enhancing credit card fraud detection by banks and financial organizations. AI-based techniques are making fraud detection easier and more accurate, with models able to recognize unusual transactions and fraud. The post discusses a…
-
Video Game Sales Data Exploration

The post explores the gaming industry’s size and state, highlighting a potential market value of $314bn by 2027. It emphasizes the industry’s three main subsectors: console, PC, and smartphone gaming. Moreover, the post conducts extensive data analysis on video game sales data, using Python to examine aspects such as genre profitability, platform sales prices, and…
-
Using AI/ANN AUC>90% for Early Diagnosis of Cardiovascular Disease (CVD)

The project utilizes AI-driven cardiovascular medicine with a focus on early diagnosis of heart disease using Artificial Neural Networks (ANN). Aiming to improve early detection of heart issues, the project processed a dataset of 303 patients using Python libraries and conducted extensive exploratory data analysis. A Sequential ANN model was subsequently built, revealing excellent performance…
-
About Face Recognition ML Algorithms

Facial Recognition (FR) involves mapping an individual’s facial features mathematically and storing the data as a faceprint. This case study outlines the process of Exploratory Data Analysis (EDA) and performance QC analysis for ML/AI workflows using public-domain datasets and real-time webcam GUI. The study includes the use of SVM for FR, dataset splitting, ML model…
-
Multi-Label Keras CNN Image Classification of MNIST Fashion Clothing

Machine learning and deep learning are invaluable in optimizing supply chain operations in fashion retail. Even smaller retailers are leveraging ML algorithms to meet customer demands. Neural network models, particularly Convolution Neural Networks (CNN) are used to classify clothing images, like the Fashion-MNIST dataset, with high accuracy. Hyperparameter optimization using GridSearchCV and Nadam optimizer are…
-
Breast Cancer ML Classification – Logistic Regression vs Gradient Boosting with Hyperparameter Optimization (HPO)

Breast Cancer (BC) is the leading cause of death among women worldwide. The present study optimizes the use of supervised Machine Learning (ML) algorithms for detecting, analyzing, and classifying BC. We compare Logistic Regression (LR) against Gradient Boosting (GB) Classifier within the Hyperparameter Optimization (HPO) loop given by GridSearchCV. We use the publicly available BC dataset…
-
AI-Powered Customer Churn Prediction

AI-Powered Customer Churn Prediction Churn is a good indicator of growth potential. Churn rates track lost customers, and growth rates track new customers—comparing and analyzing both of these metrics tells you exactly how much your business is growing over time. In this project, we explored the churn rate in-depth and examined an example implementation of…
