Tag: Machine Learning

  • An Intro to Graph Algorithms in R

    An Intro to Graph Algorithms in R

    This tutorial introduces Graph Algorithms (GA) in R, focusing on Graph Theory and GA-R. It covers graph theory fundamentals, network analysis in R, and the deployment funnel. It also showcases examples of GA-R applications such as the igraph in R demo and the spatial co-location of employees in a workplace network. Industries and companies involved…

  • Python Data Science for Real Estate & REIT Amsterdam: (Auto) EDA, NLP, Maps & ML

    Python Data Science for Real Estate & REIT Amsterdam: (Auto) EDA, NLP, Maps & ML

    The Amsterdam real estate market has experienced a significant resurgence, with property prices increasing by double digits annually since 2013. Data science is being used to analyze the city’s housing and rental markets, revealing insights on the impact of Airbnb and empowering communities with the necessary information. Comprehensive data analysis and machine learning techniques are…

  • Time Series Data Imputation, Interpolation & Anomaly Detection

    Time Series Data Imputation, Interpolation & Anomaly Detection

    The post compares popular time series data imputation, interpolation, and anomaly detection methods. It explores the challenges of missing data and the impact on processing, analyzing, and model accuracy. The study performs data-centric experiments to benchmark optimal methods and highlights the importance of imputation for time series forecasting. It provides practical strategies and techniques for…

  • Uber’s Orbit Full Bayesian Time Series Forecasting & Inference

    Uber’s Orbit Full Bayesian Time Series Forecasting & Inference

    This article introduces Orbit, an open-source Python framework by Uber for full Bayesian time series forecasting and inference. It supports models like Exponential Smoothing, Local Global Trend, and Kernel Time-based Regression, along with methods like Markov-Chain Monte Carlo and Variational Inference. Orbit captures uncertainty in time-series data, allowing credible probabilistic forecasts with confidence intervals. The…

  • Kalman-Based Object Tracking with Low Signal/Noise Ratio

    Kalman-Based Object Tracking with Low Signal/Noise Ratio

    This study focuses on real-time object tracking with low signal/noise ratios using Kalman Filter (KF) algorithms. The study covers 1D, 2D, and 3D motion analysis, and explores the impact of noise on the accuracy of object tracking. The accuracy of the KF algorithms in estimating the object’s position and speed in real-time scenarios is evaluated…

  • 100 Basic Python Codes

    100 Basic Python Codes

    Source: PYPL Popularity of Programming Language, Feb 2024. Table of Contents Setting Up Your Environment Download Datasets Initial Pandas Data QC Displaying Pandas Data Types Showing Descriptive Statistics Exploring the Dataset Email Slicer User Input & Type Conversion Working with Lists Practicing Loops Calculator Temperature Conversion ADC Temperature Sensor Sorting Numpy Arrays Story Generator Display…

  • Malware Detection & Interpretation – PCA, T-SNE & ML

    Malware Detection & Interpretation – PCA, T-SNE & ML

    This post discusses the application of PCA, T-SNE, and supervised ML algorithms for malware detection using a benchmark dataset. Techniques such as Logistic Regression, SVC, KNN, and XGBoost are implemented, achieving high performance metrics. Results show potential for improving malware detection using ML while reducing false positives and enhancing cyber defense.

  • Retail Sales, Store Item Demand Time-Series Analysis/Forecasting: AutoEDA, FB Prophet, SARIMAX & Model Tuning

    Retail Sales, Store Item Demand Time-Series Analysis/Forecasting: AutoEDA, FB Prophet, SARIMAX & Model Tuning

    This study compares and evaluates various forecasting models to predict sales and demand for retail businesses. The focus is on Time Series Analysis (TSA) methods such as FB Prophet and SARIMAX. The final FB Prophet model yields MAE=4.252 and MAPE=0.168, while SARIMAX models’ best performing variant achieves MAE=6.285 and MAPE=0.213. The study emphasizes the importance…

  • H2O AutoML Malware Detection

    H2O AutoML Malware Detection

    This study explores AI-powered malware detection using the H2O AutoML algorithm for effective and rapid classification of PE files. The optimized Stacked Ensemble model achieved high precision, recall, and F1 score. The research validates the H2O AutoML workflow’s accurate malware identification and supports related R&D products and solutions in the field of information security.

  • Kalman-Based Target Trajectory Tracking Performance QC Analysis

    Kalman-Based Target Trajectory Tracking Performance QC Analysis

    Photo by Kelly on Pexels. Table of Contents The Kalman Filter Intuition Formulation of a Problem Linear Position-Time Path Parabolic Position-Time Path Extended Kalman Filter (EKF) Tracking the Bike’s Path Unscented Kalman Filter (UKF) 1. Prediction Step 2. Correction Step Industry Application in Dynamic Positioning System Smoothed Position and Speed Estimates Radar EKF Trajectory Conclusions…

  • Anatomy of the Robust 1D Kalman Filter

    Anatomy of the Robust 1D Kalman Filter

    The Kalman Filter (KF) is a powerful tool for tracking, navigation, and data prediction tasks. It is based on the assumption of linearity and Gaussian noise, enabling it to iteratively update predicted models. The article outlines a simplified implementation of KF using Python commands, with examples demonstrating its effectiveness in handling noisy measurements. It also…

  • Leveraging Predictive Uncertainties of Time Series Forecasting Models

    Leveraging Predictive Uncertainties of Time Series Forecasting Models

    Featured Image via Canva. Table of Contents Introduction Random Simulation Tests TSLA Stock 43 Days TSLA Stock 300 Days Housing in the United States Industrial Production Federal Funds Rate Data S&P 500 Absolute Returns Number of Airline Passengers- 1. Holt-Winters Number of Airline Passengers- 2. Prophet Average Temperature in India Monthly Sales Data Analysis QC…

  • Real-Time Stock Sentiment Analysis w/ NLP Web Scraping

    Real-Time Stock Sentiment Analysis w/ NLP Web Scraping

    Stock sentiment analysis is gaining popularity as a technique to understand public opinions on specific assets. This study uses NLP web scraping in Python to extract stock sentiments from financial news headlines on FinViz. The sentiment analysis can help determine investor opinions and potential impacts on stock prices, though it is not a standalone predictor.

  • Sales Forecasting: tslearn, Random Walk, Holt-Winters, SARIMAX, GARCH, Prophet, and LSTM

    Sales Forecasting: tslearn, Random Walk, Holt-Winters, SARIMAX, GARCH, Prophet, and LSTM

    The data science project involves evaluating various sales forecasting algorithms in Python using a Kaggle time-series dataset. The forecasting algorithms include tslearn, Random Walk, Holt-Winters, SARIMA, GARCH, Prophet, LSTM and Di Pietro’s Model. The goal is to predict next month’s sales for a list of shops and products, which slightly changes every month. The best…

  • Dividend-NG-BTC Diversify Big Tech

    Dividend-NG-BTC Diversify Big Tech

    SEO Title: Can Dividends, Natural Gas and Crypto Diversify Big Techs? Ultimately, we need to answer the following fundamental question: Can Dividend Kings, NGUSD and BTC-USD Diversify Growth Tech assets? Dividends are very popular among investors, especially those who want a steady stream of income from their investments. Some companies choose to share their profits…

  • Returns-Volatility Domain K-Means Clustering and LSTM Anomaly Detection of S&P 500 Stocks

    Returns-Volatility Domain K-Means Clustering and LSTM Anomaly Detection of S&P 500 Stocks

    This study aims to implement and evaluate the K-means algorithm for ranking/clustering S&P 500 stocks based on average annualized return and volatility. The second goal is to detect anomalies in the best performing S&P 500 stocks using the Isolation Forest algorithm. Additionally, anomalies in the S&P 500 historical stock price time series data will be…

  • NVIDIA Rolling Volatility: GARCH & XGBoost

    NVIDIA Rolling Volatility: GARCH & XGBoost

    This post examines the prediction of NVIDIA stock volatility using two models: the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) and the Extreme Gradient Boosting (XGBoost). Both models are compared in terms of MSE and MAPE. The post discovers that the machine learning-based XGBoost model outperforms the GARCH model in NVDA volatility forecasting, showing the effectiveness of…

  • An Implemented Streamlit Crop Prediction App

    An Implemented Streamlit Crop Prediction App

    Precision agriculture or smart farming: We implement the Streamlit crop prediction app. This is an ML-driven app that requires the trained model as input.

  • Time Series Forecasting of Hourly U.S.A. Energy Consumption – PJM East Electricity Grid

    Time Series Forecasting of Hourly U.S.A. Energy Consumption – PJM East Electricity Grid

    Table of Contents PJME Data Let’s set the working directory YOURPATH and import the following key libraries Let’s read the input csv file in our working directory Let’s plot the time series Data Preparation Output: (113926, 1, 9) (113926,) (31439, 1, 9) (31439,) LSTM TSF Let’s plot the LSTM train/test val_loss history Output: MSE: 1811223.125…

  • Supervised ML Room Occupancy IoT

    Supervised ML Room Occupancy IoT

    The article presents a study on applying machine learning (ML) to IoT sensor data for workspace occupancy detection. Comparing 14 popular scikit-learn classifiers, the ML systems built use the gathered IoT sensor data to predict room occupancy with high certainty. The results suggest temperature and light are the significant factors affecting occupancy detection. The study…