Tag: statistics
-
Python Data Science for Real Estate & REIT Amsterdam: (Auto) EDA, NLP, Maps & ML
The Amsterdam real estate market has experienced a significant resurgence, with property prices increasing by double digits annually since 2013. Data science is being used to analyze the city’s housing and rental markets, revealing insights on the impact of Airbnb and empowering communities with the necessary information. Comprehensive data analysis and machine learning techniques are…
-
Titanic Benchmark Hypothesis Testing in Disaster Risk Management: (Auto)EDA, ML, HPO & SHAP
This project aims to apply the Titanic benchmark to hypothesis testing in disaster risk management. Using the Titanic dataset on Kaggle, a Machine Learning (ML) analysis was performed to determine the statistical significance relation between a person’s death and their passenger class, age, sex, and port of embarkation. The project involved comprehensive ML pipeline implementation…
-
Uber’s Orbit Full Bayesian Time Series Forecasting & Inference
This article introduces Orbit, an open-source Python framework by Uber for full Bayesian time series forecasting and inference. It supports models like Exponential Smoothing, Local Global Trend, and Kernel Time-based Regression, along with methods like Markov-Chain Monte Carlo and Variational Inference. Orbit captures uncertainty in time-series data, allowing credible probabilistic forecasts with confidence intervals. The…
-
100 Basic Python Codes
Source: PYPL Popularity of Programming Language, Feb 2024. Table of Contents Setting Up Your Environment Download Datasets Initial Pandas Data QC Displaying Pandas Data Types Showing Descriptive Statistics Exploring the Dataset Email Slicer User Input & Type Conversion Working with Lists Practicing Loops Calculator Temperature Conversion ADC Temperature Sensor Sorting Numpy Arrays Story Generator Display…
-
Retail Sales, Store Item Demand Time-Series Analysis/Forecasting: AutoEDA, FB Prophet, SARIMAX & Model Tuning
This study compares and evaluates various forecasting models to predict sales and demand for retail businesses. The focus is on Time Series Analysis (TSA) methods such as FB Prophet and SARIMAX. The final FB Prophet model yields MAE=4.252 and MAPE=0.168, while SARIMAX models’ best performing variant achieves MAE=6.285 and MAPE=0.213. The study emphasizes the importance…
-
Sales Forecasting: tslearn, Random Walk, Holt-Winters, SARIMAX, GARCH, Prophet, and LSTM
The data science project involves evaluating various sales forecasting algorithms in Python using a Kaggle time-series dataset. The forecasting algorithms include tslearn, Random Walk, Holt-Winters, SARIMA, GARCH, Prophet, LSTM and Di Pietro’s Model. The goal is to predict next month’s sales for a list of shops and products, which slightly changes every month. The best…
-
A Balanced Mix-and-Match Time Series Forecasting: ThymeBoost, Prophet, and AutoARIMA
The post evaluates the performance of popular Time Series Forecasting (TSF) methods, namely AutoARIMA, Facebook Prophet, and ThymeBoost on four real-world time series datasets: Air Passengers, U.S. Wholesale Price Index (WPI), BTC-USD price, and Peyton Manning. Each TSF model uses historical data to identify trends and make future predictions. Studies indicate that ThymeBoost, which combines…
-
NVIDIA Rolling Volatility: GARCH & XGBoost
This post examines the prediction of NVIDIA stock volatility using two models: the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) and the Extreme Gradient Boosting (XGBoost). Both models are compared in terms of MSE and MAPE. The post discovers that the machine learning-based XGBoost model outperforms the GARCH model in NVDA volatility forecasting, showing the effectiveness of…
-
NLP of Restaurant Guest Reviews on Tripadvisor
This is a comprehensive study examining restaurant reviews on TripAdvisor across 31 major European cities. The research, based on a dataset scraped from TripAdvisor, aims to perform a sentiment analysis of reviews, exploring average ratings per city, vegetarian-friendly cities, and how local cuisine compares to foreign food. The analysis is carried out using Python, demonstrating…
-
Joint Analysis of Bitcoin, Gold and Crude Oil Prices
The content discusses a comprehensive analysis on a joint time-series analysis of Bitcoin, Gold and Crude Oil prices from 2021 to 2023. It explores data processing, exploratory data analysis before running a range of statistical tests, ARIMA models fitting, and finally, using the Markowitz portfolio optimization method. It then presents a detailed analysis, including data…
-
Top 6 Reliability/Risk Engineering Learnings
The content provides a review of Eric Marsden’s e-learning Python courseware on risk engineering, loss prevention and safety management. It includes discussions of various topics such as the failure of light bulbs, electronic components, large computing facility maintenance, and oil field pumps. The content also delves into stock market risk analysis like Value at Risk…
-
Portfolio Optimization of 20 Dividend Growth Stocks
The post discusses implementing a stochastic optimization algorithm to create a balanced portfolio of 20 dividend growth stocks for maximum return within defined risk tolerance. By analyzing daily stock and benchmark data, the algorithm optimizes the portfolio to outperform the benchmark index and achieve desired risk-reward outcomes. The results facilitate spreading investment capital across diverse…
-
SARIMAX Crude Oil Prices Forecast – 2. Brent
This study focuses on validating the EIA energy forecast for the 2023 Brent crude oil spot price using SARIMAX time-series cross-validation. It includes prerequisites, data loading, ETS decomposition, ADF test, SARIMAX modeling, predictions, model evaluation, and summary. The predictions align with the EIA forecast, with discrepancies within predicted confidence intervals.
-
SARIMAX Crude Oil Prices Forecast – 1. WTI
The content discusses a detailed forecast of Brent and WTI oil prices for 2023, using Python, SARIMAX and Time Series Analysis. The data indicates volatility in the oil market starting 2023, with prices set to decrease from 2022 levels. Experts also warn of a potential US recession in 2023, which could further impact the oil…
-
SARIMAX Forecasting of Online Food Delivery Sales
This article provides a beginner-friendly guide to understanding and evaluating ARIMA-based time-series forecasting models such as SARIMA and SARIMAX. It focuses on an QC-optimized SARIMA(X) model to forecast the e-commerce sales of a food delivery company. The post covers essential concepts, data processing, model comparisons, and insights. It also includes a comparison between SARIMA and…
-
ANOVA-OLS Prediction of Surgical Volumes
Operating rooms (ORs) are some of the most valuable hospital assets, generating a large part of hospital revenue. Statistical models have been developed using datasets to predict daily surgical volumes weeks in advance. We focus on the VUMC dataset for evaluation of our statistical models. We use the ANOVA null-hypothesis test for the total number…
-
Stock Forecasting with FBProphet
Prophet from Meta (Facebook) is a procedure for forecasting time series data such as stocks. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.