Tag: technology

  • Python Data Science for Real Estate & REIT Amsterdam: (Auto) EDA, NLP, Maps & ML

    Python Data Science for Real Estate & REIT Amsterdam: (Auto) EDA, NLP, Maps & ML

    The Amsterdam real estate market has experienced a significant resurgence, with property prices increasing by double digits annually since 2013. Data science is being used to analyze the city’s housing and rental markets, revealing insights on the impact of Airbnb and empowering communities with the necessary information. Comprehensive data analysis and machine learning techniques are…

  • 100 Basic Python Codes

    100 Basic Python Codes

    Source: PYPL Popularity of Programming Language, Feb 2024. Table of Contents Setting Up Your Environment Download Datasets Initial Pandas Data QC Displaying Pandas Data Types Showing Descriptive Statistics Exploring the Dataset Email Slicer User Input & Type Conversion Working with Lists Practicing Loops Calculator Temperature Conversion ADC Temperature Sensor Sorting Numpy Arrays Story Generator Display…

  • Anatomy of the Robust 1D Kalman Filter

    Anatomy of the Robust 1D Kalman Filter

    The Kalman Filter (KF) is a powerful tool for tracking, navigation, and data prediction tasks. It is based on the assumption of linearity and Gaussian noise, enabling it to iteratively update predicted models. The article outlines a simplified implementation of KF using Python commands, with examples demonstrating its effectiveness in handling noisy measurements. It also…

  • Sales Forecasting: tslearn, Random Walk, Holt-Winters, SARIMAX, GARCH, Prophet, and LSTM

    Sales Forecasting: tslearn, Random Walk, Holt-Winters, SARIMAX, GARCH, Prophet, and LSTM

    The data science project involves evaluating various sales forecasting algorithms in Python using a Kaggle time-series dataset. The forecasting algorithms include tslearn, Random Walk, Holt-Winters, SARIMA, GARCH, Prophet, LSTM and Di Pietro’s Model. The goal is to predict next month’s sales for a list of shops and products, which slightly changes every month. The best…

  • Prediction of NASA Turbofan Jet Engine RUL: OLS, SciKit-Learn & LSTM

    Prediction of NASA Turbofan Jet Engine RUL: OLS, SciKit-Learn & LSTM

    We predict the Remaining Useful Life (RUL) of NASA turbofan jet engines by comparing the statsmodels OLS, ML SciKit-Learn regression vs LSTM Keras in Python. The input dataset is the Kaggle version of the public dataset for asset degradation modeling from NASA. It includes Run-to-Failure simulated data from turbo fan jet engines.

  • The 5-Step GCP IoT Device-to-Report via AI Roadmap

    The 5-Step GCP IoT Device-to-Report via AI Roadmap

    The Internet of Things (IoT) aids in the improvement of processes and enables new scenarios through network-connected devices. Recognized as a driver of the Fourth Industrial Revolution, IoT applications include predictive maintenance, industry safety, automation, remote monitoring, asset tracking, and fraud detection. Advancements in cloud IoT architectures over recent years have enabled efficient data ingestion,…

  • Health Insurance Cross Sell Prediction with ML Model Tuning & Validation

    Health Insurance Cross Sell Prediction with ML Model Tuning & Validation

    The content discusses the use of AI and Machine Learning (ML) for insurance cross-selling. It covers topics such as data preparation, model training with different algorithms, parameter optimization, and model evaluation. The study showcases the ability of ML models (HGBM, XGBoost, Random Forest) to predict cross-sell customers in the insurance sector, providing potential for improved…

  • Anomaly Detection using the Isolation Forest Algorithm

    Anomaly Detection using the Isolation Forest Algorithm

    The post describes the application of Isolation Forest, an unsupervised anomaly detection algorithm, to identify abnormal patterns in financial and taxi ride data. The challenge is to accurately distinguish normal and abnormal data points for fraud detection, fault diagnosis, and outlier identification. Using real-world datasets of financial transactions and NYC taxi rides, the algorithm successfully…

  • Oracle Monte Carlo Stock Simulations

    Oracle Monte Carlo Stock Simulations

    Oracle Corporation’s significant developments in Generative AI have led to lucrative partnerships with Nvidia and Elon Musk’s xAI. Having secured contracts exceeding $4 billion for its Generation 2 Cloud designed for AI model training, Oracle’s earnings doubled in Q4 2023. Monte Carlo simulations align with Zacks Rank 3-Hold for ORCL, implying bullish potential with projected…

  • NVIDIA Rolling Volatility: GARCH & XGBoost

    NVIDIA Rolling Volatility: GARCH & XGBoost

    This post examines the prediction of NVIDIA stock volatility using two models: the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) and the Extreme Gradient Boosting (XGBoost). Both models are compared in terms of MSE and MAPE. The post discovers that the machine learning-based XGBoost model outperforms the GARCH model in NVDA volatility forecasting, showing the effectiveness of…

  • Machine Learning-Based Crop Yield Prediction, Classification, and Recommendations

    Machine Learning-Based Crop Yield Prediction, Classification, and Recommendations

    We have implemented a Machine Learning-Based decision support tool for crop yield prediction, including supporting decisions on what crops to grow and what to do during the growing season of the crops.

  • Practical SQL Queries, Cheat Sheets, and Interview Q&A for Data Scientists

    Practical SQL Queries, Cheat Sheets, and Interview Q&A for Data Scientists

    Professionals aspiring for a career in data science must master SQL, a crucial skill. This comprehensive SQL server tutorial includes practical exercises, cheat sheets, interview Q&A tailored to data scientists, and installation requirements. From RDBMS basics to advanced concepts for data science interviews, this resource emphasizes the significance of SQL in database operations.

  • Multiple-Criteria Technical Analysis of Blue Chips in Python

    Multiple-Criteria Technical Analysis of Blue Chips in Python

    Blue chip stocks are the stocks of well-known, high-quality companies. We demonstrate that the proposed approach can help optimize the blue-chip portfolios comprehensively.

  • Time Series Forecasting of Hourly U.S.A. Energy Consumption – PJM East Electricity Grid

    Time Series Forecasting of Hourly U.S.A. Energy Consumption – PJM East Electricity Grid

    Table of Contents PJME Data Let’s set the working directory YOURPATH and import the following key libraries Let’s read the input csv file in our working directory Let’s plot the time series Data Preparation Output: (113926, 1, 9) (113926,) (31439, 1, 9) (31439,) LSTM TSF Let’s plot the LSTM train/test val_loss history Output: MSE: 1811223.125…

  • An Overview of Video Games in 2023: Trends, Technology, and Market Research

    An Overview of Video Games in 2023: Trends, Technology, and Market Research

    The gaming industry is rapidly growing, projected to reach a revenue of $365.6 billion in 2023. Major trends include Web3 gaming, AI integration, and a push for consolidation. Fashion brands collaborate for virtual sales, and advances in gaming technology, such as AR/VR and cloud-based gaming, promise an even more immersive experience for gamers.

  • Customer Reviews NLP Spacy Analysis and ML/AI Demand Forecasting of the Steam PC Video Game Service

    Customer Reviews NLP Spacy Analysis and ML/AI  Demand Forecasting of the Steam PC Video Game Service

    Steam, a leading digital distribution platform for PC gaming, has seen over 6000 new games released in 2022, averaging over 34 games each day. This post aims to conduct comprehensive customer reviews NLP sentiment analysis and ML/AI demand forecasting using public-domain datasets. It covers EDA, NLP Spacy analysis, ML/AI pipeline, model validation, word clouds, and…

  • A Comparison of Automated EDA Tools in Python: Pandas-Profiling vs SweetViz

    A Comparison of Automated EDA Tools in Python: Pandas-Profiling vs SweetViz

    Exploratory Data Analysis (EDA) is an important part of data science projects, designed to identify patterns, anomalies, and relationships. It can employ univariate, bivariate, and multivariate data analytics, and can be accelerated using automated EDA tools. The article discusses Python libraries such as Pandas-Profiling and SweetViz for automating EDA and demonstrates their application to improve…

  • Top Fast-Growing Apps in 2023

    Top Fast-Growing Apps in 2023

    The OKTA Business at Work report and blogs by Leon Zucchini discuss the fastest-growing and new app categories. Key trends include the growth of collaboration, communication, and travel apps, and the adoption of multi-cloud. Ten notable growing apps are Kandji, Grammarly, Bob, Notion, Prisma Access, Navan, GitLab, Ironclad, Terraform Cloud, and Figma. Emerging apps include…

  • Early Heart Attack Prediction using ECG Autoencoder and 19 ML/AI Models with Test Performance QC Comparisons

    Early Heart Attack Prediction using ECG Autoencoder and 19 ML/AI Models with Test Performance QC Comparisons

    Table of Contents Embed Socials: ECG Autoencoder Let’s set the working directory YOURPATH import osos.chdir(‘YOURPATH’)os. getcwd() and import the following libraries import tensorflow as tfimport matplotlib.pyplot as pltimport numpy as npimport pandas as pd from tensorflow.keras import layers, lossesfrom sklearn.model_selection import train_test_splitfrom tensorflow.keras.models import Model Let’s read the input dataset df = pd.read_csv(‘ecg.csv’, header=None) Let’s…

  • Risk-Aware Strategies for DCA Investors

    Risk-Aware Strategies for DCA Investors

    Dollar-Cost Averaging (DCA) is an investment approach that involves investing a fixed amount regularly, regardless of market price. It offers benefits such as risk reduction and market downturn resilience. It’s useful for beginners and can be combined with other strategies for a disciplined investment approach. References include Investopedia and Yahoo Finance.