Category: Data-Driven Tech

  • Improved Multiple-Model ML/DL Credit Card Fraud Detection: F1=88% & ROC=91%

    Improved Multiple-Model ML/DL Credit Card Fraud Detection: F1=88% & ROC=91%

    Photo by CardMapr.nl on Unsplash Clickable Table of Contents Data Preparation & Exploratory Analysis Let’s set the working directory import osos.chdir(‘YOURPATH’) os. getcwd() and import the necessary packages import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as sns %matplotlib inlinesns.set_style(“whitegrid”) Let’s load the dataset from the csv file using Pandas data =…

  • Unsupervised ML Clustering, Customer Segmentation, Cohort, Market Basket, Bank Churn, CRM, ABC & RFM Analysis – A Comprehensive Guide in Python

    Unsupervised ML Clustering, Customer Segmentation, Cohort, Market Basket, Bank Churn, CRM, ABC & RFM Analysis – A Comprehensive Guide in Python

    Table of Clickable Contents Motivation Methods Open-Source Datasets This file contains the basic information (ID, age, gender, income, and spending score) about the customers. Online retail is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion…

  • Top Fast-Growing Apps in 2023

    Top Fast-Growing Apps in 2023

    This post was inspired by the recent OKTA Business at Work report and the follow-up Medium blogs by Leon Zucchini published in Apr 7 and Apr 24. Table of Contents Popular App Categories 10 Fastest-Growing Apps 10 Hottest New Apps Summary The following trends are set to shake-up the (Mobile) App industry: A few of…

  • Early Heart Attack Prediction using ECG Autoencoder and 19 ML/AI Models with Test Performance QC Comparisons

    Early Heart Attack Prediction using ECG Autoencoder and 19 ML/AI Models with Test Performance QC Comparisons

    Table of Contents Embed Socials: ECG Autoencoder Let’s set the working directory YOURPATH import osos.chdir(‘YOURPATH’)os. getcwd() and import the following libraries import tensorflow as tfimport matplotlib.pyplot as pltimport numpy as npimport pandas as pd from tensorflow.keras import layers, lossesfrom sklearn.model_selection import train_test_splitfrom tensorflow.keras.models import Model Let’s read the input dataset df = pd.read_csv(‘ecg.csv’, header=None) Let’s…

  • Risk-Aware Strategies for DCA Investors

    Risk-Aware Strategies for DCA Investors

    Let’s look at the the Dollar-Cost Averaging (DCA) investment approach that involves investing the same amount of money in a target security at regular intervals over a certain period of time, regardless of price. It can make it easier to deal with uncertain markets by making purchases automatic. It also supports an investor’s effort to invest…

  • A Closer Look at the Azure Cloud Portfolio – 3. Azure DevOps Boards

    A Closer Look at the Azure Cloud Portfolio – 3. Azure DevOps Boards

    1. Getting Started with AB You need the MS account to start AB. Choose Start free option. Choose Public option Click Advanced: Git version control (there could be code versions or file management) and Basic work item process The first/second one is the distributed/centralized version control Project management process: Agile, Basic, CMMI, Scrum Capability Maturity…

  • An Interactive GPT Index and DeepLake Interface – 1. Amazon Financial Statements

    An Interactive GPT Index and DeepLake Interface – 1. Amazon Financial Statements

    Let’s set the working directory YOURPATH import osos.chdir(‘YOURPATH’) os. getcwd() and install the key libraries !pip install llama-index !pip install deeplake Let’s import the libraries from llama_index import (SimpleDirectoryReader,GPTDeepLakeIndex,GPTSimpleKeywordTableIndex,Document,LLMPredictor,ServiceContext,download_loader,)from langchain.chat_models import ChatOpenAIfrom typing import List, Optional, Tupleimport requestsimport tqdmimport osfrom pathlib import Path Let’s define the PDF file reader PDFReader = download_loader(“PDFReader”) loader = PDFReader()…

  • Effective 2D Image Compression with K-means Clustering

    Effective 2D Image Compression with K-means Clustering

    Performance Test Let’s set the working directory YOUR PATH and import the key Python libraries import osos.chdir(‘YOUR PATH’) os. getcwd() import pandas as pdimport numpy as npimport matplotlib as mplimport matplotlib.pyplot as plt from scipy.io import loadmatfrom sklearn.cluster import KMeansfrom sklearn.preprocessing import StandardScalerfrom scipy import linalg pd.set_option(‘display.notebook_repr_html’, False)pd.set_option(‘display.max_columns’, None)pd.set_option(‘display.max_rows’, 150)pd.set_option(‘display.max_seq_items’, None) %matplotlib inline import seaborn…

  • The $0 MarTech Stack for Small Business

    The $0 MarTech Stack for Small Business

    Table of Contents Automation 10 Best Free Marketing Automation Software: Freshmarketer HubSpot Marketing Ortto Omnisend EngageBay Zoho Campaigns MailChimp Drip SendinBlue Email Each tool in this list can be used individually or combined with additional 3 products:  Phantombuster Zapier Customers.ai We are now firmly in the age of automation. As big data trends show the rise of…

  • Dealing with Imbalanced Data in HealthTech ML/AI – 1. Stroke Prediction

    Dealing with Imbalanced Data in  HealthTech ML/AI – 1. Stroke Prediction

    Specifically, we will compare the (1) SMOTE-balanced Torch NN (viz. the Cross-Entropy Adam Optimizer) against the (2) Sinnott’s Python algorithm from scikit-learn to be validated by various scikit-learn metrics, such as AUC, precision, recall, F-measure and accuracy.  Table of Contents Our Jupyter notebook and the entire Python project will be stored in the working directory…

  • A Closer Look at the Azure Cloud Portfolio – 2. From VMs to Web Servers

    A Closer Look at the Azure Cloud Portfolio – 2. From VMs to Web Servers

    In this post, you’ll read about creating virtual machines (VMs) and deploying your web servers from Azure. Read more here. Courtesy of Mario Ferraro. Prerequisites: An active Azure subscription. Before taking this guide, if you don’t have an Azure subscription yet, please create an Azure Free Trial beforehand. Step 1: Create a Resource Group Step…

  • Working with FRED API in Python: U.S. Recession Forecast & Beyond

    Working with FRED API in Python: U.S. Recession Forecast & Beyond

    Featured Photo by Lukas on Pexels. FRED stands for Federal Reserve Economic Data, and is a database of time series economic data that has been aggregated from a bunch of sources.  This is a great place to find financial data. You can visit the FRED web site to search for a data series or use the Python fredapi to download data…

  • Advanced Integrated Data Visualization (AIDV) in Python – 2. Dabl Auto EDA & ML

    Advanced Integrated Data Visualization (AIDV) in Python – 2. Dabl Auto EDA & ML

    Table of Contents First, let’s install dabl !pip install dabl and set the working directory DIR import osos.chdir(‘DIR’)os. getcwd() The Digits Classification Dataset Let’s run dabl.SimpleClassifier() as follows import dablfrom sklearn.model_selection import train_test_splitfrom sklearn.datasets import load_digitsX, y = load_digits(return_X_y=True)X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)sc = dabl.SimpleClassifier().fit(X_train, y_train) Running DummyClassifier() accuracy: 0.106 recall_macro: 0.100…

  • A Closer Look at the Azure Cloud Portfolio – 1. Essentials

    A Closer Look at the Azure Cloud Portfolio – 1. Essentials

    Table of Contents Azure Cloud Concepts Source: 2023 TomTom Azure packaged software, IaaS, PaaS, and SaaS: Azure Synapse SQL Pool Learn more about Polybase here. Azure DevOps Boards Capability Maturity Model Integration (CMMI) is a process level improvement training and appraisal program.  Add new items and divide work into time slots called sprints. Learn more about…

  • Joint Analysis of Bitcoin, Gold and Crude Oil Prices with Optimized Risk/Return in 2023

    Joint Analysis of Bitcoin, Gold and Crude Oil Prices with Optimized Risk/Return in 2023

    Referring to the recent fintech R&D study in Python, let’s discuss joint time-series analysis of Bitcoin (BTC), Gold (GC=F) and Crude Oil (CL=F) prices 2021-23 with the subsequent Markowitz portfolio optimization of these 3 assets in 2023. Goals: Scope: Input Data Let’s set the working directory import os os.chdir(‘PORTFOLIORISK’) os. getcwd() and import the following…

  • Video Game Sales Data Visualization, Wrangling and Market Analysis in Python

    Video Game Sales Data Visualization, Wrangling and Market Analysis in Python

    Featured Photo by Element5 Digital on Pexels. Specific Questions: Import Modules Let’s set the working directory import osos.chdir(‘VIDEOGAMES’)os. getcwd() and import the necessary modules/libraries import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as sns%matplotlib inlinesns.set_style(‘darkgrid’) Input Dataset Let’s read the dataset df = pd.read_csv(‘vgsales.csv’)df.head() Dataset shapedf.shape (16598, 11) Dataset typedf.dtypes Rank int64…

  • Advanced Integrated Data Visualization (AIDV) in Python – 1. Stock Technical Indicators

    Advanced Integrated Data Visualization (AIDV) in Python – 1. Stock Technical Indicators

    Featured Photo by Monstera on Pexels. In this project, we will implement the following Technical Indicators in Python: Conventionally, we will look at the following three main groups of technical indicators: Input Stock Data Let’s set the working directory VIZ import osos.chdir(‘VIZ’)os. getcwd() and import the key libraries import datetime as dtimport pandas as pdimport…

  • Using AI/ANN AUC>90% for Early Diagnosis of Cardiovascular Disease (CVD)

    Using AI/ANN AUC>90% for Early Diagnosis of Cardiovascular Disease (CVD)

    Featured Photo of Karolina Grabowska on Pexels. Data Preparation Let’s set the working directory HEART23 import osos.chdir(‘HEART23’)os. getcwd() and import the libraries import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snssns.set() from scipy.stats import skew from sklearn.preprocessing import StandardScalerfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import accuracy_score, roc_curve, roc_auc_score, precision_score, recall_score import scikitplot…

  • Overview of AWS Tech Portfolio 2023

    Overview of AWS Tech Portfolio 2023

    This article provides with an overview of 50+ Amazon Web Services (AWS) 2023. AWS is the leading vendor of cloud services and infrastructure, dominating the cloud computing market: Amazon net sales increased by 15% to $127.1 billion in Q3 2022 as compared to $110.8 billion in Q3 2021. AWS segment sales increased by 27% year-over-year to reach…

  • Gold ETF Price Prediction using the Bayesian Ridge Linear Regression

    Gold ETF Price Prediction using the  Bayesian Ridge Linear Regression

    Featured Photo by Pixabay. Let’s set the working directory GOLD import osos.chdir(‘GOLD’) os. getcwd() and import the following libraries from sklearn.linear_model import LinearRegression import pandas as pdimport numpy as np import matplotlib.pyplot as plt%matplotlib inlineplt.style.use(‘seaborn-darkgrid’) import yfinance as yf Let’s read the dataDf = yf.download(‘GLD’, ‘2022-01-01’, ‘2023-03-25’, auto_adjust=True) Df = Df[[‘Close’]] Df = Df.dropna() Let’s…