Tag: data-driven technology
-
Improved Multiple-Model ML/DL Credit Card Fraud Detection: F1=88% & ROC=91%
Photo by CardMapr.nl on Unsplash Clickable Table of Contents Data Preparation & Exploratory Analysis Let’s set the working directory import osos.chdir(‘YOURPATH’) os. getcwd() and import the necessary packages import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as sns %matplotlib inlinesns.set_style(“whitegrid”) Let’s load the dataset from the csv file using Pandas data =…
-
Unsupervised ML Clustering, Customer Segmentation, Cohort, Market Basket, Bank Churn, CRM, ABC & RFM Analysis – A Comprehensive Guide in Python
Table of Clickable Contents Motivation Methods Open-Source Datasets This file contains the basic information (ID, age, gender, income, and spending score) about the customers. Online retail is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion…
-
Risk-Aware Strategies for DCA Investors
Let’s look at the the Dollar-Cost Averaging (DCA) investment approach that involves investing the same amount of money in a target security at regular intervals over a certain period of time, regardless of price. It can make it easier to deal with uncertain markets by making purchases automatic. It also supports an investor’s effort to invest…
-
Advanced Integrated Data Visualization (AIDV) in Python – 2. Dabl Auto EDA & ML
Table of Contents First, let’s install dabl !pip install dabl and set the working directory DIR import osos.chdir(‘DIR’)os. getcwd() The Digits Classification Dataset Let’s run dabl.SimpleClassifier() as follows import dablfrom sklearn.model_selection import train_test_splitfrom sklearn.datasets import load_digitsX, y = load_digits(return_X_y=True)X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)sc = dabl.SimpleClassifier().fit(X_train, y_train) Running DummyClassifier() accuracy: 0.106 recall_macro: 0.100…
-
A Closer Look at the Azure Cloud Portfolio – 1. Essentials
Table of Contents Azure Cloud Concepts Source: 2023 TomTom Azure packaged software, IaaS, PaaS, and SaaS: Azure Synapse SQL Pool Learn more about Polybase here. Azure DevOps Boards Capability Maturity Model Integration (CMMI) is a process level improvement training and appraisal program. Add new items and divide work into time slots called sprints. Learn more about…
-
Joint Analysis of Bitcoin, Gold and Crude Oil Prices with Optimized Risk/Return in 2023
Referring to the recent fintech R&D study in Python, let’s discuss joint time-series analysis of Bitcoin (BTC), Gold (GC=F) and Crude Oil (CL=F) prices 2021-23 with the subsequent Markowitz portfolio optimization of these 3 assets in 2023. Goals: Scope: Input Data Let’s set the working directory import os os.chdir(‘PORTFOLIORISK’) os. getcwd() and import the following…
-
Video Game Sales Data Visualization, Wrangling and Market Analysis in Python
Featured Photo by Element5 Digital on Pexels. Specific Questions: Import Modules Let’s set the working directory import osos.chdir(‘VIDEOGAMES’)os. getcwd() and import the necessary modules/libraries import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as sns%matplotlib inlinesns.set_style(‘darkgrid’) Input Dataset Let’s read the dataset df = pd.read_csv(‘vgsales.csv’)df.head() Dataset shapedf.shape (16598, 11) Dataset typedf.dtypes Rank int64…
-
Overview of AWS Tech Portfolio 2023
This article provides with an overview of 50+ Amazon Web Services (AWS) 2023. AWS is the leading vendor of cloud services and infrastructure, dominating the cloud computing market: Amazon net sales increased by 15% to $127.1 billion in Q3 2022 as compared to $110.8 billion in Q3 2021. AWS segment sales increased by 27% year-over-year to reach…
-
Predicting the JPM Stock Price and Breakouts with Auto ARIMA, FFT, LSTM and Technical Trading Indicators
Featured Photo by Pixabay In this post, we will look at the JPM stock price and relevant breakout strategies for 2022-23. Referring to the previous case study, our goal is to combine the Auto ARIMA, FFT, LSTM models and Technical Trading Indicators (TTIs) into a single framework to optimize advantages of each. Specifically, we will…
-
Towards Max(ROI/Risk) Trading in Q1 2023
In this post, we will compare 1Y ROI/Risk of selected stocks vs ETF using a set of basic stock analyzer functions. The posts consists of the following three parts: Looking at the closing price of a stock over time is a good way to track its performance We combine the risk and return metrics into…
-
SARIMAX X-Validation of EIA Crude Oil Prices Forecast in 2023 – 2. Brent
Based on our previous study, our today’s focus is on SARIMAX time-series X-validation of the Brent crude oil spot price USD/b: viz. the goal is to verify the following EIA energy forecast in 2023 According to EIA, the Brent spot price will average $83.63/b in 2023. Table of Contents Prerequisites In this study we will be…
-
Trending YouTube Video Data Science, NLP Predictions & Sentiment Analysis
Table of Contents Global YT WordCloud Let’s begin with the Kaggle YT TextHero dataset containing 3599 rows and 4 columns. Let’s set the working directory YOURPATH import osos.chdir(‘YOURPATH’) os. getcwd() and import all necessary modulesfrom wordcloud import WordCloud, STOPWORDSimport matplotlib.pyplot as pltimport pandas as pd Let’s read the input dataset df = pd.read_csv(r”youtube0.csv”, encoding =”latin-1″)…
-
SARIMAX X-Validation of EIA Crude Oil Prices Forecast in 2023 – 1. WTI
Featured Photo by Pixabay Table of Contents: Let’s perform SARIMAX X-validation of EIA WTI and Brent oil prices forecast in the 2nd half of 2023. Recall that SARIMAX (Seasonal Autoregressive Integrated Moving Average with eXogenous factors) is an updated version of the ARIMA model for time series forecasting. SARIMAX is a seasonal equivalent to SARIMA…
-
Top E-Commerce Trends in 2023
Featured Photo by PhotoMIX Company on Pexels Best E-Commerce Platforms (January 2023): Top e-commerce platforms make it both easy and affordable to build a successful online store. Of course, with so many good options on the market, choosing the right system for your needs can be a challenge. To help, we put together this list…
-
SARIMAX-TSA Forecasting, QC and Visualization of E-Commerce Food Delivery Sales
Featured Photo by Ella Olsson on Pexels Inspired by the recent TSA e-commerce use-case, this article is a beginner-friendly guide to help you understand and evaluate ARIMA-based time-series forecasting models such as SARIMA and SARIMAX. Objective: To understand the basic concepts of ARIMA, SARIMA and SARIMAX in terms of Time Series Forecasting QC. Application: We will…
-
NETFLIX Interactive Visualization with Plotly
Featured Photo by Roberto Nickson on Pexels This project consists in the implementation of Python-3 Exploratory Data Analysis (EDA), streaming data visualization and highly interactive Plotly UI for reviewing Netflix movies and TV shows. Objectives: The end-to-end workflow has a purpose to informed the movie enthusiasts to discover the Netflix contents which are presented in…
-
COVID-19 Data Visualization, Impact and Vaccine Sentiment Analysis
The coronavirus COVID-19 pandemic is the defining global health crisis of our time and the greatest challenge we have faced since World War Two. After over two years of living with Covid-19, we are learning to adapt to a world with this disease. 2022 ends with looming risk of a new coronavirus variant, health experts…
-
99% Accurate Breast Cancer Classification using Neural Networks in TensorFlow 2.11.0
Workflow The entire workflow is as follows: Prerequisites We need to install the following libraries: !pip install –user tensorflow BC Dataset In this study, we use the BC Wisconsin (Diagnostic) Dataset to predict whether the BC is benign or malignant. Model features are computed from a digitized image of a fine needle aspirate (FNA) of…