Featured Photo by Harsch Shivam
The discipline of Data Science (DS) sits at the interface between Technology, the quantitative sciences (such as mathematics, statistics, computer science) and engineering across various business applications and sectors. This page aims to review new methods, research findings, opinions, hypothesis articles and poster presentations on all relevant aspects of DS.
As DS bridges data analytics, statistics, business intelligence (BI), artificial intelligence (AI)-powered technology and data engineering, the page is focused on applying advanced predictive analytics techniques and scientific principles to extract valuable information from data for business decision-making, strategic planning and other uses. It’s increasingly critical to businesses: The insights that data science generates help organizations increase operational efficiency, identify new business opportunities and improve marketing and sales programs, among other benefits. Ultimately, they can lead to competitive advantages over business rivals.
An effective DS team may include the following specialists: Data engineer, data analyst, Machine Learning (ML) engineer, data visualization analyst, data translator, and data architect.
The following most recent business applications drive a wide variety of DS use cases in organizations globally:
- HealthTech
- E-Commerce
- Customer experience
- Risk management
- FinTech
- Stock trading
- Digital marketing
- Industrial IoT applications
- Logistics & supply chain management
- Image/Speech Recognition
- Cybersecurity
- LegalTech


Non-Linear Regression Analysis
Nonlinear regression is a form of regression analysis in which data is fit to a model and then expressed as a mathematical function. Simple linear regression relates two variables (X and Y) with a straight line (y = mx + b), while nonlinear regression relates the two variables in a nonlinear (curved) relationship.
Let’s learn about non-linear regressions by considering a few examples in Python. The scikit-learn library contains the simplified example of 1D regression using linear, polynomial and RBF kernels, as shown below. As a real-worls example, we fit a non-linear model to the datapoints corrensponding to China’s GDP from 1960 to 2014.



China’s GDP Example
Let’s consider the China’s GDP Kaggle Dataset to test the non-linear regression algorithm
Import and install libraries
import numpy as np
import pandas as pd
!pip install wget
Read the csv file
df = pd.read_csv(“YourPath/china_gdp.csv”)
df.head(10)

Let’s plot the data
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(8,5))
x_data, y_data = (df[“Year”].values, df[“Value”].values)
plt.plot(x_data, y_data, ‘ro’)
plt.ylabel(‘GDP’)
plt.xlabel(‘Year’)
plt.show()

Let’s introduce the non-linear sigmoid function
X = np.arange(-5.0, 5.0, 0.1)
Y = 1.0 / (1.0 + np.exp(-X))
plt.plot(X,Y)
plt.ylabel(‘Dependent Variable’)
plt.xlabel(‘Indepdendent Variable’)
plt.show()

def sigmoid(x, Beta_1, Beta_2):
y = 1 / (1 + np.exp(-Beta_1*(x-Beta_2)))
return y
beta_1 = 0.10
beta_2 = 1990.0
Logistic function
Y_pred = sigmoid(x_data, beta_1 , beta_2)
Let’s plot initial prediction against datapoints
plt.plot(x_data, Y_pred*15000000000000.)
plt.plot(x_data, y_data, ‘ro’)

Lets normalize our data
xdata =x_data/max(x_data)
ydata =y_data/max(y_data)
Let’s perform non-linear curve fitting
from scipy.optimize import curve_fit
popt, pcov = curve_fit(sigmoid, xdata, ydata)
And print the final parameters
print(” beta_1 = %f, beta_2 = %f” % (popt[0], popt[1]))
beta_1 = 690.451712, beta_2 = 0.997207
x = np.linspace(1960, 2015, 55)
x = x/max(x)
plt.figure(figsize=(8,5))
y = sigmoid(x, *popt)
plt.plot(xdata, ydata, ‘ro’, label=’data’)
plt.plot(x,y, linewidth=3.0, label=’fit’)
plt.legend(loc=’best’)
plt.ylabel(‘GDP’)
plt.xlabel(‘Year’)
plt.show()

Let’s split data into the train and test sets
msk = np.random.rand(len(df)) < 0.8
train_x = xdata[msk]
test_x = xdata[~msk]
train_y = ydata[msk]
test_y = ydata[~msk]
Let’s build the model using the train set
popt, pcov = curve_fit(sigmoid, train_x, train_y)
Predict using test set
y_hat = sigmoid(test_x, *popt)
Perform evaluation
print(“Mean absolute error: %.2f” % np.mean(np.absolute(y_hat – test_y)))
print(“Residual sum of squares (MSE): %.2f” % np.mean((y_hat – test_y) ** 2))
from sklearn.metrics import r2_score
print(“R2-score: %.2f” % r2_score(y_hat , test_y) )
Mean absolute error: 0.03 Residual sum of squares (MSE): 0.00 R2-score: 0.96

- Stocks to Watch in 2023: MarketBeat Ideas
- GIS ML/AI: Multi-Label Classification of Satellite Images with Fast.AI
- Multi-Label Keras CNN Image Classification of MNIST Fashion Clothing
- Top Digital Marketing Trends 2022-Q1’23
- Top E-Commerce Trends in Q1’23
Posts of Interest

XebiaLabs to Update Periodic Table of DevOps Tools
#ContinuousDelivery#XLPeriodicTable#DevOps
Version 4 of the industryโs most popular DevOps market landscape tool, the Periodic Table of DevOps. Selected Vendors: Snowflake, Moogsoft, Instana, DataDog, GitLab, among others.

The Content Marketing (CM) in a Nutshell
CM is a strategic marketing approach focused on creating and distributing valuable, relevant, and consistent content to attract and retain a clearly defined audience โ and, ultimately, to drive profitable customer action.
Benefits of CM:
* Grow brand awareness
* Drive organic visitors
* Generate sales leads
* Build trust
* Earn customer loyalty
* Create demand
The 3 key elements of effective CM are:
- Move your audience
- Earn your audiences attention
- Have a spark


A Start-Up Marketing Plan


Investor FAQ
The Calmar Ratio (CR) or the drawdown ratio is a risk-adjusted key performance metric for mutual funds, hedge funds and commodity trading. In fact, it measures the return per unit of risk and lets the investor decide whether the given amount of return is worth it at the given level of risk or not. Calmar is short for California Managed Accounts Report and is very similar to MAR ratio. The CR was first published in 1991. It is most similar to the Sterling ratio in its calculation, it takes the average annual compounded rate of return and divides it by the maximum drawdown for that same time period, usually over a period of 3 years. The higher the CR the better with anything over 0.5 or close to 1.0 is good (Amber), CR=3-5 is really good (Green).
Letโs consider a fund started 3 years ago. It reached a value of $2 mln but went as low as $1.5 mln. Over this period the average annual return was 10%. In this example, the CR should help in evaluating whether the fund is worth investing. The outcome is given below:

It appears that the risk-adjusted ratio is 0.4. If the investor has a criterion of a minimum CR of 0.5 [21], then the fund is not worth investing in (Red). Further, we may compare CR to another fund which has CR>0.5 or CR~1.0, and therefore has a higher risk-adjusted return and should be selected over this fund.
The CR is an improvement of both the Sharpe and Sterling Ratios in that it provides an up-to-date appraisal of commodity trading advisor (CTA) performance.
Advanced ML/AI: BTC-USD Price Prediction with LSTM Keras



Cloud-Native Tech Autumn 2022 Fair

