# Frontiers

The discipline of Data Science (DS) sits at the interface between Technology, the quantitative sciences (such as mathematics, statistics, computer science) and engineering across various business applications and sectors. This page aims to review new methods, research findings, opinions, hypothesis articles and poster presentations on all relevant aspects of DS.

As DS bridges data analytics, statistics, business intelligence (BI), artificial intelligence (AI)-powered technology and data engineering, the page is focused on applying advanced predictive analytics techniques and scientific principles to extract valuable information from data for business decision-making, strategic planning and other uses. It’s increasingly critical to businesses: The insights that data science generates help organizations increase operational efficiency, identify new business opportunities and improve marketing and sales programs, among other benefits. Ultimately, they can lead to competitive advantages over business rivals.

An effective DS team may include the following specialists: Data engineer, data analyst, Machine Learning (ML) engineer, data visualization analyst, data translator, and data architect.

The following most recent business applications drive a wide variety of DS use cases in organizations globally:

• HealthTech
• E-Commerce
• Customer experience
• Risk management
• FinTech
• Digital marketing
• Industrial IoT applications
• Logistics & supply chain management
• Image/Speech Recognition
• Cybersecurity
• LegalTech

## Non-Linear Regression Analysis

Nonlinear regression is a form of regression analysis in which data is fit to a model and then expressed as a mathematical function. Simple linear regression relates two variables (X and Y) with a straight line (y = mx + b), while nonlinear regression relates the two variables in a nonlinear (curved) relationship.

Let’s learn about non-linear regressions by considering a few examples in Python. The scikit-learn library contains the simplified example of 1D regression using linear, polynomial and RBF kernels, as shown below. As a real-worls example, we fit a non-linear model to the datapoints corrensponding to China’s GDP from 1960 to 2014.

### China’s GDP Example

Let’s consider the China’s GDP Kaggle Dataset to test the non-linear regression algorithm

Import and install libraries

import numpy as np
import pandas as pd

!pip install wget

Read the csv file

Let’s plot the data

import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(8,5))
x_data, y_data = (df[“Year”].values, df[“Value”].values)
plt.plot(x_data, y_data, ‘ro’)
plt.ylabel(‘GDP’)
plt.xlabel(‘Year’)
plt.show()

Let’s introduce the non-linear sigmoid function

X = np.arange(-5.0, 5.0, 0.1)
Y = 1.0 / (1.0 + np.exp(-X))

plt.plot(X,Y)
plt.ylabel(‘Dependent Variable’)
plt.xlabel(‘Indepdendent Variable’)
plt.show()

def sigmoid(x, Beta_1, Beta_2):
y = 1 / (1 + np.exp(-Beta_1*(x-Beta_2)))
return y

beta_1 = 0.10
beta_2 = 1990.0

# logistic function

Y_pred = sigmoid(x_data, beta_1 , beta_2)

#### Let’s plot initial prediction against datapoints

plt.plot(x_data, Y_pred*15000000000000.)
plt.plot(x_data, y_data, ‘ro’)

#### Lets normalize our data

xdata =x_data/max(x_data)
ydata =y_data/max(y_data)

Let’s perform non-linear curve fitting

from scipy.optimize import curve_fit
popt, pcov = curve_fit(sigmoid, xdata, ydata)

#### And print the final parameters

print(” beta_1 = %f, beta_2 = %f” % (popt, popt))

`beta_1 = 690.451712, beta_2 = 0.997207`

x = np.linspace(1960, 2015, 55)
x = x/max(x)
plt.figure(figsize=(8,5))
y = sigmoid(x, *popt)
plt.plot(xdata, ydata, ‘ro’, label=’data’)
plt.plot(x,y, linewidth=3.0, label=’fit’)
plt.legend(loc=’best’)
plt.ylabel(‘GDP’)
plt.xlabel(‘Year’)
plt.show()

#### Let’s split data into the train and test sets

msk = np.random.rand(len(df)) < 0.8
train_x = xdata[msk]
test_x = xdata[~msk]
train_y = ydata[msk]
test_y = ydata[~msk]

#### Let’s build the model using the train set

popt, pcov = curve_fit(sigmoid, train_x, train_y)

#### Predict using test set

y_hat = sigmoid(test_x, *popt)

#### Perform evaluation

print(“Mean absolute error: %.2f” % np.mean(np.absolute(y_hat – test_y)))
print(“Residual sum of squares (MSE): %.2f” % np.mean((y_hat – test_y) ** 2))
from sklearn.metrics import r2_score
print(“R2-score: %.2f” % r2_score(y_hat , test_y) )

```Mean absolute error: 0.03
Residual sum of squares (MSE): 0.00
R2-score: 0.96```

## XebiaLabs to Update Periodic Table of DevOps Tools

Version 4 of the industry’s most popular DevOps market landscape tool, the Periodic Table of DevOps. Selected Vendors: Snowflake, Moogsoft, Instana, DataDog, GitLab, among others.

Tweets by @xebialabs

## The Content Marketing (CM) in a Nutshell

CM is a strategic marketing approach focused on creating and distributing valuable, relevant, and consistent content to attract and retain a clearly defined audience — and, ultimately, to drive profitable customer action.

#### The 3 key elements of effective CM are:

• Move your audience
• Earn your audiences attention
• Have a spark

# Investor FAQ

The Calmar Ratio (CR) or the drawdown ratio is a risk-adjusted key performance metric for mutual funds, hedge funds and commodity trading. In fact, it measures the return per unit of risk and lets the investor decide whether the given amount of return is worth it at the given level of risk or not. Calmar is short for California Managed Accounts Report and is very similar to MAR ratio. The CR was first published in 1991. It is most similar to the Sterling ratio in its calculation, it takes the average annual compounded rate of return and divides it by the maximum drawdown for that same time period, usually over a period of 3 years. The higher the CR the better with anything over 0.5 or close to 1.0 is good (Amber), CR=3-5 is really good (Green).

Let’s consider a fund started 3 years ago. It reached a value of \$2 mln but went as low as \$1.5 mln. Over this period the average annual return was 10%. In this example, the CR should help in evaluating whether the fund is worth investing. The outcome is given below:

It appears that the risk-adjusted ratio is 0.4. If the investor has a criterion of a minimum CR of 0.5 , then the fund is not worth investing in (Red). Further, we may compare CR to another fund which has CR>0.5 or CR~1.0, and therefore has a higher risk-adjusted return and should be selected over this fund.

The CR is an improvement of both the Sharpe and Sterling Ratios in that it provides an up-to-date appraisal of commodity trading advisor (CTA) performance.