Wind Energy ML Prediction & Turbine Power Control

Featured Photo by Sam Forson on Pexels

  • Wind turbine power curve modeling plays an important role in wind energy management and power forecasting.
  • The goals of traditional wind turbine control systems are maximizing energy production while protecting the wind turbine components.
  • Wind power curves mainly contribute to wind power forecasting wind turbine condition monitoring, estimation of potential wind energy and wind turbine selection.
  • Accurate models of power curves can play an important role in improving the performance of wind energy based systems. 
  • In this project, we will invoke several linear and non-linear regression power curve models under different environmental conditions.
  • The current focus is on examining plots of wind farm total energy and loss vs wind direction as well as comparing theoretical and observed power curves.
  • The final goal is to produce reliable wind power curves from raw wind data due to the presence of outliers formed in unexpected conditions, e.g., wind curtailment and blade damage. 

Table of Contents

  1. The Wind Turbine Scada Dataset
  2. Automated Exploratory Data Analysis (EDA)
  3. ML-Based Predictions of Wind Turbine Power Production
  4. Wind Turbine Power Curve Simulations and Control
  5. Summary
  6. Explore More

The Wind Turbine Scada Dataset

  • In wind turbines, the Scada system measures actual wind speed and direction vs LV Active Power (kW) for 10 minutes intervals.
  • In addition to the observed power curve, the theoretical power curve is provided by the turbine manufacturer.
  • In this study, the input public-domain T1 dataset was taken from the Scada system that is working and generating wind power in Turkey.

Let’s set the working directory YOURPATH

import os
os.chdir('YOURPATH')    # Set working directory
os. getcwd()

Let’s import the key libraries

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.style as style
import numpy as np
import pandas as pd
import plotly.express as px
import seaborn as sns
from scipy import stats
import warnings
warnings.filterwarnings("ignore")

and read the T1 data

data = pd.read_csv('T1.csv')

Let’s examine the shape and descriptive statistics of this dataset

data.shape

Output: (50530, 5)

data.describe().T
Input data descriptive statistics

Automated Exploratory Data Analysis (EDA)

Let’s generate the Pandas Profiling HTML report

import pandas as pd
from pandas_profiling import ProfileReport

profile = ProfileReport(data, title="Pandas Scada Data Report")
profile.to_file("reportscada.html")

Overview:

Variable types

Categorical1
Numeric4

Dataset statistics:

Number of variables5
Number of observations50530
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 MiB
Average record size in memory40.0 B

Alerts:

Pandas Profiling Alerts

Variables:

LV ActivePower (kW)

LV ActivePower (kW) statistics

Interactions:

Interactions LV Active Power vs Wind Speed
Interactions LV Active Power vs Wind Direction

Correlations:

Correlation matrix

Sample

First rows

Data sample first rows

We can also generate the SweetViz report

# importing sweetviz
import sweetviz as sv
#analyzing the dataset
advert_report = sv.analyze(data)
#display the report
advert_report.show_html('scada_sweetviz.html')
SweetViz dashboard

Associations:

SweetViz associations

ML-Based Predictions of Wind Turbine Power Production

Let’s compare theoretical and observed power curves

matplotlib.rcParams.update({'font.size': 22})

data = data
exp = data['LV ActivePower (kW)']
the = data['Theoretical_Power_Curve (KWh)']
plt.figure(figsize=(12,6)) 
plt.plot(data['Wind Speed (m/s)'], data['LV ActivePower (kW)'], 'o', label='Real Power')
plt.plot(data['Wind Speed (m/s)'], data['Theoretical_Power_Curve (KWh)'], '.', label='theoretical_power_curve (kwh)')
plt.xlabel('wind speed (m/s)', size=22)
plt.ylabel('Power Production (kw)', size=22)
plt.title('Wind Turbine Power Production Prediction')
plt.legend(loc ="upper left",fontsize=15)
plt.show()
Observed vs theoretical power curves

Let’s import ML libraries and perform data pre-processing:

from sklearn import metrics
from sklearn import model_selection
from sklearn import preprocessing
from sklearn.datasets import make_classification
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import ElasticNet
from sklearn.linear_model import Lasso
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import Ridge
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_squared_error, mean_absolute_error

from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler, Normalizer
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import DecisionTreeRegressor
from xgboost import XGBClassifier
from xgboost.sklearn import XGBRegressor
import xgboost as xgb
def outlier_remover(dat, prop, min, max):
    d = dat
    q_low = d[prop].quantile(min)
    q_hi  = d[prop].quantile(max)
    return d[(d[prop] < q_hi) & (d[prop] > q_low)]
# Create Sub-DataFrames
d = {}
step = 50
i = 1
for x in range(20, 3400, step):
    d[i] = data.iloc[((data['LV ActivePower (kW)']>=x)&((data['LV ActivePower (kW)']<x+step))).values]
    #print(d[i])
    i = i + 1
print("There are in total of {} DataFrames".format(i-1))

Output: There are in total of 68 DataFrames

d[69] = data.iloc[(data['LV ActivePower (kW)']>=3300).values]
# Remove outlier
for x in range(1, 70):
    if x <= 3:
        F = 0.95
    elif ((x > 3) and (x <= 10)):
        F = 0.9
    elif ((x > 10) and (x <= 20)):
        F = 0.92
    elif ((x > 20) and (x < 30)):
        F = 0.96
    else:
        F = 0.985
    d[x] = outlier_remover(d[x], 'Wind Speed (m/s)', 0.0001, F)
df=pd.DataFrame()
for infile in range(1,70):
    data = d[infile]
    df=df.append(data,ignore_index=True)
df.shape

Output: (37803, 5)

Let’s compare theoretical and observed power curves after editing of observed data

matplotlib.rcParams.update({'font.size': 22})

data = df
exp = data['LV ActivePower (kW)']
the = data['Theoretical_Power_Curve (KWh)']
plt.figure(figsize=(12,6)) 
plt.plot(data['Wind Speed (m/s)'], data['LV ActivePower (kW)'], 'o', label='Real Power')
plt.plot(data['Wind Speed (m/s)'], data['Theoretical_Power_Curve (KWh)'], '.', label='theoretical_power_curve (kwh)')
plt.xlabel('wind speed (m/s)', size=22)
plt.ylabel('Power Production (kw)', size=22)
plt.title('Wind Turbine Power Production Prediction')
plt.legend(fontsize=15,loc ="lower right")
plt.show()
Observed vs theoretical power curves after observed data editing

Let’s introduce the following functions needed for ML training

ftrain = ['LV ActivePower (kW)', 'Wind Speed (m/s)', 'Wind Direction (°)']

def Definedata():
    # define dataset
    data2 = df[ftrain]
    X = data2.drop(columns=['LV ActivePower (kW)']).values
    y = data2['LV ActivePower (kW)'].values
    return X, y
def Models(models):
    
    model = models
    X, y = Definedata()
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 125)
    model.fit(X_train,y_train)
    y_pred = model.predict(X_test)
    y_total = model.predict(X)
    
    print("\t\tError Table")
    print('Mean Absolute Error      : ', metrics.mean_absolute_error(y_test, y_pred))
    print('Mean Squared  Error      : ', metrics.mean_squared_error(y_test, y_pred))
    print('Root Mean Squared  Error : ', np.sqrt(metrics.mean_squared_error(y_test, y_pred)))
    print('Accuracy on Training set   : ', model.score(X_train,y_train))
    print('Accuracy on Testing set  : ', model.score(X_test,y_test))
    return y_total, y

def Featureimportances(models):
    model = models
    model.fit(X_train,y_train)
    importances = model.feature_importances_
    features = df_test.columns[:9]
    imp = pd.DataFrame({'Features': ftest, 'Importance': importances})
    imp['Sum Importance'] = imp['Importance'].cumsum()
    imp = imp.sort_values(by = 'Importance')
    return imp

def Graph_prediction(y_actual, y_predicted):
    matplotlib.rcParams.update({'font.size': 22})
    y = y_actual
    y_total = y_predicted
    TP = df['Theoretical_Power_Curve (KWh)']
    number = len(df['Wind Speed (m/s)'])
    aa=[x for x in df['Wind Speed (m/s)']]
    plt.figure(figsize=(12,6)) 
    plt.plot(aa, y[:number], 'o', label='Real Power')
    plt.plot(aa, y_total[:number], 'x', label='Predicted Power')
    plt.plot(aa, TP[:number], '.', label='theoretical_power_curve (kwh)')
    
    plt.xlabel('wind speed (m/s)', size=22)
    plt.ylabel('Power Production (kw)', size=22)
    plt.title('Wind Turbine Power Production Prediction')
    plt.legend(fontsize=15,loc ="lower right")
    plt.show()

KNeighborsRegressor (KNN)

y_predicted, y_actual = Models(KNeighborsRegressor())
Graph_prediction(y_actual, y_predicted)
Error Table 

Mean Absolute Error : 84.9877243162573 

Mean Squared Error : 15990.241699106758 

Root Mean Squared Error : 126.45252745242682 

Accuracy on Training set : 0.9936902804656327 

Accuracy on Testing set : 0.9899973865944977
Power curve predicted by DecisionTreeRegressor() vs observed data and theoretical power curve

DecisionTreeRegressor (DT)

y_predicted, y_actual = Models(DecisionTreeRegressor())
Graph_prediction(y_actual, y_predicted)
Error Table 

Mean Absolute Error : 97.4102137065757 

Mean Squared Error : 22350.459953375295 

Root Mean Squared Error : 149.50070218355262 

Accuracy on Training set : 1.0 

Accuracy on Testing set : 0.9860187847966511
Power curve predicted by ExtraTreesRegressor() vs observed data and theoretical power curve

ExtraTreesRegressor (ET)

y_predicted, y_actual = Models(ExtraTreesRegressor())
Graph_prediction(y_actual, y_predicted)
Error Table 

Mean Absolute Error : 79.52242681022484 

Mean Squared Error : 14759.43257128833 

Root Mean Squared Error : 121.48840508990284 

Accuracy on Training set : 1.0 

Accuracy on Testing set : 0.9907673129103843
Power curve predicted by KNeighborsRegressor() vs observed data and theoretical power curve

RandomForestRegressor (RF)

y_predicted, y_actual = Models(RandomForestRegressor(n_estimators=350,min_samples_split=2,min_samples_leaf=1,max_features='sqrt',max_depth=25))
Graph_prediction(y_actual, y_predicted)
Error Table
Mean Absolute Error      :  76.44924832499083
Mean Squared  Error      :  12987.579768252423
Root Mean Squared  Error :  113.96306317510259
Accuracy on Training set   :  0.9988794773674271
Accuracy on Testing set  :  0.9918756863129712
Power curve predicted by RandomForestRegressor() vs observed data and theoretical power curve

GradientBoostingRegressor (GB)

y_predicted, y_actual = Models(GradientBoostingRegressor(random_state=21, n_estimators=2000))
Graph_prediction(y_actual, y_predicted)
Error Table
Mean Absolute Error      :  75.85085775100555
Mean Squared  Error      :  12444.035639985173
Root Mean Squared  Error :  111.55283788405015
Accuracy on Training set   :  0.9949454619292175
Accuracy on Testing set  :  0.9922156975452086
Power curve predicted by GradientBoostingRegressor() vs observed data and theoretical power curve

Let’s create the following ML performance summary bar chart

import pandas as pd

matplotlib.rcParams.update({'font.size': 22})



plotdata = pd.DataFrame({

    "MAE":[84.98,97.70,79.46,76.39,75.85],

    "RMSE":[126.45,150.26,121.75,113.96,111.55]},

    index=['KNN','DT',"ET", "RF", "GB"])

plotdata.plot(kind="bar",figsize=(15, 8))

plt.title("ML Model Regression Metrics MAE vs RMSE")

plt.xlabel("Regression Metrics",size=22)

plt.legend(fontsize=22,loc ="upper right")
ML Model Regression Metrics MAE vs RMSE for 5 ML models

Polynomial Regression

Let’s test the polynomial regression algorithm

from sklearn.preprocessing import PolynomialFeatures
poly_regressor = PolynomialFeatures(degree=4)
X, y = Definedata()
X_columns = poly_regressor.fit_transform(X)
from sklearn.linear_model import LinearRegression
model = LinearRegression()

model.fit(X_columns, y)

pred = model.predict(poly_regressor.fit_transform(X))

Let’s look at the scatter X-plot

plt.scatter(y,pred)
xx=y
yy=pred
m, b = np.polyfit(xx, yy, 1)

plt.plot(xx, yy, 'yo', xx, m*xx+b, '--r')

matplotlib.rcParams.update({'font.size': 22})
plt.xlabel("Observed",size=22)
plt.ylabel("Predicted",size=22)

plt.show()
Predicted vs Observed data X-plot of the 4th order polynomial regression

The corresponding power curves are as follows

y = y
    y_total = pred
    TP = df['Theoretical_Power_Curve (KWh)']
    number = len(df['Wind Speed (m/s)'])
    aa=[x for x in df['Wind Speed (m/s)']]
    matplotlib.rcParams.update({'font.size': 22})
    plt.figure(figsize=(12,6)) 
    plt.plot(aa, y[:number], 'o', label='Real Power')
    plt.plot(aa, y_total[:number], 'x', label='Predicted Power')
    plt.plot(aa, TP[:number], '.', label='theoretical_power_curve (kwh)')
    
    plt.xlabel('wind speed (m/s)', size=22)
    plt.ylabel('Power Production (kw)', size=22)
    plt.title('Wind Turbine Power Production Prediction')
    plt.legend(fontsize=15)
    plt.show()
Power curve predicted by the 4th order polynomial regression vs observed data and theoretical power curve

Wind Turbine Power Curve Simulations and Control

Let’s turn our attention to the T1 power curve control:

  • Import libraries
import numpy as np 
import pandas as pd 
import seaborn as sns
import matplotlib.pyplot as plt
from collections import Counter
%matplotlib inline
import openpyxl
import chart_studio.plotly
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
  • Data preparation
def find_month(x):
    if " 01 " in x:
        return "Jan"
    elif " 02 " in x:
        return "Feb"
    elif " 03 " in x:
        return "March"    
    elif " 04 " in x:
        return "April"    
    elif " 05 " in x:
        return "May"    
    elif " 06 " in x:
        return "June"    
    elif " 07 " in x:
        return "July"    
    elif " 08 " in x:
        return "August"    
    elif " 09 " in x:
        return "Sep"    
    elif " 10 " in x:
        return "Oct"    
    elif " 11 " in x:
        return "Nov"    
    else:
        return "Dec" 
def mean_speed(x):
    list=[]
    i=0.25
    while i<=25.5:
        list.append(i)
        i+=0.5
        
    for i in list:
        if x < i:
            x=i-0.25
            return x
def mean_direction(x):
    list=[]
    i=15
    while i<=375:
        list.append(i)
        i+=30
        
    for i in list:
        if x < i:
            x=i-15
            if x==360:
                return 0
            else:
                return x
def find_direction(x):
    if x==0:
        return "N"
    if x==30:
        return "NNE"
    if x==60:
        return "NEE" 
    if x==90:
        return "E" 
    if x==120:
        return "SEE" 
    if x==150:
        return "SSE" 
    if x==180:
        return "S" 
    if x==210:
        return "SSW" 
    if x==240:
        return "SWW" 
    if x==270:
        return "W" 
    if x==300:
        return "NWW" 
    if x==330:
        return "NNW"
  
data_T_start=pd.read_csv("T1.csv")
turbine_no="T1" 
data1_T=data_T_start.copy()
data1_T["Month"]=data1_T["LV ActivePower (kW)"]+data1_T["Wind Speed (m/s)"]
data1_T.rename(columns={'LV ActivePower (kW)':'ActivePower(kW)',"Wind Speed (m/s)":"WindSpeed(m/s)","Wind Direction (°)":"Wind_Direction"},
                inplace=True)
data1_T.rename(columns={'Date/Time':'Time'},inplace=True)
data1_T.Month=data1_T.Time.apply(find_month)
data1_T["mean_WindSpeed"]=data1_T["WindSpeed(m/s)"].apply(mean_speed)
data1_T["mean_Direction"]=data1_T["Wind_Direction"].apply(mean_direction)
data1_T["Direction"]=data1_T["mean_Direction"].apply(find_direction)
  • Data cleaning

Remove the data that wind speed is <3.5 and >25.5.

Eliminate data where wind speed is >3.5 and active power =0.

data2_T=data1_T[(data1_T["WindSpeed(m/s)"]>3.5) & (data1_T["WindSpeed(m/s)"]<=25.5)]
data3_T=data2_T[((data2_T["ActivePower(kW)"]!=0)&(data2_T["WindSpeed(m/s)"]>3.5)) | (data2_T["WindSpeed(m/s)"]<=3.5)]
  • Calculate the mean value of Powercurve(kW) when mean_WindSpeed is 5.5
data3_T["Theoretical_Power_Curve (KWh)"][data3_T["mean_WindSpeed"]==5.5].mean()

Output: 472.09575192642797

  • Loss Calculations
data_T_clean=data3_T.sort_values("Time")
data_T_clean["Loss_Value(kW)"]=data_T_clean["Theoretical_Power_Curve (KWh)"]-data_T_clean["ActivePower(kW)"]
data_T_clean["Loss(%)"]=data_T_clean["Loss_Value(kW)"]/data_T_clean["Theoretical_Power_Curve (KWh)"]*100
#round the values to 2 digit
data_T_clean=data_T_clean.round({'ActivePower(kW)': 2, 'WindSpeed(m/s)': 2, 'Theoretical_Power_Curve (KWh)': 2,
                                   'Wind_Direction': 2, 'Loss_Value(kW)': 2, 'Loss(%)': 2})
  • Create the final summary DataFrame
DepGroupT_speed = data_T_clean.groupby("mean_WindSpeed")
data_T_speed=DepGroupT_speed.mean()
#remove the unnecessary columns.
data_T_speed.drop(columns={"WindSpeed(m/s)","Wind_Direction","mean_Direction"},inplace=True)
#create a windspeed column from index values.
listTspeed_WS=data_T_speed.index.copy()
data_T_speed["WindSpeed(m/s)"]=listTspeed_WS
#change the place of columns.
data_T_speed=data_T_speed[["WindSpeed(m/s)","ActivePower(kW)","Theoretical_Power_Curve (KWh)","Loss_Value(kW)","Loss(%)"]]
#change the index numbers.
data_T_speed["Index"]=list(range(1,len(data_T_speed.index)+1))
data_T_speed.set_index("Index",inplace=True)
#del data_T_speed.index.name
#round the values to 2 digit
data_T_speed=data_T_speed.round({"WindSpeed(m/s)": 1, 'ActivePower(kW)': 2, 'Theoretical_Power_Curve (KWh)': 2, 'Loss_Value(kW)': 2, 'Loss(%)': 2})
#create a count column that shows the number of wind speed from clean data.
data_T_speed["count"]=[len(data_T_clean["mean_WindSpeed"][data_T_clean["mean_WindSpeed"]==i]) 
                        for i in data_T_speed["WindSpeed(m/s)"]]
DepGroupT_direction = data_T_clean.groupby("Direction")
data_T_direction=DepGroupT_direction.mean()
#remove the unnecessary columns.
data_T_direction.drop(columns={"WindSpeed(m/s)","Wind_Direction"},inplace=True)
#create a column from index.
listTdirection_Dir=data_T_direction.index.copy()
data_T_direction["Direction"]=listTdirection_Dir
#change the name of mean_WindSpeed column as  WindSpeed.
data_T_direction["WindSpeed(m/s)"]=data_T_direction["mean_WindSpeed"]
data_T_direction.drop(columns={"mean_WindSpeed"},inplace=True)
#change the place of columns.
data_T_direction=data_T_direction[["Direction","mean_Direction","ActivePower(kW)","Theoretical_Power_Curve (KWh)","WindSpeed(m/s)",
                                     "Loss_Value(kW)","Loss(%)"]]
#change the index numbers.
data_T_direction["Index"]=list(range(1,len(data_T_direction.index)+1))
data_T_direction.set_index("Index",inplace=True)
#del data_T_direction.index.name
#create a count column that shows the number of directions from clean data.
data_T_direction["count"]=[len(data_T_clean["Direction"][data_T_clean["Direction"]==i]) 
                        for i in data_T_direction["Direction"]]
#round the values to 2 digit
data_T_direction=data_T_direction.round({'WindSpeed(m/s)': 1,'ActivePower(kW)': 2, 'Theoretical_Power_Curve (KWh)': 2,
                                           'Loss_Value(kW)': 2, 'Loss(%)': 2})
#sort by mean_Direction
data_T_direction=data_T_direction.sort_values("mean_Direction")
data_T_direction.drop(columns={"mean_Direction"},inplace=True)
  • Mean Power vs Direction
#Drawing graph of mean powers according to wind direction.
def bar_graph():
    fig = plt.figure(figsize=(15,6))
    matplotlib.rcParams.update({'font.size': 22})
    plt.bar(data_T_direction["Direction"],data_T_direction["Theoretical_Power_Curve (KWh)"],label="Theoretical Power Curve",align="edge",width=0.3)
    plt.bar(data_T_direction["Direction"],data_T_direction["ActivePower(kW)"],label="Actual Power Curve",align="edge",width=-0.3)
    plt.xlabel("Wind Direction",size=22)
    plt.ylabel("Power (kW)",size=22)
    plt.title("Wind Farm {} Mean Power Values vs Direction".format(turbine_no))
    plt.legend(fontsize=15)
    plt.show()
bar_graph()
T1 mean power vs direction
  • Total energy generation vs direction
#create summary direction total dataframe from direction data.
data_T_direction_total=data_T_direction.copy()
#remove the unnecessary columns.
data_T_direction_total.drop(columns={"count","ActivePower(kW)","Theoretical_Power_Curve (KWh)","Loss_Value(kW)","Loss(%)"},inplace=True)
#calculate the total values from direction data.
data_T_direction_total["Total_Generation(MWh)"]=data_T_direction["ActivePower(kW)"]*data_T_direction["count"]/6000
data_T_direction_total["Theoretical_PC_Total_Generation(MWh)"]=data_T_direction["Theoretical_Power_Curve (KWh)"]*data_T_direction["count"]/6000
data_T_direction_total["Total_Loss(MWh)"]=data_T_direction_total["Theoretical_PC_Total_Generation(MWh)"]-data_T_direction_total["Total_Generation(MWh)"]
data_T_direction_total["Loss(%)"]=data_T_direction_total["Total_Loss(MWh)"]/data_T_direction_total["Theoretical_PC_Total_Generation(MWh)"]*100
#round the values to 2 digit
data_T_direction_total=data_T_direction_total.round({'WindSpeed(m/s)': 1,'Total_Generation(MWh)': 2, 'Theoretical_PC_Total_Generation(MWh)': 2,
                                           'Total_Loss(MWh)': 2, 'Loss(%)': 2})
#change the place of columns.
data_T_direction_total=data_T_direction_total[["Direction","Total_Generation(MWh)","Theoretical_PC_Total_Generation(MWh)","WindSpeed(m/s)",
                                     "Total_Loss(MWh)","Loss(%)"]]
#Drawing graph of total generations according to wind direction.
def bar_graph():
    fig = plt.figure(figsize=(15,6))
    matplotlib.rcParams.update({'font.size': 22})
    plt.bar(data_T_direction_total["Direction"],data_T_direction_total["Theoretical_PC_Total_Generation(MWh)"],label="Theoretical Power Curve",align="edge",width=0.3)
    plt.bar(data_T_direction_total["Direction"],data_T_direction_total["Total_Generation(MWh)"],label="Actual Power Curve",align="edge",width=-0.3)
    plt.xlabel("Wind Direction",size=22)
    plt.ylabel("Energy Generation (MWh)",size=22)
    plt.title("Wind Farm {} Total Energy Generation Values vs Direction".format(turbine_no))
    plt.legend(fontsize=15)
    plt.show()
bar_graph()
T1 total energy generation vs direction
  • Total energy loss vs direction
ef bar_graph():
    fig = plt.figure(figsize=(15,6))
    matplotlib.rcParams.update({'font.size': 22})
    plt.bar(data_T_direction_total["Direction"],data_T_direction_total["Total_Loss(MWh)"],
            label="Total_Loss(MWh)",align="center",width=0.5, color="red",picker=5)
    plt.xlabel("Wind Direction",size=22)
    plt.ylabel("Total Loss (MWh)",size=22)
    plt.title("Wind Farm {} Total Loss Values vs Direction".format(turbine_no))
    plt.legend(fontsize=15)
    plt.show()
bar_graph()
Total loss vs direction
  • Create the summary DataFrame for all directions
list_data=[]
list_yon=["N","NNE","NEE","E","SEE","SSE","S","SSW","SWW","W","NWW","NNW"]
for i in range(0,12):
    data1T_A=data_T_clean[data_T_clean["Direction"]==list_yon[i]]
    #
    DepGroup_A = data1T_A.groupby("mean_WindSpeed")
    data_T_A=DepGroup_A.mean()
    #
    data_T_A.drop(columns={"WindSpeed(m/s)","Wind_Direction","mean_Direction"},inplace=True)
    #
    listTA_WS=data_T_A.index.copy()
    data_T_A["WindSpeed(m/s)"]=listTA_WS
    #
    data_T_A=data_T_A[["WindSpeed(m/s)","ActivePower(kW)","Theoretical_Power_Curve (KWh)","Loss_Value(kW)","Loss(%)"]]
    #
    data_T_A["Index"]=list(range(1,len(data_T_A.index)+1))
    data_T_A.set_index("Index",inplace=True)
#    del data_T_A.index.name
    #
    data_T_A=data_T_A.round({'ActivePower(kW)': 2, 'Theoretical_Power_Curve (KWh)': 2, 'Loss_Value(kW)': 2, 'Loss(%)': 2})
    #
    data_T_A["count"]=[len(data1T_A["mean_WindSpeed"][data1T_A["mean_WindSpeed"]==x]) 
                            for x in data_T_A["WindSpeed(m/s)"]]
    list_data.append(data_T_A)
    
data_T_N=list_data[0]
data_T_NNE=list_data[1]
data_T_NEE=list_data[2]
data_T_E=list_data[3]
data_T_SEE=list_data[4]
data_T_SSE=list_data[5]
data_T_S=list_data[6]
data_T_SSW=list_data[7]
data_T_SWW=list_data[8]
data_T_W=list_data[9]
data_T_NWW=list_data[10]
data_T_NNW=list_data[11]
  • Plotting the T1 power curve for all directions
def graph_WT():
    fig = plt.figure(figsize=(12,6))
    matplotlib.rcParams.update({'font.size': 22})
    plt.plot(data_T_speed["WindSpeed(m/s)"],data_T_speed["Theoretical_Power_Curve (KWh)"],label="Theoretical Power Curve",
             marker="o",markersize=10,linewidth = 5)
    plt.plot(data_T_speed["WindSpeed(m/s)"],data_T_speed["ActivePower(kW)"],label="Actual Power Curve",
             marker="o",markersize=10,linewidth = 5)
    plt.xlabel("Wind Speed (m/s)",size=22)
    plt.ylabel("Power (kW)",size=22)
    plt.title("Wind Farm {} Power Curve".format(turbine_no))
    plt.legend()
    plt.show()
graph_WT()
T1 power curve for all directions
  • Plotting the T1 power curve vs direction bin
list_table=[data_T_N,data_T_NNE,data_T_NEE,data_T_E,data_T_SEE,data_T_SSE,data_T_S,
            data_T_SSW,data_T_SWW,data_T_W,data_T_NWW,data_T_NNW]

list_tableName=["N","NNE","NEE","E","SEE","SSE","S","SSW","SWW","W","NWW","NNW"]

def graph_T(i):
    fig = plt.figure(figsize=(12,6))  
    matplotlib.rcParams.update({'font.size': 22})
    plt.plot(list_table[i]["WindSpeed(m/s)"],list_table[i]["Theoretical_Power_Curve (KWh)"],label="Theoretical Power Curve",
             marker="o",markersize=10,linewidth = 5)
    plt.plot(list_table[i]["WindSpeed(m/s)"],list_table[i]["ActivePower(kW)"],label="Actual Power Curve",
             marker="o",markersize=10,linewidth = 5)
    plt.xlabel("Wind Speed (m/s)",size=22)
    plt.ylabel("Power (kW)",size=22)
    plt.title("Wind Farm {} Power Curve According to {} Wind".format(turbine_no,list_tableName[i]))
    plt.legend(fontsize=15)
    plt.show()
for i in range(0,12):
    graph_T(i)
T1 power curve according to N wind
T1 power curve according to NNE wind
T1 power curve according to NEE wind
T1 power curve according to E wind
T1 power curve according to SEE wind
T1 power curve according to SSE wind
T1 power curve according to S wind
T1 power curve according to SSW wind
T1 power curve according to SWW wind
T1 power curve according to W wind
T1 power curve according to NWW wind
T1 power curve according to NNW wind

Summary

  • A wind turbine is characterized by a power curve representing the relation between the wind speed and the amount of generated electrical power.
  • We have conducted this study to find the best performing regression algorithm to model power curves. 
  • We have tested and validated our models using real Scada data from a wind farm. 
  • The study showed that very accurate power curves can be created based upon supervised ML techniques such as Random Forest and Gradient Boosting Regressors.
  • Using observed winds and power production at an operational site, we have shown that both wind speed and direction affect the average energy generation and loss.
  • The final Python workflow is straightforward to implement and does not rely on assumptions about the wind or stability.
  • Industry applications include wind turbine selection, capacity factor estimation, wind energy assessment and forecasting, and condition monitoring.

Explore More


Go back

Your message has been sent

Warning

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

€5.00
€15.00
€100.00
€5.00
€15.00
€100.00
€5.00
€15.00
€100.00

Or enter a custom amount


Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly

Discover more from Our Blogs

Subscribe to get the latest posts sent to your email.

Leave a comment

Discover more from Our Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading