Real-Time Anomaly Detection of NAB Ambient Temperature Readings using the TensorFlow/Keras Autoencoder

  • Today we will discuss the anomaly detection in time series data using autoencoders. In this approach, anomalies are data points with considerable reconstruction errors.
  • In the context of predictive maintenance, a time series anomaly may point to a prospective equipment failure that may be fixed before it results in a large amount of downtime or safety concerns.
  • The Numenta Anomaly Benchmark (NAB) provides the input dataset ambient_temperature_system_failure.csv with labeled, real-world time-series data of ambient temperature readings from a system that experienced a failure. 
  • Recall that NAB is The First Temporal Benchmark Designed to Evaluate Real-time Anomaly Detectors Benchmark.
  • The entire Python workflow consists of the following three steps:
  1. Importing Libraries and Input Dataset
  2. Anomaly Detection using Autoencoder
  3. Visualizing the Anomaly

Importing Libraries and Input Dataset

import pandas as pd
import tensorflow as tf
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.metrics import precision_recall_fscore_support
import matplotlib.pyplot as plt
data = pd.read_csv(
    'https://raw.githubusercontent.com/numenta'
    '/NAB/master/data/realKnownCause/ambient'
    '_temperature_system_failure.csv')
  
# Exclude datetime column
data_values = data.drop('timestamp',
                        axis=1).values
  
# Convert data to float type
data_values = data_values.astype('float32')
  
# Create new dataframe with converted values
data_converted = pd.DataFrame(data_values,
                              columns=data.columns[1:])
  
# Add back datetime column
data_converted.insert(0, 'timestamp',
                      data['timestamp'])
data_converted = data_converted.dropna()

Anomaly Detection using Autoencoder

# Exclude datetime column again
data_tensor = tf.convert_to_tensor(data_converted.drop(
    'timestamp', axis=1).values, dtype=tf.float32)
  
# Define the autoencoder model
input_dim = data_converted.shape[1] - 1
encoding_dim = 10
  
input_layer = Input(shape=(input_dim,))
encoder = Dense(encoding_dim, activation='relu')(input_layer)
decoder = Dense(input_dim, activation='relu')(encoder)
autoencoder = Model(inputs=input_layer, outputs=decoder)
  
# Compile and fit the model
autoencoder.compile(optimizer='adam', loss='mse')
autoencoder.fit(data_tensor, data_tensor, epochs=50,
                batch_size=32, shuffle=True)
  
# Calculate the reconstruction error for each data point
reconstructions = autoencoder.predict(data_tensor)
mse = tf.reduce_mean(tf.square(data_tensor - reconstructions),
                     axis=1)
anomaly_scores = pd.Series(mse.numpy(), name='anomaly_scores')
anomaly_scores.index = data_converted.index
threshold = anomaly_scores.quantile(0.99)
anomalous = anomaly_scores > threshold
binary_labels = anomalous.astype(int)
precision, recall,\
    f1_score, _ = precision_recall_fscore_support(
        binary_labels, anomalous, average='binary')
test = data_converted['value'].values
predictions = anomaly_scores.values
  
print("Precision: ", precision)
print("Recall: ", recall)
print("F1 Score: ", f1_score)
Precision:  1.0
Recall:  1.0
F1 Score:  1.0

Visualizing the Anomaly

# Plot the data with anomalies marked in red
plt.figure(figsize=(16, 8))
plt.plot(data_converted['timestamp'],
         data_converted['value'])
plt.plot(data_converted['timestamp'][anomalous],
         data_converted['value'][anomalous], 'ro')
plt.title('Anomaly Detection')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()
Real-time anomaly detection: NAB time series.

Summary

  • We imported the key libraries required for the implementation of the anomaly detection algorithm using an autoencoder.
  • We loaded the NAB dataset, which contains time-series data of ambient temperature readings from a system that experienced a failure. 
  • We defined the autoencoder model and performed Keras model fitting using the cleaned and edited NAB data.
  • We defined an anomaly detection threshold and assessed the model’s effectiveness using precision, recall, and F1 score.
  • The final plot shows the original time series data with anomalies identified by the trained autoencoder model. The anomalies are highlighted in a different (red) color.

Explore More


Go back

Your message has been sent

Warning

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

€5.00
€15.00
€100.00
€5.00
€15.00
€100.00
€5.00
€15.00
€100.00

Or enter a custom amount


Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly

Discover more from Our Blogs

Subscribe to get the latest posts sent to your email.

Leave a comment

Discover more from Our Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading