#### TimeSeries / ForecastEvaluation / 0.3.3

## Overview

This algorithm is used to evaluate time series / sequential forecasting algorithms for performance, accuracy and precision. This algorithm can also be used as an important component in an unsupervised anomaly detection process!

## CSV Format

This algorithm takes evaluation data in csv form; the algorithm expects each variable to be delimited using a comma and each timestep to be on a separate line. For more information, take a look at our univariate and bivariate examples.

## Algorithms FAQ

As each algorithm is different, this section will define any caveats or interesting attributes for our supported algorithms.

### TimeSeries/Forecast

This algorithm uses a univariate polynomial approximation function to replicate any trend and periodicity in the training data to assist in forecasting.

##### What kind of data does this algorithm expect?

This algorithm expects univariate (1 variable) dataset. An example of this would be the univariate example.

##### Does this algorithm use model files?

No, it's not a machine learning based algorithm which means there are no RNN model files. Ensure that your `eval_percentage`

variable reflects how much data you wish to use to create your polynomial approximation function.

##### If I want to know more about this algorithm, where should I look?

For more information on this algorithm, check out it's algorithm page here: TimeSeries/Forecast.

#### Any special caveats of this algorithm?

This algorithm is well suited for business based use cases with regular seasonality, trends etc. It does struggle with other types of sequential datasets.

### TimeSeries/GenerativeForecast

This algorithm uses an LSTM neural network architecture that is capable of forecasting nonlinear and complex trends. *When using custom data, you must build a model first. Read the algorithm documentation for more info.*

#### Does this algorithm use model files?

Yes it does, this algorithm utilizes built in memory to remember and preserve data that it has already seen!

#### What kind of data does this algorithm expect?

This algorithm was designed to be robust and flexible in it's data inputs; it can take any sort of CSV file with continuous variables. However as it does use a checkpoint model, your evaluation data should proceed directly after the data used to train / update your checkpoint model.

#### Any special caveats of this algorithm?

Yes, the algorithm makes really great graphs of it's forecasts! After performing an evaluation in advanced mode, write down the UUID associated with any interesting forecast and check your TimeSeries/GenerativeForecasting temp collection to access your graph images.

## Examples

### Example 1

Sinewave timeseries/generativeForecast, simple mode

#### Input

```
{
"num_of_evals":5,
"data_path":"data://timeseries/generativeforecasting/sinewave_v1.0_t1.csv",
"forecast_length":35,
"algorithm":"timeseries/generativeforecast",
"model_path":"data://timeseries/generativeforecasting/sinewave_v1.0_t0.t7"
}
```

#### Output

```
{
"error": {
"max": 0.00857300213419307,
"mean": 0.00299875514566992,
"min": 0.0004365765109465927
}
}
```

### Example 2

Api Requests timeseries/generativeForecast, advanced mode

#### Input

```
{
"algorithm":"timeseries/generativeforecast",
"model_path":"data://timeseries/generativeforecasting/apidata_v0.2.5_t0.t7",
"advanced_mode":"true",
"data_path":"data://timeseries/generativeforecasting/apidata_v0.2.5_t1.csv",
"num_of_evals":10,
"forecast_length":35
}
```

#### Output

```
{
"complete_data_path":"data://.algo/temp/57001458-c046-4de5-a031-a3fd6e3f6338.json",
"summary":{
"error":{
"max":{
"id":"29e3db37-cf39-4388-996d-271bcd15781d",
"value":0.02067989932028304
},
"mean":0.009846650607575207,
"min":{
"id":"8043e3db-2043-4800-9f5c-94b080508e14",
"value":0.002214220459913783
},
"std":0.0057925733397795574
},
"exec_time":{
"max":31.954874515533447,
"mean":23.33813188076019,
"min":14.700079441070557,
"std":5.356676354101113
}
}
}
```

### Example 3

API requests timeseries/forecast, advanced mode with custom output

#### Input

```
{
"algorithm":"timeseries/forecast",
"model_path":"data://timeseries/generativeforecasting/apidata_v0.2.5_t0.t7",
"advanced_mode":"true",
"data_path":"data://timeseries/generativeforecasting/apidata_v0.2.5_t1.csv",
"num_of_evals":10,
"forecast_length":35,
"data_output_path": "data://.my/example_collection/apidata_v0.2.5_eval.json"
}
```

#### Output

```
{
"complete_data_path": "data://.my/example_collection/apidata_v0.2.5_eval.json",
"summary": {
"error": {
"max": {
"id": "34d5eadf-b112-48aa-955f-078b5796e64b",
"value": 0.019366085544574863
},
"mean": 0.008697436763929537,
"min": {
"id": "32bd3a8e-c588-4d53-8a92-fb28be16192b",
"value": 0.0007278606148913183
},
"std": 0.0058660767961214375
},
"exec_time": {
"max": 1.7069735527038574,
"mean": 0.9620259523391724,
"min": 0.23518633842468264,
"std": 0.5176264802321938
}
}
}
```

## Error Formula

We calculate error by first normalizing the evaluation data. This is done to enable comparisons across datasets with potentially wildly diverging ranges.
Once our data is normalized we then calculate the Mean Absolute Error (MAE) for each forecast. Which all together with our normalized data would be called the `Normal Mean Absolute Error`

. In most cases the resulting error value will be less than 1, however in some cases the error term might be greater.

## Complete evaluation data

You might be asking; what are those `complete_data_path`

files, and what do they look like? In our evaluation, we make many forecast algorithm requests - we do this to flatten out the variance from any single good/bad forecast. But what do we do with rest of the algorithm data that is returned? *We save all of it!* When using advanced mode we return all forecast data in a single json object, which allows you to dive even further into the evaluation!
Lets take a quick look at an example we already looked at:

```
{
"algorithm":"timeseries/generativeforecast",
"model_path":"data://timeseries/generativeforecasting/apidata_v0.2.5_t0.t7",
"advanced_mode":"true",
"data_path":"data://timeseries/generativeforecasting/apidata_v0.2.5_t1.csv",
"num_of_evals":2,
"forecast_length":5
}
```

As you might have noticed, we never defined the complete_data_path variable, this means that we need to look in our algorithm temp collection. Lets open up that file and see what it looks like:

```
[
{
"error":0.0018789254473265983,
"exec_time":26.187873363494873,
"id":"f5daf027-1c95-4d8b-bd54-040bc7b16dee",
"algo_response":{
"envelope":[
{
"second_deviation":{
"lower_bound":[
-760.3936167986036,
-1095.5086156265447,
-939.5766748371902,
-1167.3063141199827,
-702.9676420146652
],
"upper_bound":[
308.5193675310255,
454.2168492691229,
402.373616975862,
742.9122223231078,
955.3420385295578
]
},
"mean":[
-225.93712463378907,
-320.64588317871096,
-268.6015289306641,
-212.1970458984375,
126.18719825744628
],
"standard_deviation":[
267.2282460824073,
387.43136622391694,
335.48757295326305,
477.55463411077267,
414.57742013605576
],
"first_deviation":{
"lower_bound":[
-493.1653707161963,
-708.0772494026279,
-604.0891018839271,
-689.7516800092102,
-288.3902218786095
],
"upper_bound":[
41.29112144861821,
66.78548304520598,
66.88604402259898,
265.35758821233514,
540.764618393502
]
},
"variable":"Requests made"
}
],
"saved_graph_path":"data://.algo/temp/f5daf027-1c95-4d8b-bd54-040bc7b16dee.png"
},
"index":1277
},
{
"error":0.007211677972411254,
"exec_time":6.704785346984863,
"id":"bc1b3da1-f29c-4e0b-b33e-4279a2c21cd4",
"algo_response":{
"envelope":[
{
"second_deviation":{
"lower_bound":[
502.85361303671993,
912.1484019093724,
1704.9310144197402,
2221.680316049337,
2480.419836612773
],
"upper_bound":[
1049.152685791405,
1847.28668109844,
2413.6306066740094,
3213.890973013163,
3823.720007137227
]
},
"mean":[
776.0031494140625,
1379.7175415039062,
2059.280810546875,
2717.78564453125,
3152.069921875
],
"standard_deviation":[
136.57476818867127,
233.78456979726695,
177.17489806356724,
248.05266424095652,
335.8250426311135
],
"first_deviation":{
"lower_bound":[
639.4283812253911,
1145.9329717066394,
1882.1059124833075,
2469.7329802902937,
2816.2448792438868
],
"upper_bound":[
912.5779176027338,
1613.502111301173,
2236.455708610442,
2965.8383087722063,
3487.8949645061134
]
},
"variable":"Requests made"
}
],
"saved_graph_path":"data://.algo/temp/bc1b3da1-f29c-4e0b-b33e-4279a2c21cd4.png"
},
"index":1162
}
]
```

Looks like a bunch of valuable information; if you compare above with the generativeForecast algorithm output schema, there is something extra here. The `index`

variable defines the location in our evaulation data array that our forecast operation uses as a break point, we pass all data to the algorithm *up to this point, and tell it to predict the next N steps*. This means that if you want to replicate any particular forecast, you just need the index and the algorithm.

## IO

Need more info? below are our API docs!

### Input

Parameter | Description | Type |
---|---|---|

data_path | A data collection URI pointing to a properly formatted sequential csv file. | String |

algorithm | The name of the algorithm you wish to evaluate, check below for a list of currently supported algorithms. | String |

model_path | If the chosen algorithm utilizes checkpoint models, this must be provided in your input. | String |

num_of_evals | Total number of independent evaluations to perform. Each evaluation will have a different forecast point, taken at random from the dataset provided with `data_path` . Higher numbers yield more reliable results. defaults to 10 | Int |

forecast_length | The amount of steps into the future to evaluate each forecast, choose a number that makes sense for your algorithm. defaults to 25. | Int |

advanced_mode | If you want all available information about your evaulation, set this to `"true"` . Otherwise, you will receive a much simpler output. Defaults to `"false"` . | String |

eval_percentage | The percentage of the provided data to use for evaluation. Lower values mean the forecasting algorithm is exposed to more data before evaluating, important for evaluating algorithms without checkpoint models. defaults to 0.85 | Float |

### Output

This algorithm returns two different types of output, depending on mode - simple and advanced.

#### Simple Output

```
{
"error": {
"max": 0.00857300213419307,
"mean": 0.00299875514566992,
"min": 0.0004365765109465927
}
}
```

Parameter | Description | Type |
---|---|---|

error | The error object wrapper, contains basic error info. | Object |

max | The maximum detected error across all forecasts. | Float |

mean | The mean or average error across all forecasts. | Float |

min | The minimum detected error across all forecasts. | Float |

#### Advanced Output

```
{
"complete_data_path": "data://.my/example_collection/apidata_v0.2.5_eval.json",
"summary": {
"error": {
"max": {
"id": "34d5eadf-b112-48aa-955f-078b5796e64b",
"value": 0.019366085544574863
},
"mean": 0.008697436763929537,
"min": {
"id": "32bd3a8e-c588-4d53-8a92-fb28be16192b",
"value": 0.0007278606148913183
},
"std": 0.0058660767961214375
},
"exec_time": {
"max": 1.7069735527038574,
"mean": 0.9620259523391724,
"min": 0.23518633842468264,
"std": 0.5176264802321938
}
}
}
```

Parameter | Description | Type |
---|---|---|

complete_data_path | the data collection URI pointing to where the `Complete Evaluation Data` file is located. | String |

summary | The forecast summary object | Object |

summary/error | This algorithm measures the normal mean absolute error between forecasts and their expected targets. | Object |

summary/error/mean | The mean or average error across all forecasts. | Float |

summary/error/std | The standard deviation of the error across all forecasts. A larger than normal std might hint at an anomaly in the data. | Float |

summary/error/(max or min)/id | the forecast uuid that where the (max or min) error was detected. | String |

summary/error/(max or min)/value | The (max or min) error value for the evaluation, refer to the uuid to find the related forecast in the `Complete Evaluation Data` file. | Float |

summary/exec_time | This algorithm measures algorithm `execution time` , which can be incredibly useful as a metric to compare performance between algorithms. | Object |

summary/exec_time/max | The maximum execution time measured over all forecast algorithm requests. | Float |

summary/exec_time/mean | the mean or average execution time measured across all forecast algorithm requests. | Float |

summary/exec_time/min | The minimum execution time measured over all forecast algorithm requests. | Float |