[Regretfully, the screenshots as well as the web application discussed here have been lost in a botched site migration. Apologies. -Editor]
We’ve been hard at work on a new tool, which we are proud to announce: the Altometrics Dynamic Thresholding Tool.
Many data sets have periodic patterns. For example, in network management, the network usage (in bytes per second) varies greatly by the time of day. When, say, everybody is at work, the usage goes up. When everybody leaves, usage goes down.
Trying to create notification thresholds for this sort of data is frustrating. If you set a static threshold, it’s prone to trigger false alarms during the morning high-traffic time, because only a little bit more traffic than normal would exceed the threshold. Conversely, even a pretty serious anomaly might not trigger an alarm if it happens in the middle of the night, when there is typically little traffic.
Clearly, we need thresholds that vary by the time of day. But we shouldn’t burden the network managers with creating such complex thresholds. Instead, we can do that automatically.
Other dynamic thresholding tools use probability to decide the thresholds. The problem with this approach is that it encodes an expectation that a certain percent of your data is anomalous, which is, we think, a fairly odd assumption to make. Just because there was a large or small measurement doesn’t necessarily mean that it’s anomalous.
Our approach is different. We compare periods of time with selected historical data to determine threshold values. The default settings compare each hour of the day to the same hour from previous days. In this arrangement, we call the day being analyzed the window and the historical selections the context. The screenshot below shows the window (highlighted in grey) and its context (highlighted in yellow and red).
Within the context, we compare each of the windows to the rest, to determine whether there are any anomalous windows. If there are, like the red window in the screenshot above, we exclude them. Of the remaining windows, we simply take the minimum and maximum data point to decide on the minimum and maximum thresholds. Thus, the assumption we encode is that if a window is normal compared to similar windows, then all the data in that window can also be considered normal, and it can be relied upon to predict future behavior. We think this is a much more reasonable approach for most applications. It helps that, for reasons I hope to expound in a future post, our anomaly detection methods are excellent.
There are several other features worth mentioning:
- The drill-down charts provide detail about threshold bounds.
- The tool is configurable to work with different sorts of data sets.
- The charts update themselves in real-time when new data arrives.
The detail charts, shown below, appear when a particular window is selected. They provide more information about the threshold calculation for the given window. The top chart overlays the data from all the relevant hours, including the window’s hour (in black), the included context windows (in green) and the excluded context windows (in red). The bottom chart shows the same data, except stacked side-by-side. Hovering over a window in the bottom chart also highlights that context window’s anomaly score in the top chart. The top chart can also view its data as CDFs, which sorts the data from each window to easily show, for example, what percentage of a given window’s data was below the bottom threshold.
So far, we’ve been talking about hours and days, but these are the default values of settings that you can change. You can control how long and how frequent a window is defined to be, and what the context for each window should be—and even how anomalous a window has to be before it is excluded. Those settings are not available in the online demo version, however.
One other feature, useful for many modern applications but not obvious in the demo, is that everything updates itself whenever new data arrives. In other words, it is streaming-enabled, and can be used as a dashboard to monitor the ongoing status of a metric.
Try it out
You can try it out online. We’ve included five public datasets that you can play around with, including the search volume for the AOL search dataset, two datasets of electric power consumption, and two datasets of network traffic (1 2). If you want to use this tool with your own data, please contact us about pricing and terms. Email email@example.com. In fact, please let us know what you think about the tool in general, even if you aren’t interested in buying the full version.