How to Set Up open-appsec for Best Threat Prevention Results of the Contextual Machine Learning Engine

Introduction

open-appsec WAF is based on a machine learning engine which allows it to prevent both, known and even unknown attacks without requiring any signatures at all, opposite to traditional WAF solutions which are built based on static signatures and therefore by design unable to prevent zero-day attacks.

open-appsec’s “contextual machine learning engine” is using two machine learning models:

A "Supervised Model" that was trained offline based on millions of requests, both malicious and benign.
An "Unsupervised Model" that is being built in real-time in the protected environment. This model continuously learns based on the actual traffic in the protected environment.

In this blog, we will look at a few best practices you should always ensure with regards to configuring open-appsec WAF for getting the best threat prevention results and lowest false positive rate from the contextual machine learning engine for the inspected ingress traffic to your web applications and web APIs.

Top 5 best practices to ensure the best possible open-appsec contextual ML engine results

Let’s have a look at the important configurations and processes you should be aware of allowing open-appsec WAF to prevent threats in the best possible way while minimizing the occurrence of false positives significantly.

Even though open-appsec WAF already provides strong protection out-of-the-box, these easy steps are important for reaching ideal results and effectiveness.

All of this configuration is easily done via the central Web UI, but can also be done on CLI using the open-appsec-tuning-tool in case you are running a local, standalone deployment of open-appsec.

1. Switch from basic to using the advanced machine learning model

open-appsec “Community Edition” users:

When using the open-appsec WAF “Community Edition” by default the “Basic Model” is deployed for the contextual machine learning engine, it is provided as part of the GitHub repository and default installations. This model is only recommended for usage in monitor-only and test environments.

Instead, for best possible results make sure to deploy the "Advanced Model" which can be downloaded from open-appsec Web UI portal and applied to installations of open-appsec “Community Edition”. This is more accurate and therefore recommended for production use. You can find the latest installation instructions included in the download for all supported platforms, Docker, Kubernetes and Linux embedded.

In the open-appsec Web UI you can find the download here:

For more details on the “Advanced ML model” see the relevant docs:

Using the Advanced Machine Learning Model

open-appsec “Premium Edition” users:

As a subscriber of our open-appsec, Premium Edition the “Advanced Model” is already included by default once you deploy or upgrade to the “Premium Edition”.

No additional steps are required here.

2. Define separate assets for your protected web applications and web APIs

In open-appsec, your protected web applications and web APIs are represented as “Assets” in the Web UI (or specific rules when using local, declarative management).

Instead of using only one default “wildcard” asset with e.g. https://*:* in the configuration for the web application’s/web API’s URL, make sure to configure individual assets for your different web applications and web APIs if they have significantly different, typical usage/traffic behaviors.

This provides you with the following benefits:

Granular configuration options per each asset (e.g. enforcement mode, minimum confidence level for prevention, separate logging configuration, and much more)
Separate learning of the online machine learning model per each asset for better accuracy including separate configuration suggestions based on the current learning progress.
Separate tuning suggestions which can be reviewed and approved/rejected by the users specific to each configured asset.
Separate log view per each configured asset.
Option to configure custom rules and exceptions for each asset separately if required.

E.g. when you have an API exposed at www.example.com/api make sure to create a separate asset for it, for improved learning, and threat prevention results. In this case, you could apply a separate rate limiting setting for that specific API, or in case you are subscribed to open-appsec “Premium Edition” this would also allow you to upload an OpenAPI schema for enforcing the API structure in traffic reaching this asset.

In case you are using declarative configuration for local management of open-appsec WAF see the following documentation, in the local declarative configuration file assets are configured as “specific rules”:

Configuration Using Local Policy File (Linux)

Local Policy File (Advanced)

For more details on “Additional Assets” configuration in the WebUI see the following docs:

Protect Additional Assets

3. Configure the correct source identity for your protected assets

Make sure to select the correct source identifier for each of your protected assets in the asset configuration:

E.g., if another proxy in front of your open-appsec WAF deployment hides the original source IP and adds it to the X-Forwarded-For (XFF) header, then select the “X-Forwarded-For Header in HTTP Requests” options from the dropdown list.

In this specific case of using the XFF header as the source identifier, make sure to add the IP addresses of your previous proxy hops here:

Options range from just using the Source IP of the http request or the XFF header to more advanced options to identify the user based on e.g. a cookie’s content, a key in the header, or perhaps a JWT key.

Setting this correctly is important to allow the online ML model to properly learn and distinguish the behavior of different external users.

For more details on “Source Identity” see the relevant docs:

Configure Contextual Machine Learning for Best Accuracy

In case you are using declarative configuration for local management of open-appsec WAF see the following docs to learn how to configure Source Identifiers declaratively:

Local Policy File (Advanced)

4. Define trusted sources

The following setting is critical for minimizing the possibility of false positives occurring:

Always specify at least 3 or more trusted sources in the configuration of each asset (see screenshot below). The online machine learning model will then consider all traffic originating from these users to be very unlikely to be malicious allowing it to learn these users’ traffic as benign.

Note that to ensure that even in case one or two of those trusted sources might be the origin of some malicious traffic, the open-appsec’s online ML model will only learn this traffic as benign once observed from at least 3 different sources (this value can also be configured to be higher if required, see screenshot below).

For more details on “Trusted Sources” see the relevant docs:

Configure Contextual Machine Learning for Best Accuracy

In case you are using declarative configuration for local management of open-appsec WAF see the following docs to learn how to configure Trusted Sources declaratively:

Local Policy File (Advanced)

5. Review and confirm/reject learning suggestions

The Contextual Machine Learning model may ask to review certain events, also called “Tuning Suggestions”. Providing feedback to these suggestions is not mandatory as the engine is capable of learning by itself. However, doing this allows the machine learning engine to reach a higher maturity level and therefore a better accuracy faster based on human guidance.

Review potential tuning suggestions in the “Learn” tab within each configured asset in the WebUI.

Then provide feedback on the proposed Tuning Suggestions by selecting “Malicious” or “Benign”. For making an educated, secure decision you should review the historic logs observed earlier for each tuning suggestion that resulted in the specific tuning suggestion being proposed (click on “View Logs”).

Once you do this, the “Tuning Suggestion” will become a “Tuning Decision” and be visible in the “Tuning Decisions” list.

For more details on “Tuning Suggestions” see the relevant docs:

Configure Contextual Machine Learning for Best Accuracy

In case you are using declarative configuration for local management of open-appsec WAF see the following docs about how to use the open-appsec-local-tuning tool:

Track Learning and Local Tuning in Standalone Deployments

We hope you found the steps provided in this blog useful to get the best possible accuracy of the open-appsec contextual machine learning engine!

In case you should have any questions, feedback, or suggestions, feel free to contact us via the chat on our website or send us an email to: info@openappsec.io.

To learn more about how open-appsec works, see this White Paper and the in-depth Video Tutorial. You can also experiment with deployment in the free Playground.

Introduction

Top 5 best practices to ensure the best possible open-appsec contextual ML engine results

1. Switch from basic to using the advanced machine learning model

2. Define separate assets for your protected web applications and web APIs

3. Configure the correct source identity for your protected assets

4. Define trusted sources

5. Review and confirm/reject learning suggestions

Back to All Blogs