ADR-0013: Autodiscovery v2


Status:	DRAFT
Date:	2022-09-01
Author(s):	Max Maass max.maass@iteratec.com

Context

The autodiscovery feature allows automatically discovering services and container images inside a Kubernetes cluster, to automatically trigger scans for them. At the moment, the functionality is limited to triggering a single scan per discovered service / container. We want to build it out to make it more flexible. The first step is to allow more than one scan to be triggered based on a detected service or container. A bonus objective would be to support non-HTTP services. There are different options for how this can be achieved.

Option 1: Base Scan creation on the Cascading Scans functionality

In this scenario, the operator would emit events for detected resources, which would then be handled by the cascading scan functionality.

Advantages

High flexibility: The emitted event could contain the service name, open ports, and any other pertinent metadata, and the scheduled scan rules could decide if they should trigger any scans
Reuse existing functionality instead of a custom solution with its own configuration syntax.
Could support non-HTTP services

Drawbacks

Due to the way the system is built at the moment, such events would be emitted very frequently, as the autodiscovery operator isn't stateful, so it doesn't know if a service has changed. This would thus require implementing one of two options
- Either we port the cascading rule handling to Go code, execute the cascading rules every time, and then check if the relevant ScheduledScans have been created (which would require a complete rewrite of the scheduledScan mechanism)
- Or we create a stub scantype that can be set up as a scheduled scan, the same way the autodiscovery currently does this. The stub scan would do nothing except emit a finding and shut down. This would remove the requirement for a cascading scan rewrite, but would be a "hacky" solution to the problem.
Large change that is not backwards-compatible - requires rewriting existing configurations as cascading rules
Introduces a new concept of flow control via events, which is different from the previous flow control and makes things harder to understand and debug.

Option 2: Specify List of Scans in Autodiscovery Scan Configuration Format

At the moment, the autodiscovery scan configuration expects only a single scan to be specified. We could switch this to specifying a list of scans instead, which would allow triggering more than one scan using the autodiscovery.

Current syntax:

    # ...
    serviceAutoDiscovery:
      passiveReconcileInterval: 120s
      scanConfig:
        scanType: zap-advanced-scan
        parameters:
          - "--target"
          - "{{ .Host.Type }}://{{ .Service.Name }}.{{ .Service.Namespace }}.svc:{{ .Host.Port }}"
          - "--context"
          - "{{ default `scb-bap-context` (index .Service.Annotations `auto-discovery.securecodebox.io/zap-advanced-context`) }}"

Proposed new syntax (as an example, naming may vary):

    # ...
    serviceAutoDiscovery:
      passiveReconcileInterval: 120s
      scanConfigs:
        - scanType: zap-advanced-scan
          parameters:
            - "--target"
            - "{{ .Host.Type }}://{{ .Service.Name }}.{{ .Service.Namespace }}.svc:{{ .Host.Port }}"
            - "--context"
            - "{{ default `scb-bap-context` (index .Service.Annotations `auto-discovery.securecodebox.io/zap-advanced-context`) }}"
        - scanType: nmap
          parameters:
            - "{{ .Service.Name }}.{{ .Service.Namespace }}.svc"

Advantages

The change is fairly small, and, while breaking, would not require a lot of work for existing users of the Autodiscovery functionality to migrate (basically only adding a single - and some indenting in the config file)
It is thus a low-effort change that would allow triggering multiple scans without resorting to hacks like triggering an initial nmap from the autodiscovery and then cascading from that.

Drawbacks

If we only switch the format and nothing else, this would not allow us to support non-HTTP ports / differentiate which scans to trigger based on the open ports.

Option 3: Also Allow Filtering based on Port in the Autodiscovery Scan Configuration

This option builds on top of option 2, but adds additional fields to the scan specification for the service autodiscovery. These additional fields would allow specifying which ports to match with a specific scan (based on port number and name). That way, we could support non-HTTP services (e.g., SSH, or others, if we ever add support for scanners that connect directly to other protocols, like database engines).

Old Syntax:

    # ...
    serviceAutoDiscovery:
      passiveReconcileInterval: 120s
      scanConfig:
        scanType: zap-advanced-scan
        parameters:
          - "--target"
          - "{{ .Host.Type }}://{{ .Service.Name }}.{{ .Service.Namespace }}.svc:{{ .Host.Port }}"
          - "--context"
          - "{{ default `scb-bap-context` (index .Service.Annotations `auto-discovery.securecodebox.io/zap-advanced-context`) }}"

Proposed new syntax (as an example, naming may vary):

    # ...
    serviceAutoDiscovery:
      passiveReconcileInterval: 120s
      scanConfigs:
        - scanType: zap-advanced-scan
          parameters:
            - "--target"
            - "{{ .Host.Type }}://{{ .Service.Name }}.{{ .Service.Namespace }}.svc:{{ .Host.Port }}"
            - "--context"
            - "{{ default `scb-bap-context` (index .Service.Annotations `auto-discovery.securecodebox.io/zap-advanced-context`) }}"
          matches:
            anyOf:
              # Syntax below here subject to change - may want to support lists of ports, RegEx matching for the port name, etc.
              - Port: 80
              - PortName: "http"

Open question: would something like that also make sense for the container autodiscovery?

Advantages

All advantages of option 2
Non-HTTP services can be supported (they are currently ignored, with no way to override that behavior)
Approaches the flexibility of Option 1, without the large overhead in implementation
If matching is built in a sufficiently modular way, it could also be used without large changes in future autodiscovery implementations for other kinds of resources (AWS autodiscovery?)

Disadvantages

More complex solution, requires more changes to the operator
Less backwards compatibility (although you could set the default behavior to the old list of ports and service names, if none are specified)

ADR-0013: Autodiscovery v2

Context

Option 1: Base Scan creation on the Cascading Scans functionality

Advantages

Drawbacks

Option 2: Specify List of Scans in Autodiscovery Scan Configuration Format

Advantages

Drawbacks

Option 3: Also Allow Filtering based on Port in the Autodiscovery Scan Configuration

Advantages

Disadvantages

Decision

Consequences

ADR-0013: Autodiscovery v2

Context​

Option 1: Base Scan creation on the Cascading Scans functionality​

Advantages​

Drawbacks​

Option 2: Specify List of Scans in Autodiscovery Scan Configuration Format​

Advantages​

Drawbacks​

Option 3: Also Allow Filtering based on Port in the Autodiscovery Scan Configuration​

Advantages​

Disadvantages​

Decision​

Consequences​

Context

Option 1: Base Scan creation on the Cascading Scans functionality

Advantages

Drawbacks

Option 2: Specify List of Scans in Autodiscovery Scan Configuration Format

Advantages

Drawbacks

Option 3: Also Allow Filtering based on Port in the Autodiscovery Scan Configuration

Advantages

Disadvantages

Decision

Consequences