Skip to content

Tightening Pod/Container Security for All Workloada#100

Open
alix-graylog wants to merge 2 commits into
mainfrom
security-contexts-2
Open

Tightening Pod/Container Security for All Workloada#100
alix-graylog wants to merge 2 commits into
mainfrom
security-contexts-2

Conversation

@alix-graylog

@alix-graylog alix-graylog commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Summary

Locking down default pod and container security contexts, as far as the application currently allows, and exposing the configuration to the values.yaml

Details

  • Updated Graylog application and it's init container's security context's and moving them into user configurable values
  • Updated Graylog datanode and it's init container's security context's and moving them into user configurable values
  • Updated geoips security contexts

What this does

Moves security contexts out of hardcoded template values and into values.yaml so they're configurable, and tightens the defaults for both Graylog and Datanode.

  • podSecurityContext and containerSecurityContext are now values-driven for both workloads
  • Graylog init containers get their own initContainerSecurityContext — they don't need NET_BIND_SERVICE since they never start the JVM
  • copy-plugin-* init containers get readOnlyRootFilesystem: true (they only write to an emptyDir)
  • Datanode gets a data-chown init container (busybox) that handles ownership setup on the PVC before the main container starts. The image is configurable via datanode.initContainerImage for private registry support
  • seccompProfile: RuntimeDefault applied across all containers
  • allowPrivilegeEscalation: false and capabilities: drop: ALL on everything

Why some things couldn't be locked down further

Graylog — NET_BIND_SERVICE has to stay. The JDK binary in the image has cap_net_bind_service=ep set as a file capability. With no_new_privs=1 (which allowPrivilegeEscalation: false sets), executing a binary that would gain capabilities not already in the process's permitted set fails with EPERM. Pre-adding NET_BIND_SERVICE to the container's permitted set means the file capability doesn't grant anything new, so the exec works.

Datanode — has to start as root. This is common with Opensearch in Kubernetes. The entrypoint does chown -R graylog:graylog /var/lib/graylog-datanode then exec setpriv --reuid=graylog --regid=graylog --init-groups unconditionally. That requires CHOWN, DAC_OVERRIDE, SETUID, and SETGID. We tried moving the chown to the init container to strip those from the main container, but the entrypoint still runs its own chown -R on startup — and root without CAP_CHOWN can't chown files owned by uid 999. We also tried setting runAsUser: 999 directly, but the container runtime clears the effective capability set when it switches from root to non-root, so SETGID ends up in the bounding set but not effective — setpriv --init-groups still fails. After setpriv runs the process has uid 999 and no capabilities, so the running application is fine, just the startup window needs them.

readOnlyRootFilesystem — not set on the main containers. Both the Graylog JVM and the OpenSearch-based datanode write to the container filesystem at runtime and would need a full audit + emptyDir mounts to enable this safely.

Opting out

If the security contexts cause issues in your environment, each field is fully overridable. To disable them entirely:

graylog:
  podSecurityContext: {}
  initContainerSecurityContext: {}
  containerSecurityContext: {}

datanode:
  podSecurityContext: {}
  initContainerSecurityContext: {}
  containerSecurityContext: {}

Linked issues

PR Checklist

Please check the items that apply to your change.

  • Tests added/updated
  • Documentation updated
  • This PR includes a new feature
  • This PR includes a bugfix
  • This PR includes a refactor

Testing Checklist

Static Validation

  • Linter check passes: helm lint ./charts/graylog
  • Helm renders local template sucessfully: helm template graylog ./charts/graylog --validate

Installation

  • Fresh installation completes successfully: helm install graylog ./charts/graylog
  • All pods reach Running state: kubectl rollout status statefulset/graylog
  • Helm tests pass: helm test graylog

Functional (if applicable)

  • Web UI accessible and login works
  • DataNodes visible in System > Cluster Configuration
  • Inputs can be created and receive data

Upgrade (if applicable)

  • Upgrade from previous release succeeds
  • Scaling up/down works correctly
  • Configuration changes apply correctly

Specific to this PR

  • Datanode workloads started successfully.
  • Graylog application workloads started successfully
  • GeoIP workloads started successfully

Notes for reviewers

  • Verify all applicable tests above pass
  • Validate that the linked issues are no longer reproducible, if applicable
  • Sync up with the author before merging
  • The commit history should be preserved - use rebase-merge or standard merge options when applicable

@alix-graylog alix-graylog marked this pull request as ready for review June 26, 2026 20:37

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens the default Pod/Container security contexts for Graylog, DataNode, and GeoIP update jobs, and makes those security settings configurable via values.yaml so operators can adjust or opt out as needed.

Changes:

  • Added values-driven podSecurityContext, initContainerSecurityContext, and containerSecurityContext defaults for Graylog and DataNode.
  • Updated StatefulSet templates to render the new values-driven security contexts, and added a DataNode data-chown init container.
  • Updated GeoIP CronJob/Job rendering to accept Pod/Container security contexts via the shared helper.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
charts/graylog/values.yaml Introduces configurable default security contexts (and initContainer image config) for Graylog and DataNode.
charts/graylog/templates/workload/statefulsets/graylog.yaml Renders new values-driven pod/container/init-container security contexts for Graylog workload and plugin-copy init containers.
charts/graylog/templates/workload/statefulsets/datanode.yaml Adds values-driven pod/container/init-container security contexts plus a data-chown init container for PVC preparation.
charts/graylog/templates/workload/cronjobs/geoip.yaml Passes security contexts into the GeoIP JobSpec helper.
charts/graylog/templates/_helpers.tpl Extends GeoIP JobSpec helper to optionally render pod/container security contexts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +101
{{- with merge (dict "readOnlyRootFilesystem" true) $.Values.graylog.initContainerSecurityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
Comment on lines +67 to +71
mkdir -p \
{{ $dataMountPath }}/opensearch/config \
{{ $dataMountPath }}/opensearch/data \
{{ $dataMountPath }}/opensearch/logs
chown -R 999:999 {{ $dataMountPath }}
Comment on lines +292 to +295
initContainerImage:
repository: "busybox"
tag: "latest"
imagePullPolicy: IfNotPresent
Comment on lines +184 to +187
fsGroup: 1100
seccompProfile:
type: RuntimeDefault
initContainerSecurityContext:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants