Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission denied on sqlite database when use TLS #114

Open
slefol opened this issue Oct 13, 2023 · 11 comments
Open

Permission denied on sqlite database when use TLS #114

slefol opened this issue Oct 13, 2023 · 11 comments
Assignees
Labels
needs/kind Kind label required needs/triage Needs triage
Milestone

Comments

@slefol
Copy link

slefol commented Oct 13, 2023

Hi,
I have noticed a issue with the dashboard when tls.enabled is set to true.

Environment

Helm chart : crowdsec
Helm chart version : 0.9.9

$ helm install \
    crowdsec crowdsec/crowdsec \
    --create-namespace \
    --namespace crowdsec \
    -f crowdsec-values.yaml

crowdsec-values.yaml:

container_runtime: containerd
tls:
  enabled: true
  bouncer:
    reflector:
      namespaces: ["traefik"]
agent:
  tolerations:
    - key: node-role.kubernetes.io/control-plane
      operator: Equal
      effect: NoSchedule
  # Specify each pod whose logs you want to process
  acquisition:
    # The namespace where the pod is located
    - namespace: traefik
      # The pod name
      podName: traefik-*
      # as in crowdsec configuration, we need to specify the program name to find a matching parser
      program: traefik
  env:
    - name: PARSERS
      value: "crowdsecurity/cri-logs"
    - name: COLLECTIONS
      value: "crowdsecurity/traefik"
    # When testing, allow bans on private networks
    - name: DISABLE_PARSERS
      value: "crowdsecurity/whitelists"
  persistentVolume:
    config:
      enabled: false
lapi:
  dashboard:
    enabled: true
    ingress:
      host: dashboard.local
      enabled: true
  persistentVolume:
    config:
      enabled: false
  env:
    # For an internal test, disable the Online API
    - name: DISABLE_ONLINE_API
      value: "true"
Issue

In dashboard > Browse data > Cdrodsec > Alerts
[SQLITE_CANTOPEN] Unable to open the database file (unable to open database file)

Investigation

In Admin Settings > Databases > Crowdsec > Save changes
/metabase-data/crowdsec.db (Permission denied)

The dashboard is launched with the metabase user who does not have rights to the database file.

$ kubectl -n crowdsec exec -it crowdsec-lapi-7c79988958-q89ln -c dashboard -- sh
/ # ps faux
PID   USER     TIME  COMMAND
    1 metabase  1:30 java -XX:+IgnoreUnrecognizedVMOptions -Dfile.encoding=UTF-8 -Dlogfile.path=target/log -XX:+CrashOnOutOfMemoryError -server -jar /app/metabase.jar

/ # readlink /metabase-data/crowdsec.db
/var/lib/crowdsec/data/crowdsec.db

/ # ls -lh /var/lib/crowdsec/data/crowdsec.db
-rw-r-----    1 root     root       84.0K Oct 13 07:19 /var/lib/crowdsec/data/crowdsec.db

Change group ownership of the database file fixes the issue

/ # chown :metabase /var/lib/crowdsec/data/crowdsec.db
@slefol
Copy link
Author

slefol commented Jul 12, 2024

Hi,
I installedHelm chart version : 0.11.0 and the issue is still present.
Can you please look into this issue ?

@github-actions github-actions bot added needs/triage Needs triage needs/kind Kind label required labels Jul 12, 2024
@LaurenceJJones
Copy link
Contributor

LaurenceJJones commented Jul 16, 2024

Hi, I installedHelm chart version : 0.11.0 and the issue is still present. Can you please look into this issue ?

Hey 👋🏻

I dont think it has anything to do with TLS. By default the database is owned by root:root and the permissions are updating each time, could you try defining in the lapi environment a GID property of 2000 as that is the metabase group id from within the container.

lapi:
# -- replicas for local API
replicas: 1
# -- environment variables from crowdsecurity/crowdsec docker image
env: []
# by default disable the agent because it only needs the local API.
#- name: DISABLE_AGENT
# value: "true"
# Allows you to load environment variables from kubernetes secret or config map
envFrom: []
# - secretRef:
# name: env-secret
# - configMapRef:
# name: config-map
# -- Enable ingress lapi object

So an example

 lapi: 
   # -- replicas for local API 
   replicas: 1 
   # -- environment variables from crowdsecurity/crowdsec docker image 
   env:
     - name: GID
       value: "2000"
     # by default disable the agent because it only needs the local API. 
     #- name: DISABLE_AGENT 
     #  value: "true" 
   # Allows you to load environment variables from kubernetes secret or config map 
   envFrom: [] 
     # - secretRef: 
     #     name: env-secret 

https://github.com/crowdsecurity/crowdsec/blob/c4bfdf19914a88671663f8caae5a5ea849c1b3a6/docker/docker_start.sh#L334-L341

/ # cat /etc/group
root:x:0:root
bin:x:1:root,bin,daemon
daemon:x:2:root,bin,daemon
sys:x:3:root,bin,adm
adm:x:4:root,adm,daemon
tty:x:5:
disk:x:6:root,adm
lp:x:7:lp
mem:x:8:
kmem:x:9:
wheel:x:10:root
floppy:x:11:root
mail:x:12:mail
news:x:13:news
uucp:x:14:uucp
man:x:15:man
cron:x:16:cron
console:x:17:
audio:x:18:
cdrom:x:19:
dialout:x:20:root
ftp:x:21:
sshd:x:22:
input:x:23:
at:x:25:at
tape:x:26:root
video:x:27:root
netdev:x:28:
readproc:x:30:
squid:x:31:squid
xfs:x:33:xfs
kvm:x:34:kvm
games:x:35:
shadow:x:42:
cdrw:x:80:
www-data:x:82:
usb:x:85:
vpopmail:x:89:
users:x:100:games
ntp:x:123:
nofiles:x:200:
smmsp:x:209:smmsp
locate:x:245:
abuild:x:300:
utmp:x:406:
ping:x:999:
nogroup:x:65533:
nobody:x:65534:
metabase:x:2000:metabase

@slefol
Copy link
Author

slefol commented Jul 17, 2024

@LaurenceJJones Thank you for your interest in my request.

I defined in the lapi environment a GID property of 2000 but I get this error :
ComparisonError: error calculating structured merge diff: error building typed value from config resource: .spec.template.spec.containers[name="crowdsec-lapi"].env: duplicate entries for key [name="GID"]
(helm chart deployed by ArgoCD).

Indeed, we can see that variable is already defined in the template
https://github.com/crowdsecurity/helm-charts/blob/main/charts/crowdsec/templates/lapi-deployment.yaml#L87-L90

@LaurenceJJones
Copy link
Contributor

Can you share your full values.yaml please?

@LaurenceJJones
Copy link
Contributor

Because if you using the official metabase image, it should use the same MGID

          - name: MGID
            value: "1000"

@slefol
Copy link
Author

slefol commented Jul 17, 2024

my values.yaml :

  container_runtime: containerd
  tls:
    enabled: true
    bouncer:
      reflector:
        namespaces: ["traefik"]
  agent:
    # Specify each pod whose logs we want to process
    acquisition:
      # The namespace where the pod is located
      - namespace: traefik
        # The pod name
        podName: traefik-*
        # as in crowdsec configuration, we need to specify the program name to find a matching parser
        program: traefik
    # Those are ENV variables
    env:
      - name: PARSERS
        value: "crowdsecurity/cri-logs"
      - name: COLLECTIONS
        value: "crowdsecurity/traefik"
      - name: DISABLE_PARSERS
        value: "crowdsecurity/whitelists"
    persistentVolume:
      config:
        enabled: false
  lapi:
    dashboard:
      enabled: true
      ingress:
        host: dashboard.local
        enabled: false
    persistentVolume:
      config:
        enabled: false
    env:
      # If it's a test, we don't want to share signals with CrowdSec so disable the Online API.
      - name: DISABLE_ONLINE_API
        value: "true"
      - name: GID
        value: "2000"

@LaurenceJJones
Copy link
Contributor

LaurenceJJones commented Jul 17, 2024

my values.yaml :

  container_runtime: containerd
  tls:
    enabled: true
    bouncer:
      reflector:
        namespaces: ["traefik"]
  agent:
    # Specify each pod whose logs we want to process
    acquisition:
      # The namespace where the pod is located
      - namespace: traefik
        # The pod name
        podName: traefik-*
        # as in crowdsec configuration, we need to specify the program name to find a matching parser
        program: traefik
    # Those are ENV variables
    env:
      - name: PARSERS
        value: "crowdsecurity/cri-logs"
      - name: COLLECTIONS
        value: "crowdsecurity/traefik"
      - name: DISABLE_PARSERS
        value: "crowdsecurity/whitelists"
    persistentVolume:
      config:
        enabled: false
  lapi:
    dashboard:
      enabled: true
      ingress:
        host: dashboard.local
        enabled: false
    persistentVolume:
      config:
        enabled: false
    env:
      # If it's a test, we don't want to share signals with CrowdSec so disable the Online API.
      - name: DISABLE_ONLINE_API
        value: "true"
      - name: GID
        value: "2000"

You can remove the GID stuff didnt know we set it for both containers. Once the LAPI is started if you exec in the container you dont see these permissions?

Defaulted container "crowdsec-lapi" out of: crowdsec-lapi, dashboard, fetch-metabase-config (init)
# ls -la /var/lib/crowdsec/data 
total 104
drwxrwxrwx    3 root     root          4096 Jul 17 12:19 .
drwxr-xr-x    3 root     root          4096 Jun  5 14:15 ..
lrwxrwxrwx    1 root     root            48 Jul 17 12:18 GeoLite2-ASN.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-ASN.mmdb
lrwxrwxrwx    1 root     root            49 Jul 17 12:18 GeoLite2-City.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-City.mmdb
-rw-r-----    1 root     1000         94208 Jul 17 12:19 crowdsec.db
drwx------    2 root     root          4096 Jul 17 12:18 trace

@LaurenceJJones
Copy link
Contributor

LaurenceJJones commented Jul 17, 2024

Okay managed to replicate that enabling TLS does infact negate the permissions from updating which is really odd as there nothing depending, it must be a race condition the database is not there whilst the chown runs

/var/lib/crowdsec/data # ls -la
total 104
drwxrwxrwx    1 root     root           102 Jul 17 13:54 .
drwxr-xr-x    1 root     root             8 Jun  5 14:15 ..
lrwxrwxrwx    1 root     root            48 Jul 17 13:50 GeoLite2-ASN.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-ASN.mmdb
lrwxrwxrwx    1 root     root            49 Jul 17 13:50 GeoLite2-City.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-City.mmdb
-rw-r-----    1 root     root         94208 Jul 17 13:54 crowdsec.db
drwx------    1 root     root             0 Jul 17 13:50 trace

Running a kubectl delete pods -n crowdsec crowdsec-lapi-bdc4d8cff-bgxd6 after init does go back and update permissions but its not good 🤷🏻

Let me see if there a way round it.

@slefol
Copy link
Author

slefol commented Jul 17, 2024

No, the permissions are :

kubectl -n crowdsec exec -it crowdsec-lapi-98f7577d6-bv4tb -- sh
Defaulted container "crowdsec-lapi" out of: crowdsec-lapi, dashboard, fetch-metabase-config (init)
/ # ls -la /var/lib/crowdsec/data
total 100
drwxr-xr-x    3 root     root            89 Jul 17 13:33 .
drwxr-xr-x    3 root     root            18 Jun  5 14:15 ..
lrwxrwxrwx    1 root     root            48 Jul 17 12:58 GeoLite2-ASN.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-ASN.mmdb
lrwxrwxrwx    1 root     root            49 Jul 17 12:58 GeoLite2-City.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-City.mmdb
-rw-r-----    1 root     root        102400 Jul 17 13:33 crowdsec.db
drwx------    2 root     root             6 Jul 17 12:58 trace

and sorry to insist but the problem does not occur when tls is not enabled.

new install with same values.yaml except tls.enabled: false :

kubectl -n crowdsec exec -it crowdsec-lapi-944958666-jrw45 -- sh
Defaulted container "crowdsec-lapi" out of: crowdsec-lapi, dashboard, fetch-metabase-config (init)
/ # ls -la /var/lib/crowdsec/data
total 70580
drwxr-xr-x    2 root     root            76 Jul 17 13:48 .
drwxr-xr-x    3 root     root            18 Apr 18 13:37 ..
-rw-------    1 root     root       8404553 Jul 17 13:47 GeoLite2-ASN.mmdb
-rw-------    1 root     root      63771586 Jul 17 13:47 GeoLite2-City.mmdb
-rw-r-----    1 root     1000         94208 Jul 17 13:48 crowdsec.db

@LaurenceJJones
Copy link
Contributor

No, the permissions are :

kubectl -n crowdsec exec -it crowdsec-lapi-98f7577d6-bv4tb -- sh
Defaulted container "crowdsec-lapi" out of: crowdsec-lapi, dashboard, fetch-metabase-config (init)
/ # ls -la /var/lib/crowdsec/data
total 100
drwxr-xr-x    3 root     root            89 Jul 17 13:33 .
drwxr-xr-x    3 root     root            18 Jun  5 14:15 ..
lrwxrwxrwx    1 root     root            48 Jul 17 12:58 GeoLite2-ASN.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-ASN.mmdb
lrwxrwxrwx    1 root     root            49 Jul 17 12:58 GeoLite2-City.mmdb -> /staging/var/lib/crowdsec/data/GeoLite2-City.mmdb
-rw-r-----    1 root     root        102400 Jul 17 13:33 crowdsec.db
drwx------    2 root     root             6 Jul 17 12:58 trace

and sorry to insist but the problem does not occur when tls is not enabled.

new install with same values.yaml except tls.enabled: false :

kubectl -n crowdsec exec -it crowdsec-lapi-944958666-jrw45 -- sh
Defaulted container "crowdsec-lapi" out of: crowdsec-lapi, dashboard, fetch-metabase-config (init)
/ # ls -la /var/lib/crowdsec/data
total 70580
drwxr-xr-x    2 root     root            76 Jul 17 13:48 .
drwxr-xr-x    3 root     root            18 Apr 18 13:37 ..
-rw-------    1 root     root       8404553 Jul 17 13:47 GeoLite2-ASN.mmdb
-rw-------    1 root     root      63771586 Jul 17 13:47 GeoLite2-City.mmdb
-rw-r-----    1 root     1000         94208 Jul 17 13:48 crowdsec.db

Okay we tracked down the issue, so its not a race condition. When using TLS we dont need to add the machines to the database since when they authenticate with mTLS it will automatically create the database entry. Since we dont interact with the database, there is no database whilst the chown command runs. (Deleting the pod works as that files still exists and it works). So for now we will update crowdsec to run cscli machines list and redirect the output to /dev/null this is a hack but it would fix this issue as there no real way to fix it as its true there should be no DB.

So your were right about the TLS stuff, it was just a mess to find where the exact problem was. We will merge and fix this for 1.6.3 as its a very minor change.

@LaurenceJJones LaurenceJJones added this to the 1.6.3 milestone Jul 17, 2024
@LaurenceJJones
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/kind Kind label required needs/triage Needs triage
Projects
None yet
Development

No branches or pull requests

2 participants