[🐛 Bug]: Memory leak in grid components #1694

LonelyDaoist · 2022-10-07T12:57:33Z

What happened?

Context

We are running selenium grid in distributed mode on a docker swarm with 1 manager (8GB, 4CPUs), 2 workers (16GB, 8CPUs) and 2 workers (8GB, 4CPUs). We deployed the router, distributor, session-map and queue and event bus on the manager while on the 4 workers we deployed 45 nodes (15 for firefox, edge, chrome each).

Bug description

We usually run large test suites overnight and ever since the switch to grid 4 we've been facing huge performance issues first using hub and node, the hub often goes down due to oom error, we also tried distributed mode on a single vm (16GB, 8CPUs) and similarily the router goes down frequently, hence we moved the grid to a docker swarm just to slightly alleviate the huge memory consumption. We haven't deployed the current docker swarm to production yet so it's still functioning fairly well but from cadvisor metrics we see that the router keeps accumulating memory without releasing it even when there are no tests, so it high likely we will face the same issue eventually

Command used to start Selenium Grid with Docker

version: "3.6"

x-common-props: &common_props
  depends_on:
    - selenium-event-bus     
  logging:
    driver: "json-file"      
    options:
      max-file: "5"
      max-size: "10m"
  volumes:
    - type: tmpfs
      target: /dev/shm
      tmpfs:
        size: 4096000000
  deploy:
      placement:
        constraints: [node.role == worker]

x-router-image: &router_image selenium/router:4.5.0-20221004 
x-distributor-image: &distributor_image selenium/distributor:4.5.0-20221004 
x-sessions-image: &sessions_image selenium/sessions:4.5.0-20221004 
x-session_queue-image: &session_queue_image selenium/session-queue:4.5.0-20221004 
x-event_bus-image: &event_bus_image selenium/event-bus:4.5.0-20221004 

x-chrome-image: &chrome_image private/custom-chrome:v1.0.0
x-firefox-image: &firefox_image private/custom-firefox:v1.0.0
x-edge-image: &edge_image private/custom-edge:v1.0.0

networks:
  default:
    driver: overlay
    driver_opts:
      com.docker.network.driver.mtu: 1400

services:
  selenium-event-bus:
    image: *event_bus_image
    ports:
      - "4442:4442"
      - "4443:4443"
      - "5557:5557"
    volumes:
      - type: tmpfs
        target: /dev/shm
        tmpfs:
          size: 4096000000
    deploy:
      placement:
        constraints: [node.role == manager]

  selenium-sessions:
    image: *sessions_image
    ports:
      - "5556:5556"
    depends_on:
      - selenium-event-bus
    environment:
      - SE_EVENT_BUS_HOST=selenium-event-bus
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
    volumes:
      - type: tmpfs
        target: /dev/shm
        tmpfs:
          size: 4096000000
    deploy:
      placement:
        constraints: [node.role == manager]

  selenium-session-queue:
    image: *session_queue_image
    ports:
      - "5559:5559"
    depends_on:
      - selenium-event-bus
    environment:
      - SE_EVENT_BUS_HOST=selenium-event-bus
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
    volumes:
      - type: tmpfs
        target: /dev/shm
        tmpfs:
          size: 4096000000
    deploy:
      placement:
        constraints: [node.role == manager]

  selenium-distributor:
    image: *distributor_image
    ports:
      - "5553:5553"
    depends_on:
      - selenium-event-bus
      - selenium-sessions
      - selenium-session-queue
    environment:
      - SE_EVENT_BUS_HOST=selenium-event-bus
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_SESSIONS_MAP_HOST=selenium-sessions
      - SE_SESSIONS_MAP_PORT=5556
      - SE_SESSION_QUEUE_HOST=selenium-session-queue
      - SE_SESSION_QUEUE_PORT=5559
    volumes:
      - type: tmpfs
        target: /dev/shm
        tmpfs:
          size: 4096000000
    deploy:
      placement:
        constraints: [node.role == manager]

  selenium-router:
    image: *router_image
    ports:
      - "4444:4444"
    depends_on:
      - selenium-distributor
      - selenium-sessions
      - selenium-session-queue
    environment:
      - SE_DISTRIBUTOR_HOST=selenium-distributor
      - SE_DISTRIBUTOR_PORT=5553
      - SE_SESSIONS_MAP_HOST=selenium-sessions
      - SE_SESSIONS_MAP_PORT=5556
      - SE_SESSION_QUEUE_HOST=selenium-session-queue
      - SE_SESSION_QUEUE_PORT=5559
    volumes:
      - type: tmpfs
        target: /dev/shm
        tmpfs:
          size: 4096000000
    deploy:
      placement:
        constraints: [node.role == manager]

  chrome:
    image: *chrome_image
    << : *common_props
    environment:
      - SE_EVENT_BUS_HOST=selenium-event-bus
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=1
      - VNC_NO_PASSWORD=1
      - SCREEN_WIDTH=1700
      - SCREEN_HEIGHT=1300
      - SE_NODE_PORT=6028
      - EXTRAS_PORT=8013
    ports:
      - "6028:6028"
      - "6373:5900"
      - "8013:8013"

  firefox:
    image: *firefox_image
    << : *common_props
    environment:
      - SE_EVENT_BUS_HOST=selenium-event-bus
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=1
      - VNC_NO_PASSWORD=1
      - SCREEN_WIDTH=1700
      - SCREEN_HEIGHT=1300
      - SE_NODE_PORT=6038
      - EXTRAS_PORT=8023
    ports:
      - "6038:6038"
      - "6383:5900"
      - "8023:8023"

  edge:
    image: *edge_image
    << : *common_props
    environment:
      - SE_EVENT_BUS_HOST=selenium-event-bus
      - SE_EVENT_BUS_PUBLISH_PORT=4442
      - SE_EVENT_BUS_SUBSCRIBE_PORT=4443
      - SE_NODE_MAX_SESSIONS=1
      - VNC_NO_PASSWORD=1
      - SCREEN_WIDTH=1700
      - SCREEN_HEIGHT=1300
      - SE_NODE_PORT=6054
      - EXTRAS_PORT=8039
    ports:
      - "6054:6054"
      - "6399:5900"
      - "8039:8039"

Relevant log output

2022-10-06 12:06:59,260 INFO Included extra file "/etc/supervisor/conf.d/selenium-grid-router.conf" during parsing
2022-10-06 12:06:59,265 INFO RPC interface 'supervisor' initialized
2022-10-06 12:06:59,265 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2022-10-06 12:06:59,266 INFO supervisord started with pid 8
2022-10-06 12:07:00,268 INFO spawned: 'selenium-grid-router' with pid 10
Starting Selenium Grid Router...
2022-10-06 12:07:00,279 INFO success: selenium-grid-router entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
12:07:00.836 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding
12:07:00.847 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing
12:07:02.182 INFO [RouterServer.execute] - Started Selenium Router 4.5.0 (revision fe167b119a): http://10.0.0.84:4444

Operating System

RHEL 7.6; Docker version 18.03.0-ce, build 0520e24

Docker Selenium version (tag)

4.5.0-20221004

diemol · 2023-08-01T21:33:05Z

Some improvements made for 4.11.0 are out. Can you please try again?

diemol · 2024-01-04T14:46:47Z

Closing as we did not get more information.

github-actions · 2024-02-04T00:18:43Z

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

LonelyDaoist added the needs-triaging label Oct 7, 2022

diemol added S-needs-investigation and removed needs-triaging labels Oct 12, 2022

VietND96 added R-awaiting-answer and removed S-needs-investigation labels Dec 2, 2023

diemol closed this as not planned Won't fix, can't repro, duplicate, stale Jan 4, 2024

github-actions bot locked and limited conversation to collaborators Feb 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[🐛 Bug]: Memory leak in grid components #1694

[🐛 Bug]: Memory leak in grid components #1694

LonelyDaoist commented Oct 7, 2022

diemol commented Aug 1, 2023

diemol commented Jan 4, 2024

github-actions bot commented Feb 4, 2024

[🐛 Bug]: Memory leak in grid components #1694

[🐛 Bug]: Memory leak in grid components #1694

Comments

LonelyDaoist commented Oct 7, 2022

What happened?

Context

Bug description

Command used to start Selenium Grid with Docker

Relevant log output

Operating System

Docker Selenium version (tag)

diemol commented Aug 1, 2023

diemol commented Jan 4, 2024

github-actions bot commented Feb 4, 2024