Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELOG station logic weird and wrong for RIX #369

Open
ZLLentz opened this issue Apr 24, 2023 · 6 comments
Open

ELOG station logic weird and wrong for RIX #369

ZLLentz opened this issue Apr 24, 2023 · 6 comments

Comments

@ZLLentz
Copy link
Member

ZLLentz commented Apr 24, 2023

Expected Behavior

Each hutch should be able to configure their daq station/elog appropriately.
E.g. if they are running on daq 2, they should post to elog 2, etc.

Current Behavior

Only elogs 0 and 1 are allowed
RIX uses 2

$ cat conf.yml
hutch: rix

# Locate happi database
db: /reg/g/pcds/pyps/apps/hutch-python/device_config/db.json

# Hutch-specific imports
load: rix.beamline

# DAQ interface configuration
daq_type: lcls2

daq_host: drp-srcf-cmp004

daq_platform:
  default: 2

Station is based on whether or not the default is active
I think we put this in for CXI like a million years ago

Possible Solution

Use the config-file provided daq platform to select the elog station when available
Otherwise, fall back to old behavior

Context

RIX elog wasn't loading correctly in hutch python (ever?)

Your Environment

pcds-5.7.1

@ZLLentz
Copy link
Member Author

ZLLentz commented Apr 24, 2023

This is a bit complicated because here's what cxi's config looks like:

$ cat cxi/conf.yml
hutch: cxi

# Locate happi database
db: /reg/g/pcds/pyps/apps/hutch-python/device_config/db.json

# Hutch-specific imports
load: cxi.beamline

# Set platform information
daq_platform:
    default: 4
    cxi-monitor: 3

And their expected behavior is:

  • cxi-daq should use daq platform 4, elog 0
  • cxi-monitor should use daq platform 3, elog 1

ugh

@jyotiphy
Copy link

@ZLLentz I don't remember seeing elog error in hutch-python new, so I would say it's new issue.

@ZLLentz
Copy link
Member Author

ZLLentz commented Apr 24, 2023

I think there's something strange going on, maybe some new behavior but also there was older failing behavior.

It seems to me like this error happened about half of the time historically (from searching through the logs).
About half of all logfiles have the same error as we saw today, some of the other times failed for other reasons, and some of them succeeded using the same http request that fails today.

Here's the error:

Failed to gather current experiment information from Web Service, HTTP status_code: 500

Here's what the logs look like when it fails:

2023-04-24 15:05:33 - PID 3793           pswww.py: 159 get_experiment_logbook DEBUG    - Requesting current experiment for RIX
2023-04-24 15:05:33 - PID 3793  connectionpool.py: 1003 _new_conn          DEBUG    - Starting new HTTPS connection (1): pswww.slac.stanford.edu:443
2023-04-24 15:05:33 - PID 3793  connectionpool.py: 456 _make_request      DEBUG    - https://pswww.slac.stanford.edu:443 "GET /ws-auth/lgbk/lgbk/ws/activeexperiment_for_instrument_station?instrument_name=RIX HTTP/1.1" 500 59
2023-04-24 15:05:33 - PID 3793           utils.py: 66  safe_load          ERROR    - Failed to load elog after 0.08 s
2023-04-24 15:05:33 - PID 3793           utils.py: 67  safe_load          DEBUG    - Failed to gather current experiment information from Web Service, HTTP status_code: 500
Traceback (most recent call last):
  File "/u1/rixopr/conda_envs/pcds-5.7.1/lib/python3.9/site-packages/hutch_python/utils.py", line 60, in safe_load
    yield
  File "/u1/rixopr/conda_envs/pcds-5.7.1/lib/python3.9/site-packages/hutch_python/load_conf.py", line 430, in load_conf
    cache(elog=HutchELog.from_conf(hutch.upper(), **kwargs))
  File "/u1/rixopr/conda_envs/pcds-5.7.1/lib/python3.9/site-packages/elog/elog.py", line 224, in from_conf
    return cls(*args, user=user, pw=pw, **kwargs)
  File "/u1/rixopr/conda_envs/pcds-5.7.1/lib/python3.9/site-packages/elog/elog.py", line 143, in __init__
    exp_id = self.service.get_experiment_logbook(instrument,
  File "/u1/rixopr/conda_envs/pcds-5.7.1/lib/python3.9/site-packages/elog/pswww.py", line 172, in get_experiment_logbook
    raise Exception('Failed to gather current experiment information '
Exception: Failed to gather current experiment information from Web Service, HTTP status_code: 500

Here's what a successful request log entry looked like historically:

2021-11-30 21:49:01 - PID 11861           pswww.py: 155 get_experiment_logbook DEBUG    - Requesting current experiment for RIX
2021-11-30 21:49:01 - PID 11861  connectionpool.py: 971 _new_conn          DEBUG    - Starting new HTTPS connection (1): pswww.slac.stanford.edu:443
2021-11-30 21:49:01 - PID 11861  connectionpool.py: 452 _make_request      DEBUG    - https://pswww.slac.stanford.edu:443 "GET /ws-auth/lgbk/lgbk/ws/activeexperiment_for_instrument_station?instrument_name=RIX HTTP/1.1" 200 381

Here's the slightly modified logged request that works today:

2023-04-24 15:34:14 - PID 12711           pswww.py: 159 get_experiment_logbook DEBUG    - Requesting current experiment for RIX
2023-04-24 15:34:14 - PID 12711  connectionpool.py: 1003 _new_conn          DEBUG    - Starting new HTTPS connection (1): pswww.slac.stanford.edu:443
2023-04-24 15:34:14 - PID 12711  connectionpool.py: 456 _make_request      DEBUG    - https://pswww.slac.stanford.edu:443 "GET /ws-auth/lgbk/lgbk/ws/activeexperiment_for_instrument_station?instrument_name=RIX&station=2 HTTP/1.1" 200 568

Full error count stats:

  • 1121 logfiles
  • 843 elog failed to load
  • 678 elog failed to load for this reason
  • 278 elog succeeded to load

@ZLLentz
Copy link
Member Author

ZLLentz commented Apr 25, 2023

So I think there's some inconsistent behavior with respect to the responses we get from the pswww server

@ZLLentz
Copy link
Member Author

ZLLentz commented Apr 25, 2023

Silke noticed and fixed this in the get_info/get_expname scripts back in 2021 (for rix):
pcdshub/engineering_tools@3786c27

@klauer
Copy link
Contributor

klauer commented May 8, 2023

Seeing the above reminds me of an existing draft PR... Tangential perhaps, but is it time to revisit the centralization of "get_info" tooling in pcdshub/pcdsutils#51?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants