Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset summary in testbed instance does not show valid info for validFileOnly flag #96

Open
vkuznet opened this issue Mar 23, 2023 · 6 comments
Labels

Comments

@vkuznet
Copy link
Contributor

vkuznet commented Mar 23, 2023

Here is three different outputs using testbed instance:

scurl "https://cmsweb-testbed.cern.ch:8443/dbs/int/global/DBSReader/filesummaries/?dataset=/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD"
[
{"file_size":461981610599423,"max_ldate":1631703471,"median_cdate":null,"median_ldate":1631703471,"num_block":578,"num_event":1619891480,"num_file":100955,"num_lumi":112461}
]

# use validFileOnly=1
scurl "https://cmsweb-testbed.cern.ch:8443/dbs/int/global/DBSReader/filesummaries/?dataset=/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD&validFileOnly=1"
[
{"file_size":0,"max_ldate":null,"median_cdate":null,"median_ldate":null,"num_block":578,"num_event":0,"num_file":0,"num_lumi":0}
]

# use validFileOnly=0
scurl "https://cmsweb-testbed.cern.ch:8443/dbs/int/global/DBSReader/filesummaries/?dataset=/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD&validFileOnly=0"
[
{"file_size":0,"max_ldate":null,"median_cdate":null,"median_ldate":null,"num_block":578,"num_event":0,"num_file":0,"num_lumi":0}
]

while in DBS production we have

scurl "https://cmsweb.cern.ch:8443/dbs/prod/global/DBSReader/filesummaries/?dataset=/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD"
[
{"file_size":461981610599423,"max_ldate":1570182728,"median_cdate":null,"median_ldate":1564996283,"num_block":578,"num_event":1619891480,"num_file":100955,"num_lumi":112461}
]

# use validFileOnly=1

scurl "https://cmsweb.cern.ch:8443/dbs/prod/global/DBSReader/filesummaries/?dataset=/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD&validFileOnly=1"
[
{"file_size":461921120376671,"max_ldate":1570089906,"median_cdate":null,"median_ldate":1564996283,"num_block":578,"num_event":1619710110,"num_file":100943,"num_lumi":112453}
]

# use validFileOnly=0
scurl "https://cmsweb.cern.ch:8443/dbs/prod/global/DBSReader/filesummaries/?dataset=/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD&validFileOnly=0"
[
{"file_size":461921120376671,"max_ldate":1570089906,"median_cdate":null,"median_ldate":1564996283,"num_block":578,"num_event":1619710110,"num_file":100943,"num_lumi":112453}
]
@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 23, 2023

The actual issue lies in DBS database itself rather than in a code:

select count(f.file_id)  from cms_dbs3_prod_global_owner.files f
  join cms_dbs3_prod_global_owner.datasets d on d.DATASET_ID = f.dataset_id
  JOIN cms_dbs3_prod_global_owner.DATASET_ACCESS_TYPES DT ON  DT.DATASET_ACCESS_TYPE_ID = D.DATASET_ACCESS_TYPE_ID
  where d.dataset='/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD'
  and f.is_file_valid = 1 and DT.DATASET_ACCESS_TYPE in ('VALID', 'PRODUCTION')


COUNT(F.FILE_ID)
----------------
          100943

select count(f.file_id)  from cms_dbs3_k8s_global_owner.files f
  join cms_dbs3_k8s_global_owner.datasets d on d.DATASET_ID = f.dataset_id
  JOIN cms_dbs3_k8s_global_owner.DATASET_ACCESS_TYPES DT ON  DT.DATASET_ACCESS_TYPE_ID = D.DATASET_ACCESS_TYPE_ID
  where d.dataset='/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD'
  and f.is_file_valid = 1 and DT.DATASET_ACCESS_TYPE in ('VALID', 'PRODUCTION')

COUNT(F.FILE_ID)
----------------
               0

So, we need to check few things in DBS database using int account:

  • does file have filled is_file_valid field
  • do we have proper foreign keys in place for file and dataset_access_type
    etc.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 23, 2023

And, upon further investigation I see the following in testbed DB:

select *
from
(
select f.file_id, f.is_file_valid, dt.dataset_access_type
  from cms_dbs3_k8s_global_owner.files f
  join cms_dbs3_k8s_global_owner.datasets d on d.DATASET_ID = f.dataset_id
  JOIN cms_dbs3_k8s_global_owner.DATASET_ACCESS_TYPES DT ON  DT.DATASET_ACCESS_TYPE_ID = D.DATASET_ACCESS_TYPE_ID
  where d.dataset='/ParkingBPH1/Run2018D-05May2019promptD-v1/AOD'
)
 10  where ROWNUM <= 5;

   FILE_ID IS_FILE_VALID
---------- -------------
DATASET_ACCESS_TYPE
--------------------------------------------------------------------------------
 508165177             0
VALID

 508165178             0
VALID

 508165179             0
VALID

So, the dataset is valid but its files has 0 for is_file_valid. We need to understand why it is the case.

@amaltaro
Copy link

FYI. I have made the same observation around a month ago:
dmwm/WMCore#11414 (comment)

data that we used to use from DBS testbed for the last year or so, suddenly changed in testbed and files got marked as invalid, failing as well a few of the test workflow templates that we use in WMCore.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 23, 2023

@d-ylee , I suggest that you investigate further with Kate (and, or Yuyi) because it seems to be DB issue rather the server one (regardless of implementation) since the root of the problem comes from queries I have shown.

@vkuznet
Copy link
Contributor Author

vkuznet commented Mar 23, 2023

one particular suggestion (to discuss with Kate) is to repopulate DBS testbed database from production one such that we'll have the same content.

@amaltaro
Copy link

+1 for (re)populating testbed with a current dump of the production server.

@d-ylee d-ylee removed their assignment Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants