-
Notifications
You must be signed in to change notification settings - Fork 77
DSLParser介绍
mgqa34 edited this page Feb 20, 2020
·
1 revision
DSLParser模块的功能是解析离线训练生成的推理文件,构造DSL建模DAG流程图提供给在线推理过程,下面描述一下具体流程
*1 根据输入的dsl_json序列化对象反序列化得到dsl json对象 *2 从dsl json对象中获取组件关系字典,示例格式如下:
{'dataio_0': {'CodePath': 'federatedml/util/data_io.py/DataIO',
'input': {'data': {'data': ['args.eval_data']},
'model': ['pipeline.dataio_0.dataio']},
'module': 'DataIO',
'output': {'data': ['train']}},
'hetero_feature_binning_0': {'CodePath': 'federatedml/feature/hetero_feature_binning/hetero_binning_guest.py/HeteroFeatureBinningGuest',
'input': {'data': {'data': ['dataio_0.train']},
'model': ['pipeline.hetero_feature_binning_0.binning_model']},
'module': 'HeteroFeatureBinning',
'output': {'data': ['transform_data']}},
hetero_feature_selection_0': {'CodePath': 'federatedml/feature/hetero_feature_selection/feature_selection_guest.py/HeteroFeatureSelectionGuest',
'input': {'data': {'data': ['hetero_feature_binning_0.transform_data']},
'model': ['pipeline.hetero_feature_selection_0.selected']},
'module': 'HeteroFeatureSelection',
'output': {'data': ['train']}},
'one_hot_0': {'CodePath': 'federatedml/feature/one_hot_encoder.py/OneHotEncoder',
'input': {'data': {'data': ['hetero_feature_selection_0.train']},
'model': ['pipeline.one_hot_0.one_hot_encoder']},
'module': 'OneHotEncoder',
'output': {'data': ['output_data']}}},
'hetero_lr_0': {'CodePath': 'federatedml/linear_model/logistic_regression/hetero_logistic_regression/hetero_lr_guest.py/HeteroLRGuest',
'input': {'data': {'eval_data': ['one_hot_0.output_data']},
'model': ['pipeline.hetero_lr_0.hetero_lr']},
'module': 'HeteroLR',
'output': {'data': ['train']}}}
其中关键信息是{ "组件名": "算法模块名", "input": {} },我们使用("组件名", "算法模块名")表示节点,"input"信息来表示图上的边依赖关系,其他关于dsl更详细的说明可以去FATE仓库进一步了解。
*3 从2中得到的组件关系字典中,根据每个"组件名"进行图的节点初始化,同时,解析"input"的"data"关键字,从里面提取出上游依赖关系,构建有向边集(上游->自己),点集和边集构建完成后,使用拓扑排序,得到建模流程图的节点拓扑序数组 topoRankComponent
*4 另外初始化每个组件对应的算法模块名 componentModuleMap 和上游输入 upInputs