Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add section about Filter-based feature selection #398

Closed
wants to merge 9 commits into from

Conversation

sebffischer
Copy link
Member

@sebffischer sebffischer commented Aug 16, 2022

  • Fixes a typo
  • Adds brief section about how to do filter-based feature selection

TODO:
* [ ] Mention filter-based feature selection in the previous sections that list the different methods (is already mentioned)

@sebffischer sebffischer requested review from mllg and pat-s August 16, 2022 11:32
@sebffischer
Copy link
Member Author

ping

@pat-s
Copy link
Member

pat-s commented Aug 22, 2022

Danke, versuche es die Woche einzubauen.

Copy link
Member

@pat-s pat-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intro

A broader, more general introduction is needed which explains what filters and their "scores" are and how they differ compared to wrappers (one paragraph).

Conclusion

The following points should be briefly mentioned (maybe even in the intro, depends how you structure the section):

  • Filters reduce the feature space and by that make models "simpler"
  • They can be integrated into the tuning layer of the learner
  • They can make use of caching and only need to be calculated once (large advantage compared to wrappers)

We should also mention ensemble filters, even though they are not yet available (only in the old mlr).

### Filter-based Feature Selection

A common usecase for filters is to conduct feature selection based on the filter scores.
This can be achived using `r ref("mlr_pipeops_filter", text = "PipeOpFilter")`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using filters in pipelines is an option but we should probably start out with a generic intro, i.e. showing some "plain" examples how to calculate filter scores and move the pipelines handling below it.

book/optimization.qmd Outdated Show resolved Hide resolved
book/optimization.qmd Outdated Show resolved Hide resolved
A common usecase for filters is to conduct feature selection based on the filter scores.
This can be achived using `r ref("mlr_pipeops_filter", text = "PipeOpFilter")`.
This PipeOp takes as input a Task, applies the filter, and selects the features based on the calculated scores.
The method how the features are subset can be defined in four different ways, each corresponding to a different parameter:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terms "method" and "parameter" clash a bit here as they refer to the same thing context-wise in this sentence in its current form.

Suggestion for structuring and wording:

Method: "subset based on the 'best' X features'
Parameter: filter.nfeat

book/optimization.qmd Outdated Show resolved Hide resolved
book/optimization.qmd Outdated Show resolved Hide resolved
Comment on lines 1111 to 1112
We will first subset the features based on the `r ref("mlr3filters::mlr_filters_information_gain", text = "FilterInformationGain")` and then fit a `r ref("mlr3learners::mlr_learners_classif.lda", text = "LDA")`.
We will tune the parameter `filter.frac` with a simple grid search and visualize the classification error for the different fractions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point it would be good to mention that the tuning of filter hyperpars can be fusioned with learner hyperpar tuning. This is in fact one of the big advantages of filters: they don't need an extra layer of tuning (as wrappers do).

Maybe even worth putting this into a "tip" block.

book/optimization.qmd Outdated Show resolved Hide resolved
book/optimization.qmd Outdated Show resolved Hide resolved
)
```

We can see that using 70% - 100% of the features seems to produce fairly similar results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some more beefy content required here, this does not help the user much.
Maybe 2-3 sentences describing the results and then 1-2 interpreting/discussing them (e.g. "the learner seems to produce best results with most features present which means that only few noise is in the data and most features are important) etc.

It might be worth searching for a task in which feature selection actually has an effect on the performance (and the feature space), otherwise users might think: "what do I need it for, it does not make a difference".
This might not be easy though and if we keep it like this, we need to explain to the user what the actual point is (and why we see 70-100% here and what that means). (Note: it also highly depends on the learner).

(I know it's an example and you know that, but in the book we should address this in a somewhat scientific detail, maybe even with some references :) )

sebffischer and others added 2 commits August 24, 2022 10:58
Co-authored-by: Patrick Schratz <[email protected]>
Co-authored-by: Patrick Schratz <[email protected]>
@sebffischer
Copy link
Member Author

sebffischer commented Aug 24, 2022

I don't think we should mention the caching, because afaik we don't cache intermediate results in pipelines.
The other things you suggest are partly already mentioned in the intro.
In general I do not want to overdo it here, because it is a tutorial on usage.
Also I don't think we need mention ensemble filters in the mlr3 book if they are not avilable.

@pat-s
Copy link
Member

pat-s commented Aug 24, 2022

I don't think we should mention the caching, because afaik we don't cache intermediate results in pipelines.
The other things you suggest are partly already mentioned in the intro.
In general I do not want to overdo it here, because it is a tutorial on usage.
Also I don't think we need mention ensemble filters in the mlr3 book if they are not avilable.

This sounds like you don't really want to change anything substantial.

I don't think we should mention the caching, because afaik we don't cache intermediate results in pipelines.

Apparently we don't have it yet in mlr3pipelines (mlr-org/mlr3pipelines#16) which is somewhat of a pity as it is very important for filters.
This is one of the large benefits using filters over wrappers.
AFAIR I didn't implement the same logic back as in {mlr} because we wanted to do it in pipelines but eventually this will never happen.

The other things you suggest are partly already mentioned in the intro.

This is quite general reply. I wouldn't have raised them if my feeling was they are appropriately covered.

In general I do not want to overdo it here, because it is a tutorial on usage.

Sorry to be frank here but this sounds like an excuse for low motivation against medium complicated change requests to me ;)

Also I don't think we need mention ensemble filters in the mlr3 book if they are not avilable.

The belong to the topic and are available in {mlr}, hence one sentence about them (whether they exist and/or why not) is certainly of interest to the reader.


I stand with my opinion that important information/content is missing or should be restructured and would not be happy seeing this getting merged in it's current form. If you do, I might want to rewrite it afterwards.

Please don't get this wrong, it's great we finally get a section about filters (and I could have done so earlier!) but if we have one, I'd like to see it being high quality and rather spent a bit more time on it.

@sebffischer
Copy link
Member Author

We discussed this a little today, and it was suggested to move the feature selection out of the optimization chapter.

@RaphaelS1
Copy link
Contributor

@sebffischer @be-marc how does this PR fit in with #412 ?

@sebffischer
Copy link
Member Author

@sebffischer @be-marc how does this PR fit in with #412 ?

Thanks for the reminder. I moved the section into the chapter feature-selection. I guess the part is not perfect but now Marvin will rework this chaüter. Maybe we should just merge it and he can decide which parts to keep and what not?

@RaphaelS1
Copy link
Contributor

I think everything here is covered now in Chapters 5 and 6

@RaphaelS1 RaphaelS1 closed this Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants