Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when Using ROC test for Marker Genes #12

Open
davisidarta opened this issue Mar 12, 2020 · 1 comment
Open

Error when Using ROC test for Marker Genes #12

davisidarta opened this issue Mar 12, 2020 · 1 comment

Comments

@davisidarta
Copy link

Hi @romanhaa !

Thanks again for the great app! It's truly useful.

I'm having a issue when trying to use ROC test to get marker genes:

Error: Can't subset columns that don't exist.
x The column `p_val` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.

WIth rlang::last_error():


<error/vctrs_error_subscript_oob>
Can't subset columns that don't exist.
x The column `p_val` doesn't exist.
Backtrace:
  1. cerebroApp::getMarkerGenes(...)
 32. vctrs:::stop_subscript_oob(...)
 33. vctrs:::stop_subscript(...)
Run `rlang::last_trace()` to see the full context.

With rlang::last_trace():

1. ├─cerebroApp::getMarkerGenes(...)
 2. │ └─`%>%`(...)
 3. │   ├─base::withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
 4. │   └─base::eval(quote(`_fseq`(`_lhs`)), env, env)
 5. │     └─base::eval(quote(`_fseq`(`_lhs`)), env, env)
 6. │       └─cerebroApp:::`_fseq`(`_lhs`)
 7. │         └─magrittr::freduce(value, `_function_list`)
 8. │           └─function_list[[i]](value)
 9. │             ├─dplyr::select(...)
10. │             └─dplyr:::select.data.frame(...)
11. │               └─tidyselect::vars_select(tbl_vars(.data), !!!enquos(...))
12. │                 └─tidyselect:::eval_select_impl(...)
13. │                   ├─tidyselect:::with_subscript_errors(...)
14. │                   │ ├─base::tryCatch(...)
15. │                   │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
16. │                   │ │   └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
17. │                   │ │     └─base:::doTryCatch(return(expr), name, parentenv, handler)
18. │                   │ └─tidyselect:::instrument_base_errors(expr)
19. │                   │   └─base::withCallingHandlers(...)
20. │                   └─tidyselect:::vars_select_eval(...)
21. │                     └─tidyselect:::walk_data_tree(expr, data_mask, context_mask)
22. │                       └─tidyselect:::eval_c(expr, data_mask, context_mask)
23. │                         └─tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
24. │                           └─tidyselect:::walk_data_tree(new, data_mask, context_mask)
25. │                             └─tidyselect:::eval_c(expr, data_mask, context_mask)
26. │                               └─tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
27. │                                 └─tidyselect:::walk_data_tree(new, data_mask, context_mask)
28. │                                   └─tidyselect:::as_indices_sel_impl(...)
29. │                                     └─tidyselect:::as_indices_impl(x, vars, strict = strict)
30. │                                       └─tidyselect:::chr_as_locations(x, vars)
31. │                                         └─vctrs::vec_as_location(x, n = length(vars), names = vars)
32. └─vctrs:::stop_subscript_oob(...)
33.   └─vctrs:::stop_subscript(...)

I think this arises as new names are given for the p_value column, and getMarkerGenes still tries to make a table with the original names Seurat outputs when using Wilcox.

I want to use the ROC test because I'm actually looking for specific and exclusive cell markers, instead of differentially expressed genes, and ROC performs better for that purpose when compared to Wilcox. This would really make a difference in our analysis.

Any further insights? How can we fix this?

@romanhaa
Copy link
Owner

Hi @davisidarta!

You are bringing up another (current) limitation of Cerebro :) And you suspect correctly that a very rigid structure for tables (e.g. marker genes) is expected. I have been thinking about a generalised structure for these types of results for a while, and in fact also sent out an email to the Bioconductor developer mailing list to get their feedback. It looks like I'll have to implement my own method for processing the output of different DGE tools for visualization in Cerebro. Not sure when I'll be able to put it into releasable form though.

A short term workaround would be to rename columns so that they meet the current format (and add fake columns if there aren't enough. At the moment, the format looks like this:

# A tibble: 1,181 x 8
   cluster gene          p_val avg_logFC pct.1 pct.2 p_val_adj on_cell_surface
   <fct>   <chr>         <dbl>     <dbl> <dbl> <dbl>     <dbl> <lgl>          
 1 1       S100A12    1.22e-88      2.87 0.993 0.106  1.34e-84 FALSE          
 2 1       AC020656.1 2.04e-86      1.79 0.961 0.075  2.24e-82 FALSE          
 3 1       CD14       1.25e-85      1.94 0.948 0.06   1.37e-81 TRUE           
 4 1       VCAN       5.27e-85      2.47 0.987 0.121  5.77e-81 FALSE          
 5 1       FCN1       8.92e-83      2.27 1     0.161  9.77e-79 FALSE          
 6 1       CSTA       2.05e-82      1.64 0.974 0.089  2.25e-78 FALSE          
 7 1       MS4A6A     4.16e-82      1.86 0.954 0.089  4.56e-78 FALSE          
 8 1       MNDA       2.31e-81      2.15 1     0.19   2.53e-77 FALSE          
 9 1       FGL2       6.20e-76      1.69 0.974 0.141  6.78e-72 FALSE          
10 1       CSF3R      2.87e-74      1.38 0.908 0.095  3.15e-70 FALSE          
# … with 1,171 more rows

The on_cell_surface column is optional.

Do you think that could work?

Sorry that I don't have an easier solution this time.

Best,
Roman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants