Hi, I would like to know the meaning of nan in the data set.
At present, I guess it has two meanings
- Because the drug has not been tested on the cell line, there is no information
- Because the drug does not respond to the cell line, there is no information
In the portal, we try to consistently use nan to represent “holes” in the data.
Because IC50 is a fit parameter, yes, it won’t exist if a cell line didn’t respond to the drug. That is one of the reasons why using AUC can be a better metric to use if you want to look at patterns of response across a panel of lines.
So, off the top of my head, an nan IC50 could be:
- the line doesn’t respond to the drug.
- The drug+line combination was not tested.
- Some QCing of the data/curve fit was too poor, resulting in no IC50 reported.