Prism compound and cell line count

selina · May 10, 2025, 7:02pm

Hello Community,

I´m not sure if this exact question has been asked before. After going through some posts I could not find an answer which is why I´m trying it this way.

The main question is: How many compounds have been tested and how many cell lines are included in the dataset? Or differently: How do you count the compounds and cell lines?
When I read the description provided for the data (specifically the file ‘Repurposing_Public_24Q2_Extended_Primary_Data_Matrix.csv’) it says that there are 906 cell lines tested against (if I´m understanding correctly) 4518 (from the primary prism screen) 1280 (from REP1M) and 234 (from REP300). In total this would make 6032. I am using the ‘Repurposing_Public_24Q2_Extended_Primary_Compound_List.csv’ file to get the metadata of the compounds (of main interest are the names).
I understand that some compounds have multiple BRD IDs due to, for example, different vendors involved.
I counted the names for all compounds and found 6504 drug names which still does not match with the provided numbers. (I even retrieved the smiles and inchis but always had more than 6032 compounds)
Further, the Repurposing_Public_24Q2_Extended_Primary_Data_Matrix file contains 920 unique cell line identifier. How does that match with the number of 906 cell lines provided by you? Could you also explain how this is calculated?
I´m trying to understand the data as best as possible.
Thank you very much for reading and answering my question.

kind regards,
Selina

Topic		Replies	Views
Number of screened compounds Q&A	3	156	September 16, 2024
Cell line list and compound list Q&A data	5	1626	February 25, 2021
Data Question about PRISM Drug sensitivity AUC data Issues and Bugs compound-screens , data	2	1207	August 12, 2020
Response to compound in different cell lines Q&A	1	197	May 2, 2024
Any new drug dataset other than from prism_repurposing_secondary of prism19Q4? Q&A	4	382	June 26, 2023

Prism compound and cell line count

Related topics