Usage¶
First, please download the reference files from zenodo.
Then, extract the archive files with tar xvzf and run the SCRIP config function with the folders.
SCRIP includes 5 main commands.
usage: SCRIP [-h] [--version] {enrich,impute,target,config,index} ...
SCRIP
positional arguments:
{enrich,impute,target,config,index}
enrich Main function.
impute Imputation Factor function.
target Calculate targets based on factor peak count.
config Configuration.
index Build index with custom intervals.
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
For command line options of each command, type: SCRIP COMMAND -h
Simple usage¶
SCRIP enrich -i {peak_count.h5} -s hs -p {result_SCRIP_path} -t 32
SCRIP impute -i {peak_count.h5} -s hs -p {result_SCRIP_path} -f h5ad --factor {factor}
SCRIP target -i {result_SCRIP_path}/imputation/{factor}/imputed_{factor}.h5ad -s hs -o {result_SCRIP_path}/target/{factor}_target.h5ad
Detailed usages are listed as follows:
SCRIP enrich¶
In this function, you can input a peak count matrix in H5 or MTX format, with basic parameters of quality control. This function will output a folder including these files:
beds: bed files of all cellsChIP_result: txt files of Giggle search resultsqpeaks_length.txt: peak total length of each cellSCRIP_enrichment.txt: the result of the SCRIP scoredataset_overlap_df.pk: the raw number of overlaps of each cell to each datasetdataset_cell_norm_df.pk: normalized scoresdataset_score_source_df.pk: matched reference datasetstf_cell_score_df.pk: the same table toSCRIP_enrichment.txtbut untransposed and in pickle format
usage: SCRIP enrich [-h] -i FEATURE_MATRIX -s {hs,mm} [-p PROJECT] [--min_cells MIN_CELLS] [--min_peaks MIN_PEAKS] [--max_peaks MAX_PEAKS]
[-t N_CORES] [-m {max,mean}] [-y] [--clean]
optional arguments:
-h, --help show this help message and exit
Input files arguments:
-i FEATURE_MATRIX, --input_feature_matrix FEATURE_MATRIX
A cell by peak matrix . REQUIRED.
-s {hs,mm}, --species {hs,mm}
Species. "hs"(human) or "mm"(mouse). REQUIRED.
Output arguments:
-p PROJECT, --project PROJECT
Project name, which will be used to generate output files folder. DEFAULT: Random generate.
Preprocessing paramater arguments:
--min_cells MIN_CELLS
Minimal cell cutoff for features. Auto will take 0.05% of total cell number.DEFAULT: "auto".
--min_peaks MIN_PEAKS
Minimal peak cutoff for cells. Auto will take the mean-3*std of all feature number (if less than 500 is 500). DEFAULT: "auto".
--max_peaks MAX_PEAKS
Max peak cutoff for cells. This will help you to remove the doublet cells. Auto will take the mean+5*std of all feature
number. DEFAULT: "auto".
Other options:
-t N_CORES, --thread N_CORES
Number of cores use to run SCRIP. DEFAULT: 16.
-m {max,mean}, --mode {max,mean}
Deduplicate strategy. DEFAULT: max.
-y, --yes Whether ask for confirmation. DEFAULT: False.
--clean Whether delete tmp files(including bed and search results) generated by SCRIP. DEFAULT: False.
SCRIP impute¶
In this function, you can input a peak count matrix of scATAC-seq in H5 or MTX format and a TR or HM you are interested in, with basic parameters of quality control. This function will output the matrix of pseudo-ChIP-seq peak in H5AD or MTX format. The output can be the input of the SCRIP target function.
usage: SCRIP impute [-h] -i FEATURE_MATRIX -s {hs,mm} [-p PROJECT] [-f {h5ad,mtx}] --factor FACTOR [--ref_baseline REF_BASELINE] [--remove_others] [--min_cells MIN_CELLS] [--min_peaks MIN_PEAKS] [--max_peaks MAX_PEAKS] [-t N_CORES]
optional arguments:
-h, --help show this help message and exit
Input files arguments:
-i FEATURE_MATRIX, --input_feature_matrix FEATURE_MATRIX
A cell by peak matrix. h5 or h5ad supported. REQUIRED.
-s {hs,mm}, --species {hs,mm}
Species. "hs"(human) or "mm"(mouse). REQUIRED.
Output arguments:
-p PROJECT, --project PROJECT
Project name, which will be used to generate output files folder. DEFAULT: Random generate.
-f {h5ad,mtx}, --format {h5ad,mtx}
Format generate for output peak count. DEFAULT: h5ad.
Peak imputation paramater arguments:
--factor FACTOR The factor you want to impute. REQUIRED.
--ref_baseline REF_BASELINE
Remove dataset which peaks number less than this value. DEFAULT: 500.
--remove_others Remove signal not from best match. DEFAULT: False.
Other options:
--min_cells MIN_CELLS
Minimal cell cutoff for features. Auto will take 0.05% of total cell number.DEFAULT: "auto".
--min_peaks MIN_PEAKS
Minimal peak cutoff for cells. Auto will take the mean-3*std of all feature number (if less than 500 is 500). DEFAULT: "auto".
--max_peaks MAX_PEAKS
Max peak cutoff for cells. This will help you to remove the doublet cells. Auto will take the mean+5*std of all feature number. DEFAULT: "auto".
-t N_CORES, --thread N_CORES
Number of cores use to run SCRIP. DEFAULT: 16.
SCRIP target¶
In this function, you can input a peak count matrix of scATAC-seq in H5 format or scChIP-seq peak count. This function will output the RP matrix in H5AD. The output can be used to determine the direct target genes.
usage: SCRIP target [-h] -i FEATURE_MATRIX -s {hs,mm} [-o OUTPUT] [-d DECAY] [-m MODEL]
optional arguments:
-h, --help show this help message and exit
Input files arguments:
-i FEATURE_MATRIX, --input_feature_matrix FEATURE_MATRIX
A cell by peak matrix. h5 or h5ad supported. REQUIRED.
-s {hs,mm}, --species {hs,mm}
Species. "hs"(human) or "mm"(mouse). REQUIRED.
Output arguments:
-o OUTPUT, --output OUTPUT
output h5ad file. DEFAULT: RP.h5ad
Other options:
-d DECAY, --decay DECAY
Range to the effect of peaks. DEFAULT: auto.
-m MODEL, --model MODEL
RP model chosen. DEFAULT: simple.
SCRIP config¶
This function is used to config the reference files that SCRIP uses. The reference files can be downloaded from zenodo. The index path should be the folder after extract.
usage: SCRIP config [-h] [--show] [--human_tf_index HUMAN_TF_INDEX] [--human_hm_index HUMAN_HM_INDEX] [--mouse_tf_index MOUSE_TF_INDEX] [--mouse_hm_index MOUSE_HM_INDEX]
optional arguments:
-h, --help show this help message and exit
--show
--human_tf_index HUMAN_TF_INDEX
--human_hm_index HUMAN_HM_INDEX
--mouse_tf_index MOUSE_TF_INDEX
--mouse_hm_index MOUSE_HM_INDEX
SCRIP index¶
This function is used to create the SCRIP index with users’ peaks.
usage: SCRIP index [-h] -i INPUT -o OUTPUT
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
Path to the folder that includes all your bed files. The bed files should be named in "TRName_ID.bed", e.g. "AR_1.bed".
-o OUTPUT, --output OUTPUT
Path to the output folder.