Integrative analytical pipeline for defining high-confidence RNA sequences/motifs bound by RBP groups. a In total, 84 CLIP-seq (including PAR-CLIP and HITS-CLIP) datasets of 48 human RBPs from HEK293/HEK293T cell lines were collected. b Different computing methods (e.g. Piranha and PARalyzer) were used to call peaks from raw reads for each RBP. Peaks from different methods and biological replicas were overlapped. c Then, the binding sites of all 48 RBPs were merged into one set of binding sites. RNA-sequencing (RNA-seq) data from corresponding cell lines were used to normalize the occupancy of each binding site. d Subsequently, an occupancy profile matrix V (N × M) was generated, representing the binding affinity for each binding site (row) bound by each RBP (column). e The occupancy profile matrix V was decomposed to a basis matrix W (N × R) and a coefficient matrix H (R × M). N denotes the number of binding sites; M denotes the number of RBPs; R denotes the number of groups. f The coefficient matrix was used to define the RBP components and their weights in each group. The basis matrix was used to define group-related binding sites (motifs) and binding affinities.