Frequently Asked Questions
Find answers to common questions about PalmLab tools and data.
FAQ Categories
Data Sources & Quality
Q: What are the different types of palmitoylation data sources in PalmLab?
PalmLab integrates data from multiple sources with different confidence levels:
- Experimental: Direct experimental validation from literature (highest confidence)
- Database: Curated data from SwissPalm, CysModDB, DBPTM, and PTMD
- Prediction: Computationally predicted sites with three confidence levels:
- High Prediction - Strong computational evidence
- Medium Prediction - Moderate confidence
- Low Prediction - Weak evidence, requires validation
We recommend prioritizing experimental and high-confidence prediction sites for critical analyses.
Q: How is the tissue/cell line expression data curated and what does TSI index represent?
Our tissue and cell line expression data undergoes rigorous quality control:
- Data Integration: Compiled from multiple proteomics studies and public databases
- Normalization: All data normalized to enable cross-study comparisons
- Quality Filtering: Low-quality samples and outliers are removed
TSI (Tissue Specificity Index): Measures how specific a protein's palmitoylation is to particular tissues:
- High TSI (>0.8): Protein is palmitoylated in very few tissues (tissue-specific)
- Medium TSI (0.3-0.8): Moderate tissue specificity
- Low TSI (<0.3): Widely palmitoylated across many tissues (ubiquitous)
Q: What do the conservation scores (phyloP and phastCons) indicate about palmitoylation sites?
Conservation scores help identify functionally important palmitoylation sites:
phyloP Conservation:
- Positive scores: Evolutionary conservation (higher = more conserved)
- Negative scores: Accelerated evolution
- Near zero: Neutral evolution
phastCons Conservation:
- 0-1 scale: Probability of being in conserved element
- Close to 1: Highly conserved, likely functional
- Close to 0: Fast evolving
Search & Browse Functions
Q: What input formats are supported for protein searches?
PalmLab supports multiple input formats for maximum flexibility:
Supported Identifiers:
- UniProt IDs: P01116, P04637, P35579
- Gene Symbols: KRAS, TP53, MYH9, BRAF
- Protein Names: "GTPase KRas", "Cellular tumor antigen p53"
Input Methods:
- Single entry: P01116
- Multiple entries: P01116 KRAS TP53
- Separators: Spaces, commas, tabs, or newlines
- Mixed input: "P01116 KRAS TP53"
Q: How should I interpret the color coding in protein sequence annotations?
The color coding system helps quickly identify different types of palmitoylation evidence:
Single Evidence Types:
Combined Evidence (more reliable):
Recommendation: Sites with multiple supporting evidence types (especially experimental) are more reliable for downstream analysis.
Analysis Tools
Q: What is the difference between co-occurrence and mutual exclusion in network analysis?
These relationships reveal different biological patterns:
Co-occurrence (Positive Association):
- Definition: Proteins tend to be palmitoylated together in the same samples
- Statistical indicator: OR > 1, Jaccard > 0
- Biological implication: May indicate:
- Functional cooperation
- Same pathway membership
- Protein complex formation
- Coordinated regulation
Mutual Exclusion (Negative Association):
- Definition: Proteins rarely palmitoylated together in same samples
- Statistical indicator: OR < 1, Jaccard ≈ 0
- Biological implication: May indicate:
- Functional redundancy
- Different cellular states
- Compensatory mechanisms
- Mutually exclusive pathways
Q: How should I interpret the results from Hotspot Mutation Analysis?
The mutation analysis provides statistical evidence for associations between palmitoylation and mutations:
Key Statistical Metrics:
- adjust P (Logit) < 0.05: Statistically significant after multiple testing correction
- Coef_logit > 0: Palmitoylation increases mutation probability
- Coef_logit < 0: Palmitoylation decreases mutation probability
- Color coding: Green = significant positive association, Red = significant negative association
Sample Count Interpretation:
| Variable | Description | Biological Meaning |
|---|---|---|
| n1 | Mutated & palmitoylated | Samples with both features |
| n2 | Mutated & non-palmitoylated | Mutated without palmitoylation |
| m1 | Wildtype & palmitoylated | Normal with palmitoylation |
| m2 | Wildtype & non-palmitoylated | Normal without palmitoylation |
Q: How is the logistic regression performed in Hotspot Mutation Analysis?
The Hotspot Mutation Analysis uses Firth logistic regression to examine the association between palmitoylation status and gene mutations while controlling for cell line background effects.
Statistical Model:
The regression function is: Mutation_status ~ Palmitoylation_status + Cell_line_background
Key Components:
- Dependent Variable: Mutation_status (binary: mutated or wildtype)
- Primary Predictor: Palmitoylation_status (binary: palmitoylated or non-palmitoylated)
- Covariates: Cell_line_background (to control for cell line-specific effects)
Cell Line Background Calculation:
To account for cell line heterogeneity, we perform Principal Component Analysis (PCA) on the mutation data across all samples. The first two principal components (PCA1 and PCA2) are extracted and included as covariates in the regression model to control for cell line-specific mutation patterns.
Analysis Procedure:
- For each protein-gene pair, collect sample-level data on protein palmitoylation status and corresponding gene mutation status
- Perform Firth logistic regression to test the association between palmitoylation and mutation
- Calculate statistical significance (P-value) to identify significant associations
- Calculate regression coefficients (Coef_logit) to determine the direction of association
Interpretation Guidelines:
- Significant P values/Adjust P values: Indicates a statistically significant association between palmitoylation and mutation
- Positive Coefficient: Palmitoylation is associated with increased probability of mutation
- Negative Coefficient: Palmitoylation is associated with decreased probability of mutation
Q: What is the recommended workflow for Multi-Protein Expression Pattern Analysis?
For optimal results, follow this workflow:
Step 1: Input Selection
- Small-scale exploration: Start with 5-10 proteins to understand patterns
- Pathway-based analysis: Use pre-defined cancer pathways for hypothesis-driven research
- Validation: Check the analysis summary to ensure all proteins were found
Step 2: Pattern Identification
- Heatmap patterns: Look for vertical (sample clusters) and horizontal (protein co-expression) patterns
- UMAP clusters: Identify sample groups with similar expression profiles
- Tissue specificity: Note tissue-colored labels for context
Step 3: Biological Interpretation
- Co-expression clusters: May indicate functional modules
- Tissue-specific patterns: Reveal context-dependent regulation
- Outliers: Unique samples may represent special conditions or errors
Q: How do I interpret motif discovery results and E-values?
Motif analysis identifies conserved sequence patterns around palmitoylation sites:
Key Interpretation Points:
- E-value: Expected number of false positives (lower = more significant)
- E-value < 0.05: Statistically significant
- E-value < 0.01: Highly significant
- E-value < 0.001: Very strong evidence
- Sequence Logo: Height indicates information content (conservation)
- Consensus Sequence: Most common amino acids at each position
Technical Issues
Q: What should I do if my protein search returns "Not Found" or validation errors?
Common issues and solutions:
Error Types:
- Not Found - Identifier not in database
- Species Mismatch - Wrong species selected
- No Expression Data - No data for selected conditions
Solutions:
- Check spelling: Verify gene symbols and UniProt IDs
- Species selection: Ensure correct species is selected
- Alternative identifiers: Try different naming conventions
- Browse function: Use browse to find correct identifiers
Q: Why are some analysis results limited or unavailable for certain proteins?
Data availability depends on several factors:
- Experimental coverage: Not all proteins have been studied for palmitoylation
- Tissue specificity: Some analyses require data from specific tissues
- Statistical power: Analyses may require minimum sample sizes
- Species limitations: Some tools are human-specific
Q: How current is the data in PalmLab and how often is it updated?
PalmLab maintains a regular update schedule:
- Major updates: Quarterly releases with new data and features
- Literature curation: Continuous addition of new experimental data
- Database synchronization: Monthly updates from external databases
- Bug fixes: Ongoing maintenance and improvements
Check the About page for the most recent update information and version details.