Home
Projects
Protein Search
FAQ
Download
About Us
Site Map
Login
Frequently Asked Question
General
What is dbLEP?
What projects(datasets) are in dbLEP now?
How data is organized in dbLEP?
How to Contact us?
How to use dbLEP
How to find project in dbLEP?
What information is provided for each project?
What's the customized filtered set?
How to customize filter criteria to get interested identification set?
What's PFF, PMF and Combine identification?
What detailed information is provided for each protein?
What's protein identification parameter?
What detailed information is provided for each peptide?
What's peptide identification parameter?
Why some peptides have no score or XCorr?
What detailed information is provided for each spectrum?
How to search protein in dbLEP?
How to blast protein in dbLEP?
What is IPI match according to IPI history?
What could be downloaded from ftp of dbLEP?
General
What is dbLEP?
Liver Expression Profile database aims to be an information center of liver protein expression profile. dbLEP contains three datasets now. And we are planning to provide more datasets in the future. For each dataset, dbLEP provides all identification results including none-redundant identified protein, all possible identified proteins, peptides and their spectrums. The detailed annotation is also provided for each identified protein. Benefit from large number of intact data resources, abundant links and flexible search functions, researchers may get all the information for the proteins they are interested in by text query or similarity comparison. Besides of judging the quality of the identified proteins according to their identified peptides and spectrums, researchers could analysis these data using the annotation information. We hope dbLEP could help you step from data to knowledge finally.
What datasets (projects) are in dbLEP now?
The following datasets are in dbLEP now:
1. dataset of Human Fetal Liver: provided by BPRC.
2. dataset of French liver from HLPP project: provided by BPRC
3. dataset of Chinese liver from HLPP project: provided by BPRC
The dataset of Liver Organelles will be online in the future.
How data is organized in dbLEP?
dbLEP contains the data of non-redundant proteins, all possible proteins, peptides, spectrum. Non-redundant protein means the integrated results by one parsimony method, and all possible proteins are all those proteins matched to any MSMS peptide or any set of peptides for PMF containing data.
dbLEP could be visited by browsing identified proteins or peptides, or by Blasting, searching all the identified proteins. When browsing the data, customers may get the filtered set by some parameter (in the part of Customized Filter Set).
Detail information could be got for each protein, peptide and their spectrum, which could be used to judge the quality of the identification.
The data structure in dbLEP and how custromer could get all kinds of information in dbLEP is demonstrated as the following picture.
How to contact us?
Postal Address: 33, Life Science Park Road, Changping District, Beijing, 102206, China
Telephone : 86-010-80727777
Fax : 86-010-80705155
If you have any problem or suggestion, you may not hesitate to contact us at any time.
How to use dbLEP
How to find project in dbLEP?
There're two ways to find interested project in dbLEP:
1. All projects in dbLEP are listed under the home page with its summary. You may go to interested project follow the link on its name.
2. All projects are listed under the link of “Project” in menu only with its name. You may go to interested project follow the link on its name.
What information is provided for each project?
dbLEP provides identification results list and some important explanatory information for each project.
dbLEP provides three kinds of identification lists in project detail page, including all identified protein list, peptide list and none-redundant protein list. You may check the detailed list following corresponding link. The method to get none-redundant protein is described in processing methods of the project. To enable researcher get subset of the identification result over specific standard, dbLEP also provides function of filtered search through the link of customized filtered set for each identification list.
dbLEP provides the following explanatory information for each project:
General Information: title, start-end time and director of the project.
Sample Information: detailed information of sample, including species, organ, tissue, cell line, cell type, subcellular location and disease state etc.
Method Description: description of relative methods to get the identification, including name and version of search database, and threshold used to get the final identification. Processing method describes how to control quality, evaluate all the results, remove redundancy and choose representative protein etc. in detail.
Statistic Data of Identification:
Protein Count: how many proteins have been identified in this project.
Unique Peptide Count: How many unique peptides have been identified in this project. Unique peptide means only the peptide with different sequence is counted.
Unique MSMS Peptide Count: How many unique ms/ms peptides have been identified in this project.
Spectrum Count: How many spectrums including ms and ms/ms spectrum have identified peptides or proteins in the final identification result.
None Redundant Protein Count: How many none redundant proteins have been identified based on redundancy removal method described for this project.
None Redundant Protein Count With Two Or More Unique Identification: For a special identified protein, unique identification means different peptide sequences by MSMS identification. If the protein is also identified by PMF-containing data, plus 1 for the unique identification count.
What's the customized filtered set?
The identification result is chosen based on threshold described in each project. However, sometimes to estimate creditability of identification, researcher needs to make their own standards and check the corresponding identification result over this standard. To meet this requirement, dbLEP provides function of customized filter search. Researcher may set several filter conditions, and then get the customized filtered set after submission.
How to customize filter criteria to get interested identification set?
Follow the link of “customized filtered set” under separate identification list in project detail page, you may customize filter criteria for identified protein list, peptide list and none-redundant protein list by choosing corresponding identification parameters, relation and logical operators and inputting the corresponding value. According to the identification parameters provided, only integer and float is valid for the input value.
What's PFF, PMF and Combine identification?
According to the level of source spectrum, identification is classified into PFF(Peptide Fragment Fingerprint), PMF(Peptide Mass Fingerprint) and Combination of PMF and PFF.
What detailed information is provided for each protein?
dbLEP provides three kinds of information for each protein, including general information, identification information and protein annotation.
In general information, the protein sequence is demonstrated with the identified peptides marked in grey.
In identification information, the statistic data and detailed PFF and PMF/Combine identification information is provided in three tabs separately. Several protein identification parameters and peptide identification parameters are provided here. Following the link on peptide sequence, you may go to the peptide detail page to find out information of this peptide, such as what other proteins have been identified by the same peptide. Following the link of spectrum, you may go to spectrum detail page to check spectrum graph with b and y ion marked.
In protein annotation, except for the basic annotation such as function and family etc., lots of links to the relative information in other databases have been provided, too.
What's protein identification parameter?
dbLEP provides the following parameters about identification for each protein:
Rank: from the results of Mascot for PMF or Combine identification.
Protein group: all proteins that match to those peptides from the same MS spectrum.
MS Protein Confidence: protein confidence identified by the mass spectrum for PMF or Combine identification.
Expect: from the results of Mascot for PMF or Combine identification.
Score: from the results of Mascot for PMF or Combine identification.
Rescore: after recalibration, we give the identified results a new score, as described in processing methods of project for PMF or Combine identification.
Spectrum: a link to spectrum detail page spectrum detail page.
Coverage: percent of identified amino acids in one protein according to its identified peptides.
Coverage By MSMS Peptide: percent of identified amino acids in one protein according to its identified MSMS peptides.
Unique Peptide Count: Number of peptides with different sequence.
MSMS Peptide Count: Number of MSMS peptides.
Unique MSMS Peptide Count: Number of MSMS peptides with different sequence.
PMF/Combine Count: Number of PMF/Combine identified results.
Spectrum Count: Number of MSMS spectrum in PFF identification and MS spectrum in PMF identification.
What detailed information is provided for each peptide?
Firstly dbLEP provides protein list identified by this peptide in PFF and PMF/Combine identification separately. Then the statistic information of how many times the peptide is identified in PFF and PMF/Combine identification is provided. At last the detailed identification is listed with several identification parameters only for PFF identification.
What's peptide identification parameter?
dbLEP provides the following parameters about identification for each peptide::
Peptide Confidence: calculated confidence for the peptide.
Search Engine: SEQUEST or Mascot.
XCorr: from the results of SEQUEST.
score: from the results of Mascot.
Delta-CN: from the results of SEQUEST.
Charge: from the results of SEQUEST.
MH+: from the results of SEQUEST.
Diff(MH+): from the results of SEQUEST.
Rank: from the results of SEQUEST/Mascot.
SP: from the results of SEQUEST.
RSP: from the results of SEQUEST.
Ions: from the results of SEQUEST.
Spectrum: a link to spectrum detail page spectrum detail page.
Modification: variable post translational modification for identified peptides.
PFF Count: number of identifications by MSMS.
PMF/Combine Count: Number of PMF/Combine identified results.
Top score: the top score of a special peptide in all identifications from Mascot search engine.
Top XCorr/Charge1: top XCorr of a special peptide with charge 1 in all its identifications from SEQUEST search engine.
Top XCorr/Charge2: top XCorr of a special peptide with charge 2 in all its identifications from SEQUEST search engine.
Top XCorr/Charge3: top XCorr of a special peptide with charge 3 in all its identifications from SEQUEST search engine.
Why some peptides have no score or XCorr?
The peptide in PMF identification has no XCorr or score. Only the peptide in PFF identification from Mascot has parameter of score. Only the peptide in PFF identification from Sequest has parameter of XCorr.
What detailed information is provided for each spectrum?
In spectrum page, dbLEP provides spectrum graph with some relative high peaks marked by intensity, m/z and y, b ions. And dbLEP provides relative explanatory information, including ms level, precursor, instrument, spectrum processing method, search engine setting and the technical line of the corresponding experiment.
How to search protein in dbLEP?
Following the link of “Protein Search”, you may reach the page for protein search.
“Protein Blast” will bring you to locally wwwblast server to search the protein with sequence similarity.
In “Protein Search”, you may set search conditions including IPI Id, protein name, swiss-prot id, subcellular location and project. Only one IPI id is allowed if you want to set this condition. You may search proteins with several IPI ids in “Protein Batch Search”. You may choose interested subcellular location from the drag-down menu. The blank in this menu represents no limited subcellular location. The logical relationship between these conditions is “AND”.
In “Protein Batch Search”, you may input IPI ids separated by blank or tab or comma or carriage to get several interested proteins at one time.
How to blast protein in dbLEP?
“Protein Blast” will bring you to locally wwwblast server to search the protein with sequence similarity.
1. go to protein search page from the link of “protein search” in menu
2. click link of “protein blast” to go to wwwblast home page, then you could do blast as usual.
What is IPI match according to IPI history?
With the development of research, IPI id is not static. Actually in new version, some IPI id may be propagated to the other IPI id. So IPI database provides IPI history files to hold this kind of information.
When the protein is searched by IPI id, dbLEP firstly finds out all relative IPI ids in the searched dataset based on the IPI propagated information and then executes search with these ids, and finally returns all the corresponding records. The IPI match information based on IPI history is provided above search result with the name of “IPI ID Match Information” if match really happen. “Input IPI ID” column displays the IPI id input by users. “Matched IPI ID in Databse (based on IPI History)” column displays the really searched IPI id based on IPI history.
For example, when you search IPI00385715 in HLPP French project, the “IPI ID Match Information” will be provided telling you the IPI00398991 is really used for search, because the IPI00385715 had been propagated to IPI00398991 in IPI database of version 3.10 and IPI00398991 is identified in Human Fetal liver.
What could be downloaded from ftp of dbLEP?
You may go to ftp site from the link of Download in menu. Ftp is organized by project. For each project, you could download all identification result, none-redundant protein list and fasta file of all identified proteins. The summary of technical line used in each project is also provided. The file of FileTile.txt under root directory will help you to understand what each column means in the identification result file and none-redundant protein list file.
Beijing Proteome Research Center
Copyright 2007 All Rights Reserved
Contact Us