Search FAQs

Q: What type of searching does Harbor Consulting offer?

A: Our experienced staff offers biosequence, chemical structure, and full text searching of various patent and non-patent databases.

Q: What databases do we use?

A: After a detailed review of all the available patent and non-patent databases, we determined that the proprietary STN platform of databases, supplemented with searching NCBI, is the most comprehensive: 

Biosequence Databases:

Chemical Structure Databases:


Q: Why do we use STN/NCBI?

A: We use the proprietary STN platform of databases because it contains approximately 50% more data than other private and public databases.  In addition, the STN platform has several value added features, including manual indexing of certain records not found in other databases. We supplement STN by also searching NCBI because, although public, NCBI occasionally contains newer information which has not yet been encompassed by STN.

Q: Why don’t we use only free, public biosequence databases?

A: The free, public databases, such as GenBank, EMBL, SWISSPROT, NCBI, etc., contain approximately 50% lesssequence data than STN, and they are contingent upon the public to diligently and accurately upload and update sequence data to those sites.  STN, on the other hand, includes several value-added features, such as inclusion of sequences which are referenced but not literally disclosed in patent applications, sequences associated with accession numbers disclosed within patent applications, sequences disclosed in patent applications that do not make their way into a Sequence Listing, and record annotation for enhanced searching capabilities.

Q: Why don't we use other private databases (Genome Quest, PatBase, SQIP, etc.)?

A: We do not use other private databases because, although their individual interface components are generally user friendly, their data is simply reconstituted public data - the same data retrievable from NCBI, GenBank, etc.

Q: What is your cost structure/how do you bill for performing searches?

A: All searches are quoted and authorized before we begin work on any particular project.  We base our searching cost structure(s) on a combination of database fees, online time, and tech charges to perform a search, all of which we estimate prior to commencement of a project.  Biosequence, chemical structure, and full text searches are all estimated differently due to varying database fees and the amount of tech time required. 

Q: What is your general turnaround time for performing searches?

A: We treat every search with the highest priority, and you will generally receive the preliminary results within a couple of days.

Q: In what format do I receive my results/what do my results look like?

A: We provide you with Word document containing a table summary of all records pulled, including a hyperlink to the patent family information.  The raw data, whether it be biosequence, chemical structure, and full text, is also included with various value added annotation.

Q: What type of information can we retrieve?

A: Whether for biosequence, chemical structure, or full text results, we can retrieve most bibliographic information, including the record description (patent number, applicant, priority information, journal name, etc.) and patent claims.  With biosequence searches we also retrieve the homology score, sequence alignment and patent sequence location.  For chemical structure searches we include the markush structure.

Q: Can we search RNA?

A: Yes, however the database will change all Uracil to Thymine in the sequence because RNA sequences are indexed as DNA sequences in the various biosequence databases.

Q: Can we search with limitations and/or variability?

A: Yes, we have the ability to search sequences with various limitations and/or variability including, but not limited to, date, sequence length, genus species, keywords, and base/residue substitution.  Often, it is best to perform the initial search with no limitations and narrow large result sets with relevent key words.

Q: Can you search for consecutive base/residue groupings (i.e., 1-19, 2-20, 3-21, etc., out of a 26-mer)?

A: Yes, but it is necessary to manual review the initial result for such consecutive groupings.

Q: How will recent EP sequence listing interpretations affect the way data is indexed and the search results?

A: The recent change in the interpretation of WIPO ST.25, paragraph 18, affects only protein sequences. The EP now requires that single position, variable amino acids be disclosed as “one defined” residue with annotation that the position may be selected from a specific group of residues. For example, if position 2 of a sequence may be Lys, Ser, Gly or Ala, Lys would be “literally” disclosed, and the position would be defined as replacing Lys with Ser, Gly or Ala. By requiring a constant residue (Lys) at this position, rather than a variable (Xaa) defined as any of the four residues, sequences indexed in various databases will be overly limiting and will undoubtedly affect search results. We believe that the new requirement is a “mis-interpretation” of the last sentence of paragraph 18 (see below), which refers to the use of one Xaa at one position, and not residue variation at a single position. In other words, one Xaa may not encompass 1-5 positions/amino acids.

18. Modified and unusual amino acids shall be represented as the corresponding unmodified amino acids or as “Xaa” in the sequence itself if the modified amino acid is one of those listed in Appendix 2, Table 4, and the modification shall be further described in the feature section of the sequence listing, using the codes given in Appendix 2, Table 4. These codes may be used in the description or the feature section of the sequence listing but not in the sequence itself (see also paragraph 32). The symbol “Xaa” is the equivalent of only one unknown or modified amino acid.