Yahoo! Search Marketing

Getting Started Guide: KeywordResearchService Overview

KeywordResearchService provides operations for generating keyword recommendations based on advertiser bidded keyword data.

About Keyword Research

KeywordResearchService generates keyword recommendations by mining advertiser bidded keyword data. The service offers several ways for you to fine-tune your results:

  • Relevance Feedback: Interactively accept or reject recommended keywords.
  • Subphrase Filters: Include or exclude keywords that contain or do not contain phrases (for example, products or brands).
  • Excluded Keywords: Exclude an explicit list of keywords (for example, keywords already in an ad group).
  • Single-Page Crawl: Extract keywords from a specific web page.

KeywordResearchService returns suggested keywords and additional data such as the search canonical and phrase canonical forms of the keywords and the forecasted search volume for the next 30 days (expressed as a range).

All the values for a specific range data point will be considered, and split into buckets (typically 5-10 buckets per data point). Buckets are calculated per-market. Use the getRangeDefinitions operation to determine the upper and lower quantity for each bucket.

When the service returns a range value with a suggested keyword (for example, forecasted search volume), it returns the bucket number along with the min and max values for that bucket. You can also obtain the full description of bucket ranges for a specific data point and market.

Range Definitions

To obtain the range definition, use the getRangeDefinitions operation. A range definition represents a distinct set of data that is made available to the getPageRelatedKeywords and getRelatedKeywords operations.

Currently, the KeywordResearchService supports a single range definition, Searches, which represents search volume data for bidded keywords. Future versions of the service may include other range definitions.

Keyword Suggestions

The GetPageRelatedKeywords and GetRelatedKeywords operations generate keyword suggestions for advertisers. Both operations use the same suggestion algorithm, but here’s the distinction:

  • As "seed keywords" for the algorithm, GetPageRelatedKeywords takes your input phrases (positive keywords, negative keywords, excluded keywords, and so on) and a URL that is used to crawl and extract terms relevant to the concept of the web page (but unrelated to your input phrases). The KeywordResearchService then uses your input phrases and the extracted terms to generate suggestions.
  • As "seed keywords" for the algorithm, GetRelatedKeywords takes only your input phrases and uses them to generate suggestions.

To obtain optimum relevance in the suggestion set, you need to leverage the iterative nature of the KeywordResearchService. Once a keyword set is returned (from either operation) you have the ability to "accept" and "reject" keywords based on their conceptual relevance. This refinement of the keyword set is highly effective and should be used to generate a suggestion set with the greatest level of quality.

Given this, you should use the GetPageRelatedKeywords operation to return the initial suggestion set and then use the GetRelatedKeywords as often as needed to refine the suggestion set. In general, you should not use the GetPageRelatedKeywords operation to refine the keyword set after the initial GetPageRelatedKeywords request has been made. This is true for two reasons:

  • Once GetPageRelatedKeywords has crawled a page and extracted the terms, repeating this effort is costly and of little use; a re-crawl will produce the same results (except for web pages with rapidly changing content).
  • Because a re-crawl produces the same results, using GetPageRelatedKeywords a second time will dilute your input phrases with terms you may have already rejected as not relevant to your business.

Example

Suppose you run a web site that sells DVDs. Your web page may also contain keywords like "DVD player" or "electronics". Lets say you make a request using GetPageRelatedKeywords with these inputs:

URL: www.myUrl.example.com

Positive Keywords

  • "DVD"
  • "DVD for sale"
  • "Cheap DVDs"

First, the service crawls and extracts terms from the web page. Possible results might be:

Extracted Terms (treated as positive keywords)

  • "DVD"
  • "DVD player"
  • "Electronics"
  • "Free shipping"
  • "DVD sale"

Next, the service appends the extracted terms to your positive keywords and submits them as "seed keywords" for the algorithm.

Seed Keywords (positive keywords and extracted terms)

  • "DVD"
  • "DVD for sale"
  • "Cheap DVDs"
  • "DVD player"
  • "Electronics"
  • "Free shipping"
  • "DVD sale"

The generated suggestion set will most likely include keywords "DVDs" and "DVD players". But you do not sell electronics, you sell DVDs. So, you decide to refine your keyword set by leveraging the iterative nature of the KeywordResearchService. Lets say you make a second "refinement" request using GetPageRelatedKeywords with these inputs:

URL: www.myUrl.example.com

Positive Keywords

  • "DVD"
  • "DVD for sale"
  • "Cheap DVDs"

Negative Keywords

  • "DVD player"
  • "Electronics"

If you submit these keywords along with your original URL, the KeywordResearchService will needlessly crawl that same web page again, extract the same terms, and reintroduce them as positive keywords (keywords that you have already determined are not relevant to your business). This needless re-crawling will incur additional overhead, increase response times, and dilute the effectiveness of the algorithm by submitting the same terms for positive and negative keywords

If, however, you make your second "refinement" request using GetRelatedKeywords, with only the positive and negative keywords as input, you avoid the costly overhead of re-crawling the page to extract identical content, shorten the overall response times, and generate a much more tightly-related set of keyword suggestions.