URL Query String parameters
- What are URL parameters
- Why URL parameters must be managed for proper SEO
- Typical functions of URL parameters
- Put in order
- To specify
- Number the pages
- To translate
- Track visits
- Multiple parameters
- How parameters are managed
- Rel canonical
- Google Search Console
- Rel nofollow
- Disallow robots.txt
- URL Fragment (#) Hash
- SEO guidelines
What are URL parameters
Short answer: A URL parameter is that part of the URL that follows the question mark.
Long answer: In the web context, a query string / search string / URL parameter is the part of a URL (Uniform Resource Locator) containing data that does not fit conveniently into a hierarchical structure of paths, such as / folder_1 / folder_2 / .
The query string includes two fields: the parameter and its value . These fields are added to a base URL, for example when submitting a completed form or applying a filter to a product listing within an eCommerce site.
A web server can handle an HTTP request by reading a file from its file system based on the URL path, or it can handle the request using a query string (parameter), along with the antecedent part of the URL.
The syntax of the URL parameter is not formally defined, however the following scheme can be defined as standard (because it is implemented in all browsers and scripting languages):
The full URL consists of the following elements:
Multiple parameters can be concatenated with the use of the “&” character.
Why URL parameters must be managed for proper SEO
At the base of indexing in search engines there is always the single URL. Any URL can potentially be indexed.
The address http://www.example.com/scarpe is different from http://www.example.com/scarpe?colore=green . If Googlebot, browsing the site, finds links to both URLs, it may decide to index both pages. The result would not be good as we would have two identical indexed pages.
Parameters are used for many different functions within a website. Depending on the specific function of the parameter, it is necessary to evaluate whether the generated URL is to be indexed or to be excluded from crawling and indexing. More precisely, you have to decide which parameter to keep in the canonical rel tag .
For good SEO it is important to manage the parameters to improve indexing and crawl budget and thus prevent search engines from indexing unwanted and duplicate pages. The bot crawl budget will then be dedicated to crawling canonical (and useful) pages.
Typical functions of URL parameters
URL parameters are used by webmasters and developers for a variety of purposes. CMSs make unique use of parameters so it will be your job to understand the function of each parameter you will encounter during your SEO Audits .
Let’s see in detail all the possible functions that a query string can have, which must be indexed and which excluded.
Put in order
One of the most common functions for a parameter is sorting the results on a listing page, such as sorting products by price .
Considerations: the goal is not to clog the Google database with the same pages, which do not add any value to what is already indexed.
A category of an eCommerce site, with or without the products sorted by price, always contains the same elements. It would not make sense to have the page indexed with the parameter, as it would result in a duplicate page. For this reason, the sorting parameters are usually not indexed, so they must be excluded from the rel canonical tag.
Parameters can be used to narrow down results, for example to a specific category or characteristic.
In a blog, this filter could select all posts from a year. In an eCommerce site, the filter could limit products of one color or one brand.
Considerations: the surrounding parameters are very important, in many cases it is preferable to use a URL folder structure instead of the parameter. In the event that this type of URL creates useful and sought-after content by users, it is advisable to have them indexed.
URLs with a circumscribing parameter are indexed when the filtered page that is generated corresponds to frequent online searches, for example “nike shoes”. It is important that the title tag, meta description and supporting text of these pages are specific to the value of the active filter.
To ensure that the page is indexed, the canonical rel tag must include the parameter in the URL.
You do not have URLs indexed with parameters that generate duplicate pages or that do not match frequent user searches. For example, a parameter that sets the shoe size http://www.example.com/shoes?number=37 would generate a page that would hardly make organic traffic. Are you looking for “nike 42 shoes”? Not me, I set the number on the template page.
To avoid indexing, the rel canonical tag must not include the parameter in the URL.
It might make sense to index selections of particular numbers, of hard-to – find footwear, such as oversized large shoes . As you have seen there are no fixed rules, it is important to evaluate the function of the parameter and think about the contents that are generated. If they create value it is advisable to have them indexed, otherwise better not.
Another common use for parameters is to specify an item to display, which can be a product sheet, blog post, category, user profile, or anything else.
Considerations: The parameters they specify generally get indexed since they uniquely define an element. Verify that the title tag and meta description are unique for each id value. If id specifies the shoe model, the whole page must be dedicated to that model: meta tags and content.
In some cases they are not indexed, for example when alternative Friendly URLs are available to canonicalize, or if this parameter generates low quality content. Again, analysis and reasoning will help you find the best solution to manage the parameter.
Not everyone uses dedicated ccTLDs or a gTLD with a folder URL system to manage a multi-language site . Someone uses URL parameters. What is the best way to manage these translation parameters for SEO?
Considerations: the translation parameters must be indexed if you want to have the multi-language site indexed. The canonical rel tag must include the translation parameter and the same logic must be respected in the alternate hreflang tag .
Number the pages
The parameters can also be used to manage the paging of an archive.
Considerations: the pagination parameters must be indexed otherwise the search engines would lose all the pages of the archive.
The parameters are often used to track visits in web analytics platforms, for example when sharing content on social networks, by email or in PPC campaigns .
Considerations: Tracking parameters, such as those generated with Google’s Campaign URL Builder , do not change the content the user sees, so these parameterized URLs should not be indexed by search engines. Also make sure that this type of parameter is not included in the canonical rel tag.
In some cases, two or more parameters may be active at the same time. The webmaster must decide, as seen above, which parameters to keep in the rel canonical tag, which ones to exclude, and in this case also the order of priority of the parameters – that is the order in which they should be appended to the URL.
In the example I have removed the utm_source tracking parameter from the rel canonical tag . I entered the brand parameter first and then the page parameter . Once priorities have been decided, they must be kept that way in all URLs on the site. Any reversals can create duplications in the search engine index.
SEO methods to manage parameters with PRO and CONS
The best way to handle parameters is to correctly implement the canonical rel tag . I’ll also show you other methods to reduce the impact of unwanted parameters, but they are secondary and should never be preferred over the rel canonical tag.
1. Rel canonical
Rel canonical is a tag that is placed in the <head> section of the HTML. This tag defines the unique reference URL, the URL that search engines must use when indexing and ranking the resource that contains it.
When Googlebot encounters a URL like this http://www.example.com/scarpe?utm_source=ppc and scans it it reads a rel canonical tag http://www.example.com/scarpe it will assign the canonical URL indexing and the positioning. The parameter not present in the rel canonical tag will not be indexed but can still be crawled by bots .
The rel canonical tag is accepted by all search engines and should be the first way to manage your website parameters.
2. Google Search Console
Google Search Console has a section dedicated to parameter management in “Scan> URL Parameters”. With this tool , you can instruct Googlebot, explaining the function of each parameter it encountered while crawling the website.
To instruct Googlebot click on “edit” next to each parameter identified.
- If the parameter is to be indexed you must select: “Yes, modify, reorder or limit the contents of the page”, then select the function of the parameter (sort, circumscribe, specify, number the pages, translate, …) and finally check the option “Any URL” to ask Googlebot to crawl all URLs with that parameter.
- If the parameter is not to be indexed you must select: “No, it does not affect the contents of the page”. Alternatively you can choose “Yes, modify, reorder or limit the contents of the page”, define a function and then select “No URL”.
Manage URL Parameters with Google Search Console
This method only affects Googlebot scans , so it wouldn’t be effective on Bing, Yahoo and all other search engines. For this reason, I always recommend setting the parameters on Google Search Console only after correctly implementing the rel canonical tag. Doing so will take advantage of all the benefits of the rel canonical tag, plus the benefit of saving crawl budget.
3. Rel nofollow
The rel nofollow tag applied to an internal link asks search engines not to follow that link and not to pass PageRank , consequently the destination URL will be ignored and not crawled . However, if that URL is linked elsewhere without nofollow, or is present in the sitemap.xml, then that URL will be crawled and indexed .
For parameter handling, the nofollow tag should be used when there are no other alternatives.
4. Disallow robots.txt
The robots.txt Disallow directive asks search engine spiders not to crawl the folder or URL defined. Disallow does not preclude indexing but only affects crawling .
As with the rel nofollow tag, this method shouldn’t be the first choice for handling URL parameters.
5. URL Fragment (#) Hash
The last solution for managing filters is to use a URL fragment, i.e. insert them in the URL after the hash.
A fragment is an internal page reference, sometimes called a name anchor . It appears at the end of a URL and begins with the hash (#) character followed by an identifier. Refers to a section within a web page.
In HTML documents, the browser looks for an anchor tag with a name attribute that matches the fragment.
There are a few things about fragments, the most important being that they are not sent in HTTP request messages.
Example of URL fragment used as a filter:
Googlebot does not crawl hashbanged URLs . This method therefore helps manage the crawl budget and avoids duplication of content.
- Save crawl budget.
- Avoid duplication.
- The URL generated by the filter is shareable.
- No cons.
What to consider when developing a website that uses parameters to filter a listing page? Let’s see a practical example.
Let’s start with a hypothetical base URL of a listing page that lists the cards of tire specialists in Italy:
The filters allow you to filter by different characteristics of the workshops, such as the location, the services offered, the networks they belong to and more.
Sort the parameters
The filters applied to the URL should follow an alphabetical order, that is, the parameter or order = x is placed first in the url compared to, for example, t ipology = z. This is to prevent the same URL from appearing twice with inverted parameters (it would be double crawling work for the spiders).
It would not be optimal to have a situation where the same content is returned on different URLs:
Canonicalize the parameters
Now on our listing page we apply the network, location and performance filter:
The canonical rel tag points to the parent listing page, as the filtered page would result in duplicate parent content.
One day you could enter some parameters in canonical to make them index. For example, if you notice an interesting traffic on queries such as type + network (Gommista + Magneti Marelli), then it would make sense to index the network parameter :
In this case it would be advisable for the networks parameter to modify the meta tags (title and description) and perhaps a specific intro text for “tire specialists of the magneti marelli network”.
Multiple choice? No thanks
Another thing to keep in mind about the parameters is to evaluate how many choices by type to enable. For example, it would be ideal to be able to activate only one network at a time, and prevent the user and the system from enabling multiple choice.
Because? Because it would generate hundreds of thousands of URL combinations that Google will then crawl. Limiting the choice of a parameter by type limits the combination of URLs.
Correct example with a single choice per parameter:
Suboptimal example with multiple choice for the same parameter:
I think it is important to consider these aspects when developing a website that uses listings and filters with URL parameters.
Avoid capital letters and non-ASCII characters
One mistake I often see devs make is using upper and lower case together in URLs and parameters.
The best thing to do is always use lowercase letters without special characters (not ASCII). Using only lowercase letters simplifies the technical management of the site. For example, I happen to find URLs and links that use uppercase letters and their canonical tags only use lowercase letters. These misconfigurations create SEO problems and should be avoided.
In this guide I have explained to you how to manage the parameters for proper SEO. Now scan your site, find all parameters and define their function. There are functions that should be indexed and others that should be canonicalized. With this information you are able to see for yourself whether you are getting the correct pages indexed or if you can help search engines not get confused. Have a good scan!