What is a URL?
URL is an abbreviation for Uniform Resource Locator. A URL is nothing more than the address of a particular, unique resource on the web. In principle, every valid URL should refer to a unique resource. These resources can be HTML pages, CSS documents, images, videos, PDFs, etc. As the Web server manages both the resource represented by the URL and the URL itself, the owner of the web server is responsible for managing both the resource and its URL.
In simple terms, imagine a city where every person owns only one house. Now consider the city as the web, each person as the resource, and then the address of every house will be the URL.
Anatomy of a URL
Let me explain the complete anatomy of the URL using the URL of one of my blog posts. This might seem a bit complicated from the URLs you usually see, but talking in great detail will help you understand complex URL structures easily.
https://ankitchauhanseo.com/how-does-google-search-work?key1=xyz&key2=abc#crawling
I will break down the parts of the URL above and explain them one by one:
- https – This part is called a scheme. The scheme shows the type of protocol a browser should use to request the resource. HTTPS is a type of protocol used to transfer data around computer networks.
- ankitchauhanseo.com – This is called the domain name. A domain name is basically an IP address converted to human-readable characters. It is the address of a web server.
Basically, ankitchauhanseo.com is an IP address, say 123.456.121.8, that is a location of a web server. But for the convenience of humans, this IP address is converted to the term ankitchauhanseo.com and is called a domain name.
- how-does-google-search-work – This part is called the path. The path represents the specific location of the web document in the domain. This can be a page, post, image, video, PDF or any other document.
- key1=xyz&key2=abc – This part of the URL is called parameters. URL parameters are also called ‘query strings’ and these provide additional information to the server in returning the resource. The parameters are separated from each other by the & symbol.
- #crawling – This part is called the anchor. The anchor specifies a specific location in the document/resource requested by the browser. It is usually used to jump to a particular section of the content on the web page.
What are the best practices for a URL structure?
Google has repeated numerous times that the content of your webpage should be written for users and not for manipulating the search engines.
The same goes for the URL structure. Try to keep the URL structure:
- As simple as possible
- Logically constructed
- Easily comprehensible to humans
Here are the recommended URL structure guidelines to follow:
- Keep the URLs simple and descriptive: The URLs should make sense to the users. At the same time, they can be descriptive in a way that the user gets an idea of what the web page is about by seeing the URL.
For example, the URL https://ankitchauhanseo.com/how-does-google-search-work/ is a blog post with a very simple structure yet it gives an idea about the content of the page.
- Country-specific TLDs: If you are targeting a specific country, it is recommended to use that country’s Top Level Domain or ccTLD. For example, if you are just focussing on the United Kingdom then it is better to use https://example.uk instead of https://example.com.
- Country-specific subdirectory with gTLD: If you have a multiregional site and want to target each country separately, this method is recommended. For example, if you are targeting both UK and India, the URLs can look like this:
https://example.com/uk/
https://example.com/in/
- Use hyphens to separate keywords in URL: To make the URL clearer and more readable to users and search engines, it is recommended to separate the keywords in the URL path using a hyphen ( – ).
- Always use HTTPS protocol: Using an HTTPS protocol makes your website more credible and trustworthy in the eyes of the user. It is also a Google ranking factor. So a padlock icon in the URL bar is a must.
- URLs are always in lowercase letters.
- Eliminate stop words: This is not a necessity but a URL without stops words (a, an, the, if, is, and etc.) looks cleaner. If possible avoid using stop words.
- Use the right keywords: Try to use the keywords in the URL path that actually describe what the content on the page is rather than stuffing the keywords. Remember, you are writing for users first.
Here are the URL structure practices you should AVOID:
- Avoid using unreadable and long ID numbers in the URL.
Example: https:example.com/category/?id_ion=360&sid=2a5ebc568f41da1 - Avoid joining keywords in the URL path together.
Example: https://ankitchauhanseo.com/howdoesgooglesearchwork - Avoid using underscores ( _ ) to separate keywords in the URL.
- Avoid using multiple and complex URL parameters. They can overwork Googlebot and cause more bandwidth consumption. This may even lead Googlebot to completely leave the site pages out of the index.
- Avoid using session IDs in URLs.
- Avoid dynamically generated URLs that may overuse the bandwidth of Googlebot. Use regular expressions in the robots.txt file to block a large set of dynamically generated URLs.
You need not memorize all these rules to create a good URL structure. Just make sure you follow a URL structure that you will love to read if you were the user. Start looking from the user’s perspective and make it the best possible.