top of page

151 items found for ""

  • Introduction to PageRank for SEO

    When Google was launched back in 1998, they introduced a mechanism for ranking web pages that was radically different from how the established search engines at the time worked. Up to then, most search engines relied exclusively on content and meta data to determine if a webpage was relevant for a given search. Such an approach was easily manipulated, and it resulted in pretty poor search results where the top ranked pages tended to have a lot of keywords stuffed in to the content. Google radically shook things up by introducing PageRank as a key ranking factor. Content still mattered to Google, of course, but rather than just look at which webpage had the keyword included most often, Google looked at how webpages linked to one another to determine which page should rank first. Google’s theory was that a link from one webpage to another counted as a ‘vote’, a recommendation from that webpage for the page that was linked to. And the more ‘votes’ a webpage had – the more links that pointed to it – the more Google felt it could trust that page to be sufficiently good and authoritative. Therefore, pages with the most links deserved to rank the highest in Google’s results. It’s interesting to note that the PageRank concept was heavily inspired by similar technology developed two years earlier by Robin Li, who later went on to co-found the Baidu search engine. (Thanks to Andreas Ramos for pointing that out to me!) More than two decades later Google still relies heavily on PageRank to determine rankings. For a long time, Google allowed us to see an approximation of a webpage’s PageRank through their browser toolbar, which included a PageRank counter that showed the current webpage’s PageRank as a integer between 0 and 10. This Toolbar PageRank (TBPR) was a very rough approximation of the actual PageRank that Google had calculated, and was a linear representation of the logarithmic scale that Google used internally to determine PageRank. So going from TBPR 1 to 2 was a growth of a factor of 10, whereas going from TBPR 2 to 3 needed a growth of a factor of 100 – and so on, all the way to the mythical and almost unachievable TBPR 10. The problem with TBPR was that folks in the SEO industry obsessed over it, to the detriment of all other areas of SEO (like, you know, publishing content that actually deserved to be read rather than just serve as a platform to manipulate PageRank). Danny Sullivan (now employed by Google as their search liaison but at that time still working for Search Engine Land as their chief search blogger) wrote an obituary for Toolbar PageRank which explains rather well why it was such a terrible metric. So Google stopped showing Toolbar PageRank. First the Toolbar scores were no longer updated from 2013 onwards, and thus ceased to accurately reflect a webpage’s PageRank. Then in 2016 Google retired the Toolbar PageRank icon entirely. This retirement of TBPR led a huge contingent of SEOs to believe Google had stopped using its internal PageRank metric. You can forgive the SEO community for this conclusion, because Google also stopped talking about PageRank in favour of more conceptual terms like ‘link value’, ‘trust’, and ‘authority’. Moreover, the original patent for PageRank that Google filed in 1998 wasn’t renewed and expired in 2018. But PageRank never went away. We just stopped being able to see it. Also in 2018, a former Google engineer admitted on a Hacker News thread that the original PageRank algorithm had been replaced internally by Google in 2006 by a new approach to evaluating links. This new patent can easily be seen as the official successor to the original PageRank. In fact I’d highly recommend you read Bill Slawski’s analysis of it, as no one does a better job of analysing Google patents than him. In this blog post I don’t want to go down the patent analysis route, as that’s firmly Bill’s domain. Instead I want to make an attempt at explaining the concept of PageRank in its current form in such a way that makes the theory applicable to the day to day work of SEOs, and hopefully clears up some of the mysticism around this crucial ranking factor. Others have done this before, and I hope others will do it after me, because we need more perspectives and opinions and we shouldn’t fear to retread existing ground if we believe it helps the industry’s overall understanding of the topic. Hence this 4000-word blog post about a 22-year old SEO concept. Note that I am not a computer scientist, I have never worked at Google, and I’m not an expert at analysing patents and research papers. I’m just someone who’s worked in the trenches of SEO for quite a while, and has formed a set of views and opinions over the years. What I’m about to share is very likely wrong on many different levels. But I hope it may nonetheless be useful. The Basic Concept of PageRank At its core, the concept of PageRank is fairly simple: page A has a certain amount of link value (PageRank) by virtue of links pointing to it. When page A then links to page B, page B gets a dose of the link value that page A has. Of course, page B doesn’t get the same PageRank as page A already has. While page A has inbound links that give it a certain amount of PageRank, in my example page B only gets PageRank through one link from page A. So page B cannot be seen as equally valuable as page A. Therefore, the PageRank that page B gets from page A needs to be less than 100% of page A’s PageRank. This is called the PageRank Damping Factor. In the original paper that Google published to describe PageRank, they set this damping factor to 0.85. That means the PageRank of page A is multiplied by 0.85 to give the PageRank of page B. Thus, page B gets 85% of the PageRank of page A, and 15% of the PageRank is dissolved. If page B were then to have a link to page C, the damping factor would apply again. The PageRank of page B (85% of page A’s PageRank) is multiplied by 0.85, and so page C gets 72.25% of page A’s original PageRank. And so on, and so forth, as pages link to one another and PageRank distributes through the entire web. That’s the basic idea behind PageRank: pages link to one another, link value flows through these links and loses a bit of potency with every link, so webpages get different amounts of PageRank from every link that points to them. Pages that have no links at all get a basic starting amount of PageRank of 0.15, as extrapolated from the original PageRank calculation, so that there’s a jump off point for the analysis and we don’t begin with zero (because that would lead to every webpage having zero PageRank). Damping Factor Modifiers The above all makes sense if a webpage has just one link pointing to another page. But most webpages will have multiple links to other pages. Does that mean those links all get 0.85 of the starting page’s PageRank? In its original form, PageRank would distribute evenly across all those links. So if you had ten links to different pages from page A, that 85% of link value would be evenly shared across all of those links so that each link would get 8.5% of page A’s link value (1/10th of 85%). The more links you had on your page, the less PageRank each linked page would receive. This led to a lot of SEOs adopting a practice called ‘PageRank Sculpting’ which involved hiding links from Google (with nofollow attributes or other mechanisms) to ensure more PageRank would flow through the links you did want Google to count. It was a widespread practice for many years and seems to have never really gone away. Reasonable Surfer But this even distribution of PageRank across all links on a page was a short-lived aspect of PageRank. The engineers at Google quickly realised that letting link value flow evenly across all links on a page didn’t make a lot of sense. A typical webpage has links that are quite discrete and not likely to be clicked on by actual visitors of the page (such as links to privacy policies or boilerplate content), so why should these links get the same PageRank as a link that is very prominent on a page as part of the main content? So in 2004 Google introduced an improved mechanism for distributing PageRank across multiple links. This is called the ‘Reasonable Surfer’ model, and it shows that Google started assigning different weights to links depending on the likelihood of a link being clicked on by a ‘reasonable surfer of the web’ – i.e. an average person browsing webpages. Basically, Google modified the PageRank Damping Factor depending on whether a link was actually likely to be used by an average person. If a link was very prominent and there was a good chance a reader of the webpage would click on it, the damping factor stayed low and a decent chunk of PageRank would flow through that link. But if a link was hidden or discretely tucked away somewhere on the page, such as in the footer, it would get a much higher damping factor and so would not get a lot of value flowing through it. Pages linked from such hidden links would not receive much PageRank, as their inbound links were unlikely to be clicked on and so were subjected to very high damping factors. This is a key reason why Google wants to render pages as it indexes them. Just looking at the HTML source code of a page doesn’t necessarily reveal the visual importance of a link. Links could easily be hidden with some CSS or JavaScript, or inserted as a page is rendered. By looking at a completely rendered version of a webpage, Google is better able to accurately evaluate the likelihood of a link being clicked, so can assign proper PageRank values to each link. This aspect of Google’s PageRank system is still being updated and refined, so it’s safe to assume Google kept using it and improving it. For me, the Reasonable Surfer approach also makes PageRank Sculpting obsolete. With Reasonable Surfer, the amount of links on a page is not a determining factor of how much PageRank each link gets. Instead, the visual prominence of a link is the key factor that decides how much PageRank flows through that link. So I don’t believe you don’t need to ‘hide’ links from Google in any way. Links in the footer of a page are likely to be ignored anyway for PageRank purposes as users aren’t likely to click on them, so you don’t need to add ‘nofollow’ attributes or hide them another way. Internal versus External Links What follows is purely speculation on my end, so take it with a grain of salt. But I believe that, in addition to a link’s prominence, the applied PageRank Damping Factor also varies depending on whether the link points to an external site or an internal page. I believe that internal links to pages within the same website have a lower damping factor – i.e. send more PageRank to the target page – than links that point to external websites. This belief is supported by anecdotal evidence where improvements in internal linking have a profound and immediate impact on the rankings of affected webpages on a site. With no changes to external links, improving internal linking can help pages perform significantly better in Google’s search results. I strongly suspect that PageRank flowing within a website diminishes more gradually, whereas PageRank that is sent to other websites dissipates more quickly due to a larger damping factor. To give it some numbers (which I randomly pulled out of thin air for the purpose of this example so please don’t take this as any sort of gospel), when a webpage links to an external website Google may apply a damping factor of 0.75 which means 25% of PageRank is lost and only 75% arrives at the destination page. Whereas if that same webpage links to another page on the same site, the damping factor may be 0.9 so that the internal target page receives 90% of PageRank, and only 10% is lost. Now if we accept this as plausible, it raises an interesting question: when is a link considered ‘internal’ versus ‘external? Or, more simply put, what does Google consider to be a ‘website’? Subdomains vs Subfolders This simple question may have a complicated answer. Take a website that uses multiple technology platforms, such as a WordPress site that also uses another CMS to power a forum. If both the WordPress site and the forum exist on the same overall domain, such as www.polemicdigital.com with a /forum/ subfolder, I’m pretty confident that Google will interpret is as just one website and links between the WordPress pages and the forum pages will be seen as internal links. But what if the forum exists on a subdomain, like forum.polemicdigital.com? The forum behaves very differently from the main WordPress site on the www subdomain, with a different technology stack and different content. So in that scenario, I strongly suspect Google will treat the forum.polemicdigital.com subdomain as a separate website from the www.polemicdigital.com WordPress site, and any links between them will be seen as external links. For me, this lies at the heart of the subdomain vs subfolder debate that has waged within the SEO industry for many years. Hosting a section of your site on a subdomain makes it more likely it’ll be interpreted as a separate site, so I believe links from your main site to the subdomain will be seen as external links and be subjected to higher damping factors. Thus your subdomain’s ranking potential is diminished because it receives less PageRank from your main site. Whereas if you put extra features of your site, such as a forum or a blog, in a subfolder on the same domain as your main site, it’s likely Google will simply see this as extra pages on your site and links pointing to these features will be internal links and send more PageRank to those areas. This is why I recommend my clients to never put crucial rankable resources like blog articles and user generated content on a subdomain, unless there’s really no other choice. If you do have to use subdomains, you should try to use the same technology stack as your main site where possible with the same design, boilerplate content, and page resources (images, CSS, JavaScript). This will increase the chances that Google will interpret the subdomain as being a part of the main domain’s website. Redirects and PageRank A long standing question in SEO is “how much PageRank is lost through a redirect?” When you change a webpage’s URL and redirect the old URL to the new location, do you lose some of the original URL’s PageRank in that redirect? Over the years the answers from Googlers have varied a bit. Historically, Google has confirmed that the amount of PageRank lost through a redirect is the same as through a link. This means that a redirect from page A to page B counts as a link from page A to page B, and so the PageRank Damping Factor applies and page B receives less PageRank (by about 15% or whatever damping factor Google chooses to apply in that specific context). This is done to prevent redirects from being used to artificially manipulate PageRank. However, more recently some Googlers have said that redirects do not necessarily cause PageRank loss. In a 2018 Webmaster Hangout, John Mueller emphasises how PageRank is consolidated on a canonical URL, and that redirects serve as a canonicalisation signal. This would imply that there is no PageRank loss in a redirect, but that the redirect tells Google that there is a canonical URL that all the relevant ranking signals (including PageRank) should be focused on. Nonetheless, whenever a website goes through an effort to minimise redirects from internal links and ensures all its links point directly to final destination URLs, we tend to see an uplift in rankings and traffic as a result. This may not necessarily be due to a decrease of PageRank loss and more due to optimised crawling and indexing, but it’s an interesting correlation nonetheless. Because redirects result in extra crawl effort, and there is a chance that some redirects still cause PageRank loss, I would always recommend websites to minimise internal redirects as much as possible. It’s also good practice to avoid chaining multiple redirects due to the extra crawl effort, and Google tends not to crawl beyond a maximum of five chained redirects. PageRank Over Time Due to the volatile nature of the web, a webpage’s PageRank is never a static number. Webpages come and go, links appear and disappear, pages get pushed down deeper in to a website, and things are constantly in flux. So Google has to recalculate PageRank all the time. Anecdotally many SEOs believe links lose value over time. A newly published link tends to have a stronger positive ranking effect than a link that was published years ago. If we accept the relevant anecdotal evidence as true, it leads to questions about applied PageRank damping factors over time. One possibility is that Google applies higher damping factors to older links, which means those old links pass less PageRank as time goes on. Another possibility is that the webpages that contain those old links tend to get buried deeper and deeper on a website as new content is published, so there are more layers of clicks that each siphon off a bit of PageRank. That means a link from page A to page B passes less PageRank not because of a higher damping factor, but because page A receives less PageRank itself as it sinks into the website’s archive. Fact is, we don’t really know what Google does with PageRank from historic links. All we know is that links do tend to lose value over time, hence why we constantly need to get new links pointing to a website to maintain rankings and traffic. URL Seed Set There’s one more aspect of PageRank we need to talk about, which the updated PageRank patent mentions but the original didn’t. The updated PageRank patent frequently mentions a ‘seed set of pages’. This refers to a starting point of URLs to calculate PageRank from. I suspect that this was introduced as a way to better calculate PageRank by starting from webpages that are known and understood to be trusted and reliable, such as Wikipedia articles or high authority news websites. As per the patent, seed sets “… are specially selected high-quality pages which provide good web connectivity to other non-seed pages.” What makes this seed set especially interesting is how it’s used to modify a webpage’s PageRank based on distance, i.e. how many clicks it takes to get from a seed URL to the webpage. As per the patent, “… shortest distances from the set of seed pages to each given page in the link-graph are computed,” and “[t]he computed shortest distances are then used to determine the ranking scores of the associated pages.” This is entirely separate from how PageRank flows through links. Rather than counting a webpage’s cumulative PageRank as it flows through links, the patent explicitly states that it’s about the ‘shortest distance’ from the seed set to the webpage. So it’s not about an accumulation of PageRank from one or more links, but a singular number that shows how many clicks it would take to get to the webpage from any URL in the seed set. So by all appearances, it looks like Google uses the number of clicks from a seed URL to a webpage as a PageRank modifier, where fewer clicks means higher PageRank. PageRank and Crawling So far we’ve talked about PageRank exclusively as a ranking factor. But this is just part of the picture. There is another important effect that PageRank has on a URL: It helps determine how often it is crawled by Googlebot. While Google endeavours to crawl the entire web regularly, it’s next to impossible to actually do this. The web is huge and growing at an exponential rate. Googlebot couldn’t possible keep up with all the newly published content while also keeping track of any changes made to existing webpages. So Googlebot has to decide which known URLs to recrawl to find updated content and new links. PageRank feeds in to this decision. Basically, the more PageRank a URL has, the more often Googlebot will crawl it. A page that has a lot of links pointing to it will be seen as more important for Google to crawl regularly. And the opposite also applies – pages with very low PageRank are seen as less important, so Google will crawl them less often (or not at all). Note that PageRank is only part of that equation, but it’s good to keep in mind when you talk about optimising a site’s crawling. All this and more is explained elegantly by Dawn Anderson in her Search Engine Land article about crawl budget, which is definitely worth a read. What This Means For SEO Understanding all of the above, what does this mean for SEOs? How can you apply this theory to your daily work? We can distil the theory to a few clear and concise recommendations for improving a website’s use of PageRank: 1. Links are a vital ranking factor So far nothing has managed to replace PageRank as a reliable measure of a site’s trust and authority. However, Google is very good at ignoring links it doesn’t feel are ‘natural’, so not all links will pass PageRank. In fact, according to Paul Madden from link analysis software Kerboo, as many as 84% of all links on the web pass little to no value. In a nutshell, it’s not about how many links you have, but how much value a link could pass to your site. Which brings me to the second point: 2. Prominent links carry more weight The most valuable type link you can get is one that a user is likely to click on. A prominent link in the opening paragraph of a relevant piece of good content is infinitely more valuable than a link hidden in the footer of a website. Optimise for visually prominent, clickable links. 3. Internal links are golden It’s easy to obsess over getting more links from external sites, but often there’s just as much gain to be had – or more – from optimising how link value flows through your website. Look at your internal link structure and how PageRank might flow through your site. Start with the webpages that have the most inbound links (often your homepage and some key pieces of popular content) and find opportunities for PageRank flow to URLs that you want to boost rankings for. This will also help with optimising how Googlebot crawls your site. 4. Links from seed URLs are platinum We don’t know which URLs Google uses for its seed set, but we can make some educated guesses. Some Wikipedia URLs are likely part of the seed set, as are news publishers like the New York Times and the BBC. And if they’re not directly part of the seed set, they’re likely to be only one or two clicks from the actual seed URLs. So getting a link from those websites is immensely valuable – and typically very hard. 5. Subfolders are usually superior In almost any given context, content placed in a subfolder on a site will perform better than content hosted on a subdomain. Try to avoid using subdomains for rankable content unless you really don’t have a choice. If you’re stuck with subdomains and can’t migrate the content to a subfolder, do your best to make the subdomain look and feel as an integral part of the main site. 6. Minimise redirects While you can’t avoid redirects entirely, try to minimise your reliance on them. All your internal links should point directly to the destination page with no redirect hops of any kind. Whenever you migrate URLs and have to implement redirects, make sure they’re one-hop redirects with no chaining. You should also look at pre-existing redirects, for example from older versions of the website, and update those where possible to point directly to the final destination URL in only one redirect hop. Wrapping Up There’s a lot more to be said about PageRank and various different aspects of SEO, but the above will hopefully serve as a decent enough introduction to the concept. You may have many more questions about PageRank and link value, for example about links in images or links with the now ambiguous ‘nofollow’ attribute. Perhaps a Google search can help you on your way, but you’re also welcome to leave a comment or get in touch with me directly. I’ll do my best to give you a helpful answer if I can. [Toolbar PageRank image credit: Search Engine Land]

  • Technical SEO is absolutely necessary

    Note: This post was originally published on LearnInbound.com in 2015. I've republished it here, with some minor cleaning up, as I believe its core lessons are still applicable today. Once in a while you hear these murmurs about how SEO is not about the technology anymore, how great content and authority signals will be sufficient to drive stellar growth in organic search. And the people who say this can often show formidable success from content and links, so it would be easy to conclude that they’re right. But they’re not. For them to claim that their efforts in content and social have made technical SEO unnecessary, is a bit like a Formula 1 driver claiming he won the race purely due to his own driving, ignoring the effort that has gone into the car he’s been whizzing around the track. This is insulting to the Formula 1 engineers that have enabled the racing driver to win, just as claiming that technical SEO is unnecessary is insulting – and dangerously short-sighted – to the folks who build and optimise the platforms that enable all your content and social efforts. The thing is, technical SEO is not ‘sexy’. In the early days of SEO, pretty much all practitioners were coders and IT geeks, and the industry was primarily about using clever technologies to game the algorithm. Nowadays SEO has evolved to such an extent that it aligns a lot with classic marketing, and this is reflected in the backgrounds where the SEO industry’s professionals come from – marketers, copywriters, journalists, and designers are now more common in SEO than computer science graduates. And for these people it’s very comforting to hear that technical SEO is unnecessary, because it’s an aspect they’re not comfortable with and can’t easily navigate. It’s a nice message for them, putting them at ease and allowing them to remain confident that their content strategies will continue to drive results. Until they start working on a website that is technically deficient. Then the trouble starts. For most small to medium-sized websites, technical SEO is indeed not a huge priority. Especially if a website is built on a popular CMS like Wix or WordPress, there’s usually not a lot of technical SEO work that needs to be done to get the website performing optimally. But for larger websites, it’s an entirely different story. The more complex a website is, the higher the chances that some aspect of its functionality will interfere with SEO, with potentially catastrophic results. There are countless ways that a website’s technical foundation can go wrong and prevent search engines from crawling and indexing the right content. It takes someone skilled in technical SEO to identify, prevent, or fix these problems. Simple things like slightly inaccurate blocking rules in robots.txt and faulty international targeting tags, to major issues like spider traps, automatic URL rewrites with the wrong status codes, or incorrect canonical implementations, can wreak havoc with a website’s performance in organic search results. And if you as an SEO practitioner are not familiar with the ins and outs of technical SEO, you couldn’t even begin to diagnose the problem, let alone fix it. So let’s be clear once and for all: technical SEO is absolutely necessary. Now if you’re an SEO with a non-technical background, there’s no need to panic. You can have a very successful career in SEO with limited technical know-how. But you do need to know and accept your limitations, and be able to call on expert help when you need it. However, I do recommend that every SEO practitioner develop at least a rudimentary understanding of technical SEO. I don’t think this is optional. You need to know the basics, if only to enable you to recognise when something technical might cause SEO problems and you need to call in further support. If you have zero technical SEO knowledge, you’re not going to be able to recognise technical issues when they arise, and that’s a dangerous position to be in. Learning Technical SEO So now that we’ve established that a baseline of technical SEO know-how is valuable, how do you go about achieving that? I wish I could tell you that all you need to do is read a Moz starter’s guide and some blog posts and you’re done. But if I did that, I’d be lying. The truth is, learning technical SEO is not easy, especially if you have no technical background at all. This stuff can be challenging to wrap your head around. But, if you use the right approach and try to learn it one step at a time, you will be amazed at how quickly you’ll become sufficiently proficient in the technical side of things. And, more than that, you’ll find yourself applying that technical know-how across many other digital marketing channels as well. This is because, at its core, understanding technical SEO is about understanding how the web works, and understanding how search engines work. And by knowing these things – especially the first one – you will become a more informed, more effective, and more successful digital marketer. How The Web Works The internet has become such an integral part of our daily lives that few of us ever stop to think about how it all works. It’s something we just take for granted. But when you make a living from the internet, it pays to know how it functions under the hood. Every year I give a guest lecture for a MSc program at a local university. The MSc is in digital communications, and the lecture is an introduction to coding on the web: explaining the differences between static markup like HTML and XML, dynamic code like JavaScript, and server-side code like PHP and ASP.NET. The first time I gave that lecture, it quickly became apparent that we should have started at an even more basic level. The students, most of whom had no technical background, couldn’t really grasp the differences between the different types of code, because they didn’t understand the basic client-server architecture of the internet. It was really eye-opening for me and the academics organising the course. We had to go back to the drawing board and revise the module to ensure that we started with the basic structure of the internet before we delved into explaining code. We also had to emphasise the difference between the web and the internet, something that gets muddled too easily in many people’s vocabulary. Learning Internet Networking Unfortunately, there’s no ‘Internet Architecture For Dummies’ book I can recommend to get you up to speed. The client-server structure of the internet is usually taught as part of A-level computing, and is assumed to be basic knowledge when someone starts a degree in computer science or related technical field. Most marketers, of course, don’t necessarily have this foundational knowledge. So if this is unknown territory for you, I would advise you to read up on client-server models and basic internet networking, as you will need at least a rudimentary understanding of these concepts later down the line. Learning Code The next step is then to learn the basics of the various types of code that enable the internet – and the web specifically – to function. Now I’m not one of those SEO guys that believes we all need to learn how to code. I don’t think being able to code is a crucial skill. In fact, I consider myself an expert technical SEO guy and I couldn’t string together a coherent {if, then, else} statement if my life depended on it. Rather, I see coding skills as a gateway to a much more important skillset: problem solving. A lot of technical SEO is about troubleshooting and problem solving. Learning to code is a means to acquire those skills, but by no means the only way. It’s about logic and reason, and knowing where to look. While you don’t need to be able to code, I do think it’s important that you understand markup and are able to troubleshoot it. HTML and XML are two markup languages (the ‘ML’ in their acronyms mean exactly that) which are very important for SEO. It’s incredibly useful to be able to look at the source code of a webpage or sitemap, and have a pretty good understanding of what each line of markup actually does. The good news here is that there are plenty of online courses and books to help you get started with HTML and XML. A simple Google search will yield literally thousands of results - plus, there's a section on Google's own web.dev site dedicated to teaching HTML. So just get started with it – build your own webpage from scratch and learn the various different HTML tags, what they do, and how they combine to format a webpage. Once you’ve grasped the basics of HTML, the intricacies of XML will come easy to you. It doesn’t hurt to have a basic understanding of CSS as well, though I don’t consider that as crucial to technical SEO. Google Developers Because Google’s platforms rely on third party developers to code for them, Google has an entire website devoted to educating developers on best practices for many of its platforms. The Google Developers site has a wealth of resources for a huge range of different development environments. For SEOs, the most useful section there is the Search area, which contains loads of useful tips on how to build websites in such a way that Google’s search engine can work with them. Additionally, the web.dev site devotes entire sections to optimising the load speed of your site, the basics of web security, and much more. If you’re comfortable with HTML, it’s definitely worth looking through the web.dev site and learn how Google wants us to build and optimise websites. Learning Webservers Lastly, to round off your understanding of the web, you need to understand how webservers work. More importantly, you’ll want to come to grips with the basic configuration options of some of the most common webserver platforms. Personally, I’m reasonably proficient with Apache webserver, as it’s such a widely-used platform; many websites that use open source software like WordPress and Magento will run on an Apache webserver. Get yourself a cheap hosting environment, upload your hand-coded HTML pages to it, and start experimenting with server settings and configuration files. How Search Engines Work Learning the intricacies of how the web works will give you a great basic skillset for technical SEO. However, it’s only part of the picture. Next, you’ll want to learn how search engines work – specifically, how search engines interact with websites to extract, interpret, and rank their content. Whenever I teach SEO, I always start with a brief high-level overview of how search engines work. While search engines are vastly complex pieces of software, at their core they’re made up of three distinct processes. Each process handles a different aspect of web search, and together they combine to provide relevant search results: Crawling: this process is the spider – Googlebot – that crawls the web, follows links, and downloads content from your website. Indexing: the indexer – Google’s is called Caffeine - then takes the content from the spider and analyses it. It also looks at the links retrieved by the spider and maps out the resulting link graph. Ranking: this is the front-end of the search engine, where search queries are processed and interpreted, and results shown according to hundreds of ranking factors. I give a more in-depth explanation of each of the three distinct processes in this post for State of Digital: The Three Pillars of SEO. When it comes to technical SEO, crawling and indexing are the two most important processes you want to understand (though the third one, ranking, also touches on some aspects of technical SEO). At its core, I believe that technical SEO is primarily about optimising your website so that it can be crawled and indexed efficiently by search engines. This means ensuring the right content can easily be found by Google’s spiders, removing obstacles that prevent efficient crawling, and limiting the amount of effort for search engines to index your content. It’s important to know that search engines are, in effect, an applied science – namely, the science of Information Retrieval, which is part of the Computer Science field. If you’re serious about developing your technical SEO skills, you’ll want to invest some time in studying the basics of information retrieval. Stanford University (where Google’s founders studied) has put their entire Introduction to Information Retrieval course online, and I highly recommend it. It won’t teach you the ins and outs of how Google works, but it gives you a clearer understanding of the field and its associated lingo. This will make interpreting the various search engine patents, as often analysed by the incomparable Bill Slawski, much more fun and illuminating. Go Forth and Learn! I hope all that hasn’t discouraged you from embarking on your own journey of learning technical SEO. As I said, it won’t be easy, but I guarantee that it’ll make you a more effective overall digital marketer – not to mention help you become a truly great SEO. You’ll also find that experienced technical SEOs are very willing to help out and provide answers, guidance, and mentorship. Over the years I’ve learned so much from experts like David Harry, Alan Bleiweiss, Aaron Bradley, Rishi Lakhani, Bill Slawski, and so many others I always forget to mention. I make a point of trying to pass this on to the next batch of technical SEO experts-to-be. So if you come across a technical SEO conundrum and you need a bit of help, don’t hesitate to get in touch. If you get demotivated and downbeat, just remember that learning technical SEO is no different than most things worth doing in life: perseverance and ambition will get you there in the end. Good luck!

  • Choosing The Right Domain Name

    The domain name where your website resides at is an important part of your online identity. It represents your business in a vital way. Your domain name can have a big impact on the overall success of your website. It pays off to think hard before you choose a domain name. Ideally your domain name should reflect your business’s core activities, but just having your brand name as a domain name sometimes isn’t enough. Here I will outline some things to keep in mind when you venture out and register your first domain name. 1. Pick the right extension for your primary domain name. If you run a local business that’s not in the USA, it pays off to have a domain name with your country’s local extension. So if you’re a Dutch business primarily active in the Netherlands, stick with a .nl domain name. If you do business internationally or primarily in the USA, always go for a .com domain. Be sure to register your domain on multiple extensions to protect your online brand. Don’t rely on the .com only, also register .net, .org, .biz and .info. If you’re a European company, register the .eu as well. Register the .mobi and .asia just to be safe. You want to protect your domain name and make sure someone else doesn’t grab your domain with a different extension and starts competing with you. Have all these extra domain names point to your primary domain name where your website resides. 2. Make it easy to spell. If you have a very complicated domain name that’s hard to type, users will mistype it and get either an error or, if you’re unlucky, reach a different site entirely. So make sure your domain name is easy to type. This also helps when you have to spell it out on the phone to contacts and potential customers. If your company has a difficult name that won’t be easily spelled as a domain, consider registering a different domain name to put your website on. You should still register your brand name as a domain, but it might be a good idea to not have that be the primary address of your website. 3. Make it easy to remember. Try to pick a domain name that’s catchy and memorable so it will stick in people’s heads. If people can’t remember your domain they might just try to find your business by searching in Google, which is likely to lead them to your competitor’s websites. 4. Don’t hyphenate unless you really have to. Hyphens, like numbers, make it more difficult for you to spell out your domain name on the phone, and makes your domain more difficult to remember. Only use hyphens to prevent confusion. 5. Make it relevant. If you’re not using your brand name as your domain, make sure your domain name applies to your business. Try to use SEO keywords that describe your core business, but be sure to keep it short. Long domain names often don’t pass criteria 2 and 3. 6. Never use a subdomain or domain provided by your ISP. Always register your own domain name and arrange your own hosting environment. Nothing looks more amateurish than a business website on an ISP’s domain name like http://members.megaisp.com/~user101. A domain name with basic hosting is not expensive and it shows that you’ve at least put some effort into your website. 7. Register alternatives and misspellings. People make mistakes and will misremember or mistype your domain. If you own that mistyped version, you can still get that visitor to your site. Try to determine the most common variations and misspellings of your domain name and register those domains as well. 8. Avoid existing brand names or competitors. Many lawsuits have been fought over domain name ownership, and if you pick a domain name that resembles an existing business or brand too much you’re likely to find yourself on the wrong end of a legal settlement. Your domain name is not the be all, end all of your online presence, but it’s a big factor and deserves proper consideration. Also make sure you check out this great guide from FirstSiteGuide.com on how to choose the best domain name.

  • Are You Happy With Your Website?

    Many small business owners that have a poor website don’t realize they have a poor website. This is often the case when: A) they’ve put a lot of time and effort into building their website themselves, or… B) a friend or family member claiming to be a web expert has built the site for them, or… C) they spent a lot of money on a ‘professional’ web developer. Due to the investment of time and money and emotional attachment they’re unable to look objectively at their website and see its flaws. When asked about the success of their website they usually answer that they’re quite pleased with it. Sometimes I get the uncomfortable task of explaining to one of them that no, really, your site could do with some improvement. I usually try to drill down to the core of the issue by asking them what the purpose of their website is. What is the goal? What is your website supposed to be doing for you? Many business owners haven’t set any goals for their website, they’re just happy to be online. Very few look at their web statistics regularly. Even fewer act upon the information. A typical conversation on this topic goes something like this: “So, what is the goal of your website? What do you want to accomplish with it?” “Well, I want my information to be out there, to show my products online.” “Okay, so why do you want people to look at your products?” “To make them aware of what I can do for them.” “I see. So when they see your products on your website, what do you want them to do with that information?” “Well, I’d like them to pick up the phone or send me an email or whatever. Get in touch.” “So your website is meant to generate leads, is it?” “Yeah, basically, that’s it.” “And how many leads are you getting from your website every week?” “… I don’t know. One or two I think, I’m not sure.” “Okay, let’s assume it’s two leads a week. How many people visit your website every week?” “Well, eh, I’d have to look that up.” “I see here on your web statistics package that you get about six hundred visitors a week.” “Oh, well, that’s good. That sounds like a lot.” “Only two of them are getting in touch with you. Two out of six hundred, that’s not a very high percentage, is it?” “… Eh, I suppose not, no.” That’s the beginning of a slow, painful path from ignorance to denial, anger, frustration, and awareness. Ideally this ends in acceptance of the inconvenient truth that a good website requires constant attention. If you want to make the most of your online presence, you need to commit a certain amount of time and money to it. The good news is that it doesn’t require countless hours and infinitely deep pockets to maintain a solid website. A lot of things can be done easily and cheaply and can yield tremendous results. So ask yourself, are you happy with your website? Should you be?

  • The Dangers Of Corporate Jargon

    Their main mistake was that in their eagerness to differentiate, they adopted a different type of jargon. Instead of calling their products ‘barcode scanners’ they began referring to them as ‘image scanners’ and, even worse, ‘imaging devices’. This lingo was soon used extensively in their print material and on their website. The problem is that nobody without intimate knowledge of the barcode industry knows what an ‘imaging device’ is. Your potential customers usually aren’t intimately involved in your industry, so you have to assume they haven’t learned the lingo. Keep this in mind when you write copy for your website. Whenever you use a term or phrase that’s common in your industry, ask yourself whether an average 15 year old (if such a creature exists) would understand what it means. If the answer is no, try a different way of wording it. Sometimes you can’t escape using corporate lingo and industry jargon on your website, but it pays to keep it to a minimum. After all, few people will type “imaging device” into a Google search box. Most will type “barcode scanner”.

  • Keep Your Forms Short And Simple

    As a result users are less inclined to type a lot of information in a website’s form. Whether it’s a contact form or an orderform, users will be reluctant to give you their information. Many research studies show that elaborate web forms turn users away. Every field you add to a form will make it more likely a user will not fill it in and simply go somewhere else.  Especially form fields like address and phone number throw up barriers for users that are concerned about their privacy. It’s therefore important to keep the forms on your website as short and simple as possible. A mistake I often see is that companies base their forms on their own internal wish-list of customer information. Especially sales people want to have as much information on their customers as they can get their hands on. This usually leads to long forms that request a lot of information from users, often with little to no reward for the user when he fills it all in. It’s necessary to use forms on your website, as a form makes it easier for a user to get in touch with you. But when you ask for much more information in the form than what you’d ask for if the customer simply phoned you, you’re not likely to get a lot of submitted forms. Whenever you create a form for your website, keep these guidelines in mind to ensure your visitors will feel comfortable filling it in and giving you their information. Only ask for the absolute bare minimum. For generic contact forms the name, email address and message fields are enough. For online orderforms only ask for the minimum information you need to properly complete the order process. Any additional field risks a potential customer turning away and going to a competitor. Reward your users for giving you their information. If you really, really need to ask a lot of information from your users, give them a reward that fits the amount of information you’ve requested. This reward can be in the form of a free downloadable ebook or white paper, a possibility to win a prize like a mp3 player, or another reward that fits with your target group. Make sure this reward is clearly indicated on the form itself. Give your form proper context and explanation. Don’t just put a form up on a web page without any explanation. The best forms are those that are short and simple and clearly indicate to the user what happens with their submitted information. Encourage your users to submit the form. By using action words such as “submit now”, “learn more”, and “sign up today” you encourage your users to fill in the form and will make them feel good about doing so. Include a privacy policy. Link to your privacy policy and be sure that it states you will never give your users’ information to any third party. Your privacy policy needs to be in plain language as well – hiding your intent behind cryptic legalese will not engender any trust. It also helps to state clearly on the form itself that you won’t share your users’ information. Use a “thank you” page. When a user submits the form, send them to a “thank you” page where you confirm what you will do with their information, such as replying to the customer’s inquiry, giving them the link to the downloadable reward, enrolling them in the prize draw, etc. Measure the submission rate.  Track how many submissions you receive compared to how many page views the form itself gets. If the submission rate is very low, you’ll need to tweak your form even more. A submission rate of 20% is a good figure for generic contact forms, so don’t be surprised if your form does a lot worse than that. Use a simple CAPTCHA [?] to ensure your submitted forms actually come from humans instead of automated spam robots. Some CAPTCHAs are overly complex and difficult to read even for humans, which leads to real people abandoning your forms instead of just spambots. Simple forms pay off in the long run. You may generate some additional work for yourself or your sales people with the limited information you receive, but it will result in many more contact moments with your clients and eventually in more paying customers.

  • Fix Your Broken Links

    As your website grows it gets harder to keep track of all links. A change you make on your site may break an internal link, and you can’t control what other sites do that may invalidate your links to them. But there are several ways of finding broken links so you can fix them. For internal links you can look at your web statistics and find the 404 error pages. These errors occur when someone clicks a link that points to a webpage that doesn’t exist. A link can become broken if you deleted a page, moved it to a different URL, or typed in a wrong link. Make sure to regularly check your statistics for 404 errors occurring on your website. Another way to find broken links is to use a link checker. You can use Google’s Webmaster Tools to quickly identify many errors on your site, including broken links. Another tool and a long time favorite of mine is Xenu’s Link Sleuth, a small Windows program that spiders any site you send it to and returns extensive reports on all it has found, including broken links. An added benefit that Xenu has is that it can also check external links. By running Xenu regularly on your site you can keep track of your outgoing links and correct them if one suddenly stops working. However such automated tools have their limitations. Nothing can replace a set of human eyeballs when it comes to checking if the pages you send your website’s visitors to still have the content you want them to see. So go through your site once in a while and click on every link and button to make sure it all works as it should.

  • Check Your HTML and CSS Code

    HTML stands for HyperText Markup Language and is the granddaddy of internet codes. It describes the basic markup of your content. From bold and italics text to displaying images and laying out tables, HTML does it all. HTML does have its limitations, which is why CSS was invented. CSS stands for Cascading Style Sheets and it does a lot of what HTML does, and then some. The main advantage of CSS is that you can use it to separate the content of the site from the look & feel. This means that with a proper implementation of CSS you can change the way your website looks without having to make any changes in the content. Having good code is an important aspect of a good website. The HTML and CSS codes that generate your website’s look and feel need to work properly. If they don’t, several things might happen: Some visitors using different web browsers might not be able to use your website. Different web browsers handle code differently and can show a user very different things based on the same HTML and CSS code. With shoddy code your website might look fine in one browser, but might be hideous in another. Or worse, it might not work at all. Search engines like Google that visit your site to look at your content might stumble over bad code. Search engines use little automatic programs called web crawlers or spiders that roam around on the internet to supply the search engines with data. This is called indexing and it happens all the time. If your website uses poor HTML code, web crawlers might not index all your website’s content. This will harm your rankings in search engine results. So how do you know when your website uses bad code? Fortunately there are many ways to check your HTML and CSS code. The World Wide Web Consortium (W3C for short), the international organization that sets the standards for the Internet, has online validation tools that check your website’s code for potential problems. For your HTML code you can use the W3C Markup validator: http://validator.w3.org/ For your CSS code you can use the W3C CSS validator: http://jigsaw.w3.org/css-validator/ These validators are very thorough and will almost always give you a list of errors. Don’t worry, not all of these errors are critical. Errors indicated by yellow or blue exclamation marks (Errors shown with red crosses) Good HTML and CSS code for your site has many benefits, but most of all it will help your visitors receive the website experience you want them to.

  • Write Good Titles For Your Web Pages

    The title of a page, included in your web page’s HTML code between the tags, is an important yet often overlooked part of your website. The title isn’t an obvious and clearly visible part of a page, so it’s easy to think it’s not something you need to put a lot of effort in to. That would be a wrong assumption to make, because the titles of the pages on your website are very important: – The title determines how your page is listed on search engine results. – The title is the name a page is saved under when someone bookmarks it. – The title is the first thing a search engine spider looks at. So page titles are important and you need to make sure every page on your website has a good title. What makes a good title? There are many different ways of writing good titles for your pages. Here are my tips: Be descriptive and meaningful. Make sure a page’s title is a reflection of its content. When that page shows up in a search engine results list, a user needs to be able to quickly see if your page’s content is relevant. A good, meaningful title will encourage users to click on that result and you’ll get additional visitors on your site. Write your title in the form of a short sentence that accurately describes what that page’s content is. Keep it short. Search engines limit the amount of text they show in a search result page, and long titles also get cut off in bookmark lists. Try to keep your title at 63 characters or less, as Google will cut off the text there in the search result list. Include your site name at the end of the title if there’s space for it. By making your site’s name a part of the title you help visitors who bookmark your site to find that bookmark again. However if your title becomes too long with your site name included, leave it out. Capitalize Your Words. Make sure every new word starts with a capital letter. This helps with readability and makes your title stand out a little bit more in a long list of search engine results. DON’T WRITE IN ALL CAPS as that is the internet-equivalent of screaming, and people don’t like being screamed at. Segment your title. You can use your title to reflect the structure of your site, and this helps in giving that page proper context and relevancy. An example: Linear Retail Scanner – Handheld Barcode Scanners – BarcodeShop The first part of the title shows that this page is about linear scanners for retail, and the second part makes clear it falls under the category of handheld barcode scanners. The title finishes with the website’s name (which is fictional by the way). All this takes up no more than 63 characters, so it won’t be cut off in search engine results. Don’t overuse special characters like asterisks, hashes, dots or exclamation marks. This trick is often used to attract attention to a page, but it usually ends up annoying your users and risks making your site look like an amateurish spam site. Writing good titles for every page of your website takes some effort, but it definitely pays off in the end. It will help you get found through search engines, and that traffic will be relevant for your site.

  • Keep Your Content Fresh

    Copy you wrote for your website six months ago may not be as accurate and relevant anymore as when you first published it. It’s necessary to regularly review your website’s content and rewrite parts of it where necessary. This keeps your content fresh and accurate. Updated content has several advantages: Accurate content shows your visitors you keep up with the latest developments in your field. This will help establish your reputation as a competent organization. Search engines like to see new content. It tells them that your website is being kept up to speed. Search engines prefer to send their users to active websites. By regularly checking your content for required updates you will also spot errors and inaccuracies. This will enhance the overall quality of your website. Don’t let your website’s content get stale. Keep your site alive and fresh by regularly updating your content and maintaining the accuracy and relevancy of your website’s copy.

  • Make Your Legal Statements Human-Readable

    But I do know that, as an intensive user of the Internet, I get bombarded by them. End-user license agreements, privacy statements, legal disclaimers, copyright notices, they’re overwhelmingly abundant. And, as I’m not a lawyer, I don’t understand most of them. These legal statements may be necessary, but that doesn’t mean your customers will read them. For some companies this is exactly the intention as they hide oppressive terms and conditions in cryptic legalese. But if you do business fairly, you may want to consider putting a human-readable version of your legal agreement on your website as well. Creative Commons is a prime example of how you can make a legal agreement easy, even pleasant, to read. For example the Creative Commons agreement for this blog is easily read and understood. There’s also a more traditional version which you’ll agree is much harder to comprehend. So treat your customers respectfully and tell them in plain language what their rights and obligations are. Not only will this eliminate frustration at yet another incomprehensible legal statement, it’s likely to make your customers feel more confident about doing business with you.

  • Using Widgets on Your Website

    Web widgets – little pieces of code, often JavaScipt – can add a lot of fun features to your website. Widgets come in all sizes and shapes; sports widgets showing the latest scores and standings, weather widgets with accurate forecasts, widgets that connect to social networks like Facebook, the list goes on. You can find them all over the web. Nearly every major content provider offers widgets that allow you to show their content on your website automatically. Sometimes a widget can really add useful information and functionality to your website. For example of you have a physical store, a routeplanner widget will help website visitors find your store’s location. If you organize travel trips around sports events, a widget showing the upcoming fixtures will save you the trouble of maintaining this information yourself. But often widgets are detrimental to the quality of your website. Widgets, especially when used in abundance, tend to make your site look amateurish and cobbled-together. Be wary of this when you see a cool widget and decide it would look awesome on your own website. Widgets should be used sparingly and only when they add genuine value to your website experience. Don’t just put widgets on your website for the sake of it. For every widget ask yourself if it’s just something you think is cool, or if a user can really benefit from its functionality. And most importantly a widget need to be relevant to your website. Showing weather forecasts is pretty useless if your website is about providing interim management. However for a website about day trips, a local weather forecast in the day trip area could be handy – though if the weather there is consistently bad, it might actually hurt your business. In short, if you want to use widgets on your website, do so with caution and awareness.

bottom of page