Category Archives: Advice

Use Google Alerts & Crawl Logs to Help SEO

April 19th, 2012

In keeping with my self imposed blogging regime, there have been some definite benefits over this last week from being so strict on myself. In particular, I wanted to cover 2 main points: the way search engines crawl websites and the difference between that and indexing.

The catalyst for this post came from Google themselves yesterday. I received an email from Google Alerts (I completely forgot that I had them setup for this website) saying that it had something new in it’s index from this website. The new URL it had discovered was the post just before this one about successful hobby blogs. Why this particular alert caught my interest is that this was not the first post of my blog reactivation – that happened a few posts (and about a week) before.

So I got to thinking that maybe Google might have noticed the jump in new content on this website and for one reason or another I have stepped through some kind of filter/trigger that allows my website to be taken a little bit more seriously.

So I fired up my terminal and grabbed the server access log for this domain to see if there has been any changes. The good news is I had logs starting from about Nov 2011, so I had a good data set to analyse and see if there had been any noticeable increases in crawling (note the word used).

I ran a little Python script to make the whole process go a little faster with some extremely basic logic. What I didn’t count on (and I have no idea as to why it went like this) was that the timestamp data for the last couple of weeks (from the beginning of April until now), were not in chronological order. I am going to check with my server guy for some ideas as to why this happened, but it is extremely unusual behaviour. I overcame this setback with some manual changes to get it right, as opposed to building some logic around these edge cases. I will be keeping an eye on this and hope to come have some answers for you at a later date. Or better still – if you know why, please leave me a comment or email me and I will share it with everyone.

Anyway I have attached the graph for the requests, and as you can see over the period, there was not much of a difference in the amount of crawling requests by GoogleBot to warrant the Google Alert to come through.

Graph of GoogleBot Requests

Graph of GoogleBot Requests

What this tells me is that there is not a direct correlation between crawling and indexing. So whilst Google does go around crawling URLs with its spider, the frequency of your crawling does not speed up your change of getting pages into the index. As a point of reference, my other recent posts before were definitely crawled even though I never got an alert.

Now from the top of my head, I do not know from what index Google sends out their alerts from, but I am guessing it’s close to the primary/secondary level rather than something older.

I know from speaking to bloggers as well that blogger rhythm is often said to be much more important than the amount of content that gets generated. I am subscribing to this theory as well from here on in and I will update in a month or so about whether or not crawling/indexing behaviour has changed due to my “rhythm.”

How you can use this information for your SEO strategy?

Simple.

  1. Setup Google alerts for your domain (not keywords), using a query such as inurl:bentortora.com
  2. Grab your crawl logs and grep them for GoogleBot requests
  3. Record all the new/updated pages/urls on your website for the same period as your log files
  4. Record the date you received a Google Alert for the created/updated page
  5. Compare with when it was crawled
  6. Figure out the pattern of crawl rates and content releases to make sure that you release on a day that Google will visit
  7. Figure out average time difference between crawl and index, that way you can give some rough timelines to clients/managers on turn around times for your SEO campaign.
  8. Take the opportunity to clean up a lot of dud requests coming into the server (for instance – I noticed Google spending some time in my javascript folder and requesting those scripts – I don’t need it there at all.

So I hope everyone took something away from this and I am extremely unwell today so please forgive the bad writing. I should give this post a re-edit when feeling better (I will also fix up my python script and release it too).

How to Make This Blog Succeed

April 17th, 2012

Another day has gone by where I have not done 2 things: blog and make time to blog. I really am pushing myself this week to do something every couple of days that motivates me and others, as opposed to filler. The inspiration for this post was an article on Mashable the other day called How a Sports Fanatic Turned His Blog Hobby.

Why this article appealed to me is because most days I offer advice to many companies about blogging, content generation, content distribution and analysis of the content’s performance. One of the main contributing factors to the success of all the advice is that the implementation workload is spread across a team of people.

However, after a full day of consulting, pitching and analysing – the last thing that I really want to do when I get home is fire up the Macbook Air and repeat this all over again, by myself, when I haven’t even caught the night’s rerun of Seinfeld.

And that is why all my hobby blogs (including this one – I’m classifying it as a hobby at the moment) have failed. There is no other word for it.

So I was pretty excited about seeing the video from Matthew Cerrone about how he was so dedicated to stick out his hobby and make it his day job. Whilst I was watching it, I flagged what he has done correctly, contrasting it to my wrong choices:

  1. He picked a subject he loved – One of my biggest failures was a mortgage website, started primarily because I knew the returns from affiliates was so awesome. Bad move.
  2. Dedication – I am not sure how much time he spent in the early days when he was holding down a day job, but the fact that after work he was willing to regularly clock some minutes on it has motivated me to be consistent here.
  3. Focus – One blog, not 5 (I kid you not, I was trying to maintain 5 blogs and a day job for a while. Stupidity.
  4. Courage – To realise you are on a good thing like he did and take the risk to shift the entire focus to the project is definitely something I have lacked. I guess it is the conservativeness inside of me that prevents true success from blogging full-time.

So the take away for this post is quite simple: I am more motivated than ever to make this blog (BenTortora.com) my primary focus and to make it a success more than any other attempt at blogging before. And I feel I owe it to myself since this website, more so than any other, carries my own name.

Taking Your Website MultiNational & MultiLingual

November 28th, 2011

Recently I sat down with long time friend Mike Casey who has been running a graduate jobs website in Australia for a number of years and is now expanding into international markets. He as after some advice expanding his websites into multiple countries, but the condition that each country would be able to handle multiple languages. Some countries also needed the ability to serve multiple language inside the same website. So what is a guy to do in this situation?

Well we both did some research and found some really interesting links and information on how to run multinational, multilingual websites. We will share our thoughts here for all so others can learn from what we discovered because finding good information was harder than we first thought.

Taking a Website Multinational

We won’t go into the required technologies and code required to replicate a website across countries, because that is beyond the scope of this post. We assume you have your platform sorted in this case and just want to know about how you need to modify templates and URL structures.

Get the Right Top Level Domain(TLD)

The first thing you need to do is to make sure you have bought all the country specific domain names for your organisation. In a perfect case scenario you would go for a URL that was character for character identical. For example, Mike had made sure he bought www.gradconnection.co.uk and not www.grad-connection.co.uk (the character change is a notable difference in the eyes of search engines).

Pick a Default Language to Display

Even if your new market speaks multiple languages in the same country, your website needs to default to one in particular. This becomes a business decision and should be done on what is best for the users in the region. For example, in Hong Kong, the vast majority of websites default to English and not Cantonese.

Meta Geo-Location Data Tags Mean Nothing

In the footer I have referenced links from Google’s Webmaster Blog that state that meta geo-location tags do not help identify a website with a particular country. So don’t waste your time.

Setting up a MultiLingual Website

This is where things get really tricky, so take notes, there is a lot of work to do here.

The HTML lang Attribute

One thing that was discovered during our research is that the HTML attribute is fairly important in multilanguage website. So for example:

<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">

means that this page is being served in the English language.

So when it comes to serving the same website in multiple languages, each and every page needs to have the correct attribute value in place so search engines and search readers know what to do with the words on the page. HTML lang attribute – remember it!.

How do you know what the right value for the attribute is? After searching high and low, we found this link that provides all the language identifies (RFC 3006). Have a search through that page and find the relevant code to serve in the markup of your page.

How you flip the attribute value in your template files is completely up to you, talk to your developers and explain the importance of having it there.

Language Context & Grammar Need to be Correct

Running your website through Google Translate is not an effective or sensible multilingual strategy. If you do not want your website being tripped as being spam, your only option is to get it hand translated. Period.

On a different Google Webmaster Blog post (referenced in the footer), they even recommend all boilerplate text, template buttons, navigation, etc be translated. Not only will this benefit your users, but it will also add to the authenticity and trust-factor of your web site.

Always use UTF-8 Encoding

This is simple fix, make sure the following piece of HTML appears in all your templates (note: there is different markup for HTML5 and the previous HTML standards):

HTML5

<meta charset="UTF-8" />

Previous Versions of HTML

<meta http-equiv="content-type" content="text/html; charset=utf-8"/>

Every Page in Every Language Needs a Unique URL

The 2 options available to the development team are:

Once again this becomes a business decision, along with a chat with your developers. Sub domains might cause problems with mobile websites and other network setups. Whatever option you choose to go with, stick to it and implement the required URL routes.

Use Sessions to Set User Language Choice

Once a user arrives at your website and makes a conscience decision to switch the language, the best place to store that is in a session for the duration of the visit. If you want to remember that preference, store it in a cookie. Do not do server-side redirects based on IP location, browser settings, etc – Google doesn’t recommend it.

Also interesting fun fact we discovered: GoogleBot does not send any language data in its server request calls. So don’t try to get tricky with that either.

Same Language Linking

For maximum SEO impact, common sense as well as a couple of blog posts have mentioned that links that are pointing to a specific language page have the most impact when the link in marked up in the same language. So links coming from a Spanish website should have spanish words in the anchor text and point to the Spanish version of the page on the website, not the English version or the homepage.

Conclusion & References

So after implementing all of the above points, you are well on track to having a fully effective multinational and multilingual website. And if your platform can’t handle any of the above changes, then you might have to reconsider before you go worldwide.

If anyone else has some good recommendations about going international, please leave a note in the comments!

References

Clean Out Your Referrer Traffic

August 10th, 2011

If you have a collection of websites that you manage, it is really just a matter of time before you get spammed by bots, scrapers, and dodgy URL shortners. Most of these poorly built spam bots will make some badly coded requests that will trigger your Google Analytics code. This then fills your referral traffic reports with redundant numbers if you do not take one of 2 options:

This morning I spent the last 40min blocking domains and IPs because I feel it is the best way to tackle it. Reasons being:

Sure it was time consuming and I will never win the war, but at least now both my server and my Google Analytics reports love me that little bit more.

If anyone knows of a tool where you feed it a list of referring URLs and grabs the IP and can tell if it is part of a spam/traffic-faking network can you please, please, please let me know in the comments!

Happy 403ing!

21 queries in 0.346 seconds.