View Full Version : Properly track Google Images traffic (keywords)
Hannes Johnson
04-05-2007, 02:07 AM
This might also be categorized as a bug...
It looks like there is a character limit to the referring URL.
This can be a problem for example when you're getting traffic from http://images.google.com - because then the referring URL is cut off and you can't always see the referring keywords (what people typed in to find your site/image).
Here is an example of a link I'm seeing in my logs:
http://images.google.com/imgres?imgurl=http://www.my-domain-name.com/articles/a-pretty-long-name-of-the-image.jpg&imgrefurl=http://www.my-domain-name.c
I can't even see what the landing page is...
Do you plan on fixing this? It shouldn't be difficult to fix this, right?
- Hannes
webado
04-05-2007, 07:27 AM
Errr ... I think it is rather difficult, as it would require expanding the database field to allow a much longer url while there seems to be no limit to how long they can be.
Also the entire query string structure is very different from those from regular search engine, so parsing it for keywords would be difficult.
However when you say you can't see the landing page ... you know the landing page since that is the page that recorded the hit.
jeroen1973
05-15-2007, 06:16 AM
This is a good feature request.
I have several hits on my website through images.google.com, but unfortunately I cannot see what keywords they are using.
Regards, Jeroen
www.akershoek.com (http://www.akershoek.com/)
-Ash-
08-12-2007, 12:25 PM
I'm also getting an abnormal amount of hit from images.google.com and its annoying because i cant see where they have come from (clikcing on the link just send me to my own page with the top part of the page with the google image info).
I've even taken out the picture and link from the page thats getting this hits its in and i'm still getting hits (is this because its from images.google.com?).
webado
08-12-2007, 05:33 PM
Yes, the images are already indexed in Google Images. It will take long for them to drop out of that index after you have made changes to the pages where the images are being displayed.
You can block Googlebot Images using the robots.txt file if you don't want images to be indexed.
-Ash-
08-13-2007, 07:39 PM
You can block Googlebot Images using the robots.txt file if you don't want images to be indexed.
Nice, thanks. I'm setting that up now.
Hannes Johnson
06-29-2008, 05:53 PM
I just wanted to revisit this thread to see if StatCounter plans on adding this feature. Because a competing service (extremetracking.com) supports this - there I can see what keywords people are using in Google Images to find my images/webpages. The only reason I'm still using eXTReMe Tracking is because they have this feature - if StatCounter would support Google Images tracking I could drop eXTReMe and focus more on StatCounter ;)
- Hannes
lnieves
07-02-2008, 10:52 AM
I second this request, although I understand the potential complication with the URL length. This of course would not be a problem if google where to place the q= bit at the begining and not at the end of the full query string.
TheRain
07-19-2008, 04:59 AM
You can block Googlebot Images using the robots.txt file if you don't want images to be indexed.
Is there a good guide somewhere on how to do this? I've been getting inundated with Google Images links in my "Recent Came From" log lately, and I'd love a way to screen them out.
webado
07-19-2008, 05:54 AM
The robots.txt can be used to disallow access to your site to Googlebot Images.
User-agent: Googlebot-Image
Disallow: /
It doesn't mean that all images that have already been idnxed will suddenly get deindexed. It takes time.
tosommerfugle
07-19-2008, 01:58 PM
I've just registered, and came to request useful handling of Google Image Search links, too :-)
The database design issue of insufficient field lenght could be circumvented by rearranging the URL to place the most interesting keywords first. Specifically I'm thinking of the "prev" keyword having the information about the query run by the user.
While rearranging the URL keywords is a deviation from the concept of simply logging whats coming, it should not be difficult to do, and I feel convinced that quite a few Statcounter users would accept this with thanks ;-)
The next logical step would of course be parsing of the query arguments contained within the URL encoded "prev" value, to provide image search stats (including keyword handling) similar to regular Google Search.
Please? :-)
webado
07-19-2008, 03:50 PM
The problem with that is that such parsing and rearranging of query string paramaters woudl have to be done at logging time as opposed to simply logging what is there and parse at reporting time.
It would result in a much higher server load.
tosommerfugle
07-19-2008, 04:56 PM
I know about programming. If done with performance in mind, using a few simple string manipulations (as opposed to full parsing/reconstruction), it would not cause a lot of server load.
webado
07-19-2008, 07:23 PM
Times millions of hits?
I knwo about programming too. I am a programmer. When you run an applicaitn like this which yu cnanto throtle in any way, you have to keep it down to bare bones.
Sure, a database change to increase or remove the limit of the field may be in order. Just the conversion would be a long process I'm sure. One day, maybe it will get done.
TheRain
07-28-2008, 04:26 PM
The robots.txt can be used to disallow access to your site to Googlebot Images.
It doesn't mean that all images that have already been idnxed will suddenly get deindexed. It takes time.
So can I take the code that you wrote in the earlier post and just drop it into my template?
The problem I'm having here is that I don't understand what robots.txt is.
The problem I'm having here is that I don't understand what robots.txt is.
http://www.robotstxt.org/
webado
07-29-2008, 12:15 AM
So can I take the code that you wrote in the earlier post and just drop it into my template?
The problem I'm having here is that I don't understand what robots.txt is.
Oh no. That's a file that you do not manage through any web builder program or system.
Notepad only. And you upload ti to the server by using an FTP program or if you have a file manager with your website control panel.
The robots.txt file is a plain text file of robots directives. Well behaved robots try to access this file and ,if it's present on the site, will observe the directives.
Directives are usually to indicate what is disallowed and for which robot (by user agent).
It has to be scrupulously correct, otherwise you can run into serious problems.
vBulletin® v3.8.4, Copyright ©2000-2010, Jelsoft Enterprises Ltd.