Why is it so hard to measure website traffic?
Today’s topic is a little geeky, but it’s something I’m interested in, and so are a lot of other bloggers, so here goes.
The question recently arose concerning how many visitors and unique visitors come to my blog on a monthly basis. If you’ve ever tried to penetrate the arcane mysteries of this issue, you know it’s an alligator-infested swamp.
There are different reasons why bloggers want to know how many people visit their site. If they’re accepting advertising, the number of visitors is crucial, since advertisers pay by the eyeball. More viewers, higher rates: same as it’s ever been or, to coin a phrase, as it is in print, so it will be online.
I don’t take advertising, but I’m interested in numbers. Everybody who puts a product out there wants to know if people like it or not. Movie producers check the box office; authors check their book sales; and bloggers like to know their visitor stats. You can call that egotistic, but it’s merely human nature.
But how do you get statistics? This is where we step through the looking-glass and enter Weirdville.
There are many services that purport to reliably measure such things as visitors and unique visitors to a website. (“Visitors” is anybody who goes to your site in a given time period. “Unique visitors” is individual people who go to your site in a time period. If I, Steve, go to your site 30 times in a month, I will count as 30 Visitors but only 1 Unique Visitor. This is my understanding of it, anyhow, and I’m sure people will correct me if I’m wrong!)
Most blogs are hosted, for a subscription fee, by a web server that also includes, as part of the package, a statistical service. For example, my web host is a company called Newtek Web Hosting. I can log into their password-protected area and access many different levels of statistics, including my daily and monthly visitors. And it’s been enormously gratifying to see that ever since I signed up with Newtek, more than two years ago, my numbers go in one direction only: straight up. I’m not going to get specific, but let’s just say that for December, 2010 (the most recent month available), my stats were beyond anything I’d ever imagined when I started my litte blog in May, 2008.
Keep in mind, I pay for Newtek; their data is private, so I’m the only one who sees the numbers. Most of us bloggers have private data we pay for. But how do you access someone else’s website traffic? There are a number of free sites where you can do that. One of the most popular is compete.com, which allows you, for free, to compare up to 5 websites at a time and see what their unique visitor numbers are. (The free service is not particularly timely; right now, on Jan. 5, I can only get numbers through the end of November. If you actually pay a subscription fee, I think you get more updated reports.) Now, according to compete.com, through the end of November, 2010, my unique visitors were about 1/12th of what Newtek reports.
One-twelfth! That’s a huge discrepency, and a troubling one for anyone attempting to get to the truth. How do you account for such variation in numbers? That’s what I meant about the swamp.
But wait, there’s more! You can also register with Google Analytics for free to measure your website’s traffic. I just did this for the first time starting Jan. 1, 2011–in other words, five days ago. During that period Google Analytics is showing a number for my Unique Visitors (they call them “Absolute Unique Visitors”) that, extrapolated to a 30-day month, works out to a number that’s almost exactly halfway between Newtek’s and compete.com’s. So what’s going on? Are my unique visitors X, 12 x X, or 6 x X? If you were taking a basic arithmetic test in sixth grade and your results varied by that much, you’d flunk.
There are many reports of compete.com’s inaccuracy. This one, which explains how compete.com “gets data from the surfing habits of two million people”–in other words, from a selected random sampling–also describes how “horribly wrong” compete.com has been proven to be. After interviewing technical experts at compete.com, the author (Matt Marshall) wrote this telling statement: “The Compete guys told us they are more interested in serving general consumer sites (their main customers), and don’t really care about niche sites,” which, of course, is what every wine blog is: a niche site.
Complaints about compete.com’s data litter the Internet, most commonly along these lines: “Compete and similar sites can only estimate traffic and they are pretty much always inaccurate. I personally know websites that receive 10x more traffic than Compete says.”
Another widespread complaint about compete.com is that the free data they provide is only a tease to persuade site owners to subscribe for more extensive, and possibly more accurate, data (which may be why compete.com pitches itself to larger, consumer-oriented websites with bigger budgets). This has led critics to lodge complaints like this one: “Let me be clear here, ALL http://WWW.COMPETE.COM SELLS IS DATA and thus if the data is inaccurate then all they are selling is junk. To pour salt in the wounds of a dissatisfied client….the site has endless links to “upgrade” the use for few hundred dollars. This is a classic bait & switch,” especially if “upgrade” is interpreted as “more accurate measuring.”
After several days of online research, I can tell you that the biggest complaint about compete.com is that their traffic reports are consistently about 1/10th those of other metric providers. As one writer stated, “What’s the best way to determine a site’s popularity? Ask the site’s owner.”
This only makes sense, because a site owner will be getting his data directly from the host, which is the only entity that has full access to the site’s server. A host, such as Newtek, doesn’t have to rely on estimates or polls or tracking or projections or logarithms; it can reliably and constantly track real traffic in real time as it pours into the server.
The reason this is important isn’t just to satisfy bloggers’ egos, or even so they can attract advertising. It’s because if there’s no accepted way to measure how many people are going to a site, then the development of online will be stymied, as publishers, content providers and advertisers stand back, waiting for true metrics to emerge. Measuring website traffic will continue be what it is now: a mystery wrapped in a riddle inside an enigma.
It astonishes me that we–the online world–have gotten to the year 2011, but still have no way of agreeing to something as basic as “How many people visit my website?” This is clearly unacceptable, and a disgrace.
Trackbacks/Pingbacks
- Alexa Page Rank Booster - [...] Source: http://www.steveheimoff.com/index.php/2011/01/05/why-is-it-so-hard-to-measure-website-traffic/ [...]
Amen, Brother Steve…
Ah, Steve, as you know well, I also suffer from the same frustrations.
Here’s my take on this, for what it’s worth:
All of these measurement systems use differing methodology, and none will ever really “agree” for that same reason – it’s fundamental, like the current differences between tea party conservatives and liberals.
Google Analytics is not perfect, but it’s based on technology that was becoming the standard in its time and when combining that with Google’s ubiquity, it’s more-or-less the standard now. Which is to say, it offers one of the lesser of all of the evils, and it gives us a bit more common ground on which to talk when numbers are needed. The downside is that it’s private and therefore comparisons between sites’ traffic stats are difficult or impossible.
As I tell many, many other bloggers, I think it’s best to have your own view of success with those numbers, and to not get too caught up in short-term peaks or valleys. Over a year, two years, three years, are you meeting your personal goals? If so, don’t sweat the small stuff with the numbers – blogging is really a marathon run, so the short term data is less important than the overall trends.
I only tend to worry about the numbers when I am approached about advertising, because it becomes more important in that context (and even then its importance in a niche topic like wine is amazingly overblown – better to reach a small, concentrated and passionate audience willing to buy than tons of people who aren’t as willing!).
Wine as you point out is a niche topic, and so expectations for traffic numbers need to be adjusted accordingly. And for those thinking about starting a general-purpose wine blog now – good luck, you will need it, as the field is kind of maxed out (I was lucky and got in early enough that I can still be general-purpose). So newer blogs are better off being a niche within a niche, which means even less traffic (but more influence within that niche-niche!)…
Cheers!
Joe–
You have just said about four mouthfuls.
The first and foremost is this: unless you are blogging for the purposes of making money by acquiring a large audience, spending time worrying about numbers instead of simply blogging because you have something to say is a waste of energy.
Yes, we all blog because we think we have something to say, and, yes, many young bloggers hope that blogging will lead to something more permenant and financially rewarding than blogging just for the sake of seeing one’s words in print, the fact is that the cost of entry to the blogosphere is next to nothing, and most bloggers start blogging because they think it will be enjoyable and rewarding to the soul. Measuring their efforts by numbers alone is to change the success measure from content to eyeballs.
Secondly, Joe, you and Steve and I and others like us fall into a different category. It is not to be construed as a better category or a category of higher standing, but we do hold ourselves out as professionals. And as such, the first measure of our success has to be the feedback we get from our readership. If we are doing a good job for the readers, regardless of whether they amount to one thousand uniques or five thousand uniques per day, week or month, then we are successful at the most basic level.
The idea that blogging is a marathon run confuses me a bit. For you, that might be true, but, even for someone like Steve, blogging is a means to an end, and often those ends are multifacted and occasionally even in conflict with one another. Numbers may be a long-run measure of success, and if that is the game, then, yes, one needs be ready for the marathon. But if success is more than a number, then success can be much more immediate if one feels that the efforts of writing are being met by the acceptance and support of that writing.
Finally, something you said about trying to start a general blog at this late date has struck home to me. When I started my print publication, Connoisseurs’ Guide to California Wine, I came into a space that was not previously occupied. Being first in is a giant advantage. My blog is among the last in. It comes into a space filled with successful blogs like Vinography, Steve Heimoff, 1WineDude, Dr. Vino and others plus dozens of great niche blogs like The Wine Economist and a host of other pros trying to make a living through Palate Press, Zester Daily and Wine Review Online. Being last one into the pool is clearly not been so quickly rewarding as being first into the pool. But I am very happy with the content that is appearing in the Connoisseurs’ Blog, and I feel like it is adding to the knowledge base.
What you are suggesting is that the blogosphere is becoming more mature. It has settled in and the rate of change has slowed. I see that as a good thing in that there is sure to be a consolidaton phase of some sort, and possibly out of that phase will come new and better rewards for folks whose effots today are largely rewarded with intellectual stimulus. However, I will be surprised if very many bloggers ever make a financial go out of blogging. It is a form of work, but it has yet to become an enterprise.
Charlie – wise words (as always, and I would expect nothing less from my adopted papa!).
What I meant by blogging being a marathon is that gaining success (measured in reader interaction, traffic, anything external really) doesn’t come overnight, it’s the result of consistent hard work and quality. I think this holds true even for guys like you and Steve, who are coming to blogging from a different place, having already made names for yourselves in print.
Because you have names for yourselves already, you can actually be general-purpose blogs and have success because you are speaking to an audience that you have already helped to build via print. Bloggers without that background are in for a longer haul and are less likely to be successful, I think, unless they choose their niche and audience very carefully.
Cheers!
I have been monitoring website traffic for several years I have tested many programs and compared results. Complete.com is a complete waste of time and does not have the correct information. When I made a comparison of Google Analytics to an on server program the Google program was accurate. For Google to work you need to place the code on each and every page you want to track.
The numbers I have seen on your web traffic were consistent with Google’s numbers.
Google’s approach is pretty sound — they embed a small javascript program in your Web page, which means that it tends to count real visitors (as opposed to robots).
Depending on what analytics program your host provides, it may be more or less accurate, but there’s a lot of science to getting good data out of a raw web server log.
The sample-based sites like compete.com are at the mercy of their “Nielsen households,” i.e. the people who they use as their “representative sample.” I would expect them to under-report.
I strongly recommend that winery (and wine) bloggers use Google, unless they are themselves tech geeks with an interest in the details.
(Also, your readership is complicated by people [like me] who subscribe via RSS [e.g. Google Reader], since we don’t actually visit your site — a service like FeedBurner [also a Google product] will give you insight into that readership).
“It astonishes me that we–the online world–have gotten to the year 2011, but still have no way of agreeing to something as basic as “How many people visit my website?” This is clearly unacceptable, and a disgrace.”
I understand where you’re coming from Steve, but I’m not sure you’re clear on the reasons why you can’t get an exact number. Some reasons may be tech-related for sure, but others have more to do with personal privacy.
Some people prefer to visit anonymously, so they run programs that blocks things like Google Analytics (Firefox’s NoScript, for example) and therefore are not tracked. Apparently switching off cookies can also affect Analytics data.
Not sure how this affects your server’s own tracking system, but perhaps the results would be similar.
The question becomes: what’s more important, the individual’s right to privacy or the web owner’s right to know who’s reading the website?
Even though I’m a web owner myself, I still believe the individual should have the right to visit a website anonymously. There are programs out there that you can purchase that forces readers to sign up (even if it’s for free) before they can read the content, which will help you track exactly how many people read your blog posts.
Personally, I think that kind of defeats the purpose. If I lose track of a few readers/website visitors because they want to remain anonymous, so be it. I’d much rather that than to institute a system of precise accountability. I’d lose a lot more readers that way…
~Graham
Numbers are an interesting thing. Because I run the Ad Network, I see numbers for over a hundred different wine websites, and have a pretty good idea, not of exact numbers, but of comparative ones.
First, the question of unique visitors. They can be measured in different ways. Steve, you suggested that if you visit a site once a day for a month, you are just one unique. That is only true if the unique counter is set for thirty days. If, on the other hand, it is set for 24 hours, you show up as 30 uniques. Even more interesting, if it is set for 1 hour and you visit three times a day, 90 uniques it is. Do that with three different websites, one set for an hour, one set for a day, an one set for a month, and exactly the same behavior creates three different “unique” counts.
Google analytics, and the others as well, work on java script. If the viewer blocks it, the visit does not count, as noted above.
It is very hard to get solid numbers, but that does not mean it is not impossible to get pretty good ones. One way to do it is to load Google Analytics, StatCounter, and Quantcast, and compare them. Where they are consistent, they are reliable. Where they differ, try to figure out why. Do they count “uniques” on a different timer? Is one script blocked while another is not? Is one number a relatively consistent factor of another, perhaps indicating you have accidentaly (or intentionally) double-loaded one piece of code?
The best you can do is approximate traffic, but it is possible to at least get a reasonable picture of trends, numbers estimates, and comparisons to other sites.
David, thank you. I have been told that compete.com is used by most ad agencies. If this is true, and if it’s true that compete.com is ridiculously off from the truth, then the situation is serious and needs to be repaired.
Graham, is there a way for a visitor to remain anonymous, but still be counted? You know, we have such a system in America. It’s called “voting.” You do it so that your vote counts, but no one know who you voted for.
Complete.com is only accurate if you put their token on your web page same as Google. Ad agency’s only rely on sites with tokens.
A lot has been said in this comment stream already. Joe R. had some good points. I completely agree with @Charlie that blogging is about good content, but I differ in that i don’t believe numbers are a waste of time for a blogger. Web analytics are a beautiful thing. I monitor a dozen or so news sites and blogs through Google Analytics (which is the way to go) and without looking at these numbers we can never gauge what content is working and who is reading it. The attitude that blogging is all about creating great content is right, but the definition of “great content” can be different to the blogger than to the readers. And without readers and a growing audience isn’t blogging fairly useless? Why not just keep a journal? So, if i can put my two sense in, a serious blogger should always keep regular tabs on their web analytics to gauge how their content is received, who is coming to their site, and if their audience is growing. As to the how – Google Analytics should be the go to for a blogger. Compete screws up everybody’s numbers. As does Quantcast (although perhaps slightly less). I think it’s totally acceptable and industry standard to put forth your Google Analytics numbers for advertising sales purposes.
Steve,
Your post made me laugh AND feel old at the same time. I remembered how giddy I felt when the counter on my self-built website hit 2000 after about two years (it took me months back in 1999, trying to figure out Front Page so I can get it to actually work). I am no tech guy but I look at our site’s Google analytics once in a while, it occured to me at some point that I have no way of really knowing if this data is true or not. I can only imagine how frustrating this might be for one who depends on web traffic for a living.
Alvin Toffler said: “One of the definitions of sanity is the ability to tell real from unreal. Soon we’ll need a new definition.”
I say he is spot on!
Terry, thanks for your comment. I completely agree with everything you said.
Hi Steve,
I’m not that deep into the technical side. I do know that Google Analytics, for example, runs a JavaScript program, and NoScript disallows it. So in that case, anonymity is simply a by-product. (People run NoScript to prevent downloading of harmful viruses to their computers; to be safe, they stop all of them.)
However to continue your metaphor: although modern democracies use secret ballot, the voting itself is not done anonymously. The polling station still can tell from the roll if you’ve voted, i.e. “visited the polling station” — they have to track that so people don’t vote multiple times.
In that case though, the information theoretically could be kept semi-secret — swipe an ID card or something that’s stored in a database that no human eyes look at. That way you don’t have people physically crossing your name off the roll, but you can still track things like voter turnout. The problem seems to be though that the more technology we place between people and voting, the more chance of mistakes.
Back to websites: I agree with all above that Google Analytics is probably the industry standard these days. Like democracy (ironically), it’s not perfect, but it’s the best thing we have.
~Graham
Steve,
It is possible to absolutely and comprehensively measure web site traffic, and while to those who don’t make their living designing, building, and maintaining web sites for large corporations, it can seem like a mystery. But it is not.
Google Analytics is the best measurement package available for those who don’t want to spend a hundred thousand dollars on Omniture or CoreMetrics, the leading commercial analytics packages.
The combination of Google Analytics and hosting your RSS feed with FeedBurner will give you an exact picture of how much traffic you get.
The main problem, which I assume that you, and many other bloggers suffer from, is that you don’t own your own web site. It’s hosted by a hosted service like wordpress.com and typepad.com and therefore you’re at the mercy of those services and their (lack) of transparency and ability for you to fully manage your junk.
Alder
Alder, they can manage my junk as long as they don’t touch my junk! But seriously, why would my web host’s numbers not be accurate — in fact, more accurate than Google’s?
Steve,
Just FYI — your functionality to notify commenters that there are follow up comments isn’t working for me. I never saw that you posted this reply.
Your web host’s numbers would be wildly different from google because the way they are “counting” is based on different data, and the methodology is probably different from googles.
Your host may be using server log data, whereas google uses page load events. Your host may not distinguish between automated bots and spiders and real people (Google does). Etc. etc.
Alder
Alder, the notification kicks in after I approve the initial comment. You’re in there now.