Eurica!

numbers for the people.

Multiple languages in Accept-Language header

| 1 Comment

One of the use cases I wanted to support in OrganicYak was multi-lingual users: when we detect that a user speaks a both a language that your site is available in and one that isn’t, we could still show some text in their native language.

When your browser requests a webpage it sends an Accept-Language header which tells you which languages you’re able to accept content in, and it what order. Mine on this machine is:


ACCEPT_LANGUAGE] => en-US,en;q=0.8,en-GB;q=0.6,fr-CA;q=0.4,fr;q=0.2

Namely that I’d prefer American English or just English, but I can read British English if you have it (q=0.8), if not give me Canadian French or just any French at all. The values are set automatically (in Chrome at least) from the language settings I have on my machine, a US keyboard but with English and French language settings. Honestly, how exactly this value gets set is a bit of mystery to me, since IE, Firefox and Safari seem to be only accepting English.

Google Analytics only tracks one language per visitor, but I can’t be the only one out there using the Accept-Language header the way it was intended. So as part of some work I’m doing on what % of people block ads, I recorded the full language string of all my visitors for a thousand visitors (this is a pretty small sample, so I wouldn’t call it statistically relevant, but I’ll try update it with more data in the future).

Roughly 3% of browsers in the wild advertise multiple languages.

This no doubt under-estimates how many visitors can read multiple languages, and my site is all in English which may depress these numbers, but ever since I became famous the traffic has come from all over the world. Really, the Accept-Languages header has never gotten the love it deserves, it’s hard to set and lots of sites ignore it (for instance, Canada Post will ask your language in a splash screen, even if your browser very clearly states you only speak one or the other official language).

For reference, here are the Accept-Languages strings I found from the first 1000 visitors:

  • de-de,de;q=0.8,en-us;q=0.5,en;q=0.3
  • de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4
  • en-gb,en-us;q=0.7,de-ch;q=0.3
  • en-GB,en-US;q=0.8,en;q=0.6
  • en-GB,en-US;q=0.8,en;q=0.6
  • en-us, en;q=1.0,fr-ca, fr;q=0.5,pt-br, pt;q=0.5,es;q=0.5
  • en-US,de-DE;q=0.5
  • en-US,en;q=0.8,en-GB;q=0.6,fr-CA;q=0.4,fr;q=0.2
  • en-us,en;q=0.8,es;q=0.5,es-mx;q=0.3
  • en-US,en;q=0.8,es;q=0.6
  • en-US,en;q=0.8,es-419;q=0.6
  • en-US,en;q=0.8,pl;q=0.6
  • en-US,en;q=0.8,pl;q=0.6
  • en-US,en;q=0.9,fr;q=0.8,de;q=0.7,id;q=0.6
  • en-US,en;q=0.9,ja;q=0.8,fr;q=0.7,de;q=0.6,es;q=0.5,it;q=0.4,nl;q=0.3,sv;q=0.2,nb;q=0.1
  • es-es,es;q=0.8,en-us;q=0.5,en;q=0.3
  • fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
  • fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
  • fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
  • fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
  • fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
  • fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4
  • fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4
  • fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4
  • it-IT,it;q=0.8,en-US;q=0.6,en;q=0.4
  • it-IT,it;q=0.9,en;q=0.8
  • ja,en-us;q=0.7,en;q=0.3
  • nl,en-us;q=0.7,en;q=0.3
  • pl,en-us;q=0.7,en;q=0.3
  • ru-RU,ru;q=0.9,en;q=0.8
  • zh-tw,en-us;q=0.7,en;q=0.3
  • http://gphemsley.wordpress.com/ Gordon P. Hemsley

    Gecko (which powers Firefox, etc.) calculates the q values by counting up the number of language tags and dividing it into 1, so that each q value is a given step down from the previous one. (The HTTP standard clamps the q values to 3 decimal places, though—not sure if that ever comes into play.) Judging from your example, it appears the Chrome does something similar.

    The code powering the Accept-Language header (and its associated UI) has been essentially unchanged since the Netscape days, though I’m spearheading an effort to bring BCP 47 support to Gecko. (Most people who care nowadays just hand-edit the ‘intl.accept_languages’ preference in about:config. See if you can dig up my setting. :) )

be a pal and share this would ya?
Multiple languages in Accept-Language header
Web Statistics