If you have seen my presentation about Varnish on the PHPBenelux Conference 2011, you know already that Varnish is a really great reverse proxy caching system that can boost your website performance massively. A somewhat lesser-known feature of Varnish is that its VCL-configuration is very powerful. So powerful in fact, that we could easily add, replace or modify headers to and from Varnish. I have shown for instance in my presentation that you can send a X-Cache-Hit or X-Cache-Miss header to display if a request has been fetched from the either the Varnish cache or has been retrieved from the back-end.

After my presentation, Andreas Creten approached me, asking about mobile device detection in Varnish. He had problems on a website keeping up with all types of mobile devices to detect in the correct way. That question didn’t result in a direct answer, but did trigger something…

What if Varnish could figure out if the visitor was working on a mobile browser, and if so, set a flag to the back-end servers so it could display a mobile site? Impossible? No. Actually, it’s quite easy…Word of caution: we are going to leave the PHP domains and enter C. Varnish has the possibility to implement C-code inside the Varnish configuration file (yes, you can literally program inside the configuration file).

Meet WURFL: Wireless Universal Resource FiLe

It would be really nice if a browser just returned us if they were mobile or not, since we could just check that flag inside our php-code and be done with it. Unfortunately, this is not the case. But what we CAN check is the user-agent that a client send. If we can find out of the user-agent came from a browser on a mobile device, we could set that “mobile”-flag ourselves.

WURFL is a big data file with all the user-agents on mobile systems. We can safely assume that if the user-agent is defined in WURFL, it’s a mobile browser. WURFL has got loads of extra data like how big the mobile screen is, if it has wifi etc. Check out the possibilities on the WURFL site.

The plan

  1. Load WURFL XML into memory
  2. Get the user-agent from the incoming Varnish request
  3. Check if user-agent is inside the WURFL file
  4. Set X-Mobile flag in the Varnish response (either to yes or no)
  5. Profit!

Now, the main approach would be to implement this completely inside the VCL-configuration (/etc/varnish/default.vcl for instance), which is possible since you can use plain-C inside this. However, it bit of cramps our style, since we don’t have full control over everything so we use another way: we create our own detection-library (a dynamic shared object) and load that from the VCL.

Creating the .so:

Even if you don’t know C that well, the main idea is simple: load the XML (through a external library), make an “xpath” and see if we get any result. It will only return a 1 or 0 (or negative number on error) so we can use that as a boolean value.

and you need a header file (more or less, an interface)

Compilation should be done by:

gcc -c -o wurfl.o wurfl.c -I/usr/include/libxml2
gcc -shared -Wl,-soname,libwurfl.so.1 -o libwurfl.so.1.0.1 wurfl.o -lxml2

which creates the shared object, which uses libxml2 (so make sure you’ve got libxml2 libraries AND development files on your system).

Implementing in Varnish

Now that we have installed our shared object, implementation is a breeze.

Not really that difficult but let’s explain. First of all, we need to include our header file so the C-compiler knows about the wurfl_* functionality and we define a global “is_mobile” variable. Then we create a subroutine that fetches the user-agent header from the request through the VRT_GetHdr call. Notice the \013 in front of the User-Agent:, which is the length of the string we want to fetch in OCTAL. \013 means 11 in decimal. If you made a mistake, varnish crashes on the request calls with a “Condition(l == strlen(hdr + 1)) not true.” error.

Once we have fetched the user-agent, we can set the is_mobile variable (we use that later) and we the backend-response header. In our case, we set the X-Mobile: header (mind the strlength again), and add the “yes” or “no” behind it. Note we need to add vrt_magic_string_end to close the string.

Varnish has 3 different hooks that are processed before entering a backend. We need to hook our functionality to all of them. Finally, we want to display the resulting x-mobile flag to the user as well, so the vcl_deliver will add the header to the output that is returned to the visitor.

Testing the setup

So let’s try out the complete setup. I just use a simple php file that does only the CHECKING of the mobile flag:

We run this on the standard port 80. Next up: we start Varnish. We need to use some different startup parameters just to make sure varnish will actually use our WURFL library:

  varnishd -s malloc,32M -a 0.0.0.0:81 -f /etc/varnish/default.vcl \
  -p 'cc_command=exec cc -fpic -shared -Wl,-x -lwurfl -o %o %s'

I also use a small 32M malloc’ed buffer, and use port 81 to accept incoming connections. Now, when we direct the browser to port 81, we get something like this:

As you can see, the output has got a X-Mobile response, and the php-code actually can use HTTP_X_MOBILE. Since we use Firefox’s default user-agent (Mozilla 5.0 etc…), the X-mobile is set to “No”.

Now when we change the user-agent (with a simple Firefox extension) to an iphone 3.0 user-agent, this is our result:

All the mobile-flags are set to “yes” since it detected the iphone user-agent. It works… sweet!

Why not do this in PHP?

Two reasons: first of all, it’s much quicker to do it outside PHP since you don’t have all the php/zend overhead. Second reason: It might be even possible that either Varnish or the web-server behind Varnish will redirect the request to a dedicated web-server that only does the mobile site and maybe that website isn’t programmed in PHP but in something else (god forbid :p). Either way, you would need to program the same logic twice (or maybe even more) and you can redirect more quickly when you let varnish handle the detection.

Todo’s

This blog-post is just a proof-of-concept. We haven’t ran any benchmarks yet on how fast or slow it is and there is lots of room for improvement. For instance: instead of loading a big-HTML file, why not flatten the XML file to only user-agent (md5)-hashes and compare the md5-hash with an md5 from the user-agent. This would remove the need of a bulky XML overhead which saves time and memory, but I will leave this exercise up to you. :)

Big credits also to my colleague Joshua for helping me with this proof-of-concept!

Tags: There are no tags

About Jeroen van Dijk

Jeroen van Dijk is technisch consultant en evangelist bij Enrise en mede-oprichter van 4worx, de voorloper van Enrise. Voor complexe vraagstukken bedenkt hij technisch-creatieve oplossingen die uitblinken in schoonheid en eenvoud.

Zend Certified Engineer PHP5 Zend Framework Certified Engineer

Leave a Reply

Your email address will not be published. Required fields are marked *

32 thoughts on “Mobile device detection with WURFL and Varnish

  1. Pingback: Tweets that mention Mobile device detection with WURFL and Varnish - Enrise -- Topsy.com

    • The possibilities are pretty much endless. If that kind of data is available, you can pass it back to varnish. The whole blogpost is more or less a proof-of-concept on how to setup such a system, but you can expand it with whatever you like.

  2. You solution is really interesting, and something I would like to tryout with Tera-Wurfl if possible. I am having a slight issue though with some beginners C, if I set is_mobile to 1 in the vcl it is fine, but the second I run the function is_mobile = wurfl_ismobile();

    I get an error along the lines of:
    undefined symbol: wurfl_ismobile

    Is this something you have come across and may be able to advice on? My files are all very simple at present, so it is possible I have over-simplified somewhere, but I still have the *.c and *.h files with wurfl_ismobile in both

    • It looks like the wurfl shared object is not loaded by varnish. Did you use the “varnishd” commandline with the “cc_command” to test it? Otherwise varnish does not know anything about wurfl_* functionality.

  3. Pingback: undefined symbol: wurfl_ismobile

  4. I’d like to offer an alternative to the way you’re hooking your code into varnish.

    Instead of making an object file that you link in by editing the cc_command parameter, which is a pretty dirty hack, you should probably:

    a) Write your inline code in a separate VCL, and include that
    or b) Compile your code to a shared library and load that from your VCL

    See for instance plugin.vcl in http://www.varnish-cache.org/trac/raw-attachment/wiki/GeoipUsingInlineC/GeoIP-plugin-2.tar.gz

    Hacking up your cc_command parameter just leads to situations where admins forget they have to change it on new installs. Or if varnishd’d default value changes because of some feature, this parameter has to be adjusted likewise. And it just goes against the KISS principle. :D

    • Basically the main modification would be that instead of pre-linking against the shared-object, you dynamically load the shared object on runtime which is a viable alternative. True enough, that would save you modifying the cc_command parameter. I’m curious though (should test it though) how varnish would react when the dlopen/dlsym fails. Would varnish still run or halt?

      • The way the GeoIP plugin VCL is written, it doesn’t halt or crash varnish, it just doesn’t add the country code to the hash anymore.

        However, it would probably be misleading in the “if it starts everything is OK” paradigm, so an exit() is probably in order.

        In related news, Varnish 3.0 will add vmods to the mix. It will allow the writing of various plugins for Varnish that do not require ugly and semi-complex inline-C to enable. They’re available for testing in trunk right now.

  5. Any idea how to resolve this?

    cakeboy@devbox:/home/john/varnish# gcc -shared -Wl,-soname,libwurfl.so.1 -o libwurfl.so.1.0.1 wurfl.o -lxml2
    /usr/bin/ld: wurfl.o: relocation R_X86_64_32 against `a local symbol’ can not be used when making a shared object; recompile with -fPIC
    wurfl.o: could not read symbols: Bad value
    collect2: ld returned 1 exit status
    cakeboy@devbox:/home/john/varnish# gcc -shared -Wl,-soname,libwurfl.so.1 -o libwurfl.so.1.0.1 wurfl.o -lxml2 -fPIC
    /usr/bin/ld: wurfl.o: relocation R_X86_64_32 against `a local symbol’ can not be used when making a shared object; recompile with -fPIC
    wurfl.o: could not read symbols: Bad value
    collect2: ld returned 1 exit status

  6. Another implementation problem, this time:

    test@dev:/home/test/varnish# /usr/sbin/varnishd -s malloc,32M -a 0.0.0.0:81 -f /home/test/varnish/default.vcl \
    > -p ‘cc_command=exec cc -fpic -shared -Wl,-x -lwurfl -o %o %s’
    storage_malloc: max size 32 MB.
    Message from C-compiler:
    In file included from ./vcl.OaGbHZXp.c:820:
    /var/test/varnish/wurfl.h:6:22: warning: no newline at end of file
    /usr/bin/ld: cannot find -lwurfl
    collect2: ld returned 1 exit status
    Running C-compiler failed, exit 1
    VCL compilation failed

  7. Pingback: Mobile phone detection with Varnish – Quick getting started guide | John McLear's School Technology

    • That’s a shame…. maybe will have a look later to see what caching options there are to speed up the process

    • The code is a proof of concept. If you look at the code, you’ll notice that the is_mobile() function must do an xpath through the whole xml file. This takes time (a LOT!). There are a few ways to speed the whole process:

      1. Remove all unnessesary tags from the XML. Since we only need the “mobile” information, remove everything that is not related to mobile. This would mean that we can have an xml with only the useragent and “mobile” flag. It will speed up the xpath compares and will reduce processor usage.

      2. Do not use XML. Create a flatfile with again the useragent and mobile-flag. Reading the flatfile consumes less memory and can be parsed quicker.

      To speed it up even more, create a btree like structure from the crc’s of the useragents. This is probably the quickest way to find out if a user-agent is mobile or not, but requires some more preprocessing but in the end it will be the quickest way I can think of.

  8. Pingback: A day in the life of… » Varnish in non-compiler environments

  9. Nice tute ;)
    Is it possible to use the is_mobile variable in vcl_hash so that a separate hash can be created for a mobile rendered version on the same domain?

    • You could create logic like this:


      sub detect_mobile {
      ...
      VRT_SetHdr(sp, HDR_REQ, "11X-Mobile:", (is_mobile == 1)?"yes":"no", vrt_magic_string_end);
      ...
      }
      sub vcl_hash {
      set req.hash += req.http.X-Mobile;
      }

      The hash subroutine can only read the client request, not the backend request. So we add the X-Mobile header to the client request and add it to the hash.

      I haven’t tested it yet!

  10. Hi guys. Luca Passani of WURFL here.

    interesting but way too error-prone. This will fail on all androids and iPhones simply because the language substring is different from what is recorded in WURFL.

    Also, they use some libraries to do XPath. I am sure it is fast, but I doubt that grabbing through those thousands of UA strings is fast enough.

    Question: how much interest is there out there for a Varnish WURFL plugin of some kind? if someone from the varnish team wants to contact me offline, I would be happy to talk.

    Takk

    Luca

    • @Luca: There’s definitely interest :) I’m developing a mobile site where this functionality is sorely needed. Any news on this front?

  11. Hi,

    Nice code, but my server doesn’t like to parse a > 16MB xml file.
    Is it not beter to convert the wurfl.xml to only user agent string?

    sed -n -e ‘s/^.*id=”\(.*\)”.*user_agent=”\(.*\)”\sfall.*$/\2/p’ wurfl.xml |sort|uniq > moblist.clean (around the 829K)

    And do a check on this file?

  12. Hi Luca Passani, creator of WURFL and CTO at ScientiaMobile here.

    I had commented a few months back that the approach presented in this article is not very scalable.
    I and the team at ScientiaMobile are happy to announce that, today, we have released a beta version of our Apache, NGINX and Varnish-Cache modules. The modules utilizes our own standard WURFL API under the hood, so no need to go out of one’s way to analyze the wurfl.xml in funny ways :)

    The following blog post has more details:

    http://www.scientiamobile.com/blog/post/view/id/25/title/HTTP-and-Mobile%3A-The-Missing-Header-

    Because of the investment required to develop the modules, they are being offered under commercial terms exclusively.

    Thank you

    Luca Passani
    CTO @ScientiaMobile

  13. Hello,

    There at least three other ways to achieve something pretty similar in Varnish:

    * The VCL way
    It uses a simple, yet powerful, VCL based device_detection subroutine which is in use by several sites (and accept patches from users):
    https://github.com/varnish/varnish-devicedetect/

    * The commercial version from the makers of Varnish
    If you need speed, a commercial graded product, warranties and support, then the DeviceAtlas VMOD is something for you. It is made and supported by Varnish Software and uses dotMobi’s Award-winning C++ API:
    https://www.varnish-cache.org/vmod/deviceatlas-mobile-detection

    * The FLOSS version using OpenDDR + dClass
    There is a project that provides you speed through a VMOD which uses the dClass API (made in C, supporting also Java through JNI) and the OpenDDR database (they have also an nginx version):
    https://github.com/TheWeatherChannel/dClass

    We hope you enjoy detection and serving mobile content at wire speed with our software.

    • Hello Ruben,

      You are absolutely right. This blogpost is already over 2 years old and time hasn’t stood still since then.
      Your examples are indeed more applicable nowadays. The same goes for the ScientiaMobile version.

      Anyway, thanks from a happy Varnish user for providing your comments.

Volg ons

Twitter

Douglas Crockford opening the #zendcon closing keynote talking about the Dutch genius Edsger Dijkstra! http://t.co/yK9FSTarPn
- Thursday Oct 30 - 6:53pm

Meet the Zend team! #zendcon http://t.co/lM5qrkgqCl
- Thursday Oct 30 - 2:50am

. @EvanDotPro is sharing very useful and detailed info about #nginx and configuring it for performance! #zendcon http://t.co/gXmBxJnAsf
- Wednesday Oct 29 - 11:26pm

After the panel discussion about #bitcoin, at 2:45pm @jrvandijk will be talking about our experience with Zend Server in room 204! #zendcon
- Wednesday Oct 29 - 8:39pm

Synthetic benchmarks from PHP over the years. 5.6 is about 6x faster then 4.4. #zendcon http://t.co/RZCt0ISTVV
- Wednesday Oct 29 - 4:51pm