"Curiosity is the very basis of education and if you tell me that curiosity killed the cat, I say only the cat died nobly." - Arnold Edinborough

Edit: Part 2 is now available.

This is the first entry in a short series I’ll do on caching in PHP. During this series I’ll explore some of the options that exist when caching PHP code and provide a unique (I think) solution that I feel works well to gain high performance without sacrificing real-time data.

Caching in PHP is usually done on a per-object basis, people will cache a query or some CPU intensive calculations to prevent redoing these CPU intensive operations. This can get you a long way. I have an old site which uses this method and gets 105 requests per second on really old hardware.

An alternative that is used, for example in the Super Cache WordPress plug-in, is to cache the full-page data. This essentially mean that you create a page only once. This introduces the problem of stale data which people usually solve by checking whether data is still valid or by using a TTL caching mechanism and accepting stale data.

The method I propose is a spin on full-page caching. I’m a big fan of nginx and I tend to use it to solve a lot of my problems, this case is no exception. Nginx has a built-in Memcached module, with this we can store a page in Memcached and have nginx serve it – thus never touching PHP at all. This essentially turns this:

Concurrency Level:      50
Time taken for tests:   2.443 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      11020000 bytes
HTML transferred:       10210000 bytes
Requests per second:    2046.32 [#/sec] (mean)
Time per request:       24.434 [ms] (mean)
Time per request:       0.489 [ms] (mean, across all concurrent requests)
Transfer rate:          4404.39 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       2
Processing:     6   22  19.7     20     225
Waiting:        5   20   2.6     20      40
Total:          6   22  19.7     20     225

Percentage of the requests served within a certain time (ms)
  50%     20
  66%     21
  75%     22
  80%     22
  90%     24
  95%     26
  98%     29
  99%     39
 100%    225 (longest request)

Into this

Concurrency Level:      50
Time taken for tests:   0.414 seconds
Complete requests:      5000
Failed requests:        0
Write errors:           0
Total transferred:      11024350 bytes
HTML transferred:       10227760 bytes
Requests per second:    12065.00 [#/sec] (mean)
Time per request:       4.144 [ms] (mean)
Time per request:       0.083 [ms] (mean, across all concurrent requests)
Transfer rate:          25978.27 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.1      1       2
Processing:     1    3   0.3      3       5
Waiting:        1    1   0.3      1       4
Total:          2    4   0.3      4       7

Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      4
  80%      4
  90%      4
  95%      4
  98%      5
  99%      5
 100%      7 (longest request)

What’s important to note here is how these figures will scale. To get these numbers I developed a very simple proof-of-concept news script, all it does is fetch and show data from two MySQL tables: news and comments. A more complicated application might result in only 100 requests per second or if something like WordPress or Magento as low 20 requests per second! The good thing is that with full-page caching the time required to fetch and display the data depends only on the size of the cached data. Therefore if your application is written to do full-page caching it will always be able to enjoy low latency and high concurrency.

The Complications

Full-page caching does introduce some complications, though. As mentioned earlier the goal is to make nginx serve the cached pages, as such we cannot perform any logic during the serving of the page. This means we need to handle invalidation of cached pages during the updating of the data they use.

To be able to invalidate pages it’s important that we understand what data we have to work with and how it relates to not only our pages, but also our code. We will be using a framework so we can create a few rules that will help us understand the whole system.

  • The framework uses a three-tiered setup of controllers, libraries and templates.
  • Controllers will dictate how to handle a request defined by the URI.
  • Libraries will be used to access all data.

This is how most frameworks work, you have a few of the big ones which use a MVC pattern but such a setup will be largely the same. From these rules we can determine how the relationship between data, controllers and pages will be.

  • All data will need an identifier. For instance if you have a news script you’ll need an identifier for “news” and “comments”.
  • All controllers must specify which data they use by referencing the identifier.

So to recap. The goal is to invalidate the correct pages, to do this we need to know which pages use what data. gives us 3 important parts.

  • The library that handles the editing of data, and therefore the invalidation triggering.
  • The controller handles the requests based on the URI and therefore relates to the cached pages.
  • The actual cached pages.

Finally, we’re unlikely to have only one of each, for instance often multiple controllers will be using data. To continue our news script example, we have a controller to fetch the news and a controller to generate a RSS feed of the news. Similarly a controller might generate multiple pages, for instance one page per news post to display the comments. Therefore we also need to consider the inter-data-relationships.

  • One-to-many relationship between invalidated data and controllers.
  • One-to-many relationship between controllers and pages.

Data & Controllers

Earlier we defined a rule that all controllers much specify which data they use. This is useful as it means we can create a dependency list between data and controllers. When data is invalidated we can do a lookup in the dependency list and see which controllers we need to tell about the invalidated data.

This solves the problem elegantly and with OOP we can define interfaces to force controllers to implement the required methods. If they don’t we can set a flag that prevents the data from being cached and they should work normally.

One possible downside to this is that you can no longer edit files on the fly. If you change the way data is used you will most likely need to regenerate the dependency list, therefore it becomes critical that you have a deployment process in place for all code changes. Personally I think this is required any way so it does not cause me any problems, however it is something that has to be considered.

Controllers & Pages

Websites are per their nature diverse, in this framework all requests are passed to a controller along with the URI. The controller then uses the URI to determine what data to use to generate the output. The problem here is that there is a huge range of options on how the controller might look and behave. It would be really difficult to define something like a dependency list as a controller might use multiple data sources which will update dynamically. This would require the dependency list to be updated every time new data was added, not really a feasible solution.

The easy scenario is where the page URI is directly related to the data. For example in our news script the URI /news/4/ might show the news post with ID 4. If a comment is added to this news post we trigger an invalidation on the comments data identifier. The library that inserts the data will know to insert to news post 4, therefore it can also pass this along when triggering the invalidation. This allows the controller to determine that the page /news/4/ needs to be invalidated.

The bigger problem is when data is used as part of a set defined by data not related to the updated data. A simple example here would be a search function. You have the controller search and the keyword “PHP” being searched for – the URI for this would most likely be /search/PHP/. When a news post is updated we pass along the ID to the controller but we have no way to determine which URI actually uses said news post. Keeping track of each search term is not feasible. There are a few options here but none that are really perfect.

  • Don’t cache at all, data will always be current but might be CPU intensive.
  • Increase caching granularity. Pass each request to PHP but cache the IDs of the news post and fetch the current data.
  • Cache the full page using a time-to-live value. This means we have stale data for a bit but we keep high performance.

Ultimately it depends on your situation and what will fit best. I’d imagine I’d most often choose TTL caching or in case I need current data then increased caching granularity.

This covers the overall system, next time I’ll talk about how I’ve chosen to implemented this.

  • Herbert

    Posted: September 20, 2010


    Please tell me, were those test run from the same server that nginx was running or was it run from a different pc in a different network? Cause I'll be happy just getting the stats you got before you switched to using the nginx cache.

    We also use memcached for our pages and this is the stats I'm getting for a page that's fully cached, we use apache 2.1 and php 5.1

    Stats when I run ab on a different datacenter
    Concurrency Level: 50
    Time taken for tests: 424.595 seconds
    Complete requests: 50000
    Failed requests: 0
    Write errors: 0
    Total transferred: 1882973672 bytes
    HTML transferred: 1868221017 bytes
    Requests per second: 117.76 [#/sec] (mean)
    Time per request: 424.595 [ms] (mean)
    Time per request: 8.492 [ms] (mean, across all concurrent requests)
    Transfer rate: 4330.81 [Kbytes/sec] received

    Connection Times (ms)
    min mean[+/-sd] median max
    Connect: 78 82 3.4 80 104
    Processing: 318 342 47.3 340 1626
    Waiting: 81 91 47.9 87 1384
    Total: 396 424 47.1 422 1705

    Percentage of the requests served within a certain time (ms)
    50% 422
    66% 423
    75% 424
    80% 425
    90% 427
    95% 428
    98% 432
    99% 435
    100% 1705 (longest request)


    Stats when run on the web server


    Concurrency Level: 50
    Time taken for tests: 12.615 seconds
    Complete requests: 50000
    Failed requests: 0
    Write errors: 0
    Total transferred: 1882737406 bytes
    HTML transferred: 1867987111 bytes
    Requests per second: 3963.61 [#/sec] (mean)
    Time per request: 12.615 [ms] (mean)
    Time per request: 0.252 [ms] (mean, across all concurrent requests)
    Transfer rate: 145750.81 [Kbytes/sec] received

    Connection Times (ms)
    min mean[+/-sd] median max
    Connect: 0 1 0.8 1 6
    Processing: 3 12 2.9 11 27
    Waiting: 2 9 4.1 7 24
    Total: 6 13 2.4 12 28

    Percentage of the requests served within a certain time (ms)
    50% 12
    66% 13
    75% 14
    80% 15
    90% 16
    95% 17
    98% 18
    99% 19
    100% 28 (longest request) Reply


    • fjordvald

      Posted: September 20, 2010


      I did run these tests on the same machine as the code was running on as I did not want the network to factor into the equation. When I ran it from another server in the same LAN segment it was slightly lower due to the network overhead, but it was very close.

      When you run AB from a completely external location you're essentially benchmarking how much data you can move between the two locations, not how much data the server can generate. In a real world scenario where you have this much traffic you won't have just one connection going but many, many thousands. This means that your bottleneck is not likely to be the bandwidth between location A and B. Reply


      • Justin

        Posted: February 15, 2011


        Was this Wordpress? Because getting 2046.32 request a second without caching or optimization on Wordpress from apache bench is a damn miracle. If so, would love to see your nginx config. Reply


        • fjordvald

          Posted: February 15, 2011


          No. It was a pretty generic news posting script. I had some networked database interaction but nothing overly complicated. Wordpress gives me around 200 requests/sec using the standard W3 Total Cache plugin with memcached caching. I could make it use static files but even with all the Traffic ycombinator gave me my blog seems to have held up just fine. Reply


      • Ian Chilton

        Posted: February 15, 2011


        Hi,

        What are the spec's of the server?

        Thanks,

        Ian Reply


        • fjordvald

          Posted: February 15, 2011


          It’s a dedicated box with a i7 860 @ 2.8GHz CPU and 6 GB of RAM. Reply


  • vincent

    Posted: February 15, 2011


    Why you don't use SSI (Server Side Include) ? It can be a good complement. Reply


    • fjordvald

      Posted: February 15, 2011


      SSI is a bit simple in it's support. Nginx doesn't really have the required logic to use SSI as a viable caching strategy. I intend to explore edge side includes in an eventual 3rd part of this series. Will probably use Varnish unless Nginx has gotten some better ESI support by then. Reply


      • vincent

        Posted: February 15, 2011


        http://kovyrin.net/2007/08/05/using-nginx-ssi-and-memcache-to-make-your-web-applications-faster/ Reply


        • fjordvald

          Posted: February 15, 2011


          The problem with that approach is that you don't gain anything with it. You tell Nginx to include a file, that's totally fine, but you still need something to process it, you can proxy pass or fastcgi or whatever, but *something* needs to generate it. If you do decide to pass it to PHP then you're already slowed down loads as simply having PHP echo something like "hi" is rather slow. Reply


          • vincent

            Posted: February 15, 2011



            Nginx doesn't only include a file, he makes the subrequest like another normal request. You can cache them in memcached too.

            One page = multiple cached parts
            Delete one key, doesn't affect others.


          • fjordvald

            Posted: February 15, 2011


            Certainly. But you'll end up doing multiple cache gets in one request. That's not necessarily bad if it allows you to cache a page you normally wouldn't be able to cache, but in the examples used in the article you linked he uses it to include a login page. I see absolutely no point in that as there's no dynamic content there. SSI (or ESI) is definitely a concept I want to explore further, but one I'm going to be careful about.


  • T

    Posted: February 15, 2011


    welcome to 2007. glad you made it ;)

    (still, nice article!) Reply


    • fjordvald

      Posted: February 15, 2011


      Hah, yeah I realize the concept is known, but I don't actually know of any popular PHP framework which centres around full page caching with smart invalidation. Usually they provide methods to cache based on TTL, but that leaves to stale data and is less than optimal. Reply


  • T

    Posted: February 15, 2011


    btw. you turned off keep alive right? Reply


    • fjordvald

      Posted: February 15, 2011


      No. Keep alive is turned on in Nginx, Nginx handles TIME_WAIT connections really well so I see no reason to not have them on, I would in any real world case. Furthermore, I also had keep-alive on connections between Nginx and Memcached. I'll detail this in the next part.

      As far as I remember I did not use the -k switch, though. Reply


  • Herberth Amaral

    Posted: February 15, 2011


    I wish I had some more information, like the whole nginx config. I have an Webbynode VPS with 4 cores availble and I couldn't make it faster than 6k req/second with static files. It has very low memory, but this process is basically CPU bound as I observed. Reply


    • fjordvald

      Posted: February 15, 2011


      I will expand on the entire setup in part 2. Both on how the actual PHP implemention and Nginx implemention are handled. Reply


  • Peter Bengtsson

    Posted: February 15, 2011


    Did you use -k on these tests? With 10,000 requests across 50 concurrent users I can't get much over 10,000 requests/second unless I use -k. The it becomes around 13,000-14,000 requests/second. Reply


    • fjordvald

      Posted: February 15, 2011


      As far as I remember I did not use the -k flag in ab. I did however use keep-alive between Nginx and Memcached, which did increase requests per second some. Reply


  • James Cleveland

    Posted: February 15, 2011


    What command did you run AB with? Reply


    • fjordvald

      Posted: February 15, 2011


      You can sort of tell by AB output, but something like:

      ab -c 50 -n 5000 http://url.com/ Reply


  • James

    Posted: February 15, 2011


    Also, with full page caching, how would you propose handling things that are genuinely dynamic, like login boxes, forum posts, etc? Reply


    • fjordvald

      Posted: February 15, 2011


      You obviously cannot cache a POST request as you need to take some sort of action. You can always cache things that are fully dynamic, the difficulty is how to figure out when to invalidate the cache when the data is updated. The more complex your application is the more complex the invalidation logic becomes. The framework I use has some methods for keeping track of it but it obviously becomes complex over time. I'll provide more details in part 2. Reply


  • James

    Posted: February 15, 2011


    Finally, what about specs of the server etc?

    I'm running on a small linode VPS and I get ~2000req/sec for a script that simply echoes "hi" Reply


    • fjordvald

      Posted: February 15, 2011


      It's a dedicated box with a i7 860 @ 2.8GHz CPU and 6 GB of RAM. Reply


  • Moe

    Posted: February 15, 2011


    I wonder what the number would look like if you ran

    :%s/nginx/lighttpd Reply


    • fjordvald

      Posted: February 15, 2011


      I honestly cannot say. I used Lighttpd before Nginx but back then the memory leaks were so bad it was useless. Today I just don't see anything that would entice me to switch back. Reply


  • Darren Odden

    Posted: February 16, 2011


    do you have information on how you configured your nginx/php/memcached. Did you do anything special for the configurations? Considerations on bypassing apache and serving php directly? Reply


    • fjordvald

      Posted: February 16, 2011


      I'm working on a follow up blog post right now and will prove all the details there plus a working framework that has an implementation of the smart invalidation.
      As for bypassing Apache. I've actually been doing that for 2 years now, PHP-FPM is extremely stable and extremely awesome so there's absolutely no need for Apache for me, it'd just be another layer of complexity at no benefit. Reply


  • James

    Posted: February 16, 2011


    Thanks for your responses. Have you tried with HTTPerf? I seem to get much higher results with it, not sure why. For my scripts that are using memcached at nginx level, on my Linode 512 (so a long way off of a dedicated i7!), I get something like:



    james@li140-209:~$ httperf --hog --num-conns 10000 --num-calls 10000 --burst-length 20 --port 80 --rate 10000 --server 0xf.nl --uri=/
    httperf --hog --client=0/1 --server=0xf.nl --port=80 --uri=/ --rate=10000 --send-buffer=4096 --recv-buffer=16384 --num-conns=10000 --num-calls=10000 --burst-length=20
    Maximum connect burst length: 2824

    Total: connections 2546 requests 47244 replies 1981 test-duration 1.316 s

    Connection rate: 1934.7 conn/s (0.5 ms/conn, <=1022 concurrent connections)
    Connection time [ms]: min 0.9 avg 546.8 max 792.8 median 575.5 stddev 126.6
    Connection time [ms]: connect 156.6
    Connection length [replies/conn]: 1.000

    Request rate: 35901.2 req/s (0.0 ms/req)
    Request size [B]: 59.0

    Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples)
    Reply time [ms]: response 259.7 transfer 0.0
    Reply size [B]: header 143.0 content 2066.0 footer 0.0 (total 2209.0)
    Reply status: 1xx=0 2xx=1981 3xx=0 4xx=0 5xx=0

    CPU time [s]: user 0.12 system 1.12 (user 9.1% system 84.9% total 94.1%)
    Net I/O: 5316.0 KB/s (43.5*10^6 bps)

    Errors: total 10000 client-timo 0 socket-timo 0 connrefused 0 connreset 2546
    Errors: fd-unavail 7454 addrunavail 0 ftab-full 0 other 0
    james@li140-209:~$ Reply


  • Daniel

    Posted: February 17, 2011


    Great Blog! do you have a twitter account? Reply


    • fjordvald

      Posted: February 17, 2011


      Thank you, I do not have a twitter account, though. Reply


  • Nicolò Martini

    Posted: February 17, 2011


    Great Post, but when will you publish the one about the implementation? Reply


    • fjordvald

      Posted: February 17, 2011


      Got the code all tidied up and packaged, so just have to write the actual blog post now. So probably tomorrow. Reply


      • Nicolò Martini

        Posted: February 18, 2011


        Ok, thank you a lot, this study is what I was looking for. Reply


  • Goran

    Posted: February 18, 2011


    Great article. I'm creating hi-performance API server by using PHP and Apache. The server is CentOS placed on Amazon EC2 platform. For static .html files I get 6000 req/sec (50 concurrent) without any optimization but when I execute simple echo the number drops to 3000 req/sec. When I put big comment inside I get 2200 req/sec and I use simple "include" with small file I get 1200 req/sec. Our PHP RESTful API application gets only 100 req/sec.

    Can someone explain why is this happening and how can we increase the requests per second for our application? Will Nginx help? How to create a hi-performance API web server?

    Thanks for you replies in advance. Reply


  • Ryan Dahl

    Posted: June 15, 2012


    @Goran - "creating hi-performance API server by using PHP and Apache" - this is a contradiction in terms. Apache is anything but hi in terms of performance.

    Have you ever heard of Nginx? Reply


  • Goran

    Posted: June 17, 2012


    @Ryan, yes I'm mentioning nginx in the above post. Today we're using nginx to serve static files but it never became part of our API server configuration since we're using small EC2 instances which have moderate network I/O and the network is bottleneck. We've achieved hi performance with several DNS roundrobin loadbalancers and memcached. Reply


  • Siripala

    Posted: August 20, 2012


    I too tried out to achieve 12000 requests but not success. AB tool shows that 12000 requests has reached but same time PHP get crashed. Run particular web page from browser after running AB, it says bad gateway error.

    http://stackoverflow.com/questions/3616191/nginx-php-fpm-502-bad-gateway Reply


    • mfjordvald

      Posted: August 20, 2012


      You have a segmentation fault in your PHP, that has nothing to do with caching PHP pages in memcached... You're looking in the way wrong location. Reply



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>