Category: Nginx

Yesterday Nginx Inc announced that it had taken $3 million USD in funding. No one deserves this more than Igor Sysoev and it’s hard to believe that Nginx wasn’t commercialized sooner. Well deserved or not, though, whether this funding is good for Nginx or not is up for debate.

To understand the whole aspect of the deal I’ll first cover the worst-case scenario that people might fear happening. I’ll later on cover why this case is unlikely, so please do finish reading before considering me a moron.

The FUD Aspect

Getting funded means a business person has seen potential and decided to invest money to get a return. There’s really no way to deny this, philanthropy simply does not happen in the start-up world unless you’re being funded by your rich but slightly senile aunt. Eventually this business man will want to get a return on his investment and this means the Nginx Inc will have to become profitable. How does an open source project become profitable, though?

  • Going closed source and commercializing the product.
  • Creating a closed source enterprise version to develop alongside open source version.
  • Keeping the core product open and developing commercial extensions of that product.
  • Keeping product open sourced and selling support, training and resources.

There might be a few more options that I haven’t thought of, but these are the most commonly seen ones. Based on the press release and statements made to the press we know that Nginx Inc plans to release a commercial version of Nginx for paying customers. To quote Andrew Alexeev:

“we think that it’s the most valuable approach for open source projects to be open core, in order to provide the commercial features that are really needed”

So that leaves us with an open core and most likely commercial modules for enterprise customers. Modules, perhaps such as high availability, proper load balancing or actual backend monitoring. Things normal people obviously do not need.

I’ll be the first to admit that the slippery slope argument is not a proper argument, it cannot be used as evidence of Nginx going in the wrong direction. Nevertheless, it is still a fun thought-experiment. For Nginx Inc to be profitable it’s in their interest to get as many people as possible on their paid plans, as such it is in their interest to keep the functionality in the free version limited to just enough that they can keep attracting new users.

They might promise to not want to upsell users, however, we all know how much a promise is worth when it comes to making money. If the commercial modules fail then commercial version is introduced, then the free version is scraped and eventually you’ve got a new Oracle on your hands. Business people are running Nginx Inc now and the death of Nginx as open source might be coming.

The Rational Aspect

The above is, of course, pure FUD. There’s no evidence that actually points to such a scenario happening and it is merely the worst case scenario I could think up. So what do we know? What are the actual facts about this move.

  • Nginx Inc is getting new offices in San Francisco.
  • Nginx Inc will release a commercial arm based on the open source Nginx core. Whether a full version or just modules is not known.

We can infer another fact based on this – namely that Nginx Inc will hire new people. Before Nginx Inc formed as a company back in July it was largely a one man project. If you followed the development it was Igor writing code with a few rare patches from third party. Mostly other developers were told to develop modules.

Today Nginx has 3 full-time developers working on the code instead of just Igor working after-hours, this alone is a win for everyone who uses Nginx. I think it’s safe to say that development on the Nginx core should increase even if they only dedicate a single person to working on it.

Having a resourceful company behind Nginx is also a plus as it allows enterprise customers to be confident in using Nginx to power their infrastructure. They’ll be able to get support and know that the product isn’t a fly-by-night operation. More companies using Nginx means an increased need for people familiar with Nginx and that might increase the value of people with Nginx as a skill set.

The Rational Worst-Case Scenario

Lets assume for a second that the FUD aspect holds true and Nginx becomes a close source project, or even that the open source version is crippled to where it’s just a bare bones httpd which even lighttpd outshines.

Nginx Inc actually has very little control over the entire infrastructure that is Nginx, in fact, the only two things controlled by Nginx Inc are the Nginx domain(and product) and the mailing list. For the longest time Nginx support has been handled by Igor on the mailing list and the community everywhere else. The IRC channel, which these days has 300+ people idling, is controlled by community volunteers, the Nginx wiki is controlled by the same community volunteers and the Nginx forum is controlled by Jim Ohlstein who has no connection to the Nginx company.

All this means that should the worst case scenario happen with Nginx Inc blinking and suddenly having dollar signs appear in their eyes, then the community can pull an OpenSSH and fork Nginx due to it using the BSD 2-clause license. If the community so desires the documentation and support structure can follow along.

Of course, it’s important to note that this scenario is far-fetched and that forking software is a last-measure. I don’t see it happening.

My Personal Thoughts

I’m personally not too concerned at this point. Nginx has a long history of being open source and while it’s going open-core now I still feel confident that the core will not be neglected or crippled in favour of making money. On the other hand, I don’t know how much ownership Igor had to give up, nor do I know how strong of a leader/owner he’s going to be. At this point I’m positive about the funding, extra developers means good things and until I see signs otherwise I really have no reason to panicking. Should my FUD scenario ever come true I’m also pretty confident we’ll see an Nginx fork with a lot of the support structure migrating over. This of course makes it in the best interest of Nginx Inc to continue working closely with the community which has supported Nginx for so long.

I would like to see a more open development approach, though. A road map of planned features and more details what exactly they plan to offer in their commercial version would be very welcome and would allow people to know how to react.

I have previously talked about some of the most common Nginx questions; not surprisingly, one such question is how to optimize Nginx. This is not really overly surprising since most of new Nginx users are migrating over from Apache and thus are used to having to tweak settings and perform voodoo magic to ensure that their servers perform as best as possible.

Well I’ve got some bad news for you, you can’t really optimize Nginx very much. There’s no magic settings that will reduce your load by half or make PHP run twice as fast. Thankfully, the good news is that Nginx is already optimized out of the box. The biggest optimization happened when you decided to use Nginx and ran that apt-get install, yum install or make install. (Please note that repositories are often out of date. The wiki install page usually has a more up-to-date repository)

That said, there’s a lot of options in Nginx that affects its behaviour and not all of their defaults values are completely optimized for high traffic situations. We also need to consider the platform that Nginx runs on and optimize our OS as there are limitations in place there as well.

To summarize, while we cannot optimize the load time of individual connections we can ensure that Nginx has the ideal environment for handling high traffic situations. Of course, by high traffic I mean several hundreds of requests per second so the far majority of people don’t need to mess around with this, but if you are curious or want to be prepared then read on.

First of all we need to consider the platform to use as Nginx is available on Linux, MacOS, FreeBSD, Solaris, Windows as well as some more esoteric systems. They all implement high performance event based polling methods, sadly, Nginx only support 4 of them. I tend to favour FreeBSD out of the four but you should not see huge differences and it’s more important that you are comfortable with your OS of choice than that you get the absolutely most optimized OS.

In case you hadn’t guessed it already then the odd one out is Windows. Nginx on Windows is really not an option for anything you’re going to put into production. Windows has a different way of handling event polling and the Nginx author has chosen not to support this; as such it defaults back to using select() which isn’t overly efficient and your performance will suffer quite quickly as a result.

Read More »

This is part two in my caching series. Part one covered the concept behind the full page caching as well as potential problems to keep in mind. This part will focus on implementing the concept in actual PHP code. By the end of this you’ll have a working implementation that can cache full pages and invalidate them intelligently when an update happens.

Requirements

I’ll provide a fully functional framework with the simple application I used to get my benchmark figures. You’ll need the following software to be able to run it.

  • Nginx. I’m not sure which exact version but I generally use and recommend the latest development version.
  • PHP 5.3.0. I recommend at least 5.3.3 so you’ll have PHP-FPM for your fastcgi process management.
  • MySQL
  • Memcached

The Framework

You can download the framework here: Evil Genius Framework. I’ll be referencing code in the files instead of pasting it in this post to keep the size down, so you will probably want to download it.

Read More »

The Big Picture

So you’ve finally decided to make the switch from Apache to Nginx. You most likely did this for performance reasons; perhaps all those blogs have been writing about how fast Nginx is or perhaps your webmaster friends have been raving about how they can now handle a lot more traffic without spending money on hardware.

This is usually all true, but why exactly is Nginx so much faster than the typical Apache setup of the prefork MPM and mod_php? The technical explanation is that Nginx is a non-blocking event based architecture while Apache is a blocking process based architecture. To simplify it heavily the theory is like this:

Apache Prefork Processes:

  • Receive PHP request, send it to a process.
  • Process receives the request and pass it to PHP.
  • Receive an image request, see process is busy.
  • Process finishes PHP request, returns output.
  • Process gets image requests and returns the image.

While the process is handling the request it is not capable of serving another request, this means the amount of requests you can do simultaneously is directly proportional to the amount of processes you have running. Now, if a process took up just a small bit of memory that would not be too big of an issue as you could run a lot of processes. However, the way a typical Apache + PHP setup has the PHP binary embedded directly into the Apache processes. This means Apache can talk to PHP incredibly fast and without much overhead, but it also means that the Apache process is going to be 25-50MB in size. Not just for requests for PHP requests, but also for all static file requests. This is because the processes keep PHP embedded at all times due to cost of spawning new processes. This effectively means you will be limited by the amount of memory you have as you can only run a small amount of processes and a lot of image requests can quickly make you hit your maximum amount of processes.

Compare this to the Nginx event based method.

Read More »

“No input file specified” is one of the most frequently encountered issues in Nginx. People on serverfault and in the #nginx IRC channel asks for help with this so often that this post is mostly to allow me to be lazy and not have to type up the same answer every time.

This is actually an error from PHP and due to display_errors being 0 people will often just get a blank page with no output. In a typical setup PHP will then send the error to stderr or stdout and Nginx will pick up on it and log it in the Nginx error log file. Thus people spend a ton of time trying to figure out why Nginx isn’t working.

The root cause of the error is that PHP cannot find the file Nginx is telling it to look for, and there are two common cases that causes this. Either you’re not giving PHP the right path to the file or your file permissions are incorrect.

Wrong Path Sent to PHP

The most common reason at the time of writing happens because a user uses a horrible tutorial found via google instead of actually understanding Nginx. Reading my primer will equip you to actually solve this on your own but since this post is actually dedicated to the error I’ll cheat this once and allow you to be lazy by just giving you the full solution.

Nginx tells PHP about the file to execute via the SCRIPT_FILENAME fastcgi_param value. Most examples in the wiki should define this as $document_root$fastcgi_script_name. The horrible tutorials will often hardcode the path value but this is not desirable as we don’t want to duplicate information and invite future screw ups. So you’ve gone with the $document_root$fastcgi_script_name option and suddenly it’s no longer working.

This happens because Nginx has 3 levels of inheritance commonly referred to as blocks, these being http, server and location, each being a sub-block of the parent. Directives in nginx inherit downwards but never up or across, so if you define something in one location block it will never be applied in any other location block under any circumstance.

Typically users define their index and root directive in location / because a tutorial told them to. So when they then define SCRIPT_FILENAME using $document_root the root directive is not actually defined and thus the SCRIPT_FILENAME value becomes just the URI making PHP look at the root server dir.

The simple solution here is to just define the directive in your server block. (or http block even!) Generally the higher up your can define a directive the less duplicate directives you’ll need.

Incorrect File Permissions

Most people don’t really believe me when I tell them their file permissions are incorrect. They’re looking at the damn permissions and the PHP user can read the file just fine! Sadly, this shows a lack of understanding of Unix user permissions. Being able to read a file is not enough, a user must also be able to traverse to the file.

This effectively means that not only should the file have read permission, but the entire directory structure should have execute permission so that the PHP user can traverse the path. An example of this:

Say you have an index.php file in /var/www. /var/www/index.php must have read permission and both /var and /var/www must have execute permissions!

If you’ve corrected both things and still have this issue then please put a comment so I can look into it, as far as I know there should be no other reasons  for this error.

Edit: Part 2 is now available.

This is the first entry in a short series I’ll do on caching in PHP. During this series I’ll explore some of the options that exist when caching PHP code and provide a unique (I think) solution that I feel works well to gain high performance without sacrificing real-time data.

Caching in PHP is usually done on a per-object basis, people will cache a query or some CPU intensive calculations to prevent redoing these CPU intensive operations. This can get you a long way. I have an old site which uses this method and gets 105 requests per second on really old hardware.

An alternative that is used, for example in the Super Cache WordPress plug-in, is to cache the full-page data. This essentially mean that you create a page only once. This introduces the problem of stale data which people usually solve by checking whether data is still valid or by using a TTL caching mechanism and accepting stale data.

The method I propose is a spin on full-page caching. I’m a big fan of Nginx and I tend to use it to solve a lot of my problems, this case is no exception. Nginx has a built-in Memcached module, with this we can store a page in Memcached and have Nginx serve it – thus never touching PHP at all. This essentially turns this:

Read More »