At our company we develop a Learning Management System (LMS), and a couple of years ago we decided to rewrite the whole LMS from scratch and at the same time take all our experience and make the best of it.
We took into consideration everything from choice of web-server, scripting language, database backend, clients proxy caching (that we have had some problems with before), etc etc.
This is a guide on our experiences from nothing to version 2.0 of our LMS.
Our servers are running Linux, so the choice of web-server some years ago was always Apache. But since about 75% of all the content that our servers send is static files we decided to give nginx a try. Nginx is a asynchronous web-server that only runs in one process and serves files with very little footprint or load on the system. Nginx can not serve php-files like Apache can using mod_php, but instead you use fastcgi to execute php-files. The downside with Apache is that every request has it's own process, and every process will load mod_php even if only serving a static file, so every request will use up alot of memory with no advantage. Nginx will (when configured right) instead only proxy the fastcgi when it needs to execute php code.
So the choice of nginx required us to execute php through fastcgi, and fortunately php now includes php-fpm that is a fastcgi process manager for php.
Setting up php-fpm is basically like setting up how Apache works. You need to set the number of startup/maximum/minimum processes.
At first I was a bit worried that executing php through nginx/fastcgi/php-fpm would affect the performance, but after setting it up and benchmarking some php scripts, we noticed that the performance was about the same compared to Apache/mod_php.
We also made the choice to use xcache that is set up and optimized for 4 cores that we have on the webserver. Optimizing xcache for several cores using the xcache.count parameter affects the locking of shared memory quite a bit.
Problems with web-server/PHP:
When we started to benchmark our LMS we ran into some problems that are important.
- The first problem was that PHP leaks memory, so after running the web-server for a couple of hours we noticed that the php-fpm processes was getting bigger and bigger, so after a while the server started swapping memory that affected the whole system.
The solution was easy: Just set php-fpm parameter "pm.max_requests" to something like 20 that means that the process will only serve 20 requests and then shut down. Php-fpm will handle this nicely and start a new process if the processes are too few.
- When benchmarking under high load, we started to get alot of php errors that turned out to be problems with the database. PHP could not connect to the database anymore giving us some undocumented error code in return.
We use PHPs PDO to connect to a MySQL-server running on a different server and the error we got turned out to be a error that there where no available ports anymore.
The solution again was very easy: Just turn on persistent connections to the database. If you do not use persistent connections, the PHP script will connect to the database and make it's requests and then close the connection. But a normal system will still make that TCP port unavailable for another 60 seconds, and since there are only some thousand ports dedicated to this you will eventually run out of ports.
When we turn on persistent connection, the connection will not close to the database unless the process is killed and that will only happen every 20 request with our configuration.
When we developed the new version of our LMS we made some very good decisions based on previous experience.
We see our static files in different ways:
- Uploaded files: In a LMS the administrators upload all the content like documents and SCORM-packages.
Ground rule: NEVER UPDATE STATIC FILES!!
This means that if we make a update on any of the shared files, we make a new revision in a new path. For instance we go from http://example.com/shared/r100/jquery.js to http://example.com/shared/r101/jquery.js if we upgrade jquery.
If the administrators upload any new content, we always update the path to the content like: http://example.com/data/documents/1/mydoc.pdf to http://example.com/data/documents/2/mydoc.pdf
If we follow this rule, we can setup nginx with the setting "expires max" on all our static files. This basically means that if a browser have downloaded the file once, it will not even check to see if the file is updated again and if we update a file, the path change and the browser will see this as a new file to download.
This will also take care of any miss-configured proxy server that doesn't check for updated files as it should. Proxy servers are very good with this aproach, since they can take the load off serving static files.
Rule: Compress appropriate static files.
We also have turned on nginxs "gzip on" on out php files.
Our web application is very fast. We have a very well normalized and indexed database running on one backend server and our webapp is developed using Zend Framework. When we make a load test on the LMS the bottleneck actually isn't the database as you normally predict, but instead it's PHP that is the main bottleneck. On maximum load on the web-server (4 cores) the database (1 core) server only has a load of about 1/4 compared to the web-server. We are looking at different approaches to optimize this like trying to optimize the autoloader or maybe even use PHP hip-hop compiler. The problem is not memory on the web-server; there is plenty of memory left and the maximum number of php-fpm processes is never reached under maximum load.
After optimizing our application, we score 91/100 on the "page speed" firebug plugin. And we have no problems with high load on our servers at all. But it's always fun to optimize webapplications further since I'm a bit allergic to slow webapps :)
Any suggestions on how to optimize this further is appreciated.