Derick RethansNatural Language Sorting with MongoDB (27.8.2014, 08:33 UTC)

Natural Language Sorting with MongoDB

Arranging English words in order is simple—well, most of the time. You simply arrange them in alphabetical order. However sorting a set of German words, or French words with all their accents, or Chinese with their different characters is a lot harder than it looks. Sorting rules are specified through "locales", which determine how accents are sorted, in which order the letters are in and how to do case-insensitive sorts. There is a good set of those sorting rules available through CLDR, and there is a neat example to play with all kinds of sorting at ICU's demo site. If you really want to know how the algorithms work, have a look at the Unicode Consortium's report on the Unicode Collation Algorithm.

Right now, MongoDB does not support indexes or sorting on anything but Unicode Code Points. Basically, that means, that it can't sort anything but English. There is a long standing issue, SERVER-1920, that is at the top of the priority list, but is not scheduled to be added to a future release. I expect this to be addressed at a point in the near future. However, with some tricks there is a way to solve the sorting problem manually.

Many languages, have their own implementation of the Unicode Collation Algorithm, often implemented through ICU. PHP has an ICU based implementation as part of the intl extension. And the class to use is the Collator class.

The Collator class encapsulates the Collation Algorithm to allow you to sort an array of text yourself, but it also allows you extract the "sort key". By storing this generated sort key in a separate field in MongoDB, we can sort by locale—and even multiple locales.

Take for example the following array of words:

$words = [
        'bailey', 'boffey', 'böhm', 'brown', 'серге́й', 'сергій', 'swag',

Which we can turn into sort keys with a simple PHP script like:

$collator = new Collator( 'en' );
foreach ( $words as $word )
        $sortKey = $collator->getSortKey( $word );
        echo $word, ': ', bin2hex( $sortKey ), "\n";

We create a collator object for the en locale, which is generic English. When running the script, the output is (after formatting):

bailey: 2927373d2f57010a010a
boffey: 294331312f57010a010a
böhm:   2943353f01859d060109
brown:  294943534101090109
серге́й: 5cba34b41a346601828d05010b
сергій: 5cba34b41a6066010a010a
swag:   4b53273301080108
svere:  4b512f492f01090109

Those sort keys can be used to then sort the array of names. In PHP, that would be:

$collator->sort( $words );
print_r( $words );

Which returns the following list:

[0] => bailey
[1] => boffey
[2] => böhm
[3] => brown
[4] => svere
[5] => swag
[6] => серге́й
[7] => сергій

We can extend this script, to use multiple collations, and import each word including its sort keys into MongoDB.

Below, we define the words we want to sort on, and the collations we want to compare. They are in order: English, German with phone book sorting, Norwegian, Russian and two forms of Swedish: "default" and "standard":

<?php $words = [
        'bailey', 'boffey', 'böhm', 'brown', 'серге́й', 'сергій',
        'swag', 'svere'
$collations = [
        'en', 'de_DE@collation=phonebook', 'no', 'ru',
        'sv', 'sv@collation=standard',

Make the connection to MongoDB and clean out the collection:

$m = new MongoClient;
$d = $m->demo;
$c = $d->collate;

Create the Collator objects for each of our collations:

$collators = [];

foreach ( $collations as $collation )
        $c->createIndex( [ $collation => 1 ] );
        $collators[$collation] = new Collator( $collation );

Loop over all the words, and for each collation we have define, use the created Collator object to generate the sort key. We encode the sort key with bin2hex() because sort keys are binary data, and MongoDB requires UTF-8 for strings. M

Truncated by Planet PHP, read more at the original (another 4683 bytes)

Matthew Weier O'PhinneyDeployment with Zend Server (Part 1 of 8) (26.8.2014, 20:15 UTC)

I manage a number of websites running on Zend Server, Zend's PHP application platform. I've started accumulating a number of patterns and tricks that make the deployments more successful, and which also allow me to do more advanced things such as setting up recurring jobs for the application, clearing page caches, and more.

Yes, YOU can afford Zend Server

"But, wait, Zend Server is uber-expensive!" I hear some folks saying.

Well, yes and no.

With the release of Zend Server 7, Zend now offers a "Development Edition" that contains all the features I've covered here, and which runs $195. This makes it affordable for small shops and freelancers, but potentially out of the reach of individuals.

But there's another option, which I'm using, which is even more intriguing: Zend Server on the Amazon Web Services (AWS) Marketplace. On AWS, you can try out Zend Server free for 30 days. After that, you get charged a fee on top of your normal AWS EC2 usage. Depending on the EC2 instance you choose, this can run as low as ~$24/month (this is on the t1.micro, and that's the total per month for both AWS and Zend Server usage). That's cheaper than most VPS hosting or PaaS providers, and gives you a full license for Zend Server.

Considering Zend Server is available on almost every PaaS and IaaS offering available, this is a great way to try it out, as well as to setup staging and testing servers cheaply; you can then choose the provider you want based on its other features. For those of you running low traffic or small, personal or hobbyist sites, it's an inexpensive alternative to VPS hosting.

So... onwards with my first tip.

Tip 1: zf-deploy

My first trick is to use zf-deploy. This is a tool Enrico and I wrote when prepping Apigility for its initial stable release. It allows you to create deployment packages from your application, including zip, tarball, and ZPKs (Zend Server deployment packages). We designed it to simplify packaging Zend Framework 2 and Apigility applications, but with a small amount of work, it could likely be used for a greater variety of PHP applications.

zf-deploy takes the current state of your working directory, and clones it to a working path. It then runs Composer (though you can disable this), and strips out anything configured in your .gitignore file (again, you can disable this). From there, it creates your package.

One optional piece is that, when creating a ZPK, you can tell it which deployment.xml you want to use and/or specify a directory containing the deployment.xml and any install scripts you want to include in the package. This latter is incredibly useful, as you can use this to shape your deployment.

As an example, on my own website, I have a CLI job that will fetch my latest GitHub activity. I can invoke that in my post_stage.php script:

if (! chdir(getenv('ZS_APPLICATION_BASE_DIR'))) {
  throw new Exception('Unable to change to application directory');

$php = '/usr/local/zend/bin/php';

$command = $php . ' public/index.php githubfeed fetch';
echo "\nExecuting `$command`\n";

One task I always do is make sure my application data directory is writable by the web server. This next line builds on the above, in that it assumes you've changed to your application directory first:

$command = 'chmod -R a+rwX ./data';
echo "\nExecuting `$command`\n";

Yes, PHP has a built-in for chmod, but it doesn't act recursively.

For ZF2 and Apigility applications, zf-deploy also allows you to specify a directory that contains the *local.php config scripts for your config/autoload/ directory, allowing you to merge in configuration specific for the deployment environment. This is a fantastic capability, as I can keep any private configuration separate from my main repository.

Deployment now becomes:

$ vendor/bin/zfdeploy.php --configs=../ --zpk=zpk

and I now have a ZPK ready to push to Zend Server.

In sum: zf-deploy simplifies ZPK creation, and allows you to add deployment scripts that let you perform other tasks on the server.

Next time...

I've got a total of 8 tips queued up, including this one, and will be publishing on Tuesdays and Thursdays; I'll update each post to link to the others in the series. Next tip: creating scheduled Job Queue jobs, à la cronjobs.

Other articles in the series

I will update this post to link to each article as it releases.
labs @ Qandidate.comBringing CQRS and Event Sourcing to PHP. Open sourcing Broadway! (26.8.2014, 14:00 UTC)

Last week we open sourced our toggle library, API and GUI, see our announcements here and here. Today we open source Broadway! Broadway is a project providing infrastructure and testing helpers for creating CQRS and event sourced applications. Broadway tries hard to not get in your way. The project contains several loosely coupled components that can be used together to provide a full CQRS\ES experience.

∞ labs @ Permalink

Matthew Weier O'PhinneyDeployment with Zend Server (Part 2 of 8) (26.8.2014, 13:30 UTC)

This is the second in a series of eight posts detailing tips on deploying to Zend Server. The previous post in the series detailed getting started with Zend Server on the AWS marketplace and using zf-deploy to create ZPK packages to deploy to Zend Server.

Today, I'm looking at how to created scheduled/recurring jobs using Zend Server's Job Queue; think of this as application-level cronjobs.

Tip 2: Recurring Jobs

I needed to define a few recurring jobs on the server. In the past, I've used cron for this, but I've recently had a slight change of mind on this: if I use cron, I have to assume I'm running on a unix-like system, and have some sort of system access to the server. If I have multiple servers running, that means ensuring they're setup on each server. It seems better to be able to define these jobs at the applicaton level.

Since Zend Server comes with Job Queue, I decided to try it out for scheduling recurring jobs. This is not terribly intuitive, however. The UI allows you to define scheduled jobs... but only gives options for every minute, hour, day, week, and month, without allowing you to specify the exact interval (e.g., every day at 20:00).

The PHP API, however, makes this easy. I can create a job as follows:

$queue = new ZendJobQueue();
$queue->createHttpJob('/jobs/github-feed.php', [], [
  'name'       => 'github-feed',
  'persistent' => false,
  'schedule'   => '5,20,35,40 * * * *',

Essentially, you provide a URL to the script to execute (Job Queue "runs" a job by accessing a URL on the server), and provide a schedule in crontab format. I like to give my jobs names as well, as it allows me to search for them in the UI, and also enables linking between the rules and the logs in the UI. Marking them as not persistent ensures that if the job is successful, it will be removed from the events list.

The question is, where do you define this? I decided to do this in my post_activate.php deployment script. However, this raises two new problems:

  • Rules need not just a path to the script, but also the scheme and host. You _can_ omit those, but only if the script can resolve them via $_SERVER... which it cannot due during deployment.
  • Each deployment adds the jobs you define... but this does not overwrite or remove the jobs you added in previous deployments.

I solved these as follows:

$server = '';

// Remove previously scheduled jobs:
$queue = new ZendJobQueue();
foreach ($queue->getSchedulingRules() as $job) {
    if (0 !== strpos($job['script'], $server)) {
        // not one we're interested in

    // Remove previously scheduled job

$queue->createHttpJob($server . '/jobs/github-feed.php', [], [
  'name'       => 'github-feed',
  'persistent' => false,
  'schedule'   => '5,20,35,40 * * * *',

So, in summary:

  • Define your rules with names.
  • Define recurring rules using the schedule option.
  • Define recurring rules in your deployment script, during post_activate.
  • Remove previously defined rules in your deployment script, prior to defining them.

Next time...

The next tip in the series is a short one, perfect for following the US Labor Day weekend, and details something I learned the hard way from Tip 1 when setting up deployment tasks.

Other articles in the series

I will update this post to link to each article as it releases.
Matthias NobackDecoupling your (event) system (25.8.2014, 22:00 UTC)

About interface segregation, dependency inversion and package stability

You are creating a nice reusable package. Inside the package you want to use events to allow others to hook into your own code. You look at several event managers that are available. Since you are somewhat familiar with the Symfony EventDispatcher component already, you decide to add it to your package's composer.json:

    "name": "my/package"
    "require": {
        "symfony/event-dispatcher": "~2.5"

Your dependency graph now looks like this:

Dependency graph

Introducing this dependency is not without any problem: everybody who uses my/package in their project will also pull in the symfony/event-dispatcher package, meaning they will now have yet another event dispatcher available in their project (a Laravel one, a Doctrine one, a Symfony one, etc.). This doesn't make sense, especially because event dispatchers all do (or can do) more or less the same thing.

Though this may be a minor inconvenience to most developers, having just this one extra dependency may cause some more serious problems when Composer tries to resolve version constraints. Maybe a project or one of its dependencies already has a dependency on symfony/event-dispatcher, but with version constraint >=2.3,<2.5...

Some drawbacks of the Symfony EventDispatcher

On top of these usability issues, when it comes to design principles, depending on a concrete library like the Symfony EventDispatcher is not such a good choice either. The EventDispatcherInterface which you will likely use in your code is quite bloated:

namespace Symfony\Component\EventDispatcher;

interface EventDispatcherInterface
    public function dispatch($eventName, Event $event = null);

    public function addListener($eventName, $listener, $priority = 0);
    public function addSubscriber(EventSubscriberInterface $subscriber);
    public function removeListener($eventName, $listener);
    public function removeSubscriber(EventSubscriberInterface $subscriber);
    public function getListeners($eventName = null);
    public function hasListeners($eventName = null);

The interface basically violates the Interface Segregation Principle, which means that it serves too many different types of clients: most clients will only use the dispatch() method to actually dispatch an event, and other clients will only use addListener() and addSubscriber(). The remaining methods are probably not even used by any of the clients, or only by clients that help you to debug your application (a quick search in the Symfony code-base confirms this suspicion).

Personally, I think it's not really nice that events need to be objects extending the Event class. I understand why Symfony does it (it's basically because that class has the stopPropagation() and isPropagationStopped() methods which enables event listeners to stop the dispatcher from notifying the remaining listeners. I've never liked that, nor wanted it to happen in my code. Just like the original Observer pattern prescribes, all listeners (observers) should be able to respond to the current situation.

I also don't like the fact that the same event class can be used for different events (differentiated by just their name, which is the first argument of the dispatch() method). I prefer each type of event to have its own class. Therefore, to me it would make sense to pass just an event object to the dispatch() method, allowing it to return its own name, which will always be the same anyway. The name returned by the event itself can then be used to determine which listeners need to be notified.

According to these objections to the design of the Symfony EventDispatcher classes, we'd be better off with the following clean interfaces:

namespace My\Package;

interface Event
    public function getName();

interface EventDispatcher
    public function dispatch(Event $event);

It would be really great to be able to just use these two interface in my/package!

Symfony EventDispatcher: the good parts

Still, we also want to use the Symfony E

Truncated by Planet PHP, read more at the original (another 6031 bytes)

SitePoint PHPPINQ – Querify Your Datasets – Faceted Search (25.8.2014, 16:00 UTC)

In part 1, we briefly covered the installation and basic syntax of PINQ, a PHP LINQ port. In this article, we will see how to use PINQ to mimic a faceted search feature with MySQL.

We are not going to cover the full aspect of faceted search in this series. Interested parties can refer to relevant articles published on Sitepoint and other Internet publications.

A typical faceted search works like this in a website:

  • A user provides a keyword or a few keywords to search for. For example, “router” to search for products which contain “router” in the description, keyword, category, tags, etc.
  • The site will return the products matching the criteria.
  • The site will provide some links to fine tune the search. For example, it may prompt that there are different brands for a router, and there may be different price ranges and different features.
  • The user can further screen the results by clicking the different links provided and eventually gets a more customized result set.

Faceted search is so popular and powerful and you can experience it in almost every e-Commerce site.

Unfortunately, faceted search is not a built-in feature provided by MySQL yet. What can we do if we are using MySQL but also want to provide our users with such a feature?

With PINQ, we’ll see there is an equally powerful and straightforward approach to achieving this as when we are using other DB engines - at least in a way.

Extending Part 1 Demo

NOTE: All code in this part and the part 1 demo can be found in the repo.

In this article, we will extend the demo we have shown in Part 1 and add in some essential faceted search features.

Let’s start with index.php by adding the following few lines:

$app->get('demo2', function () use ($app)
    global $demo;
    $test2 = new pinqDemo\Demo($app);
    return $test2->test2($app, $demo->test1($app));

$app->get('demo2/facet/{key}/{value}', function ($key, $value) use ($app)
    global $demo;
    $test3 = new pinqDemo\Demo($app);
    return $test3->test3($app, $demo->test1($app), $key, $value);

Truncated by Planet PHP, read more at the original (another 1415 bytes)

PHP ClassesThe False Decline of PHP - Lately in PHP podcast episode 50 (25.8.2014, 09:07 UTC)
By Manuel Lemos
It seems the popularity of PHP continues to upset people that do not sympathize with the language for some reason. The release of yet another article claiming the PHP decline was one of the main topics discussed by Manuel Lemos and Arturs Sosins in the episode 50 of the Lately in PHP podcast.

They also discussed the latest decisions about the features being planned for PHP 7, the last release of PHP 5.3 and PHP 5.4 going into security releases mode, and the (Facebook) PHP language specification effort.

Now listen to the podcast, or watch the hangout video or read the transcript to learn more about the details of these interesting PHP discussions.
Remi ColletPHP 5.6 as Software Collection (25.8.2014, 08:46 UTC)

RPM of upcoming new major version of PHP 5.6, are available in remi repository for Fedora 19, 20 and Enterprise Linux 6, 7 (RHEL, CentOS, ...) in a fresh new Software Collection (php56) allowing its installation beside the system version.

As I strongly believe in SCL potential to provide a simple way to allow installation of various versions simultaneously, and as I think it is useful to offer this feature to allow developers to test their applications, to allow sysadmin to prepare a migration or simply to use this version for some specific application, I decide to create this new SCL.

Installation :

yum --enablerepo=remi install php56

emblem-important-2-24.pngTo be noticed:

  • as the SCL is independant from the system, and doesn't alter it, this SCL is available in remi repository
  • installation is under the /opt/remi tree
  • the Apache module, php56-php, is available, but of course, only one mod_php can be used (so you have to disable or uninstall any other, the one provided by the default "php" package still have priority)
  • the FPM service (php56-php-fpm) is available, it listens on default port 9000, so you have to change the configuration if you want to use various FPM services simultaneously.
  • the php56 command give a simple access to this new version, however the scl command is still the recommended way.
  • for now, the collection provides 5.6.0RC4, but stable version should be released soon.
  • more PECL extensions will be progressively are also available.
  • only x86_64, no plan for other arch.

emblem-notice-24.pngAlso read other entries about SCL.

$ scl enable php56 'php -v'
PHP 5.6.0 (cli) (built: Aug 28 2014 08:14:49)
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2014 Zend Technologies
with Zend OPcache v7.0.4-dev, Copyright (c) 1999-2014, by Zend Technologies
with Xdebug v2.2.5, Copyright (c) 2002-2014, by Derick Rethans

As always, your feedback is welcome, a SCL dedicated forum is open.

tillWhat's wrong with composer and your .travis.yml? (24.8.2014, 00:19 UTC)

I'm a huge advocate of CI and one service in particular called Travis-Ci.

Travis-CI runs a continuous integration platform for both open source and commercial products. In a nutshell: Travis-CI listens for a commit to a Github repository and runs your test suite. Simple as that, no Jenkins required.

At Imagine Easy we happily take advantage of both. :)

So what's wrong?

For some reason, every other open source project (and probably a lot of closed source projects), use Travis-CI wrong in a way, that it will eventually break your builds.

When exactly? Whenever Travis-CI promotes composer to be a first-class citizen on the platform and attempts to run composer install automatically for you.

There may be breakage, but there may also be slowdown because by then you may end up with not one, but two composer install runs before your tests actually run.

Here's what needs fixing

A lot of projects use composer like so:

language: php
before_script: composer install --dev
script: phpunit

Here's what you have to adjust

language: php
install: composer install --dev
script: phpunit

install vs. before_script

I had never seen the install target either. Not consciously at least. And since I don't do a lot of CI for Ruby projects, I wasn't exposed to it either. On a Ruby build, Travis-CI will automatically run bundler for you, using said install target.

order of execution

In a nutshell, here are the relevant targets and how the execute:

  1. before_install
  2. install
  3. before_script
  4. script

The future

The future is that Travis-CI will do the following:

  1. before_install will self-update the composer(.phar)
  2. install will run composer install
  3. There is also the rumour of a composer_opts (or similar) setting so you can provide something like --prefer-source to the install target, without having to add an install target


Is any of this your fault? I don't think so, since the documentation leaves a lot to be desired. Scanning it while writing this blog post, I can't find a mention of install target on the pages related to building PHP products.

Long story short: go update your configurations now! ;)

I've started with doctrine/cache and doctrine/dbal, and will make it a habit to send a PR each time I see a configuration which is not what it should be.

Matthias NobackSymfony2: Event subsystems (23.8.2014, 22:00 UTC)

Recently I realized that some of the problems I encountered in the past could have been easily solved by what I'm about to explain in this post.

The problem: an event listener introduces a circular reference

The problem is: having a complicated graph of service definitions and their dependencies, which causes a ServiceCircularReferenceException, saying 'Circular reference detected for service "...", path: "... -> ... -> ...".' Somewhere in the path of services that form the circle you then find the event_dispatcher service. For example: event_dispatcher -> your_event_listener -> some_service -> event_dispatcher.

Your event listener is of course a dependency of the event_dispatcher service, but one of the dependencies of your event listener needs the event_dispatcher itself! So you accidentally introduced a cycle in the dependency graph and now you need to dissolve it.

The wrong solution: injecting the service container

Assuming there is no way to redesign these services in a better way, you finally decide to make the event listener "container-aware". You will fetch some_service directly from the service container, which gets injected as a constructor argument:

use Symfony\Component\DependencyInjection\ContainerInterface;

class YourEventListener
    private $container;

    public function __construct(ContainerInterface $container)
        $this->container = $container;

    public function onSomeEvent()
        $someService = $this->container->get('some_service');


This will break the cycle since you depend on something else entirely: event_dispatcher -> your_event_listener -> service_container. However, it also introduces a nasty dependency on the service container, making your class coupled to the framework, the service definitions themselves, and making it less explicit about its actual dependencies. Besides, I really think that a dependency issue at configuration level should not affect a class this much.

The solution

There is an entirely different and much better way to break the cycle, which is to avoid any dependency on the main event_dispatcher in your own services at all. Let me explain this to you.

In the past, whenever I introduced a new event in my code, I used the application's main event_dispatcher service to dispatch the event and I also registered any listeners for the new event to the same event_dispatcher, using the kernel.event_listener service tag like this:

            - @event_dispatcher

        class: ...
            - { name: kernel.event_listener, event: my_event, method: onMyEvent }

However, the application's main event dispatcher depends on so many services (via the event listeners registered to it), that sooner or later the dependency mess will start to show cycles. Still, to work with events, we need an event dispatcher (or event emitter, event manager, etc.). Since our code is already written with the Symfony EventDispatcherInterface in mind, we just need to work around the main event_dispatcher service.

Event subsystems

We can avoid the event_dispatcher altogether by introducing event subsystems.

In the old situation we depended on the event_dispatcher. In the new situation, each set of custom events comes with its own event subsystem. This subsystem consists of a dedicated event dispatcher, and an easy way to register event listeners for the custom events.

Since Symfony 2.3 it's really easy to set up your own event dispatcher and to conveniently register event listeners and subscribers in the same way as before.

Example: a subsystem for domain events

A highly relevant example would be an event subsystem for domain events. You don't want those events to be dispatched by the same event dispatcher that Symfony2 uses for its core events.

To be able to use the domain event subsystem, we first need to define a new event dispatcher service:

        class: Symfony\Component\EventDispatcher\ContainerAwareEventDispatcher
            - @service_container

Now we want to register event listeners for the domain event subsystem in the following, familiar way:

        class: ...
            - {
                name: domain_event_listener,
                event: my_domain_event,

Truncated by Planet PHP, read more at the original (another 2237 bytes)

LinksRSS 0.92   RDF 1.
Atom Feed   100% Popoon
PHP5 powered   PEAR
ButtonsPlanet PHP   Planet PHP
Planet PHP