We have moved our forum to GitHub Discussions. For questions about Phalcon v3/v4/v5 you can visit here and for Phalcon v6 here.

Raw query taking too long

I've been fighting this issue in different ways for a while. Right now I have cron generate a MySQL view of a joined query processing 300,000 results with a limit of 1000. This query runs very fast.

Here's the query that generates the view;

    $sql = "
            DROP VIEW IF EXISTS active_users_list;
            CREATE VIEW active_users_list AS
            SELECT roles_users.*, users.* FROM roles_users
            LEFT JOIN users ON users.id = roles_users.user_id
            WHERE roles_users.role_id = '2'
            AND users.f_name IS NOT NULL
            AND users.city IS NOT NULL
            AND users.gender IS NOT NULL
            AND users.date_of_birth IS NOT NULL
            ORDER BY users.last_active DESC
            LIMIT 1000;
        ";
        $con = \Phalcon\DI::getDefault()->getShared('db');
        if($con->query($sql)) {
            echo "Active user list generated\n";
        }

Then the following code is used to query the view of 1000 results in a paginated way;

        $sql = "
        select * from active_users_list
        LIMIT $offset, $limit
        ";
         $con = \Phalcon\DI::getDefault()->getShared('db');
         $data = $con->query($sql);
         $data->setFetchMode(\Phalcon\Db::FETCH_OBJ);
         $results = (object) $data->fetchAll();

This query can be very slow and can take 6 seconds or more!

I originally had the main query running here, paginated, with all 300,000 results, but as you got further to the end of pagination it would get rediculously slow.

It seems to be the LIMIT $offset, $limit which slows it down. I was under the impression the above was a raw SQL query, so would avoid Phalcon's limitations when working with large result sets. Is the speed issue here still caused by Phalcon?

This sounds more like an SQL issue rather than Phalcon. Rewriting the join to inner will give you some performance boost:

SELECT roles_users.*, users.* FROM users INNER JOIN roles_users ON users.id=roles_users.user_id WHERE roles_users.role_id=2

You may use a PHQL query builer with Phalcon\Paginator\Adapter\QueryBuilder as your paginator and still get the same performance as with raw SQL, since the builder will automatically insert the LIMIT command. Check the docs: https://docs.phalcon.io/en/latest/reference/pagination.html

And yes, your last snippet will execute a raw SQL command using the PDO adapter, so it should have the same performance as running it from console for eg.

Try benchmarking your query first, so you can pinpoint the performance bottleneck. Here's a nice tutorial, although not exhaustive: https://www.digitalocean.com/community/tutorials/how-to-measure-mysql-query-performance-with-mysqlslap

edited Apr '16

Rewrite your view to remove OFFSET and use LIMIT and WHERE greather then something clause instead(im guessing you have user id maybe).

On a big data set you should avoid offset.

PS: Views won't reduce execution time (99% of the times). Oh, and I do hope you've indexed your columns ;]

Best case to check if it's the phalcon or sql just test it in sql(for example in phpmyadmin) or just use plain php.



20.4k

Rewrite your view to remove OFFSET and use LIMIT and WHERE greather then something clause instead(im guessing you have user id maybe).

On a big data set you should avoid offset.

Thanks for the pointers. Yes there are user IDs and roles_users has foreign key on user_id. I will try using LIMIT with WHERE and see how that goes.



20.4k

PS: Views won't reduce execution time (99% of the times). Oh, and I do hope you've indexed your columns ;]

Yes I'm using foreign keys. The intention isn't really to use the view. I'm only using it at the moment to try to debug the speed issues.

But firstly check it on raw query. Maybe it's really phalcon's problem.



20.4k
edited Apr '16

But firstly check it on raw query. Maybe it's really phalcon's problem.

No, it's the query. I spent the day reading up on it, and it's suprisingly tricky. I remember discovering the same issue in the past too. Using offset on a large data set is the issue. Doing something like WHERE id BETWEEN $offset AND $last_id is the only way this seems possible, but I am ordering by last_active (DateTime) and not ID. I have had to rediseign the way the app works. This link is helpful for anyone who comes across this issue. https://www.percona.com/files/presentations/ppc2009/PPC2009_mysql_pagination.pdf <- this crappy forum won't link it. Use underscores "_" in place of %5