Hi Mike! Thanks for your interest, I really feel better knowing i'm not the only one had pain on this subject!
Your code is exactly what i have in my project and it works only on the first test method called.
I investigated and when it run the first test, in the setUp DI::getDefault(); return an object,
when the secont test method is called, DI::getDefault(); return null.
[Edit], i solved this problem, it was code i've inserted in my parent-parent test class to workaround another issue. It is a feature-lack of the Phalcon library, and i will open a new topic on this. To be short, there is no way to reset shared services dinamically in phalcon.
The only solution that has worked was that i had when i started the topic, that's to instantiate DI in the setUp method of the parent test case class so it created a new DI for each test-method run.
Obviously this solution was not compatible with the session service initialization because phpunit run all tests in the same context and when session->start is called multiple time in the same context it raise errors... so i fixed this creating a test session class that store values in a temp file and replacing the standard session class (i use the Redis adapter) with the test-session class in the test setUp method. It's not very elegant and i think it could evolve the test project in something horrible and unmaintenable, but actually it worked .
The real problem with sessions and other request-context related problem is that phpunit run all test one after the another in the same context and especially with shared services there is no a real working way to reset it all for each method without involve workarounds and some strategy that increase code complexity. (i'm one of those that think that code (especially testing code) must be kept simple)
As emerged in other topics i've joined, i think the best solution to test controller's actions is actually to use acceptance test and reserve unit-test to services/library only where the developer has more control on requirements and can mock it all up in test methods.
Acceptance test are slower but they are run each in their own isolated context and well emulating user behavior.