Hi, i'm __edorian.
I'm a magic method. Apart from that a php/mysql/web guy and a gamer... among other stuff
What does the word UNIT in unit testing stand for?
Think of an answer and read on!
*Photo credit – Wikipedia – Free content licence
So? Did you say “A method! Because we test methods!”?
If so let me offer another perspective.
Wikipedia tells us that the answer is:
A unit is the smallest testable part of an application
but what does mean? Is method the right answer because it is the smallest testable part?
To be more precise, using my own words:
Unit testing, in PHP, is about testing the observable behaviors of a class!
Observable from the outside! Nobody cares about the internal state of a class if it never changes the outcome of a method call.
But let’s take a step back.
A method on it’s own is hardly testable. It might seem counter intuitive at first but maybe let us look at some examples before i try to make a general point:
private $count; public function getCount() { return $this->count; }
This is maybe the most obvious case but it shows that without the ability to manipulate $this->count from the outside writing a test is pointless.
We could make $count public (using reflection), change its value and test if the getter works but then all we did was test the implementation of the getter method not that “it does the right thing”.
If you later change the function to:
public function getCount() { return count($this->list); }
our test would fail even so the class might still work as expected!
Another example
public function setValue($value) { $this->value = $value; } public function execute() { if (!$this->value) { throw new Exception("No Value, no good"); } return $value * 10; // business logic }
The example may be a little constructed but it should show you that testing “execute” can’t be done “in isolation” as we need another method.
Well DO'H!
It sounds trivial but the distinction is important and more easy to overlook when looking at bigger classes.
What do we test there?
So we are testing two behaviors of your class and not the methods in isolation!
This is very important as the “test methods” mindset might lead you to adding a getValue function just for the tests so you have “at least those methods covered” but you end up with rather worthless tests that don’t tell you if the class actually works!
OOP is about passing messages between objects
When I call a method on an object I ask it do something. Another way of saying that is “I tell an object to exhibit a certain behavior”
So what ‘behaviors’ does a class have?
What we care about when unit testing
Return values – The answer we get to our questions!
assertSame($object->methodCall, $expectedResult);
Calls to other methods – Does it pass the message along?
$loggerMock->expects($this->once())->method('log')->with('ERROR');We don’t really care about testing global state. We try to avoid it inside the application and for testing the database interactions we use integration or system tests.
If you want one sentence out of this: Having one test case per method is usually a bad thing!
What you want is: One test case per behavior
You want to have a list of “If you do this, it should react like that” sentences in executable form.
Why? Because there is value in specifying how a class should react to certain inputs. It’s why we do testing all along and we should express that as clearly as possible to make writing tests worthwhile.
When you change a class and the tests fail you should be sure that it now actually works differently and if that wasn’t your intention (because you were just refactoring) then you just saved yourself from a bug.
When your tests fail but the class “still works” and you need to “fix the tests” the your tests are worth a lot less as they don’t really give you that cozy safety net that they should provide you with.
They don’t have any observable behavior. You test them implicitly through public API of the class.
They are an implementation detail and it is not important if you have 0 or 100 private methods! For the test cases this should make no difference at all. Nobody cares how the work gets done just that the result is right.
I hope this helped a little to clarify what I mean by saying
We don’t test methods, we test classes/behaviors!
and that it helps you create more meaningful tests suites.
Maybe you have read it maybe you didn’t:
Some days ago there was a blog post regarding php exception performance in 5.4 and the numbers got reported all over the place.
The actually numbers are secondary. The main point is: Don’t trust “random” stuff on the Internet when thinking about improving your application performance. You always need to measure things for your self and take care doing so!
I’ve initially trusted the benchmark myself and disgraced the whole post saying: “Well yes, exceptions are slower than if statements but nice that they got faster”.
After a small comment by Nikita Popov during a chat session we had and his comment on the blog post. Quoting:
A quick tip for the next time when you do “benchmarking”: Try to understand the reason behind it.
For example here the reason is that you did your PHP 5.3 benchmarking with xDebug enabled, and the PHP 5.4 benchmarking with XDebug disabled.
With XDebug (5.3.8) the Exception code needs 1.2 seconds on my machine.
Without it (still 5.3.8) it needs only 0.14 seconds. On 5.4RC3 (without XDebug) it needs 0.09. So the improvement is actually quite minimal (only factor of two).
So, next time you do benchmarks, don’t forget to disabled XDebug
So I did copy the script to look at the numbers my self (the script is attached at the bottom. 1 million runs instead of 100k to get nicer numbers)
/opt/php5.3.8/bin/php a.php
odds: 500000, evens 500000Array
(
[memory] => 72.665935516357
[microtime] => 1.4795081615448
)
odd: 500000, evens 500000Array
(
[memory] => -7.62939453125E-6
[microtime] => 0.42677807807922
)
/opt/php5.4.0RC5/bin/php a.php
odds: 500000, evens 500000Array
(
[memory] => 72.665348052979
[microtime] => 1.0610599517822
)
odd: 500000, evens 500000Array
(
[memory] => 0
[microtime] => 0.47566914558411
)Showing an improvement from 1.5 seconds to 1.0 seconds.
/opt/php5.3.8/bin/php a.php
odds: 500000, evens 500000Array
(
[memory] => 72.666839599609
[microtime] => 5.929713010788
)
odd: 500000, evens 500000Array
(
[memory] => -0.0006256103515625
[microtime] => 0.6544771194458
)
/opt/php5.4.0RC5/bin/php a.php
odds: 500000, evens 500000Array
(
[memory] => 72.665969848633
[microtime] => 5.3529570102692
)
odd: 500000, evens 500000Array
(
[memory] => 6.866455078125E-5
[microtime] => 0.65565991401672
)Showing 5.9 seconds versus 5.3 seconds.
Running stuff with debugging tools is slower than not doing that. Thats why we don’t use xDebug in production :)
More seriously: When measuring take care, when believing measurements take care too :)
error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();
$even = $odd = array();
foreach (range(1, 1000000) as $i) {
try {
if ($i % 2 == 0) {
throw new Exception("even number");
} else {
$odd[] = $i;
}
} catch (Exception $e) {
$even[] = $i;
}
}
echo "odds: " . count($odd) . ", evens " . count($even);
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time));
error_reporting(-1);
$time = microtime(TRUE);
$mem = memory_get_usage();
$even = $odd = array();
foreach (range(1, 1000000) as $i) {
if ($i % 2 == 0) {
$even[] = $i;
} else {
$odd[] = $i;
}
}
echo "odd: " . count($odd) . ", evens " . count($even);
print_r(array('memory' => (memory_get_usage() - $mem) / (1024 * 1024), 'microtime' => microtime(TRUE) - $time));I’ve met one of the Authors, Lorna, at a few (4?) PHP conferences. She has been very supportive, friendly, honest and welcoming. That and her contributions to the PHP community made me check out the book.
The book is solid, well written and covers the most important topics that people need to think about when starting off with PHP. It is one of the few PHP book on the market that you can pass on to your trainees/junior developers without having to “unteach” them half of the taught bad practices afterwards. This is a great achievement in my mind and I’d definitely recommend checking it out and passing it on to your trainees and ‘junior developers’ … maybe read it first yourself and rip out a few pages in chapter 4.
The book: PHP Masters – Write Cutting Edge Code.
First off: I really like the flow, structure and scope of the book. It touches upon a lot of important points for current php development while keeping it simple enough so that it can serve as an introductory book.
I’ve tried to write a more fluent list of impressions but it always came back to a list of points about the single chapters so that’s what I’m going with.
The book starts off describing the very basics of OOP and mixes in the special points to consider for PHP. The decision to use __autoload instead of the more current spl_autoload_register for most of the book strikes me as a little odd but I assume it was done for the sake of simplicity. In my opinion it could have been put in the appendix altogether though.
The chapter continues describing namespaces, statics, inheritance, type hinting, polymorphism, visibility, references, interfaces and exceptions.
The whole package is written in a very compact but understandable fashion that isn’t to find elsewhere. If I really had to I could nitpick a lot but in general I’ve only seen worse printed which is one of the points that makes this book so great for beginners. But the one paragraph that explained statics and started off with “When to use a static method is mainly a point of style.” made me cringe and wish that either a more verbose discussion would have been included or that it would just have been left out. That might as well just be my pet peeve as “a little static never hurt anyone” is what i find prohibits people from reaping the benefits of good OO.
Apart from that it was a really nice read summing up a lot of the complexity a grown language like PHP brings to the table while being concise and readable.
The book choose PDO as the API to explain and it is, imho, the best choice for a book like this. Personally, I prefer mysqli over PDO and a DB API Abstraction over both but for the book PDO made the most sense.
I like the explanation of the PDO API a lot and the book took the time to touch upon the finer points, for that scope, of MySQL like explain, foreign keys, m:n tables and inner vs left join and normalization. Those can be hard to tackle the first time and just mentioning them can save the reader a lot of trouble as they now know the concepts to look into
APIs as in web services to be more precise but that gets pretty clear on the first page. I was initially wondering if that would cover class design, naming methods and parameters and so on as it was pretty early in the book and before ‘Patterns’ but this chapter is talking about “Web APIs”
That’s a big one. I’m glad it is in the book as API design is getting more important nowadays and explains all the basic terms everybody should know quite well.
Starting of with a small explanation of what SOA meant before it became an enterprise-bs-bingo-word it explains the common data formats, xml and json, and common service types like RPC, SOAP and REST quite well showing nice examples for each one.
The thing I like the most here is that json and xml where explained and that it was shown how to use them from PHP. While I don’t like choosing SimpleXml over Dom for any task I can understand that it might be more readable for the examples. Calling it “Simpler” doesn’t relate to my experiences though but that’s a minor issue. (Did you know that if you rearrange the letters of “Simple” you can get “Broken” if you also swap out 5 of them.)
Of the service types the first thing explained is HTTP, the second is cURL and php streams. In my opinion that’s a great introduction as it first gives people a basic understanding of what is going on and then gives them a tool to play with and make sure everything works and everything is reproducible!
Explaining status codes, headers and HTTP Verbs on just a few pages isn’t easy but it came well so that everyone should be able to quickly follow along without needing to read an extra book on the subject.
With a sidestep into HTTP debugging going over RPC and AJAX, showing nice examples, the book is stepping towards the glory of REST!
It is prefixed with a, maybe questionable, disclaimer that REST is ‘quite academic’ which I choose to read as ‘quite hard to do right’ and then agree too. Due to the scope of the book it is understandable (even preferable) that it falls a little short explaining REST in all its glory touching upon HATEOAS (calling it url design) but not going further.
All things considered I can’t argue much against anything in that chapter as it delivers a good overview of all current possibilities for creating web services!
sigh .. this is the only chapter I found myself disagreeing with heavily. Please don’t let that deter you in any way from buying the book. I liked every following chapter and while I’m really not happy with the choices in this chapter I’m sure they put in a lot more thought into this than I did for the review and decided that it was important to explain even the patterns that have a lot of issues associated with them.
You can skip this part of the review without missing out to much but I didn’t want to skip writing it.
The chapter starts of nicely describing what patterns are and why they are useful but then proceeds to slap me in the face by starting with Singleton.
I’m all for explaining Singleton as an example of “that’s what the people that came before you really fucked up” but the pages read like there could be use case for a Singleton in modern PHP. Really, there isn’t. The big disclaimer after the pattern description helps a lot but starting off with THE ONE MOST HURTFUL PATTERN that is common in PHP really annoyed the heck out of me.
I know/assume the order of patterns was chosen deliberately to show of the evolution of concepts, I just don’t agree with the order or the conclusion it leads to.
While I could come to terms with the decision to structure it like that turning the page made me believe I’ve picked up a fake “let’s fuck with the file sharers” copy that ended up in print by accident. I’d like to say this more nicely but this really was my honest response to seeing a SingletonTrait.
The whole chapter about Traits is a short explanation “This is a sample Singleton DB Trait”. We used to joke that the first thing traits would be abused for was to implement a Singleton with it and my own attempt at it was, rightfully, disregarded as ‘trolling’ by everyone I sent it to.
Then we get into the Registry pattern, the next most hurtful pattern down the line to continue the journey from Singleton to DI. Registry is only a little less broken than singleton and breaks with way to many OO concepts to be endorsed. The sample even goes into late static binding which again is just breaking with proper OO practices.
With a short side note on Factory there is the agreeable section about Iterator in which I’d only was annoyed by one point that was made:
“it is not uncommon to have an object that represents both the business logic—for example, basic CRUD (create, read, update, and delete, the four fundamental database interaction functions)—and storage of a dataset”
It is not uncommon, yes, that is true. It’s a bad idea though and CRUD is not a good example for business logic. People have enough trouble not doing MVC completely wrong and that sidenote didn’t help that case at all!
Continuing with small section about Observer, a pattern I don’t find any use cases in php but that is very valid here, we continue to Dependency Injection.
I wish I could end this chapter here but I can’t. It starts off with
“Dependency injection is one of the simplest patterns”
and while it can be quite easy I always found that Inversion of Control is one of the hardest to grasp OO concepts.
The very next sentence states
“For each dependency, you specify a setter method (and it’s nice if you add a getter too!) that will accept an argument that’s able to fulfil the dependency requirement.”
and if you explain DI to me like that in an interview that is the point where I will ask one or two clarifying questions and then ask you to leave! Not having any clue about DI is fine, especially for ‘junior developers’ but that explanation is beyond good and evil. For me that is completely unacceptable and hurtful to the readers. Especially the suggestion to even include a GETTER…
DI is a massive subject with many variants and setter injection is the most useless one. The sample even combines it with the Registry just to make it completely unusable. It doesn’t show of any benefit it has for separation of concerns and overarching application architecture.
The impact of using DI on the design of individual classes is way greater than the decision to use MVC and I feel that was cut horribly short, It should have been explained in more detail or at least left out completely if no somewhat acceptable explanation was possible in the scope of the book. Maybe I’m being to pedantic here but I feel that if the focus of the book here was to teach people about the existence of patterns one sample just showing of constructor injection in combination with a factory and not the mostly pointless setter injection could have sufficed.
MVC
I’ve spent way to much time on that chapter to I’ll keep that one short
After ordering a couple of copies for other developers I feel that I should cut out that chapter (and the last one) or at least put big disclaimer in it. Thankfully the rest of the book is good enough to make up for everything I disagree with here.
*Keep in mind that this is just my take on that chapter
There is no ONE TRUE WAY of writing PHP code. “What works for us” is a very good choice and please don’t take offensive with the way I’ve expressed my opinion here :)
Amazing!
Everything that needed saying was said. The structure (Problem, Sample attack, Fix) is really great. While there surely can be nitpicks it tells developers everything that they need to hear before deploying their first websites. It’s a great read with nice thought out attacks and fixes.
If now we could get people to just stop using
<form action="<?php echo htmlentities($_SERVER['PHP_SELF']); ?>">
and use
<form action="?">
or the related methods ;)
Yay! No micro optimization but showing what matters!
AB and JMeter get us started continuing with APC and Session Storage options leading to xDebug and XHProf delivering a very nice chapter that I enjoyed.
The testing section starts of good showing basic phpunit usage and suggesting an ok folder layout. I missed the “test suites organization though the phpunit.xml file” section as this is something people tend to struggle with but all in all it’s a very solid start.
It goes into “testing views and controllers” and while I wished where wasn’t any need for that section a lot of frameworks struggle with making their controllers testable. They are just normal classes they should be testable like everything else… well :)
I don’t draw the same conclusions as the book but it’s nothing too far off, unpractical or hurtful that I feel the need to go into it. It’s a good read rounded off by all the integration testing options that exist. Showing tools for front end, db and even load testing.
Showing of most of the phpqatools and I found myself agreeing with pretty much everything that was said. Especially the suggestions that having a coding standard is more important then which one and that it’s not hard to build your own standard are very close to my heart.
Talking about SCM with svn AND git is a very welcome additions to the book that I didn’t expect at first but was very happy that it got included. It makes it even more useful to pass it to trainees.
Finishing up with automated deployment this chapter of the book touches many many important points of an projects live cycle.
Icing on the cake. Using pear and pecl is easy once you know what to do but can be a major pain if nobody ever showed you. I was very happy to find this included in the book. Going to deep to even show off how to create PEAR packages and compiling extensions by hand.
The SPL section is solid and the suggested “next steps” are great!
I can only recommend the book to everyone. Tread with care in the OO sections but apart from that this is one of the view PHP books that should be read by everyone.
For the lack of a better title. What I’m trying to describe is that familiar feeling…
It just doesn’t look right
Without being able to pinpoint a problem I guess everyone of us programmers opened a source file and sighed at some class or method. Maybe the problem was solved correctly, maybe it’s just complicated.. but sometimes there is that strange feeling that an underlying issue was covered up or that the issue was solved at the wrong level of abstraction or something.
But what does “looks wrong” actually tell us. Is there an issue? Do we just not understand the authors intent? Did he violate some design rule we adhere to and is that really an issue?
Let’s talk about those rules first.
There are a lot of ‘rules’ or guidelines people tell us to follow like:
Those rules have emerged for a reason but blindly following them without understanding why the came to be tends to lead to more trouble then not caring at all.
Everyone can break down code into little functions but when those are named stuff1(&$context) through stuff125(&$context) nothing was achieved. Just alike it is extremely easy to get 100% code coverage by writing everything in one line.. but hopefully that is so obvious a hack that nobody does that.
Understanding those rules, when to apply them and why they exist helps creating “better” code.
But what to do when you can’t achieve those goals?
“Well 100% code coverage is only a goal to strive for and it’s not important anyways.”
“Well static methods just are more pragmatic here”
“But i need to test that protected method! Is has all the logic!”
In the last year I’ve gotten really fluid with phpunits mocking api and I haven’t encountered an issue that i couldn’t mock around for quite some time now but usually those 40 line mock constructs “just look darn ugly”. After finishing up a complex set of mock objects I look at the test and think “If that is what it takes to get my problem solved then maybe It’s not worth solving” or something like that.
When then tends to happen is that, upon closer inspection, I find out that I’ve crammed a lot of stuff into one class that should be split up into smaller pieces. For example file access, parsing, business rules and then persisting the results.
The original issue “my mocks look ugly” was not due to the design or capabilities of mocking API but showed me that i needed to do A LOT of stuff you to get that once test to execute. The solution was not in trying to solve the problem on the test level but to step back and see that the class was just way to powerful.
I wouldn’t have noticed without the test getting really really ugly that there was an issue with my piece of code until i forced myself to go through the methods an see why there where 5 lines in those methods that my tests didn’t cover. They where just exceptions, file checks and so on but testing them they lead me to discover that the class was touching way to much stuff in the system and knowing/expecting way to much about its environment.
Metrics are self evaluation tools not a set of arbitrary numbers or guidelines that exist for the sake of it.
If you struggle to achieve one of those goals don’t discard it easily saying “Well I’m just a little off here” before making sure you’ve taken a step back and looked at the problem on a broader level. Maybe from time to time you will discover a bigger, underlying, issue that would have bitten you down the road.
I’m not sure where I’m going with this but it’s something I’ve noticed quite a lot of times and that I’m still having trouble explaining to people so it seem appropriate to write about it.
The last time that came up for me was after seeing what Laura / @elblinkin had written on unit testing and her phpunit phpcs standard. The first though was not “Oh look, shiny new rules to impose” but “Maybe those set of guidelines can teach me something about how my tests and my production code should be structured” and that is how many of those guidelines should be look at in my humble opinion.
If you happen to share my feeling there and have a nice name or some for samples I’d be happy to hear about them.
Three weeks ago PHPUnit 3.6 was released and it has a little new feature you might have missed until now.
PHPUnit can now show you code coverage information on the command line
phpunit --coverage-text [...] Code Coverage Report for "BankAccount" YYYY-MM-DD HH:II:SS Summary: Classes: 90.91% (20/22) Methods: 94.74% (36/38) Lines: 98.38% (182/185) @bankaccount.controller::BankAccountController Methods: 100.00% ( 2/ 2) Lines: 100.00% ( 13/ 13) @bankaccount.controller::BankAccountListController Methods: 100.00% ( 1/ 1) Lines: 100.00% ( 2/ 2) @bankaccount.framework::ControllerFactory [...]
While practicing TDD and while working at new classes I usually have my test suite running on one of the other screens using an advanced version of “watch -n1 phpunit” so i don’t have to press a button to see the testing status.
To make sure I always stay at 100% test coverage, which is not hard when doing TDD, I was using —coverage-html and a browser window with auto refresh but this was a rather cumbersome approach so I wanted something smaller and faster.
Another nice thing is that it got really easy to get an overview of a projects code coverage status without needing to generate a set of html file first.
You should have figured out by now that this type of reporting is pretty pointless for anything expect very small projects when used on the whole code base.
For anything a little bigger this can still be used in conjunction with the —filter option.
watch -n1 phpunit --filter MyNewClass --coverage-text
keeps you updated on the class you are currently working on shows its code coverage information.
This feature is of course also documented in the phpunit documentation
If people are interested in using PHPUnit this way one possible addition would be to list the methods of a class if it is the only one with coverage information as this would help a little when working with single classes but for any more advanced information the HTML report will still be the place to look.
Give it a try and let me know if you like it.
The Code Coverage reports PHPUnit can generate for you can tell you one important thing:
What parts of your code you definitely have not tested yet!
It can’t tell you what part of your code base you have tested. Everything that is green (“covered”) only shows you that that code was executed. If you where to write a small piece of code running your bootstrap and executing all your controllers in a loop than you probably can get 80% of your code to execute but you have tested 0% of it.
One of the goals of your test suite and the coverage report is to make you trust in your code base and to remove the fear of changing something that needs to be changed. If you look at your coverage report and you see that one class or one module has 100% coverage (which is doable and the only goal to strive for that makes sense.. but thats for another post) you should trust your tests and your code to work properly. You shouldn’t think “Well yes that a 100% but a lot of that just comes from that big integration test and I don’t know if the class is really tested!”.
Thankfully PHPUnit offers a way to drastically increase your confidence in what you actually have tested.
By using the @covers annotation above each test case you can tell PHPUnit which methods you are actually testing in that test case. That means you don’t generate code coverage information by accident. Additionally the —strict switch will not generate coverage for all test cases that don’t make any assertions. Of course you can “work around” that but as only as you don’t lie to yourself your test suite won’t ether.
You can find a basic example of how the @covers annotation works in the PHPUnit documentation.
The great thing about @covers is that now you have the knowledge that your tests really go into each line of your code and most likely make assertions about the outcome. When it comes to trusting your own test suite this is a big improvement.
For pulbic methods usually one test tests one method. Your testAddingStuffFailsWhenIInvalidProductIsPassed covers your addStuff, your removeStuffIAdded mainly covers your remove stuff as the addStuff method should already be tested before.
While you could also “@covers” your addStuff method when testing the removeStuff method I’d say usually that isn’t needed and you should only generate coverage for the methods that correlate with your assertions. Not the methods you call to “set up” your object.
When it comes to protected methods there are two ways you can go
In my opionion the second one makes more sense. First of you don’t have to write +10 lines of @covers annotations when you have many small protected methods in your class and secondly i think it goes better along with the “We don’t test protected methods because they are an implementation detail” way of thinking. I don’t want to adjust my @covers annotations every time i refactor. This can be achieved quite easily by adding *@covers * to the test classes doc block.
/** * @covers Calculator:: */ class CalculatorTest extends PHPUnit_Framework_TestCase { protected function setUp() { /* ... */ } /** * @covers Calculator::add */ public function testAddTwoIntegers() { /* ... */ } /** * @covers Calculator::multiply */ public function testMultiplyTwoIntegers() { /* ... */ } }
The @covers annotation help you increase the quality of your test suite by making sure that your unit tests really only test one class by only generating code coverage for that class. When you have this knowledge you can be a lot more confident that your tests really cover every class of your project. Especially that the one you are currently working on is properly tested.
While there are other ways to achieve that, for example running on the one test case at a time for each of your classes and then looking at that coverage, the @covers annotation offers a nice, clean and fast way to improve your tests. Try it!
Recently i had the chance to discuss a coding-standard with someone. It’s way more important to have and follow one than what it contains about i enjoy those discussions and we got to one point that i haven’t talked to anyone about for years: Using the _underscore to denote private methods and variables.
Let’s have a short look at PHPs recent history.
PHP 5.4 is just around the corner, 5.3 is alive and kicking and our beloved “I’m still stuck with it but it was the first 5.x that really worked well” PHP 5.2 is not supported any more.
PHP 5 gave us proper OO (again) and now that we have namespaces and closures you could start calling PHP a proper language ;)
We are deep into 5.x now and it’s time to get rid of your PHP 4 legacy. The last release was 3 years ago, the number of people that still admit to using it in production dropped below the care ratio and all major frameworks and libs migrated.
… and we needed to walk 20 miles though snow to start the interpreter but thats another story…
So just to remind anyone that what was right some day isn’t right the other day (many years later).
PHP 4
class Foo {
var $_a;
var $_b;
function myStuff() {}
function _myHelper() {}
}Back in the day we did this to say “Don’t touch my privates bro!” because we had no other way of doing that.. well except @phpdoc and ASCII art.
It worked kinda sorta well and it did its job good enough that we didn’t mind.
Then came PHP 5
class Foo {
private $a;
private $b;
public function myStuff() {}
protected function myHelper() {}
}and we could get rid of all the underscores.
Even with PHP 4 people didn’t use them for variables all the time as you didn’t expect people to fiddle around in your members anyways but now ‘they’ gave us visibility and we got a clear way of expressing what one was supposed to call and that was considered an implementation detail.
Of course there are older projects that transitioned from PHP 4 and while migrating one had more issues than fiddling with underscores and thats fine. I’m not suggesting to change everything this instant! There are BC reasons to keep names and so on.
Projects starting nowadays shouldn’t adopt this practice!
It doesn’t offer any benefit and should be considered legacy that once was useful but is superfluous and redundant nowadays.
Code like:
private _myHelper() {}just tells me twice that it’s a private method and if i have to choose I’ll gladly take the one that reads better and is commonly used throughout the language.
Modern IDEs are able to only show me the public API anyways and while working from within the class it doesn’t really matter all that much if i call another private or public.
From a refactoring standpoint it doesn’t really matter all that much but it makes promoting a function to public easier when i don’t have to change the name to do that. Although even search & replace can usually take care of that.
I just don’t see any reason to keep this practice around in PHP 5+ so my suggestion is to remove PEAR_Sniffs_NamingConventions_ValidVariableNameSniff from your phpcs.xml for any upcoming projects.
They are a pain to work with. PHP uses copy on write and $x = $y = $z = str_repeat("a", 10000); only stores 10.000 Chars not 30.000 so there is no performance gain in 99.999% of the cases.
Even so not every PHP Developers knows WHY we don’t use references pretty much every core function and every somewhat modern framework avoids them so people adapted this best practice. The leftovers in the PHP core, like sort() or str_replace(), are exceptions to the rule.
So if the common consensus is, or at least ‘should be’, that we should not use references then maybe we should start looking for places where they hurt and how we could fix them?
I really like Prepared statements. I found them to be the “default secure” way of creating queries. (Yes, I’m aware that they where not designed as a security feature but it helps so lets just do that ok? :) ) – When using normal queries you have to remember to use escaping every single time in order to be save, with prepared statements you just have to remember to use them at all. Thats a lot easier and way less error prone.
I’m not going into detail about the why and when not to use them for now. I’m more interested in talking about the one thing that, imho, makes it REALLY hard to sell this way of querying to PHP developers.
Just look at the API and let’s compare that to a normal MySqli Query
Samples mostly copied from the PHP manual
A normal query
$query = "SELECT Name, CountryCode FROM City WHERE id = 1"; $result = $mysqli->query($query); while($array = $result->fetch_array()) { // do stuff with // array("name" => "...", "CountryCode" => "..."); }
The same thing with a prepared statement
$query = "SELECT Name, CountryCode FROM City WHERE id = ?"; $statement = $mysqli->prepare($query); $id = 1; $statement->bind_param('i', $id); $statement->bind_result($name, $countryCode); // Look ma, i can create variables out of thin air! while($statement->fetch()) { // you thing we are done here? // we just have two variables, not an associative array! $array = array("Name" => $name, "CountryCode" => $countryCode); // ^^ hard-coding the names? No.. that can't be right! // ------ // There is another way! // ------ $fieldnames = $statement->result_metadata()->fetch_fields(); $fieldsAsArray = array(); foreach($fieldnames as $field) { $fieldsAsArray[] = $field->name; } $array = array_combine($fieldsAsArray, array($name, $countryCode)); // Writing this makes me quite sick.. i hope you can stand reading it }
Counting lines:
Normal query: 3 lines
Prepared statement: 6 lines to get to the while and another 6 lines to get a ‘proper’ array
So even without the comments the prepared statement way of doing things just is A LOT longer and doesn’t look very nice. Most IDEs and PHP_CodeSniffer will also bug you about an “use of undeclared variable” in the –>bind_result line even so that’s technically correct. But I’d say it just look very unfamiliar to many PHP Programmers.
But who uses that database directly anyways? Well if you don’t use an ORM you usually use some sort of DB Abstraction or write your own. While you might be able to use mysql(i)_* functions natively I’d like to make the case that:
At least i can’t. You can make a case for NOT NEEDING THAT ARRAY ANYMORE and thats ok if all your classes don’t deal with those types of arrays but PHP is (or at least was for many years) hash map based (array) programming, the C code and the userland PHP code. Having 1 or 2 dimensional arrays with your data can work out quite well for many use cases and more OOP-Style data structures can be YAGNI’d away if you know when you know when you can get away with using that data structure.
For example they can be a pretty nice way of getting rows from a database. (No, stdClass doesn’t count as an OOP-Array, [] or –> access isn’t enough of a difference ;) ).
I challenge you do look into the Zend_MySqli Driver and examine their statement implementation. It’s vomit inducing and it’s not even their fault. I’ve spend quite some time talking to people if there would be a better way to achieve their goals and nobody could come up with anything… well anything exception questions like “oh my god why/what on earth are you doing there $randomSwearWords”
My very simple requirements
I want a ->fetchAll($query, array $params) function that returns a 2D array with assoc arrays for each row.
Thats it! Let’s write that!
This is quite a lot of pretty ugly code so I’m skipping some steps like creating the connection it’s self.
Our –>fetchAll function in one big pile of code
public function fetchAll($query, $arguments) { $statement = $this->mysqli->prepare($query); // Skipped error handling for readability $argumentCount = count($arguments); if($statement->param_count !== $argumentCount) { // fail } // Now we need to call 'bind_param' // 'bind_param' is a procedure and the only way to call a procedure with a variable number of arguments is call_user_func_array // BUT WE NEED TO CALL IT WITH REFERENCES! $callArgs = array(); foreach($arguments as $index => $arg) { $callArgs[$index] = &$arguments[$index]; // :( } // Assume all parameters to be strings, works quite well apparently array_unshift($callArgs, str_repeat("s", count($arguments)); // Now bind the parameters call_user_func_array(array($statement, 'bind_param'), $callArgs); // Now we can execute the statement, finally $statement->execute(); // again, error handling skipped // Now the RESULTS! // The fieldnames $fields = $statement->result_metadata()->fetch_fields(); $fieldnames = array(); foreach($fields as $field) { $fieldnames[] = $field->name; } // Now we need a CONTAINER where the results get fetched into! $resultRow = array_fill(0, count($fieldnames), null); // Oh and btw.. THOSE FIELDS NEED! TO BE REFERENCES! foreach($resultRow as $index => &$value) { // ^^ Don't try this at home! foreach & references are evil! $resultRow[$index] = &$value; } call_user_func_array(array($statement, 'bind_result'), $resultRow); // All preparations done! Let's fetch! $result = array(); while($statement->fetch()) { // THIS IS WHERE IT GETS REALLY UGLY! // we need to dereference the result values since we don't want to return reference to the user // Doing so would break in very hard to debug ways! $deref = array(); foreach($resultRow as $value) { $deref[] = $value; // This is not a copy on write, this hurts! } $result[] = array_combine($fieldnames, $deref); } // You are still reading this? Thanks :) return $result; // Done! }
Fetching data this way uses a lot(!) more memory than it should and from heavy production use I’ve benchmarked that for the average case around 20% of all query execution time is spend dereferencing the return values. That of course heavily depends on the queries and the amount of data returned but thats time i waste pandering an API just to make it somewhat usable!
Again:
20% of the time it takes to run $result = $db->fetchAll("...", ...); is spend moving around stuff in PHP memory for no reason what-so-ever.
I hope after reading this you can agree that this needs to be fixed!
The ‘easiest’ way:
$statement->fetchArray()That would take away the need to ‘bind_result’ and while the other stuff isn’t really nice I can live with quite well. It doesn’t cost that much performance. Being able to pass the parameters directly to –>fetchArray would be even nicer! but maybe a rather strange API.. not to sure :)
Maybe the proper way?
Give me a way to get an MySqli_Result from a prepared statement execution so i can use its fetch methods.
Use PDO?
PDO doesn’t do “real” prepared statements but client side escaping/expanding and that creates a whole lot more problems than I’m trying to solve. No thanks, I’d rather do the escaping myself and use sprintf before going with PDO.
Do you know of a better of doing so?
I’d be pretty interested and I guess the Zend Framework guys would be too. So please share!
For a complete, working implementation of this sample ether contact me (for my ~600 loc implementation thats a little cleaner and has more features) or look at the Zend Framework DB package. The really ugly parts are the same :)
And on your way out take a cookie for reading the whole post!
I just hate talking about frameworks!
But as it seems not many people share that feeling so this is an attempt to write a rather short and linkable post on how i approach a new framework and by what standards i judge it.
Over the past 2 month I’ve spend a lot of time in various chats talking over php. In that time over 15 php frameworks have been introduced to me and I’m kinda done explaining why i do or (mostly) don’t like a certain framework.
I’m not going to call any names in this post so no need to grab your pitchforks. (For some reason people seem to get really upset when you tell them you don’t like the framework the use)
Thats pretty much the only question i ask myself when reviewing a framework.
If their code looks like crap i don’t really care, chances are that i don’t understand it or that i just picked “the wrong file”. If the API is a mess thats a different topic but how everything is implemented usually doesn’t concern me at all.
What’s really important to me is that the code i write using that framework is something i want to maintain. For me that takes care of at least 90% of the frameworks I’ve seen.
Those are my standards and your mileage may vary so let me explain…
The front page usually will tell me that the framework is: “Fast, secure, flexible, easy, small, elegant, efficient, reboust, MVC, simple and that kittens will die when i use something else for my next project”
The amount of “positive adjectives” usually is already a good indicator where the ride will take me. The less the better and bonus points of the framework tells me what it is aiming to do and what is is NOT good at. A clear vision is a much more valid excuse for implementing something in a way i don’t like than “because it’s better this way” ever can be.
If the front page mentions MVC (and at least 90% of the frameworks do) i usually look up the frameworks definition of MVC and compare it to their implementation and further links explaining MVC.
Actually that part can be quite fun sometimes:
I’m aware that word is becoming more of an empty marketing term that lost all its original meaning in the php framework world and has become a shorthand for “Don’t put sql in your template” but sometimes a framework is actually aware of what it does and gives it a proper name and provides a solid explanation for its approach.
Then I’m browsing the sample code in the docs and maybe the sample projects they include. The question I’m asking myself there is “Is that code i want to have in my project” and “Does it work like I’d expect it to work by just reading the class names”
After doing so I’ll get to the main question…
Everything mentioned above can be boiled down to a very basic question:
I’d just like to be able to write unit tests for the code I WRITE. That is a rather specific statement but it was many implications about the architecture of the framework, so in reality there is more to it than testing but I found that most of those things can be reduced to that question.
I don’t care about the framework, it’s nice if they do because they might know how testable code looks like but it’s not a problem if the framework has no tests at all. I just expect it to work, how the people creating the framework ensure that is not my business.
Every other aspect of the framework is negotiable but not being able to write tests means that I (personally) have no way of making sure that the code i write actually works and is structured in a way that i know i can adapt when requirements change.
Additionally: If I’m able to mock out every part of the framework while i write tests there is also a relatively big chance I’m also able to replace a part of the framework (like the authentication) in case i, for whatever reason, need to do so.
Testable code is not a boolean so let me draw you a little scale:
From “oh dear…” to “hell yeah!”
To write test for the code I write using the framework i need to…
I usually spend an hour or two with every php framework i encounter before passing personal judgment and usually i don’t get very far down that list.
That might not be a problem for you!
Your values in code might differ and you might not need that for your project. I tend to focus on very long running projects that a long maintenance period and testing is the best way i know of to make my code future proof.
So please don’t be mad if you don’t like the framework you are using, it just might not work for me :)
Let’s look at the console output!
phpunit [...] Return code: 139 [...] Build FAILED!
sigh. Not again!
About half the “Build failed” mails I’ve gotten from Jenkins in the last two weeks where not due to me breaking the tests but just PHPUnit segfaulting. Wait! I know PHPUnit can’t segfault!“, only PHP itself can.
And it does, quite often. For some reason that probably has to do with using PHP 5.2.OLD it doesn’t survive generate the clover.xml file or the HTML report about 20% of the times it’s being run.
This probably could be solved by upgrading PHP but as long as that hasn’t happened on the production servers i don’t want to do that for CI ether and the production env. is on that old version because $randomLameExcuse.
So for now I’d like Jenkins to not send me a mail when that has happened. I don’t want failed builds because of that ether. After playing around with several ideas how i could handle that and discovering that very one of those brought it’s share of new problems. Like telling ant to ignore the segfault it, of course, lets to Jenkins complaining about the clover.xml not being valid an so on. So i resorted to something pretty simple.
For now I’ll just rerun PHPUnit until it gets there. The script below does, for now, a pretty good job at that.
It wraps the “phpunit” call passing though all parameters. Should PHP segfault the script is run again until something else happens. That PHPUnit return code is than returned by the script.
Sometimes the build takes a little longer (if it runs twice) but so far i haven’t seen it run more than those two times. Even if it should get stuck in an endless loop adding a counter to the script seems pretty trivial. Well, as is the whole problem but it generates useless mails and wastes time, so away with it!
phpunit-segfault-wrapper
#!/usr/bin/env bash
returnCode=139; # Segfault
echo "Starting PHPUnit Segfault Wrapper";
echo
while [ $returnCode -eq 139 ]
do
phpunit $*
returnCode=$?
done
echo
echo "Done with PHPUnit";
echo
exit $returnCodePlacing that file next to your build.xml and a changing the “phpunit” call to a “phpunit-segfault-wrapper” call is all there is left to do.
Chris Cornutt over at phpdeveloper.org pretty much nailed it:
If something was seriously broken, this could cause all sorts of problems, but in theory it’s a simple hack that gets the job done.