Sometimes it’s useful to be able to quickly peek what keys memcache is storing and how old are they. A good use case for example could be to check whether something is cached or not or that they expire as they should.
At first I found a way to dump memcache keys through telnet. However if a memcache instance is fairly large and has a lot of slabs and thousands of keys it becomes impractical to do it manually.
I wrote a simple utility that helps me find keys across all memcache slabs.
#!/usr/bin/php
< ?php
$host = "127.0.0.1";
$port = 11211;
$lookupKey = "";
$limit = 10000;
$time = time();
foreach ($argv as $key => $arg) {
switch ($arg) {
case '-h':
$host = $argv[$key + 1];
break;
case '-p':
$port = $argv[$key + 1];
break;
case '-s':
$lookupKey = $argv[$key + 1];
break;
case '-l':
$limit = $argv[$key + 1];
}
}
$memcache = memcache_connect($host, $port);
$list = array();
$allSlabs = $memcache->getExtendedStats('slabs');
$items = $memcache->getExtendedStats('items');
foreach ($allSlabs as $server => $slabs) {
foreach ($slabs as $slabId => $slabMeta) {
if (!is_numeric($slabId)) {
continue;
}
$cdump = $memcache->getExtendedStats('cachedump', (int)$slabId, $limit);
foreach ($cdump as $server => $entries) {
if (!$entries) {
continue;
}
foreach($entries as $eName => $eData) {
$list[$eName] = array(
'key' => $eName,
'slabId' => $slabId,
'size' => $eData[0],
'age' => $eData[1]
);
}
}
}
}
ksort($list);
if (!empty($lookupKey)) {
echo "Searching for keys that contain: '{$lookupKey}'\n";
foreach ($list as $row) {
if (strpos($row['key'], $lookupKey) !== FALSE) {
echo "Key: {$row['key']}, size: {$row['size']}b, age: ", ($time - $row['age']), "s, slab id: {$row['slabId']}\n";
}
}
} else {
echo "Printing out all keys\n";
foreach ($list as $row) {
echo "Key: {$row['key']}, size: {$row['size']}b, age: ", ($time - $row['age']), "s, slab id: {$row['slabId']}\n";
}
}
This script accepts 4 parameters:
-h host
-p port
-s partial search string
-l a limit of how many keys to dump from a single slab (default 10,000)
The easiest way to use it:
./membrowser.php -s uk
Searching for keys that contain: ‘uk’
Key: 1_uk_xml, size: 3178b, age: 1728s, slab id: 17
Key: 2_uk_xml, size: 3178b, age: 1725s, slab id: 17
Key: 3_uk_xml, size: 3178b, age: 1721s, slab id: 17
Download memcache keys dump script.
P.S some of the code I’ve copied from 100days.de blog post.
25
Optimizing MySQL on Ubuntu 10.10 Maverick
5 Comments | Posted by Žilvinas Šaltys in Linux, MySQL, Optimization
Since Ubuntu 9.04 Jaunty Jackalope Ubuntu ships with EXT4 as the default file system. Surprisingly it makes MySQL writes extremely slow. This post is targeted to developers who work on Linux using MySQL and who would like to optimize MySQL performance.
Disk Performance Tuning
First start by tuning your disk performance. To do that you’ll have to sacrifice data consistency over data write speed. First start by enabling journal_data_writeback on your partition. This will allow to write to disk before updating the EXT4 journal. If your box crashes before updating the journal you might loose new data or some deleted data might reappear.
sudo tune2fs -o journal_data_writeback /dev/sda1 (use the right partition)
Next step is editing your /etc/fstab to change ext4 mounting options. My fstab file looks something like this:
UUID=irrelevant / ext4 errors=remount-ro,noatime,nodiratime,data=writeback,barrier=0,nobh,commit=100,nouser_xattr 0 1
There’s a few non default options added to improve write performance over consistency. Journal data writeback is enabled by data=writeback. The main option which is slowing down MySQL is barrier=0. You could actually change this single option and MySQL write performance would increase dramatically. Disabling this option makes your new data less safe when a system crash happens. Option nobh tries to avoid associating buffer heads and offers a minor performance improvement. Another option commit=100 says that all your updates are written to disk every 100 seconds. The default is 5 seconds. If your machine crashes you’re likely to loose 100 seconds of updates. Large commit values like 100 provide big performance improvements. And the last option nouser_xattr disables extended options on your filesystem and provides a minor performance boost.
Double check your /etc/fstab syntax and reboot.
Tuning MySQL configuration
MySQL configuration settings depend on what database engines you’re using. The most common ones are MyISAM and InnoDB. I will assume that you use both.
Warning! Some of the configuration changes will or might make your database inaccessible. Therefore backup all your databases by dumping them to SQL to a safe location. Make sure to include triggers and stored procedures. Double check that you will be able to reimport your backups and only then proceed further. Some options will make your InnoDB database stop working. I’ll mark those. Also backup your MySQL configuration. Just in case.
MySQL settings depend on how much memory you have. I will assume a normal working station will have 4GB of RAM. Open your MySQL configuration file which on Ubuntu is located at /etc/mysql/my.cnf and set the following options.
transaction-isolation = READ-COMMITTED
As a developer you will probably not have transactions running in parallel. If you don’t care about transactions and still use InnoDB set the isolation level to READ-COMMITED. This will make your transactions only see committed data but won’t prevent phantom rows. Setting it to READ-COMMITED will also improve performance.
key_buffer = 512M
By far the most important option for MyISAM. MyISAM indexes are cached using in the key buffer. It’s usually a good bet to set it from 25% to 40% of memory available. As a developer you might not need that much but do not leave it at a default.
query_cache_size = 256M
Caches query results. Especially useful if your applications don’t have caching.
innodb_buffer_pool_size = 1024M (requires a backup and an import)
InnoDB buffer pool size is the most important option for InnoDB. If your whole database is InnoDB you can try and fit your whole database in memory. If you don’t have that much memory you can generally set 70% – 80% of memory available. On a development box you will probably want to have extra RAM for things like Gnome or your IDE.
innodb_additional_mem_pool_size = 32M
innodb_log_buffer_size = 4M
innodb_log_file_size = 128M
innodb_flush_log_at_trx_commit = 2
This option tells InnoDB to only flush log data every two seconds. On development machines you can set this even higher because the only risk is losing transactions during a system crash. If your development machine crashes you probably won’t care about lost transactions. Experiment!
innodb_flush_method = O_DIRECT
This options tells InnoDB to skip filesystem cache and write straight to disk since InnoDB already has it’s own cache – the buffer pool. You save yourself some RAM.
table_cache = 1024
Caches open tables. Might not be very useful on a single dev box but useful in general on any database server.
myisam_use_mmap = 1
Mmap is a new MyISAM feature available with MySQL 5.1. Should improve MyISAM write/read performance ~6%.
To sum up all the settings on a 4GB work environment:
transaction-isolation = READ-COMMITTED
key_buffer = 512M
query_cache_size = 256M
innodb_buffer_pool_size = 1024M
innodb_additional_mem_pool_size = 32M
innodb_log_buffer_size = 4M
innodb_log_file_size = 128M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
table_cache = 1024
myisam_use_mmap = 1
Buy an SSD disk
This is by far the best upgrade you can do. SSD does not have any moving mechanical parts therefore doing a random read or write is as fast as doing a sequential read or write. My work laptop Lenovo T400 can push 3.5 MB with random writes, 35 MB with sequential writes, 2.6MB with random reads and 38MB with sequential reads per second. The same test with an SSD disk can push 220MB random writes and 330MB random reads with similar numbers for sequential reads and writes. So for IO access you can expect 10 – 100 times performance difference.
Summary
It’s easy to squeeze some extra performance out of your development environment by sacrificing data safety. In my case these changes made our database integration test suites run a lot quicker. So far I haven’t experienced any downsides from the above settings though you have to accept that one day it most likely will. Most of the database settings I’ve mentioned are those considered most when tuning production database servers. My final advice is take everything you read here with a pinch of salt as I am by far not an expert in these matters and everything listed here is gathered from various resources online.
Resources
InnoDB performance optimization basics
Tunning MySQL server after installation
MyISAM MMAP feature
MySQL transaction isolation levels
Why you should ignore key cache hit ratio
Tweaks to boost EXT4 performance
|SSD Benchmarks
In the past I wrote about what CQRS is and now I am adding a list of available CQRS resources known to me. If you come by any other CQRS resources online please post a comment with your link. Thank you.
Video Presentations / Interviews
Greg Young on Unshackle Your Domain
Udi Dahan on CQRS, DDD, NServiceBus
Udi Dahan on CQRS and Domain Models
Greg Young on Architectural Innovation, Eventing and Event Sourcing
Greg Young on CQRS and Event Sourcing: The Business Perspective
Udi Dahan on CQRS
Udi Dahan on CQRS, Race Conditions, Sagas
Articles / Blogs / Blog Posts
CQRS information website
Greg Young’s Blog – a lot of posts on CQRS and related topics.
Think Before Coding – blog posts on CQRS and related topics
CQRS isn’t the answer by Udi Dahan.
Clarified CQRS by Udi Dahan
CQRS a la Greg Young by Mark Nijhof
Brownfield CQRS by Richard Dingwall.
Transitioning from DDD lite by Julien Letrouit
Why I Love CQRS
CQRS on Cloud by Rinat Abdullin
Frameworks, Code Examples
C# CQRS Example by Mark Nijhof
C# CQRS Framework
JAVA Axon Framework
Lokad CQRS Framework
NCQRS Framework
Kitchen Example
Other
CQRS mailing list
DDD Mailing List – Usually lot’s of conversations on CQRS
CQRS is a software architecture pattern which stands for Command Query Responsibility Segregation. The author of the pattern name CQRS is Greg Young who first described it in his blog:
I am going to throw out a quick pattern name for this and call it Command and Query Responsibility Segregation or CQRS as we are in fact simply taking what would usually be one object and splitting its responsibilities into two objects.
At the time of writing CQRS does not have an official definition. It’s difficult to define CQRS with a definition that would be both simple and useful. To describe CQRS at an object level I’ve came up with a definition which is just a reworded sentence from Greg Young’s blog post:
Command Query Responsibility Segregation or CQRS is the creation of two objects where there was previously one. The separation occurs based upon whether the methods are a command or a query.
CQRS can also be defined at a higher level. Greg Young was kind to provide a definition:
Command Query Responsibility Segregation or CQRS is the recognition that there are differing architectural properties when looking at the paths for reads and writes of a system. CQRS allows the specialization of the paths to better provide an optimal solution.
CQRS pattern is similar to CQS by Meyer but is also different. CQS separates command methods that change state from query methods that read state. CQRS goes further and separates the command methods that change state and query methods that read into two different objects.
Benefits of CQRS
- The most simple benefit of CQRS is that it simplifies read and write models by separating them. Write model no longer contains queries and developers can directly focus on domain model behaviours. What otherwise could have been a repository with hundreds of different read methods mixed with different lazy loading, pre-fetch and paging strategies can now be hidden away in a separate read model.
-
Another reason is Divergent Change. Divergent change occurs when one class is commonly changed in different ways for different reasons. You might be modifying queries more often than commands which might not only break your read queries but your commands as well. By having them separated you minimise the risks of both being broken.
- The single most important benefit of CQRS is that by separating read and write models you can make different choices on different models. For example you may optimize your write model for write performance and your read system for read performance.
- Another nice feature of CQRS is the available option to easily distribute work across separate teams. For example the read part of a web e-shop application can be outsourced to less expensive developers offshore.
- Event sourcing is a different pattern which shares a strong symbiotic relation with CQRS. Once your system reaches an architectural level where you may need multiple data models it might and probably will introduce synchronization issues. It is then impossible to say which model is incorrect. In an event centric system where commands are translated into events by the domain model these events can be used as the primary data model. This not only solves data synchronization issues, but also significantly improves testing by allowing to test for “what didn’t happen” and opens easy doors for integration with other systems since other systems can now listen to the events published by the domain model.
- Eventual Consistency. In very simple terms Eventual Consistency can be defined as simply just caching. In event centric systems it is possible to delay the handling of published domain model events and handle them in a different thread or a process. This will make write and read data models inconsistent but it might significantly improve the performance of your commands.
In Conclusion
CQRS is a very interesting pattern. By some it may even be considered to be the silver bullet. It isn’t. Like all patterns CQRS has tradeoffs. It may be difficult to sell CQRS to management since it’s not a well known classic approach to software architecture. Less known tools, technologies. As an example in the PHP world there are currently no mature service buses such as NServiceBus in the .NET world. It is almost impossible or more often than not worth the Return of Investment to migrate legacy apps to CQRS.
I’ve finished reading Expert Python Programming written by Tarek Ziade. This book is written for Python developers who wish to go further in mastering Python. Expert Python Programming covers a range of topics such as generators, meta programming, naming standards, packaging, continuous integration, writing documentation, test driven development, optimizations and design patterns. Even non Python developers will find this book useful since it covers best practices which are well suited to other programming languages.
There’s a sample chapter available which covers the topic of documentation. We all know how frustrating it is to write documentation. It’s boring, often it feels pointless and it tends to get out of date. The 7 rules of technical writing presented in the book changed my mind. It’s actually one of my personal favourite chapters in the book.
The first chapter of the book is very friendly and covers installation of many Python flavours, packaging tools such as EasyInstall and setuptools, prompt customization and choices of editors.
While the first chapter is very easy going the second chapters dives deep into syntactic intricacies of Python with it’s iterators, generators, decorators and context providers. If the second chapter won’t make your head spin then the third one on class level Python best practices certainly will. Author of the book does a great job at explaining the pitfalls of multiple inheritance, inconsistent super usage, Python’s method resolution order and finally meta programming which allows to change classes’ and objects’ definitions on the fly.
The rest of the book is a lot less confusing but nonetheless rewarding. Chapter four gives some very good advice on naming standards, building API’s and tools that ease might help along the way. Chapter five explains how to create python packages, distribute and deploy them.
What I really like in every book is examples. One example can explain more than a thousand words could. The examples in the second and third chapters are very valuable and help greatly to understand the concepts explained. The book goes even further and provides a complete example of a small application called Atomisator. This example is implemented following the best practices of previous five chapters.
Chapters eight and nine will be very interesting to team leads which explain distributed version control systems such as Mercurial, continuous integration and managing software in an iterative way.
Another very important topic on Test Driven Development or TDD is presented in chapter eleven. I cannot emphasize enough how valuable test driven development is. Though even today it’s not a widely adopted practice and not a well understood one either. This book will try to convince you why you should be doing TDD and if you’re already convinced it will present you with tools that you can use to do TDD. I was very interested to find out about the available unit testing framework alternatives. Further an interesting idea on doc testing is described which while seems a little exotic may be a very efficient way to keep your documentation up to date.
Reading further there’s a great chapter on optimization which describes general principles of optimization and various profiling techniques. Measuring performance may prove difficult on different hardware such as local development machines and stage servers. I was very intrigued to find out about pystones and the general concept behind it which helps to deal with the problem described.
Together with optimization techniques, various profiler tools which you never knew of, the book describes some generic optimization solutions available. Some are well known such as the Big-O notation, some are less known such as Cyclomatic Complexity. I think this book explains the concepts behind multi threading, multi processing and caching very well. Making an informed decision whether to use threads or multi processes for your Python application may as well mean if it’s going to be successful or not.
And finally the last chapter talks of design patterns. While it’s not the most mind blowing chapter of the book it provides some very interesting details why Python doesn’t have interfaces or how certain GoF patterns can be implemented in a Python specific way.
Conclusion
Should you read this book? My answer is yes. Especially if Python earns your bread and butter. Not only you will know the syntactic intricacies of python it will introduce you to many must know concepts of software development. Even if you’re not a day to day Python developer but you do write an occasional Python script or application by all means read the book and read the first six chapters. I will go even further and recommend this book to non Python developers. Simply because it explains concepts that every developer should understand. And as an extra it is always interesting to learn new ideas and to see how things can be done differently.
22
DDD Resources / Papers / Presentations
2 Comments | Posted by Žilvinas Šaltys in DDD, Model, Patterns
Recently I wrote about what Domain Driven Design is which only scratches the surface of DDD. I’ve decided to put a list of DDD resources available. If you come by any other DDD resources online please post a comment with your link.
Books / Papers
- Domain-Driven Design: Tackling Complexity in the Heart of Software
- DDD Pattern Summaries (Free)
- Domain Driven Design Quickly (Free E-book)
- Domain Driven Design Step by Step (Free E-book)
Presentations / Videos / Interviews
- Vaughn Vernon on RESTful SOA or Domain-Driven Design – A compromise?
- Greg Young on 5 Reasons Why DDD Projects Fail
- Eric Evans on the State of DDD
- Greg Young on State Transitions in Domain-Driven Design
- Eric Evans on Domain Driven Design
- Eric Evans on What he’s learned about DDD since the book
- Eric Evans on DDD Emerging Themes
- Eric Evans on Folding together DDD & Agile
- Eric Evans on Strategic Design
- Eric Evans on Putting The Model to Work
- Jimmy Nilsson on Is Domain-Driven Design More than Entities and Repositories?
- Dan North on BDD & DDD
Websites and Blogs
21
What is DDD or Domain Driven Design?
1 Comment | Posted by Žilvinas Šaltys in Model, Patterns, Uncategorized
Domain Driven Design can be described as a philosophy based on domain modelling. More accurately it may be be described as a very large body of patterns and pattern language in its own right. The term Domain Driven Design or DDD was coined by Eric Evans the author of the book Domain-Driven Design: Tackling Complexity in the Heart of Software also known in the DDD community as the “blue book”.
Understanding the DDD philosophy
The Domain Driven Design philosophy states:
- Most software projects should focus on business domain
- Complex domain designs should be based on a model
To understand the meaning of these statements one has to understand the meaning of domain and model.
Domain is a sphere of knowledge, influence or activity. The subject area to which the user applies a program is the domain of the software. In other terms if you work for a bank then banking is your domain.
Model is a system of abstractions that describes selected aspects of a domain and can be used to solve problems related to that domain. For example a map is a model designed to solve a specific problem. A treasure map shows how to find a treasure, a political map shows the borders of countries. A model is a simplification. It is an interpretation of reality that focuses on the problem at hand and ignores the extraneous detail.
Models are designed to be useful to solve domain specific problems. For example in the past the universe was viewed in a geocentric way where the universe revolves around Earth. Heliocentric model is another astronomical model in which the Earth and planets revolve around a stationary Sun at the centre of the universe. Even though geocentric model is not realistic it is a valid model in it’s own right designed to solve a problem – the human desire to be in the centre of everything. It’s not a useful model when it’s used to compare planet movements.
Domain Driven Design advocates designing software systems to reflect the domain model in a very literal way, so that the mapping is obvious, also revising the model continuously and modifying it to be implemented more naturally in software. To tie the implementation to a model well requires tools and languages that support a modelling paradigm, such as object-oriented programming.
A well mapped implementation of a model usually expresses an object model that incorporates both behaviour and data. A decomposed domain model consists of common building blocks: entities, aggregates, value objects, services and factories.
Essential Principles of DDD
The greatest value of a domain model is that it provides a ubiquitous language that ties domain experts and technologists together. Ubiquitous language is a language structured around the domain model and used by all team members to connect all activities of the team with the software. It’s a shared, versatile language between team members and domain experts. A well designed model speaks to the developers through the ubiquitous language. It’s important to understand that a change in the model is a change in the language and vice versa.
When multiple models are in play on a large project it’s beneficial to define bounded contexts where these models apply. A bounded context is a linguistic boundary marking the applicability of distinct models. Usually a subsystem or work owned by another team. For example in a typical e-shop web application a sales reporting application could be defined as a separate bounded context.
Every domain consists of subdomains. For example a very common subdomain is billing. Such a subdomain is usually not the driving part of the domain and therefore not as important. It is harsh reality that not all parts of the design are going to be equally refined therefore priorities must be set. DDD suggests distilling the core domain by distinguishing it from other generic subdomains and applying the top talent to work on it.
Conclusion
DDD helps projects to develop a strong internal language, define clear context boundaries, and focus on the core domain. Domain Driven Design brings structure and cohesion into domain modelling which are much appreciated features of any software project in existence. The blue book has been released six years ago and since then it influenced many developers. Yet I feel it hasn’t reached it’s momentum. One can only hope it will reach widespread adoption.
Update: I’ve added a list of available DDD resources such as papers and video presentations.
17
PHPUnit email integration testing using Sendmail
5 Comments | Posted by Žilvinas Šaltys in PHP, Testing
One of the problems when doing functional or integration testing is testing that emails are being sent out with a correct header and body. One such scenario could be a controller action which sends a password reset confirmation email and redirects to another action.
A common way to solve such a problem is to configure the local MTA to store the test emails on the file system. The following shows how this could be done using sendmail. First create a sendmail alias by editing a file located at /etc/mail/aliases and adding a line bellow other aliases:
test-mail: “| cat > /tmp/test-mail”
This tells sendmail that all incoming emails to test-mail will be written (not appended) to /tmp/test-mail. Sendmail needs to be restarted for the changes to take effect.
sudo /etc/init.d/sendmail restart
Depending on the situation it may be necessary to add the user who is going to be reading emails (for example apache) to the mail group.
sudo /usr/sbin/usermod -G mail apache
Now using PHP it should be possible to do this:
$ok = mail('test-mail', 'Hello world!', 'I am an email.');
var_dump($ok);
echo file_get_contents('/tmp/test-mail');
Further PHPUnit could be extended to add the following method to the base test case class:
public function assertEmail($attributes, $emailFilePath,
$message = '', $delta = 0, $maxDepth = 10,
$canonicalizeEol = FALSE, $ignoreCase = FALSE)
{
$mailParser = new Company_Product_MailParser;
$mailData = $mailParser->parseFile($emailFilePath);
foreach ($attributes as $attribute => $value) {
$constraint = new PHPUnit_Framework_Constraint_IsEqual(
$mailData[$attribute], $delta, $maxDepth, $canonicalizeEol, $ignoreCase
);
$this->_test->assertThat($value, $constraint, $message);
}
if (is_file($emailFilePath) && is_writable($emailFilePath)) {
unlink($emailFilePath);
}
}
The mail parser class name explains itself:
class Company_Product_MailParser
{
public function parseFile($mailFilePath)
{
$emailBody = file_get_contents($mailFilePath);
$attributes = array(
'to' => '',
'from' => '',
'date' => '',
'subject' => '',
'body' => ''
);
foreach (array_keys($attributes) as $attribute) {
if($attribute == 'body') {
if (preg_match("/\n\n(.*)/", $emailBody, $matches, PREG_OFFSET_CAPTURE)) {
$offset = $matches[1][1];
$attributes[$attribute] = quoted_printable_decode(substr($emailBody, $offset));
}
} else {
if (preg_match("/" . ucfirst($attribute) . ": (.*)\n/", $emailBody, $matches)) {
$attributes[$attribute] = $matches[1];
}
}
}
return $attributes;
}
}
Important notice. Sendmail may not immediately send the email and it may take a few seconds for the file to appear. It may require you to add a sleep for a few seconds before the email file appears. If you find a way how it is possible to make sendmail send an email immediately please let me know.
15
Skinny Controllers and Fat Models
3 Comments | Posted by Žilvinas Šaltys in Domain Model, Model, Patterns
Most of the modern web application frameworks follow the MVC design pattern. It’s probably one of the most misunderstood design patterns in existence. There are a lot of discussions what kind of responsibilities each letter holds. Common misinterpretation in MVC is regarding the letter M.
The Model should be understood as a domain model. Meaning a collection of domain objects. Usually an application has one model that is the domain model. Models are often mistakenly referenced to as singular domain entities. For example an Order, a User or an Account. This leads unwary developers to common application design problems.
It’s common to see a web application to have a directory named “models” with class files inside it. Upon closer inspection one can often find that those classes are the nouns of the application. For example those nouns could be a User, an Order or a Product. In this scenario the MVC Model stands for singular application entities.
Problems start to surface when an application developer has to create reports, do input validation or to implement an ACL. These kind of problems don’t naturally fit into entities. For example getting a report of top 10 products doesn’t naturally fit into any entity. Validating a complex search filter made out of multiple input fields also doesn’t fit into any of the entities.
It’s common to see developers adding logic that doesn’t fit anywhere naturally to controller classes or somewhat close entities. For example adding a top 10 products report to an Order entity class as a static method call. Or validating complicated search filters inside controller actions.
In time this steadily leads to bloated controller and entity classes that later on become fat spaghetti dishes. Controller classes containing thousands of lines of code with more private methods than public ones, entity classes with few state changing methods and hundreds of lines long SQL report queries with joins to 10 tables.
To prevent this from happening it is crucial to understand what controller and model stands for. A controller’s responsibility is only to receive input and initiate a response for the view by making calls on model objects. This means that controllers should be very light and skinny. Usually doing nothing else just instantiating classes, getting data from the domain objects and passing it to the view. Model is not a singular entity and can consist of an entire web of interconnected domain objects. The definition fat model means having as many domain objects as needed. Be it reports, validators, filters, entities, strategies and so on.
There’s a lot of confusion over pair programming. It’s been widely known for a long time and there are a lot of famous companies such as ThoughtWorks actively using pair programming but on the other side there are still a lot of people not knowing what exactly pair programming is, how it works, what are it’s benefits and downsides. The greatest resource on the matter so far that I’ve read is Stuart Wray’s paper for the January 2010 edition of IEEE Software Magazine entitled “How Pair Programming Really Works“. I really enjoyed reading this article because of it’s scientific approach to the problem.
The main benefits of pair programming are these:
- Communication. While developers explain software problems to each other they often suddenly experience enlightenment and find the solution they were looking for.
- Noticing details. Experiments prove that focused people can miss an elephant in the room. Pair programming partners are usually very helpful to notice various details. For example noticing typos in the code.
- Following code standards. Developers tend to follow best practices more when they work in pairs.
- Expertise judgement. Working with another person in pair is one of the best ways to judge expertise and productivity.
The downside of pair programming is that developers get burnt out. On one hand it forces developers to keep working instead of reading blogs and emails, but after a while developers might get mentally tired and become counter productive. It’s important to allow developers to have some “slack time” if they need to and do some work solo.
ThoughtWorks made a great presentation on how they use pair programming on one of their projects. I highly recommend watching it.
