<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>2Paths &#187; database</title>
	<atom:link href="http://www.2paths.com/tag/database/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.2paths.com</link>
	<description>Custom Software Technical Architecture, Design and Development in Vancouver, BC, Canada</description>
	<lastBuildDate>Mon, 27 Sep 2010 01:15:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>memcached and Grails</title>
		<link>http://www.2paths.com/2009/07/16/memcached-and-grails/</link>
		<comments>http://www.2paths.com/2009/07/16/memcached-and-grails/#comments</comments>
		<pubDate>Fri, 17 Jul 2009 01:16:30 +0000</pubDate>
		<dc:creator>Tim</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Under the hood]]></category>
		<category><![CDATA[Utilities]]></category>
		<category><![CDATA[continuous integration]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Grails]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://www.2paths.com/?p=1122</guid>
		<description><![CDATA[I put together a Grails app to try out memcached. Like a lot of things in Grails, it was quite simple to integrate memcached once I found the right magic words. hibernate-memcached works nicely and has clear setup instructions.
Install memcached, which is simple if you use a Mac and have MacPorts installed.
$ sudo port install [...]]]></description>
			<content:encoded><![CDATA[<p>I put together a <a href="http://www.grails.org/">Grails</a> app to try out <a href="http://www.danga.com/memcached/">memcached</a>. Like a lot of things in Grails, it was quite simple to integrate memcached once I found the right magic words. <a href="http://code.google.com/p/hibernate-memcached/">hibernate-memcached</a> works nicely and has clear setup instructions.</p>
<p>Install memcached, which is simple if you use a Mac and have <a href="http://www.macports.org/">MacPorts</a> installed.</p>
<pre>$ sudo port install memcached</pre>
<p>Get the <a href="http://spymemcached.googlecode.com/files/memcached-2.3.1.jar">memcached jar</a> , the <a href="http://code.google.com/p/hibernate-memcached/downloads/list">hibernate-memcached jar</a> and the <a href="http://jdbc.postgresql.org/">PostgreSQL JDBC driver jar</a>, then  add them to the project.</p>
<pre>$ cp memcached-2.3.1.jar hibernate-memcached-1.2.jar postgresql-8.4-701.jdbc3.jar &lt;project_dir&gt;/lib/</pre>
<p>Configure memcached as the second-level cache by adding something like this to your DataSource.groovy.</p>
<pre><span class="pln">hibernate </span><span class="pun">{</span><span class="pln">
    cache</span><span class="pun">.</span><span class="pln">use_second_level_cache </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">true</span><span class="pln">
    cache</span><span class="pun">.</span><span class="pln">use_query_cache </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">true</span><span class="pln">
    cache</span><span class="pun">.</span><span class="pln">provider_class </span><span class="pun">=</span><span class="pln"> </span><span class="str">'com.googlecode.hibernate.memcached.MemcachedCacheProvider'</span><span class="pln">
    memcached </span><span class="pun">{</span><span class="pln">
        servers </span><span class="pun">=</span><span class="pln"> </span><span class="str">"localhost:11211"</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">
</span><span class="pun">}</span></pre>
<p>I wrote a simple test to generate a whole bunch of domain objects and shove them in the database, then go back and fetch them all again. I set the test to run the insert/load loops in batches of 1, 10, 100, 1,000, 2,000, 4,000, 7,000, and 10,000 objects. I expected that inserting the objects would take a little longer because of the memcached overhead but that loading the objects would be faster because they could be loaded from memcached instead of going out to the database.</p>
<p>I tested three persistence backend configurations.</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/HSQLDB">HSQLDB</a>, the default for new Grails projects and when running unit/integration tests. This is an in-memory database only, so data aren&#8217;t persisted permanently.</li>
<li><a href="http://en.wikipedia.org/wiki/PostgreSQL"> PostgreSQL</a> for a full-blown RDBMS setup. I set dbCreate to &#8220;create-drop&#8221; in <a href="http://docs.codehaus.org/display/GRAILS/Quick+Start">DataSource.groovy</a> so that each test started with an empty database.</li>
<li>PostgreSQL with memcached enabled. I should point out that in this test I had both PostgreSQL and memcached running on the same machine (my laptop) as the application, so it&#8217;s really not taking advantage of memcached&#8217;s parallelism strengths.</li>
</ul>
<p>My expectations held true for the inserts. The memcached case is a little slower but not that much.</p>
<div id="attachment_1123" class="wp-caption alignnone" style="width: 806px"><img class="size-full wp-image-1123 " title="db-compare-inserts" src="http://www.2paths.com/wp-content/uploads/2009/07/db-compare-inserts.png" alt="db-compare-inserts" width="796" height="516" /><p class="wp-caption-text">Comparison of object insert times using HSQL, PostgreSQL, and memcached+PostgreSQL for persistence backend</p></div>
<p>When I tested the load times, I was surprised. It looked like memcached wasn&#8217;t speeding things up at all. I was also surprised to see that it took less time to load 10,000 objects from the data store than 7,000 objects. I ran the test a few times to rule out the possibility of a load spike from something else running at the same time, but I got similar results every time.</p>
<div id="attachment_1124" class="wp-caption alignnone" style="width: 806px"><img class="size-full wp-image-1124 " title="db-compare-loads" src="http://www.2paths.com/wp-content/uploads/2009/07/db-compare-loads.png" alt="Comparison object load times using HSQL, PostgreSQL, and memcached+PostgreSQL for persistence backend" width="796" height="516" /><p class="wp-caption-text">Comparison of object load times using HSQL, PostgreSQL, and memcached+PostgreSQL for persistence backend</p></div>
<p>I then tried to improve my test setup by running memcached on a second machine (my desktop, connected by gigabit ethernet to the laptop). Using the PostgreSQL + memcached configuration for each test this time, I looked at three more scenarios.</p>
<ul>
<li>PostgreSQL running locally on the laptop, memcached running locally on the laptop.</li>
<li>PostgreSQL running locally on the laptop, memcached running remotely on the desktop.</li>
<li>PostgreSQL running locally on the laptop, memcached running on both the laptop and the desktop.</li>
</ul>
<p>There&#8217;s some overhead in going out to the network to retrieve data instead of fetching it from a process running locally, so I expected the memcached local instance to be faster than the memcached remote instance. I expected the third test, with two instances of memcached running in parallel, to be fastest because it splits the load between the two instances and frees up some CPU time on the laptop. Things didn&#8217;t play out that way, though.</p>
<div id="attachment_1125" class="wp-caption alignnone" style="width: 806px"><img class="size-full wp-image-1125" title="local-remote-inserts" src="http://www.2paths.com/wp-content/uploads/2009/07/local-remote-inserts.png" alt="Comparison of object insert times for various memcache setups" width="796" height="516" /><p class="wp-caption-text">Comparison of object insert times for various memcache setups</p></div>
<div id="attachment_1126" class="wp-caption alignnone" style="width: 806px"><img class="size-full wp-image-1126" title="local-remote-loads" src="http://www.2paths.com/wp-content/uploads/2009/07/local-remote-loads.png" alt="Comparison of object load times for various memcache setups" width="796" height="516" /><p class="wp-caption-text">Comparison of object load times for various memcache setups</p></div>
<p>I have to be honest that these were not great tests. For one, I should really have run those insert/load loops in parallel instead of sequentially. I&#8217;m guessing that the test spent a lot of time stalled waiting for data when it could have been sending out more requests. I would also like to test with a large group of memcached servers and a remote database, perhaps even a cluster of app servers with a load balancer to try the full meal deal. The <a href="http://code.google.com/p/memcached/wiki/FAQ#Memcached_is_not_faster_than_my_database._Why?">memcached FAQ</a> explains that running everything on one machine will not show off memcached&#8217;s scalability.</p>
<p>However, this gave me the opportunity to get my hands dirty with memcached and learn a few things.</p>
<ul>
<li>Integrating memcached is so simple and the performance penalty of just running it locally during development is so small that there&#8217;s no reason not to enable it.</li>
<li>It&#8217;s dead simple to install and set up. We could , for example, easily  install it on a bunch of test servers and run multiple instances on different ports &#8211; one for each developer (to prevent us from clobbering each other&#8217;s data).</li>
<li>Measuring performance and scalability realistically is tricky. It&#8217;s one thing to hand wave and say that running multiple instances of a database or a cache will speed things up, but it takes some work to set up a realistic simulation of high traffic application performance.</li>
<li>Adding memcached doesn&#8217;t automatically turn performance up to <a href="http://www.youtube.com/watch?v=EbVKWCpNFhY">11</a>. I need to learn more about what&#8217;s going on under the hood with my domain objects and how they&#8217;re being persisted through memcached.</li>
</ul>
<p>I look forward to using memcached more in the future and learning how to really take advantage of it.</p>
<p><strong>Update (2009-07-24)</strong>: Ray pointed out in the <a href="#comment-8792">comments</a> that I should have enabled <a href="http://www.grails.org/GORM+-+Mapping+DSL">caching</a> in my domain objects under test. I added &#8220;static mapping = { cache true }&#8221; to my domain class definition and re-ran the tests to compare the effect under 4 scenarios.</p>
<ul>
<li>no domain object caching (as before)</li>
<li>domain object caching with memcached running locally</li>
<li>domain object caching with memcached running remotely</li>
<li>domain object caching with memcached running both locally and remotely</li>
</ul>
<p>Not much difference with the inserts.</p>
<div id="attachment_1145" class="wp-caption alignnone" style="width: 806px"><img class="size-full wp-image-1145" title="cache-true-inserts" src="http://www.2paths.com/wp-content/uploads/2009/07/cache-true-inserts.png" alt="Comparison of object insert times with domain object caching enabled and disabled" width="796" height="516" /><p class="wp-caption-text">Comparison of object insert times with domain object caching enabled and disabled</p></div>
<p>However, the load times are slightly better with memcached enabled during heavy activity! The advantage disappears when I add the overhead of traversing the network, but it&#8217;s a start. Thanks for the pointer, Ray!</p>
<div id="attachment_1146" class="wp-caption alignnone" style="width: 806px"><img class="size-full wp-image-1146" title="cache-true-loads" src="http://www.2paths.com/wp-content/uploads/2009/07/cache-true-loads.png" alt="Comparison of object load times with domain object caching enabled and disabled" width="796" height="516" /><p class="wp-caption-text">Comparison of object load times with domain object caching enabled and disabled</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.2paths.com/2009/07/16/memcached-and-grails/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Getting [On&#124;Off] the CouchDB?</title>
		<link>http://www.2paths.com/2009/06/30/getting-onoff-the-couchdb/</link>
		<comments>http://www.2paths.com/2009/06/30/getting-onoff-the-couchdb/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 00:13:39 +0000</pubDate>
		<dc:creator>Garrett</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[couchdb]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[schema-free]]></category>

		<guid isPermaLink="false">http://www.2paths.com/?p=1066</guid>
		<description><![CDATA[Based on my recent experimenting with CouchDB I have conflicted feelings of whether to get on, or stay off the couch. I am intrigued by the notion of a schema-free document-based database, and how applications could be created to leverage such a technology. I am also reserved about its benefits to my current work environment. [...]]]></description>
			<content:encoded><![CDATA[<p>Based on my recent experimenting with <a href="http://couchdb.apache.org/" target="_blank">CouchDB</a> I have conflicted feelings of whether to get on, or stay off the couch. I am intrigued by the notion of a schema-free document-based database, and how applications could be created to leverage such a technology. I am also reserved about its benefits to my current work environment. Most of the relational data storage I&#8217;ve dealt with has been handled by Spring/Hibernate and now Grails/Hibernate. Does CouchDB buy anything that isn&#8217;t already easy to do with methodology like <a href="http://www.grails.org/GORM" target="_blank">GORM</a>? Let me attempt to dissect some of the pros and cons of CouchDB.</p>
<h3><em>PROS</em></h3>
<p>Probably the number one mentioned feature by anyone talking about CouchDB is that it&#8217;s schema-free. That means that each document is its own structure, that everything is based on a key-value pairing, where the value is typically a document, but can also be a base type. This is pretty cool when dealing with describing real world objects, as one can be more descriptive and hierarchical in storing the data. Also, updating your application&#8217;s domain model is much like evolution, as soon as you need something added to the model, just add it.</p>
<p>CouchDB&#8217;s RESTful HTTP service implementation is great for the many applications being created for today&#8217;s web. Send Javascript queries/requests to the service, and get JSON formatted data in return. This is great, since many front-end technologies can decipher JSON, and the format itself is even human readable if need be.</p>
<p>There is support for interacting with CouchDB data offline. One can access and edit data while not connected, and when back online can merge and update changes made while offline. This kind of support out of the box makes CouchDB a prime candidate for  web based applications like blogs, wikis, bug reporting, CRM, etc.</p>
<p>Another quality of CouchDB is its MapReduce implementation for handling large data sets. I&#8217;m going to redirect you to the author&#8217;s notes for this one: <a href="http://damienkatz.net/2008/02/incremental_map.html" target="_blank">MapReduce</a> and <a href="http://labs.google.com/papers/mapreduce.html" target="_blank">Google&#8217;s definition</a>.</p>
<h3><em>CONS</em></h3>
<p>Transaction management is non-existent. Relational database managements systems (RDBMS) provide mechanisms for things like transaction management, particularly unique constraints and locking. CouchDB is based on being &#8216;eventually consistent&#8217; and &#8216;available&#8217; (See the <a href="http://books.couchdb.org/relax/eventual-consistency" target="_blank">CAP Theorem section</a> for more), which although great for many things, doesn&#8217;t lend itself to transaction management.</p>
<p>Data security is also a concern. A somewhat old article describes this issue quite well: <a href="http://www.automatthew.com/2008/01/planned-security-model-for-couchdb.html" target="_blank">Planned Security Model For CouchDB</a>. The CouchDB team has been working on <a href="http://jchrisa.net/drl/_design/sofa/_show/post/couchdb_edge__security_and_vali" target="_blank">security and validation</a> but are still early in their development.</p>
<p>To a less important degree, but still worth mentioning, is that personally I really don&#8217;t find the out-of-the-box UI for CouchDB to be all that useful. It feels clunky to navigate your key/document pairs, and too feature-light for managing the database.</p>
<h3><em>Conclusion</em></h3>
<p>If you&#8217;re looking to create a document based web application, then this may be a worthwhile technology to use. CouchDB has a number of very impressive features. However, for now at least, I believe I&#8217;ll be staying off the couch.</p>
<hr />
<h5><em>Aside</em>:</h5>
<p>When testing out CouchDB I went through the process of manually installing the application. I had to resolve the dependencies, and deal with compilation errors on my own. I kind of wish I&#8217;d found <a href="http://jan.prima.de/~jan/plok/archives/142-CouchDBX-Revival.html" target="_blank">this site</a> earlier. Follow that link if you&#8217;re a MacOSX user and want an installer for CouchDB.</p>
<p>For me, this was pretty much the  only useful tutorial on getting started with CouchDB: <a href="http://jan.prima.de/~jan/plok/archives/108-Programming-CouchDB-with-Javascript.html" target="_blank">Programming CouchDB with Javascript</a>.</p>
<hr />
<h5><em>Resources</em>:</h5>
<ul>
<li><a href="http://couchdb.apache.org/" target="_blank">http://couchdb.apache.org/</a></li>
<li><a href="http://books.couchdb.org/relax/" target="_blank">http://books.couchdb.org/relax/</a></li>
<li><a href="http://www.eflorenzano.com/blog/tag/couchdb/" target="_blank">http://www.eflorenzano.com/blog/tag/couchdb/</a></li>
<li><a href="http://labs.google.com/papers/mapreduce.html" target="_blank">http://labs.google.com/papers/mapreduce.html</a></li>
<li><a href="http://damienkatz.net/2008/02/incremental_map.html" target="_blank">http://damienkatz.net/2008/02/incremental_map.html</a></li>
<li><a href="http://jan.prima.de/~jan/plok/archives/142-CouchDBX-Revival.html" target="_blank">http://jan.prima.de/~jan/plok/archives/142-CouchDBX-Revival.html</a></li>
<li><a href="http://www.automatthew.com/2008/01/planned-security-model-for-couchdb.html" target="_blank">http://www.automatthew.com/2008/01/planned-security-model-for-couchdb.html</a></li>
<li><a href="http://jchrisa.net/drl/_design/sofa/_show/post/couchdb_edge__security_and_vali" target="_blank">http://jchrisa.net/drl/_design/sofa/_show/post/couchdb_edge__security_and_vali</a></li>
<li><a href="http://jan.prima.de/~jan/plok/archives/108-Programming-CouchDB-with-Javascript.html" target="_blank">http://jan.prima.de/~jan/plok/archives/108-Programming-CouchDB-with-Javascript.html</a></li>
<li><a href="http://www.infoq.com/CouchDB" target="_blank">http://www.infoq.com/CouchDB</a></li>
<li><a href="http://www.infoq.com/news/2007/11/the-rdbms-is-not-enough" target="_blank">http://www.infoq.com/news/2007/11/the-rdbms-is-not-enough</a></li>
<li><a href="http://www.infoq.com/news/2008/11/Database-Martin-Fowler" target="_blank">http://www.infoq.com/news/2008/11/Database-Martin-Fowler</a></li>
<li><a href="http://code.google.com/p/couchdb-lounge/wiki/SettingUpTwoCouchInstances" target="_blank">http://code.google.com/p/couchdb-lounge/wiki/SettingUpTwoCouchInstances</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.2paths.com/2009/06/30/getting-onoff-the-couchdb/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>LiquiBase-ifying your Grails Application</title>
		<link>http://www.2paths.com/2008/12/23/liquibase-ifying-your-grails-application/</link>
		<comments>http://www.2paths.com/2008/12/23/liquibase-ifying-your-grails-application/#comments</comments>
		<pubDate>Tue, 23 Dec 2008 21:46:27 +0000</pubDate>
		<dc:creator>Lorill</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Under the hood]]></category>
		<category><![CDATA[Utilities]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Grails]]></category>
		<category><![CDATA[LiquiBase]]></category>

		<guid isPermaLink="false">http://www.2paths.com/?p=401</guid>
		<description><![CDATA[At 2Paths we&#8217;ve got some pretty good processes in place: we practice agile software development and scrum, have all our projects set up in Continuous Integration. We try to do test-driven development where at all possible. One area that has slipped through the cracks though is database change management. What company hasn&#8217;t run into the [...]]]></description>
			<content:encoded><![CDATA[<p>At 2Paths we&#8217;ve got some pretty good processes in place: we practice agile software development and scrum, have all our projects set up in Continuous Integration. We try to do test-driven development where at all possible. One area that has slipped through the cracks though is database change management. What company hasn&#8217;t run into the problem of ensuring all database environments in a project (dev, staging, production, test) are in sync and in source control? Without this, it becomes an onerous if not impossible task to rollback a set of databases to a known state, or to recreate one from scratch to a known state.</p>
<p>Enter <a href="http://liquibase.org">LiquiBase</a>, an open source database-agnostic tool for tracking, managing and applying database changes. We recently worked on a small Grails project and decided to give this tool a try and I was duly impressed.</p>
<p>LiquiBase functionality is built around a main changelog.xml file containing changesets representing incremental database changes to be applied to a database. LiquiBase manages which changesets have been run through a DATABASECHANGELOG table it creates in each database. </p>
<p>We used the <a href="http://www.liquibase.org/manual/grails">grails plugin</a> which gave us most of the full LiquiBase functionality, albeit a little less than mature.  There were a couple of gotchas and bugs with the plugin, but nothing we couldn&#8217;t work around.</p>
<p>LiquiBase buys us the ability to store database change in source control, easily sync databases in multi-environments in an automated and controlled fashion, tag database states upon iteration-end releases, auto-generate rollback sql to tagged states, diff databases, and much more.</p>
<h2>LiquiBase-ifying Your Application</h2>
<p>With your database in a known state, you can use the grails plugin to create your changelog.xml. First you need to install the grails LiquiBase plugin:<br />
<code>grails install-plugin liquibase</code><br />
Once installed simply run this from the root of your grails app:<br />
<code>grails generate-changelog grails-app/changelog.xml</code></p>
<p>This will generate the changelog.xml from your development database (specified in DataSource.groovy) and write it to the path specified in the command.  <code>grails-app/changelog.xml</code> is the default path where the grails plugin will be expecting the changelog to be. If you need to read from a different database environment like staging for example, just add the -Dgrails.env=staging option to the command.</p>
<p>To propagate this changelog to other databases (say, test), run<br />
<code>grails -Dgrails.env=test migrate</code>. This runs all the sql necessary to upate the database to match the changelog. Any new changes from here on will be appended as new changesets, and can be migrated with the same command as above.  You can also use the migrate-sql command instead if want to generate the sql and run it yourself.  Most LiquiBase commands come in tuples: one to generate the sql for you and one to just run the sql directly against your database.</p>
<p>If you&#8217;re integrating LiquiBase mid-project and already have all your databases set up, you&#8217;ll need to run sql against them to update the LiquiBase DATABASECHANGELOG table to show all the changes as run.  First, make sure that the databases are all in sync. To do this, you can use the handy LiquiBase diff tool. Unfortunately, the grails plugin for this diff tool is not very robust and will only diff your development database against your test database &#8211; the plugin has those two environments hard-coded. You can either mess with your DataSource.groovy db environments to do the diff setting the two dbs in question to dev and test temporarily, or <a href="http://www.liquibase.org/download">install LiquiBase</a> itself and run the <a href="http://www.liquibase.org/manual/diff">diff</a> passing the db parameters to the diff command. Using the grails plugin you would run:<br />
<code>grails db-diff</code> which will spit to the screen any differences as changesets to be applied. Strangely, there is no documentation for this command in the <a href="http://www.liquibase.org/manual/grails">LiquiBase Grails plugin page</a>.</p>
<p>Once your database are in sync, run<br />
<code>grails changelog-sync-sql</code> with the appropriate <code>-Dgrails.env</code> switch to generate the sql to update the DATABASECHANGELOG table, then run the sql in your database.</p>
<h2>LiquiBase and Continuous Integration</h2>
<p>We use Hudson at 2Paths for our CI, and have added the simple <code>grails -Dgrails.env=test migrate</code> command as part of our build process to migrate the test database upon every checkin. This ensures that the test db is always the most up-to-date.</p>
<h2>Rolling LiquiBase into our Dev Process</h2>
<p>We&#8217;ve adopted on a trial basis the following process for database change management as part of our agile software development, taking into consideration Grails development (which uses hibernate) which can auto-generates schemas based on domain objects:</p>
<ol>
<li>Generate changelog from initial schema and commit to svn</li>
<li>Rollout schema to other databases by migrating from changelog</li>
<ol>
<li>if the schema already exists and it needs &#8220;LiquiBase-ifying&#8221;, generate changelog-sync-sql and run it</li>
</ol>
<li>Add every schema change via new changeset from hereon in by generating changelogs via grails <a href="http://www.liquibase.org/manual/diff">db-diff</a></li>
<ol>
<li>add / change domain objects in project</li>
<li>set hiberante ddl mode to update</li>
<li>start app and grails will automatically update the db</li>
<li>generate new changelogs using db-diff (against the test db)</li>
<li>append changelogs to changelog.xml</li>
<li>run <code>grails changelog-sync-sql</code> to generate the sql that will mark all these new changes as ran on your db, then run the sql</li>
<li>checkin changes. this will be applied to test</li>
</ol>
<li>provide explicit rollback sql for any custom sql</li>
<li>tag each iteration end with a &#8220;tagDatabase&#8221; changelog in the changelog.xml</li>
</ol>
<h2>Gotchas</h2>
<p>We ran into some gotchas with LiquiBase. One of the first things we started doing before fully understanding how LiquiBase worked was to update existing changesets when changing the schema. LiquiBase generates checksums for each changeset to ensure they don&#8217;t change, and altering existing changelogs will cause future migrations to fail. Even though LiquiBase gives you some tools to get around this, it&#8217;s generally a better practice to just add a new changeset for every database change. There is a good blog on the LiquiBase site explaining  <a href="http://blog.liquibase.org/2008/10/dealing-with-changing-changesets.html">how to deal with changing changesets</a>.</p>
<p>Another gotcha is that the rollback-sql command won&#8217;t magically generate rollback scripts for hand-coded sql (obviously!) You must to generate your own rollback sql for these custom sql tags, otherwise not only will you not get rollback sql for those particular changesets, but the rollback functionality won&#8217;t spit out sql for any other changeset either until you do.</p>
<p>Tagging was a little fussy as well. The command line <code>grails tag</code> can only tag a given state once &#8211; future tags will overwrite the earlier ones unless you&#8217;re at a different changeset. I found it better to add a tag changeset explicitly in the changelog.xml. The Grails plugin for tagging also seems to be problematic with spaces in the tag name. We decided to just use underscores as a convention.</p>
<p>We also ran across some broken functionality:</p>
<ul>
<li>The <code>grails generate-changelog</code> command generates some extraneous information that breaks sql for data type numeric(19,0) if that type has auto-increment on.  It generates this in the changelog:<br />
&lt;column autoIncrement=&#8221;true&#8221; name=&#8221;id&#8221; type=&#8221;numeric()(19,0)&#8221;&gt; (notice the extra empty brackets).  We needed to manually remove these to get it to work.</li>
<li>The <code>grails rollback-to-date-sql</code> command writes to a file with the datestamp instead of to console like the other rollbacks do, even though the docs say it writes to STDOUT (<code>rollback-count-sql</code> also has issues, but <code>rollback-sql</code> for tags works just fine). The code for these plugin commands doesn&#8217;t seem to do the right thing with the args. Time permitting, we may provide a patch for this.</li>
<li>db-doc generation doesn&#8217;t work with large amount of columns in a table because it uses the table description including column names for the html file name.  Too many db columns make the file names too long.</li>
</ul>
<h2>Moving Forward</h2>
<p>Although there&#8217;s a bit of grumbling amongst the developers here that using LiquiBase with the above process is a little onerous, it buys us a lot more certainty for knowing what state our databases are in, making sure changes are in source control, and knowing what updates have or haven&#8217;t been run. It&#8217;s a whole lot easier than hand-coding a bunch of sql, and easily accommodates the possibility of migrating to other DBMS&#8217;s in the future without having to re-write DBMS-specific sql. There&#8217;s even talk of trying to go a step or two further to automate even more of the process so a developer doesn&#8217;t have to think about generating changesets upon changing of a grails domain model. We&#8217;ll see how that goes.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.2paths.com/2008/12/23/liquibase-ifying-your-grails-application/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Can Database Refactoring be Agile?</title>
		<link>http://www.2paths.com/2008/11/11/can-database-refactoring-be-agile/</link>
		<comments>http://www.2paths.com/2008/11/11/can-database-refactoring-be-agile/#comments</comments>
		<pubDate>Wed, 12 Nov 2008 01:49:53 +0000</pubDate>
		<dc:creator>Lorill</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[agile]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[much ado about agile]]></category>
		<category><![CDATA[refactoring]]></category>

		<guid isPermaLink="false">http://www.2paths.com/?p=259</guid>
		<description><![CDATA[I was fortunate enough to partially attend the Much Ado About Agile Conference held recently in Vancouver, and was immediately drawn to the session &#8220;Database Development in Agile World&#8221; led by Marc Munro. This was a very timely session as we have recently begun a project that involves refactoring a legacy database and developing a [...]]]></description>
			<content:encoded><![CDATA[<p>I was fortunate enough to partially attend the <a href="http://www.agilevancouver.ca/?p2=/modules/agilevancouver/conference.jsp&amp;id=3">Much Ado About Agile Conference</a> held recently in Vancouver, and was immediately drawn to the session &#8220;Database Development in Agile World&#8221; led by Marc Munro. This was a very timely session as we have recently begun a project that involves refactoring a legacy database and developing a new application to replace the legacy system. A data migration from the legacy system to the new system would also of course be necessary.  We had only been working on the new application for a week or so but had already been running into problems and I was looking for some advice. We do things in an agile way here at 2Paths, and we were struggling with how to coordinate the database refactor / data migration with the creation of a new application. </p>
<p>After some discussion of whether we should start development on the database or application first, we decided to tackle the database first.  We had started out by re-designing the database to allow for some sorely-needed normalization of some relationships. We came up with a new normalized schema through a mixture of requirements gathering with the client and reverse-engineering the legacy database.  We focussed on areas where we could generalize and abstract out relationships and left the details for last.  With our new schema, we then worked on migrating the legacy data into the new schema to ensure everything was accommodated. </p>
<p>Using our new schema we began application development with Grails. We began having difficulties where the application was changing rapidly, necessitating corresponding changes in the schema and data migration script. We were having a disconnect between the schema changes and the application changes.  It was at this point that I attended Marc&#8217;s session.</p>
<p>The main point that he drove home which we already knew too well was that Database refactoring is expensive.  Databases have <strong>no source code</strong>, databases resist change, databases need to be coordinated with corresponding applications, and teams usually have many databases to coordinate (ie: dev, staging, production).  Marc stressed the first point of there being no source code as one of the major problems. Database creation and maintenance are usually done as &#8216;patch scripts&#8217;. </p>
<p>Enter <a href="http://www.liquibase.org/">LiquiBase</a>. Marc neglected to mention this very handy library that we decided to try out with our new project. LiquiBase is DBMS-agnostic and allows us to manage database change much more cleanly than ever before. I haven&#8217;t fully explored all of the LiquiBase functionality yet but so far am very pleased with what I have explored. There is a LiquiBase Grails plugin and also a LiquiBase IntelliJ Idea plugin that we&#8217;ve been using.</p>
<p>Back to Marc&#8217;s presentation. Because we&#8217;re already convinced that Database refactoring is very expensive, he made a case to do Big Picture database modeling up-front, most importantly capturing all the areas where generalizations and abstractions can be made.  It is much more expensive to refactor something specific into something general than to refactor something general into something specific.  Adding objects rarely if ever affects existing functionality, but removing them usually does.</p>
<p>Start by modeling the entities and relationships only.  Don&#8217;t worry about the attributes or details &#8211; these can be fleshed out during implementation.</p>
<p>Marc had some Dogmas to impart: </p>
<ul>
<li>Use subtypes: they&#8217;re great for recording extra knowledge about a logical model</li>
<li>Hate NULLs: allow no optional relationships. This usually requires using a linked entity which may seem uglier but makes a whole lot more sense</li>
<li>Allow no implied participants in transactions</li>
</ul>
<p>And then there was Pragma:</p>
<ul>
<li>Don&#8217;t invent when you can copy</li>
<li>Use standard models. He highly recommended Len Silverston&#8217;s Data Model Resource Books which we also refer to here at 2Paths</li>
<li>Reuse models when you can</li>
</ul>
<p>So far in the presentation, it didn&#8217;t seem like we were in an &#8220;Agile&#8221; session, but finally Marc roped us back into the subject at hand.  Just because we&#8217;ve done up front design doesn&#8217;t mean we have to implement everything at once.  Using the up front design we can do iterative development.  We don&#8217;t need to have the up front model be 100% correct &#8211; we can do iterative refinement to it as the project progresses.</p>
<p>For an agile implementation, revisit and refine the model at the start of each iteration. Make explicit documented decisions on each of the design decisions that are made so that down the road it&#8217;s known why certain decisions were made.</p>
<p>In summary, Marc reiterated that because databases are so expensive to refactor, it&#8217;s important to get as much right the first go around as possible. Getting it right first will make agile development and subsequent refactors much easier.  Design for change by generalizing and abstracting relationships.</p>
<p>Relating Marc&#8217;s advice back to our project, I realized that we hadn&#8217;t done too badly with our process after all.  We had made generalizations and abstractions where we could, getting the Big Picture logical design done up front.  We then began application development based only on the stories included in the current iteration. The problems we were having in syncing the application to the database were mainly mapping to physical design issues, and not logical design issues. </p>
<p>In hindsight, it may have been easier if we took the logical model that we had designed and implemented this in Grails first and seen what schema Grails produced from that model.  We could then have tweaked this schema and associated mappings to be the physical implementation we wanted, instead of producing the physical model in the database first.</p>
<p>When we had started the process of our up front design we had had many discussions about whether this was the right thing to do, and if it was really agile.  It was good to have confirmation that others have thought of this too and that it IS okay to do Big Picture up front design. This will hopefully save us some pain and suffering in the future.  Hopefully we&#8217;ve made our future database refactoring less expensive.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.2paths.com/2008/11/11/can-database-refactoring-be-agile/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Persistence</title>
		<link>http://www.2paths.com/2008/04/11/persistence/</link>
		<comments>http://www.2paths.com/2008/04/11/persistence/#comments</comments>
		<pubDate>Sat, 12 Apr 2008 00:24:34 +0000</pubDate>
		<dc:creator>Lorill</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[newbies]]></category>

		<guid isPermaLink="false">http://blog.2paths.com/persistence.html</guid>
		<description><![CDATA[Any developers working on projects involving databases will need to be aware of persistence strategies, and take these into consideration at design time. Persistence strategies are tailored to specific projects depending on a variety of circumstances such as if they are read-only or read/write, how important the timeliness of data is within the application, and [...]]]></description>
			<content:encoded><![CDATA[<p>Any developers working on projects involving databases will need to be aware of persistence strategies, and take these into consideration at design time. Persistence strategies are tailored to specific projects depending on a variety of circumstances such as if they are read-only or read/write, how important the timeliness of data is within the application, and how likely it is for data-collisions, etc. If coupled units of work involve writes to various tables, the units of work will need to be wrapped in transactions to avoid data corruption, and have proper rollback strategies in place. Optimistic or pessimistic locking strategies need to be put in place where there are possibilities of data collisions.  Appropriate isolation levels need to be associated with the transactions.</p>
<p>These concepts are all database-agnostic, but there are specific tools for use with Java, Hibernate, and Spring to assist in persistence strategy integration. For more information, see the wiki here:<br />
<a href="https://dev.2paths.com/wiki/display/2pathsTECH/Persistence"></p>
<p>https://dev.2paths.com/wiki/display/2pathsTECH/Persistence</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.2paths.com/2008/04/11/persistence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

