<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SDLC Blog &#187; Data modelling</title>
	<atom:link href="http://www.rodenas.org/blog/category/data-modelling/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rodenas.org/blog</link>
	<description>Software Development Life Cycle: Methodologies and Tools for the Enterprise</description>
	<lastBuildDate>Tue, 04 May 2010 22:21:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Reverse engineer the WordPress database with RDA</title>
		<link>http://www.rodenas.org/blog/2007/01/22/reverse-engineer-the-wordpress-database-with-rda/</link>
		<comments>http://www.rodenas.org/blog/2007/01/22/reverse-engineer-the-wordpress-database-with-rda/#comments</comments>
		<pubDate>Mon, 22 Jan 2007 00:39:49 +0000</pubDate>
		<dc:creator>Ferdy</dc:creator>
				<category><![CDATA[Data modelling]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://www.rodenas.org/blog/2007/01/22/reverse-engineer-the-wordpress-database-with-rda/</guid>
		<description><![CDATA[I was evaluating some enterprise data modelling tools when I realized that one of them, the new version 7.01 of Rational Data Architect (RDA), now supports MySQL. Simultaneous, I was working on a personal project trying to analyse the WordPress database, that relies on MySQL, in order to understand how does it works. So I [...]]]></description>
			<content:encoded><![CDATA[<p>I was evaluating some enterprise data modelling tools when I realized that one of them, the new version 7.01 of <a href="http://www-306.ibm.com/software/data/integration/rda/">Rational Data Architect</a> (RDA), now supports <a href="http://www.mysql.com">MySQL</a>. Simultaneous, I was working on a personal project trying to analyse the <a href="http://wordpress.org">WordPress</a> database, that relies on MySQL, in order to understand how does it works. So I decided to combine both projects and try to reverse engineer the MySQL WordPress database with RDA to obtain the WordPress data model.</p>
<p>Based on this experience, I wrote a short tutorial for beginners that explains how you can reverse engineer a MySQL database with RDA. I didn&#8217;t intend to cover all aspects of the product, that is the job of the reference manuals. And if you want to go deeper in the Rational Data Architect functionalities, at the bottom of this article you can also find some <a href="#references">references</a>.</p>
<p>So, here it is the tutorial as well as my conclusion. Hope you find it useful.</p>
<p><span id="more-77"></span></p>
<h3>Prerequisites</h3>
<ol>
<li>A <a href="http://wordpress.org/download/">WordPress</a> installation configured and running (so I assume that you have also a <a href="http://dev.mysql.com/downloads/mysql/5.0.html">MySQL</a> instance running). In this tutorial we are going to use a wordpress 2.07 database and MySQL 5.0 (despite RDA only supports up to version 4.1).</li>
<li>Rational Data Architect V7.01. If you don&#8217;t have a license and you are just evaluating the product, you can download a <a href="http://www.ibm.com/developerworks/downloads/r/rda/?S_TACT=105AGX28&#038;S_CMP=TRIALS">trial</a>. RDA can be installed on top of an existing <a href="http://www.eclipse.org/downloads/">Eclipse 3.2</a> environment or will install its own Eclipse 3.2 instance.</li>
<li>A MySQL jdbc driver. You can use the official <a href="http://www.mysql.com/products/connector/j/">MySQL Connector/J</a>.</li>
</ol>
<h3>Creating a new data design project</h3>
<p>A data design project is primarily used to store modelling objects, including logical and physical data models, <acronym title="Data Definition Language">DDL</acronym> scripts, mapping models, and more. To create a data design project:</p>
<ol>
<li>On the main menu bar, select <b>File &gt; New &gt; Project</b>. Or, you could right-click in any blank space in the Package Explorer (if you are using the RDA plugin on Eclipse) or in the Data Project Explorer (if you are using RDA with the Eclipse bundled on it) and select <b>New &gt; Project</b>. The New Project wizard opens.</li>
<li>Select <b>Data Design Project</b>, under the Data folder.</li>
<li>Name the Project <b>WordPress</b> and select <b>Finish</b>.</li>
</ol>
<h3>Creating a new physical data model:</h3>
<p>A physical data model (PDM) is a database-specific model that represents relational data objects, such as tables, columns, primary and foreign keys. A PDM can be used to generate DDL statements that can then be deployed to a database server.</p>
<p>You can use the New Physical Data Model wizard to create a physical data model:</p>
<ol>
<li>Select <b>File &gt; New &gt; Physical Data Model</b> from the main menu file. The New Physical Data Model wizard opens.</li>
<li>On the first page of the wizard, change the file name of the model to <b>WordPress PDM</b>, the selected database to <b>MySQL</b>, the selected version to <b>4.1</b>, and check the <b>Create from reverse engineering</b> check box. Then select <b>Next</b>.</li>
<p><center><a class="imagelink" href="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-1.jpg" title="Creating a new physical data model" rel="lightbox[77]"><img id="image84" src="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-1.thumbnail.jpg" alt="Creating a new physical data model" /></a></center><br />
</p>
<li>On the second page, note the <b>Create a new connection</b> is checked. Leave that as it is and select <b>Next</b>.</li>
<li>On the third panel, specify:</li>
<ul>
<li>Database: The name of the WordPress database, in this case <b>wordpress</b></li>
<li>JDBC driver class: <b>com.mysql.jdbc.Driver</b></li>
<li>Class location: Browse to the location of the MySQL jdbc driver file <b>mysql-connector-java-5.0.4-bin.jar</b></li>
<li>Connection url: <b>jdbc:mysql://<i>host</i>:<i>port/</i></b>, where <b><i>host</i></b> is the name of the system where MySQL is installed, in this case <b>localhost</b>, and <b><i>port</i></b> is the database server port that the MySQL instance is using to listen to communications from remote clients, in this case <b>3306</b></li>
<li>User and Password: type your <b>user ID</b> and <b>password</b></li>
</ul>
<p><center><a class="imagelink" href="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-2.jpg" title="Connection Parameters" rel="lightbox[77]"><img id="image85" src="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-2.thumbnail.jpg" alt="Connection Parameters" /></a></center><br />
</p>
<li>Select <b>Test Connection</b> and if the connection is successful then select <b>Next</b>.</li>
<li>On the fourth panel, select the wordpress schema to reverse engineer and then select <b>Next</b>.</li>
<li>On the fifth, select the database elements to reverse engineer and then select <b>Next</b>.</li>
<li>On the sixth panel, check the <b>Generate Overview diagram</b> option and then select <b>Next</b>.</li>
<li>Select <b>Finish</b>.</li>
</ol>
<p>The PDM is created and displayed in the Data Models folder under the WordPress data design project (this model has a .dbm extension to represent physical data model).</p>
<p>If you expand WordPress PDM.dbm and wordpress schema, you will see the database elements we have reverse engineer. Double-click the wordpress diagram, in the Diagrams folder under the wordpress schema, to see the new generated diagram. In the properties tab, you can change the elements that must appear in the diagram.<br />
<center><a class="imagelink" href="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-3.jpg" title="Using the diagram, palette, and properties view to design a PDM" rel="lightbox[77]"><img id="image86" src="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-3.thumbnail.jpg" alt="Using the diagram, palette, and properties view to design a PDM" /></a></center>
</p>
<h3>Creating the foreign key relationships</h3>
<p>As wordpress database doesn&#8217;t use foreign keys, we must create them manually in order to see relationships between tables. To create a foreign key relationship from a parent table to a child parent table in a physical data model diagram:</p>
<ol>
<li>Select a type of relationship in the palette.</li>
<li>Select the parent table that has the primary key</li>
<li>Drag to the child table. Depending on the type of relationship you are creating, a pop up window might open asking you to specify relationship options.</li>
</li>
</ol>
<p>Be aware that the key from the parent table is migrated to the child table. As wordpress doesn&#8217;t use a similar name on the child table, we must delete this new field at the child table and assign manually the relation between the primary key at the parent table and the foreign key at the child table.</p>
<ol>
<li>In the Data Project Explorer view, select the <b>WordPress PDM</b> model and then the <b>wordpress</b> schema.</li>
<li>Select the <b>child table</b> to modify.</li>
<li>Select the <b>Foreign key relation</b> to modify.</li>
<li>Select <b>Details</b> under the <b>Properties</b> tab.</li>
<li>Select the appropriate <b>column</b> in the child table.</li>
<li>Delete the generated column in the child table.</li>
</ol>
<p><center><a class="imagelink" href="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-5.jpg" title="Foreign keys details" rel="lightbox[77]"><img id="image87" src="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-5.thumbnail.jpg" alt="Foreign keys details" /></a></center><br />
</p>
<p>So in order to establish all relations, we must create manually the following relationships:</p>
<ul>
<li>wp_linkcategories[cat_id] -&gt; wp_links[link_category]</li>
<li>wp_categories[cat_ID] -&gt; wp_post2cat[category_id]</li>
<li>wp_categories[cat_ID] -&gt; wp_categories[category_parent]</li>
<li>wp_posts[ID] -&gt; wp_post2cat[post_id]</li>
<li>wp_posts[ID] -&gt; wp_comments[comment_post_ID]</li>
<li>wp_posts[ID] -&gt; wp_posts[post_parent]</li>
<li>wp_posts[ID] -&gt; wp_postmeta[post_id]</li>
<li>wp_comments[comment_ID] -&gt; wp_comments[comment_parent]</li>
<li>wp_users[ID] -&gt; wp_links[link_owner]</li>
<li>wp_users[ID] -&gt; wp_posts[post_author]</li>
<li>wp_users[ID] -&gt; wp_comments[user_id]</li>
<li>wp_users[ID] -&gt; wp_usermeta[user_id]</li>
</ul>
<p></p>
<p>And you will get this final diagram:<br />
<center><a class="imagelink" href="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-6.jpg" title="Wordpress Physical Data Model" rel="lightbox[77]"><img id="image88" src="http://www.rodenas.org/blog/wp-content/files/2007/01/figure-6.thumbnail.jpg" alt="Wordpress Physical Data Model" /></a></center>
</p>
<p></p>
<p>You can also define the referential integrity constrains, but this is not the object of this tutorial.</p>
<h3>Publishing the data model</h3>
<p>Finally, you can publish the data model outside of the modelling tool, as an HTML page or as a PDF file. To create a PDF report:</p>
<ol>
<li>In the Data Project Explorer view, select the <b>WordPress PDM</b> model to on which to create a report.</li>
<li>On the main menu bar, select <b>Data &gt; Publish &gt; Report</b>.</li>
<li>In the Generate Report window, select the <b>Diagram Report for Physical Data Model</b>. Each row contains information on the type of file to be generated, the name of the report, and a description.</li>
<li>Type an <b>output file name</b> in the Select the file name for the generated report field.</li>
<li>Select <b>OK</b> to publish the model report.</li>
</ol>
<h3>Summary</h3>
<p>In this tutorial you learned how to create a new physical data model and reverse engineer an existing MySQL database. You created a foreign key relationship from a parent table to a child parent table, and you modified the columns involved in the relationship. Finally, you published the data model outside of the modelling tool, as a PDF file.</p>
<h3>My opinion</h3>
<p>Rational Data Architect supports lots of relational data sources (<a href="http://www-306.ibm.com/software/data/cloudscape/">Cloudscape</a>, <a href="http://www-306.ibm.com/software/data/db2/">DB2 Universal Database (UDB)</a>, <a href="http://www-03.ibm.com/servers/eserver/iseries/db2/">DB2 UDB iSeries</a>, <a href="http://www-306.ibm.com/software/data/db2/zos/">DB2 UDB zSeries</a>, <a href="http://db.apache.org/derby/">Derby</a>, <a href="http://www-306.ibm.com/software/data/informix/">Informix</a>, <a href="http://www.mysql.com"> MySql</a>, <a href="http://www.oracle.com/database/index.html">Oracle</a>, <a href="http://www.microsoft.com/sql/default.mspx">Microsoft® SQL Server</a> and <a href="http://www.sybase.com/">Sybase</a>), and that it&#8217;s great.</p>
<p>You can work in the Logical model or in the Physical model separately, and you have the ability to transform a Logical model to a Physical model or vice versa. This sounds good, as you can differentiate roles but reusing the previous work: analysts designs the application, the logical model, and developers could implement the physical model based on the target <acronym title="Database Management System">DBMS</acronym> reusing the work done by the analyst. The only problem is that both models are not synchronized, so if the analyst changes an entity in the logical model, it is not propagated to the physical model, and this could be a mess.</p>
<p>It&#8217;s very easy to reverse engineer a database, as I have show you in this tutorial. The biggest concern about obtaining a data model through reverse engineering is discovering relationships. This is a hard work, as many apps database designs doesn&#8217;t use foreign keys (as WordPress) and they don&#8217;t tend to normalize column names. RDA has a discover function that can help you find the matching elements automatically so that you don&#8217;t have to specify them manually. I need to check deeper this functionality, but I don&#8217;t have a hope that it will succeed completely, as there are lots of relations that rely on the code, not in the database.</p>
<p>It will be very useful that when you create a new relation between two entities/tables, you could also specify the child column. In the tutorial example, I had to delete the new field created automatically at the child table and to assign manually the relation between the primary key at the parent table and the foreign key at the child table. In my opinion, this steps can be avoided if you let specify which is the target child column.</p>
<p>RDA can be installed on top of an existing Eclipse 3.2 environment or will install its own Eclipse 3.2 instance. I didn&#8217;t find any list of plug-in dependencies in the installation guide, so I tried to install RDA on top of my existing Eclipse 3.2 (Birt, CDT, DTP, PHP, TPTP and WTP). The plug-in worked fine, except some few functionalities as the report generation and the XML Schema Validator. This is a problem that I have had with lots of Eclipse plug-ins. If you don&#8217;t want to deal with plug-in dependencies problems, my recommendation is to install RDA with its own Eclipse instance.</p>
<p>I have also found some bugs:</p>
<ul>
<li>You can not specify a precision for the BIGINT numerical type, despite you can specify it in MySQL.</li>
<li>RDA doesn&#8217;t support the MySQL <b>ENUM</b> type. Enum is not a SQL standard (you must create a separate table that maps different values or use a check restriction), but if the RDA brochure says it supports MySQL then it must support the ENUM type.</li>
<li>With the previous problem in the physical model, RDA doesn&#8217;t generate DDL, despite there isn&#8217;t any error in the Problems tabs and no message appeared in the Generate DDL wizard. This problem drives me crazy, until I found the problem. It will be very useful to have a validation utility.</li>
</ul>
<p>Despite these bugs, Rational Data Architect made a very good impression on me.</p>
<h3><a id="references">References</a></h3>
<ul>
<li>&#8220;<a href="http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0701liu/">Use Rational Data Architect to define and enforce data object naming standards</a>&#8221; (developerWorks, Jan. 2007) examines the features of IBM Rational Data Architect that enable users to define and implement object naming standards, and then demonstrates with a real-world example.</li>
<li>RDA skills series at <a href="http://www-128.ibm.com/developerworks">developerWorks</a>:</li>
<ul>
<li>&#8220;<a href="http://www-128.ibm.com/developerworks/edu/ar-dw-ar-rdamap.html">Rational Data Architect skills series, Part 3: Discover schema relationships with Rational Data Architect</a>&#8221; (developerWorks, Dec. 2006) describes how to create schema mappings semi-automatically.</li>
<li>&#8220;<a href="http://www-128.ibm.com/developerworks/edu/dm-dw-dm-0609bittner-i.html">Rational Data Architect skills series, Part 2: Generate SQL/XML queries with Rational Data Architect</a>&#8221; (developerWorks, Sep. 2006) describes how to transform data from relational data sources into XML format.</li>
<li>&#8220;<a href="http://www-128.ibm.com/developerworks/edu/ar-dw-ar-wbirda.html">Rational Data Architect skills series, Part 1: Access and integrate enterprise metadata with Rational Data Architect</a>&#8221; (developerWorks, Jul. 2006) describes how to create a unified view across heterogeneous data sources.</li>
</ul>
</ul>
<img src="http://www.rodenas.org/blog/?ak_action=api_record_view&id=77&type=feed" alt="" />]]></content:encoded>
			<wfw:commentRss>http://www.rodenas.org/blog/2007/01/22/reverse-engineer-the-wordpress-database-with-rda/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
