<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Warehouse Library</title>
	<atom:link href="http://dwlib.com/feed" rel="self" type="application/rss+xml" />
	<link>http://dwlib.com</link>
	<description>Efficiently improve relevant decision making</description>
	<lastBuildDate>Sun, 30 Oct 2011 04:43:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Naming Convention</title>
		<link>http://dwlib.com/archives/163</link>
		<comments>http://dwlib.com/archives/163#comments</comments>
		<pubDate>Sat, 24 Sep 2011 21:37:29 +0000</pubDate>
		<dc:creator>Arnoud</dc:creator>
				<category><![CDATA[Patterns]]></category>
		<category><![CDATA[abbreviation]]></category>
		<category><![CDATA[convention]]></category>
		<category><![CDATA[name]]></category>
		<category><![CDATA[naming]]></category>
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://dwlib.com/?p=163</guid>
		<description><![CDATA[A system needs to find and address different objects efficiently and accurately while remaining easy to understand, develop and keep up. Names are primarily a handle for people, as systems do not value the difference between OBJ1234 and CUST_NM. So, &#8230; <a href="http://dwlib.com/archives/163">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A system needs to find and address different objects efficiently and accurately while remaining easy to understand, develop and keep up.</p>
<p><div class="problem" style="">You need a scheme that allows you to uniquely reference each object in a way that is both machine and human readable. The scheme describes each object within the business domain the system supports.</div><span id="more-163"></span></p>
<div class="solution" style="">Come up with a naming convention that names objects for their content, and possibly their role, within the business domain. Avoid naming based on internal workings of the object, such as SLS_ORDR_<em>VW,</em> <em>CRC32</em>_CHKSUM, or <em>int</em>PersonWeight. <a  title="Namespace - Wikipedia" href="http://en.wikipedia.org/wiki/Namespace">Namespaces</a> — database, schema, table, etc. — provide context to underlying objects, do not cram that context into an object&#8217;s name. Unit of measure is not an internal aspect of an object as it changes the meaning of the content the object represents.</p>
<p><a  href="http://dwlib.com/wp-content/plugins/ebnfer/images/04408112764c3386b23b80583ba8b69a.png"><img title="Name Syntax Example" alt="ColumnName = (DescriptionContext) Description (DomainContext) Domain"src="http://dwlib.com/wp-content/plugins/ebnfer/images/04408112764c3386b23b80583ba8b69a.png" /></a></p>
<p>Decide on a name syntax for objects; it describes how to form names. While different object types can have different name syntaxes, keep the number of syntaxes small, strive for three or less. You also choose an abbreviation scheme as necessary. Finally, and most important, <span class="pullquote" style="">apply your naming convention consistently everywhere</span>.</div>
<p>Names are primarily a handle for people, as systems do not value the difference between OBJ1234 and CUST_NM. So, good names instantly convey meaning that otherwise is only available through tacit knowledge or locked away in a document somewhere.</p>
<p>A naming convention that changes names when implementation changes demands that everyone referring to it dances to its tune. It creates unnecessary <a  title="Coupling (computer programming) - Wikipedia" href="http://en.wikipedia.org/wiki/Coupling_(computer_programming)">coupling</a> and makes the object name a &#8220;kitchen sink&#8221; of different meanings. This includes less obvious values like _DIM and _FACT suffixes as Jeff Kanel commented in the <a  title="TDWI's Business Intelligence and Data Warehousing Discussion Group - LinkedIn" href="http://www.linkedin.com/groups?home=&#038;gid=45685&#038;trk=anet_ug_hm">TDWI BI and DW Discussion group on LinkedIn</a>. For example: Customer is hardly a fact, as Sales Order is hardly a dimension. Sales Order Status? Clearly a dimension.</p>
<p>If _KEY means <a  title="Surrogate Key - Wikipedia" href="http://en.wikipedia.org/wiki/Surrogate_key">surrogate key</a>, that indicates content; if _KEY means primary key, that indicates function. A surrogate key has as its sole purpose to substitute a primary key, hence content follows function and the separation becomes difficult to see.</p>
<p>If namespaces do not give enough context to properly name the object, it might become necessary to <span class="pullquote" style="">add the role to the name, <em>not</em> the function</span>. Imagine an employee table with an EMPLOYEE_KEY and you need to have a self-referencing foreign key to document the organizational structure. Name it MANAGER_EMPLOYEE_KEY to show its role, not EMPLOYEE_FKEY to show its function nor MANAGER_KEY where the manager role replaces the employee meaning.</p>
<p>As a last comment, consistency and good naming patterns allow logic that can make decisions based on the object name. Your solutions can become more flexible at the cost of a more disciplined naming convention.</p>
<p>
<table id="wp-table-reloaded-id-2-no-1" class="wp-table-reloaded wp-table-reloaded-id-2">
<thead>
	<tr class="row-1 odd">
		<th colspan="3" class="column-1 colspan-3"><strong>Principle Trade-offs</strong></th>
	</tr>
</thead>
<tbody class="row-hover">
	<tr class="row-2 even">
		<td class="column-1"><a  href="http://dwlib.com/principles#Extend_your_understanding-1">Extend your understanding</a></td><td class="column-2"><strong><</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Know_your_limits-2">Know your limits</a></td>
	</tr>
	<tr class="row-3 odd">
		<td class="column-1"><a  href="http://dwlib.com/principles#Stay_relevant-3">Stay relevant</a></td><td class="column-2"><strong><</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Build_enduring-4">Build enduring</a></td>
	</tr>
	<tr class="row-4 even">
		<td class="column-1"><a  href="http://dwlib.com/principles#Do_right-5">Do right</a></td><td class="column-2"><strong>=</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Share-6">Share</a></td>
	</tr>
</tbody>
</table>
<!--{NETBLOG_EXPORT} MTIzNDBiMjBhMzkxODcyZGEyZjNlYTUxMzI4MzY2MjM= --></p>]]></content:encoded>
			<wfw:commentRss>http://dwlib.com/archives/163/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Period</title>
		<link>http://dwlib.com/archives/161</link>
		<comments>http://dwlib.com/archives/161#comments</comments>
		<pubDate>Thu, 15 Sep 2011 04:25:23 +0000</pubDate>
		<dc:creator>Arnoud</dc:creator>
				<category><![CDATA[Patterns]]></category>
		<category><![CDATA[date]]></category>
		<category><![CDATA[interval]]></category>
		<category><![CDATA[period]]></category>
		<category><![CDATA[range]]></category>
		<category><![CDATA[time]]></category>

		<guid isPermaLink="false">http://dwlib.com/?p=161</guid>
		<description><![CDATA[A business needs to track periods of time. This is a special case of the Range pattern, where the data values are date and time related. A period can be constructed with a single boundary and an interval. As most &#8230; <a href="http://dwlib.com/archives/161">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A business needs to track periods of time. This is a special case of the <a title="Range" href="http://dwlib.com/archives/118">Range</a> pattern, where the data values are date and time related.</p>
<p>A period can be constructed with a single boundary and an interval. As most actions with periods rely on testing its boundaries, this approach would need continuous calculation of the opposite boundary. Conversely, the interval can always be calculated from the two period boundaries and is used less often.</p>
<p><span id="more-161"></span></p>
<h2>Range pattern</h2>
<p><p>A business needs to bucket sets of consecutive, <a  title="Level of Measurement - Ordinal scale" href="http://en.wikipedia.org/wiki/Level_of_measurement#Ordinal_scale">ordinal</a>, data for aggregation purposes like age group, salary band, <a  title="Period" href="http://dwlib.com/archives/161">period</a>, life cycle phase, and so on. This is different from grouping non-consecutive, <a  title="Level of Measurement - Nominal Scale" href="http://en.wikipedia.org/wiki/Level_of_measurement#Nominal_scale">nominal</a>, data into buckets, such as hierarchies.</p>
<p><div class="problem" style="">You need to store a complete definition of all buckets and the values they contain, while allowing for minimum and maximum values and gaps between buckets. Bucket boundaries can change regularly, while the precision and scale of boundaries changes at a slower rate.</div><!--more--></p>
<div class="solution" style="">Create <span class="pullquote" style="">a range with an inclusive lower boundary and an exclusive upper boundary</span>, also known as [closed-open). Two boundaries allow you to handle extremities and gaps between ranges. System maximums for the extremities can cause overflow errors when used in calculations. Negative and positive <a  title="Infinity" href="http://en.wikipedia.org/wiki/Infinity">infinities</a> need special values; check if your chosen infinity values need <a  title="Three-valued logic" href="http://en.wikipedia.org/wiki/Three-valued_logic">three-valued logic</a>; i.e. <a  title="NULL (SQL)" href="http://en.wikipedia.org/wiki/Null_(SQL)">NULL</a>s.</p>
<p>Modify each boundary name to show the inclusive or exclusive state, e.g. since/until for periods and from/up to for numbers. Boundaries hold real data values or the data's sorting values: e.g. January through December are ordinal if you use numbers 1 through 12 to sort them. Identify the correct data type, and its precision and scale, or grain.</p>
<p>Enforce uniqueness if overlap between ranges cannot be tolerated; this is <em>not</em> easy.</div>
<p>The exclusive upper boundary removes +/-1 calculations, except for ranges the size of a single data point; the grain. This allows data types to change to a finer grain — DECIMAL(6,2) becomes DECIMAL(8,4) — without having to update code or values, because<a  title="Snodgrass (1999, p.91). Developing Time-Oriented Database Applications in SQL. San Francisco, California: Morgan Kaufmann Publishers." href="http://www.cs.arizona.edu/people/rts/tdbbook.pdf"> the niggling +/-1 is obsolete</a><a  class="fn-ref-mark" href="#footnote-1" id="refmark-1"><sup>[1]</sup></a>.</p>
<p>The <span class="pullquote" style="">Range pattern cannot use the <acronym title="American National Standards Institute">ANSI</acronym>-<acronym title="Structured Query Language">SQL</acronym> BETWEEN operator</span>, which would be used to compare a single data point against a [closed-closed] range. You can write out the greater-than-or-equal-to and less-than logic and most <acronym title="Relational DataBase Management System">RDBMS</acronym> allow <a  title="Function Overloading" href="http://en.wikipedia.org/wiki/Function_overloading">overloaded</a>, <a  title="Deterministic Algorithm" href="http://en.wikipedia.org/wiki/Deterministic_algorithm">deterministic</a> functions as an alternative. Three-valued logic, if needed, would invalidate BETWEEN as well.</p>
<p>As a final consequence, joins between consecutive ranges are straight <a  title="Join (SQL)" href="http://en.wikipedia.org/wiki/Join_(SQL)">equi-joins</a> without calculations, so both sides of the join can benefit from indexes. And you can query a continuous chain of ranges with a recursive <acronym title="Common Table Expression  or WITH clause">CTE</acronym>.<a  class="fn-ref-mark" href="#footnote-2" id="refmark-2"><sup>[2]</sup></a></p>
<p>
<table id="wp-table-reloaded-id-1-no-1" class="wp-table-reloaded wp-table-reloaded-id-1">
<thead>
	<tr class="row-1 odd">
		<th colspan="3" class="column-1 colspan-3"><strong>Principle Trade-offs</strong></th>
	</tr>
</thead>
<tbody class="row-hover">
	<tr class="row-2 even">
		<td class="column-1"><a  href="http://dwlib.com/principles#Extend_your_understanding-1">Extend your understanding</a></td><td class="column-2"><strong>></strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Know_your_limits-2">Know your limits</a></td>
	</tr>
	<tr class="row-3 odd">
		<td class="column-1"><a  href="http://dwlib.com/principles#Stay_relevant-3">Stay relevant</a></td><td class="column-2"><strong><</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Build_enduring-4">Build enduring</a></td>
	</tr>
	<tr class="row-4 even">
		<td class="column-1"><a  href="http://dwlib.com/principles#Do_right-5">Do right</a></td><td class="column-2"><strong>=</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Share-6">Share</a></td>
	</tr>
</tbody>
</table>
<!--{NETBLOG_EXPORT} NzkwODQ= --></p>
<div id="footnote-list" style="display:inherit"><span id=fn-heading>Footnotes</span> &nbsp;&nbsp;&nbsp;(&uarr; returns to text)
<ol>
<li id="footnote-1" class="fn-text">If you like <a  href="http://www.amazon.com/gp/product/1558604367?ie=UTF8&#038;tag=datawarelibr-20&#038;linkCode=as2&#038;camp=1634&#038;creative=6738&#038;creativeASIN=1558604367">a paper copy</a> better…<a  href="#refmark-1">&uarr;</a></li>
<li id="footnote-2" class="fn-text">See also <a  title="Common Table Expressions" href="http://en.wikipedia.org/wiki/Common_table_expressions">Common Table Expressions</a> and <a  title="Hierarchical Query" href="http://en.wikipedia.org/wiki/Hierarchical_query">Hierarchical Query</a><a  href="#refmark-2">&uarr;</a></li>
</ol>
</div><!--{NETBLOG_EXPORT} NjgxZDdmMWE1ZmFhOWRmNjEwNDA5MjU4NTE5MzljOGU= --></p>]]></content:encoded>
			<wfw:commentRss>http://dwlib.com/archives/161/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Range</title>
		<link>http://dwlib.com/archives/118</link>
		<comments>http://dwlib.com/archives/118#comments</comments>
		<pubDate>Sun, 04 Sep 2011 23:27:49 +0000</pubDate>
		<dc:creator>Arnoud</dc:creator>
				<category><![CDATA[Patterns]]></category>
		<category><![CDATA[band]]></category>
		<category><![CDATA[bucket]]></category>
		<category><![CDATA[period]]></category>
		<category><![CDATA[range]]></category>

		<guid isPermaLink="false">http://dwlib.com/?p=118</guid>
		<description><![CDATA[A business needs to bucket sets of consecutive, ordinal, data for aggregation purposes like age group, salary band, period, life cycle phase, and so on. This is different from grouping non-consecutive, nominal, data into buckets, such as hierarchies. The exclusive &#8230; <a href="http://dwlib.com/archives/118">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A business needs to bucket sets of consecutive, <a  title="Level of Measurement - Ordinal scale" href="http://en.wikipedia.org/wiki/Level_of_measurement#Ordinal_scale">ordinal</a>, data for aggregation purposes like age group, salary band, <a  title="Period" href="http://dwlib.com/archives/161">period</a>, life cycle phase, and so on. This is different from grouping non-consecutive, <a  title="Level of Measurement - Nominal Scale" href="http://en.wikipedia.org/wiki/Level_of_measurement#Nominal_scale">nominal</a>, data into buckets, such as hierarchies.</p>
<p><div class="problem" style="">You need to store a complete definition of all buckets and the values they contain, while allowing for minimum and maximum values and gaps between buckets. Bucket boundaries can change regularly, while the precision and scale of boundaries changes at a slower rate.</div><span id="more-118"></span></p>
<div class="solution" style="">Create <span class="pullquote" style="">a range with an inclusive lower boundary and an exclusive upper boundary</span>, also known as [closed-open). Two boundaries allow you to handle extremities and gaps between ranges. System maximums for the extremities can cause overflow errors when used in calculations. Negative and positive <a  title="Infinity" href="http://en.wikipedia.org/wiki/Infinity">infinities</a> need special values; check if your chosen infinity values need <a  title="Three-valued logic" href="http://en.wikipedia.org/wiki/Three-valued_logic">three-valued logic</a>; i.e. <a  title="NULL (SQL)" href="http://en.wikipedia.org/wiki/Null_(SQL)">NULL</a>s.</p>
<p>Modify each boundary name to show the inclusive or exclusive state, e.g. since/until for periods and from/up to for numbers. Boundaries hold real data values or the data's sorting values: e.g. January through December are ordinal if you use numbers 1 through 12 to sort them. Identify the correct data type, and its precision and scale, or grain.</p>
<p>Enforce uniqueness if overlap between ranges cannot be tolerated; this is <em>not</em> easy.</div>
<p>The exclusive upper boundary removes +/-1 calculations, except for ranges the size of a single data point; the grain. This allows data types to change to a finer grain — DECIMAL(6,2) becomes DECIMAL(8,4) — without having to update code or values, because<a  title="Snodgrass (1999, p.91). Developing Time-Oriented Database Applications in SQL. San Francisco, California: Morgan Kaufmann Publishers." href="http://www.cs.arizona.edu/people/rts/tdbbook.pdf"> the niggling +/-1 is obsolete</a><a  class="fn-ref-mark" href="#footnote-1" id="refmark-1"><sup>[1]</sup></a>.</p>
<p>The <span class="pullquote" style="">Range pattern cannot use the <acronym title="American National Standards Institute">ANSI</acronym>-<acronym title="Structured Query Language">SQL</acronym> BETWEEN operator</span>, which would be used to compare a single data point against a [closed-closed] range. You can write out the greater-than-or-equal-to and less-than logic and most <acronym title="Relational DataBase Management System">RDBMS</acronym> allow <a  title="Function Overloading" href="http://en.wikipedia.org/wiki/Function_overloading">overloaded</a>, <a  title="Deterministic Algorithm" href="http://en.wikipedia.org/wiki/Deterministic_algorithm">deterministic</a> functions as an alternative. Three-valued logic, if needed, would invalidate BETWEEN as well.</p>
<p>As a final consequence, joins between consecutive ranges are straight <a  title="Join (SQL)" href="http://en.wikipedia.org/wiki/Join_(SQL)">equi-joins</a> without calculations, so both sides of the join can benefit from indexes. And you can query a continuous chain of ranges with a recursive <acronym title="Common Table Expression  or WITH clause">CTE</acronym>.<a  class="fn-ref-mark" href="#footnote-2" id="refmark-2"><sup>[2]</sup></a></p>
<p>
<table id="wp-table-reloaded-id-1-no-2" class="wp-table-reloaded wp-table-reloaded-id-1">
<thead>
	<tr class="row-1 odd">
		<th colspan="3" class="column-1 colspan-3"><strong>Principle Trade-offs</strong></th>
	</tr>
</thead>
<tbody class="row-hover">
	<tr class="row-2 even">
		<td class="column-1"><a  href="http://dwlib.com/principles#Extend_your_understanding-1">Extend your understanding</a></td><td class="column-2"><strong>></strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Know_your_limits-2">Know your limits</a></td>
	</tr>
	<tr class="row-3 odd">
		<td class="column-1"><a  href="http://dwlib.com/principles#Stay_relevant-3">Stay relevant</a></td><td class="column-2"><strong><</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Build_enduring-4">Build enduring</a></td>
	</tr>
	<tr class="row-4 even">
		<td class="column-1"><a  href="http://dwlib.com/principles#Do_right-5">Do right</a></td><td class="column-2"><strong>=</strong></td><td class="column-3"><a  href="http://dwlib.com/principles#Share-6">Share</a></td>
	</tr>
</tbody>
</table>
<!--{NETBLOG_EXPORT} NzkwODQ= --></p>
<div id="footnote-list" style="display:inherit"><span id=fn-heading>Footnotes</span> &nbsp;&nbsp;&nbsp;(&uarr; returns to text)
<ol>
<li id="footnote-1" class="fn-text">If you like <a  href="http://www.amazon.com/gp/product/1558604367?ie=UTF8&#038;tag=datawarelibr-20&#038;linkCode=as2&#038;camp=1634&#038;creative=6738&#038;creativeASIN=1558604367">a paper copy</a> better…<a  href="#refmark-1">&uarr;</a></li>
<li id="footnote-2" class="fn-text">See also <a  title="Common Table Expressions" href="http://en.wikipedia.org/wiki/Common_table_expressions">Common Table Expressions</a> and <a  title="Hierarchical Query" href="http://en.wikipedia.org/wiki/Hierarchical_query">Hierarchical Query</a><a  href="#refmark-2">&uarr;</a></li>
</ol>
</div>]]></content:encoded>
			<wfw:commentRss>http://dwlib.com/archives/118/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

