<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress.com" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>data-structures &amp;laquo; WordPress.com Tag Feed</title>
	<link>http://en.wordpress.com/tag/data-structures/</link>
	<description>Feed of posts on WordPress.com tagged "data-structures"</description>
	<pubDate>Fri, 27 Nov 2009 22:13:22 +0000</pubDate>

	<generator>http://en.wordpress.com/tags/</generator>
	<language>en</language>

<item>
<title><![CDATA[Linked list implementation in C]]></title>
<link>http://ragsagar.wordpress.com/2009/11/26/linked-list-implementation-in-c/</link>
<pubDate>Thu, 26 Nov 2009 13:16:09 +0000</pubDate>
<dc:creator>Rag Sagar.V രാഗ് സാഗര്‍.വി</dc:creator>
<guid>http://ragsagar.wordpress.com/2009/11/26/linked-list-implementation-in-c/</guid>
<description><![CDATA[Just thought of sharing the code i written to learn linked list implementation the day before my dat]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Just thought of sharing the code i written to learn linked list implementation the day before my data structures model practical exam. </p>
<pre class="brush: python;">
/*
 *      linkedlist.c
 *
 *      Copyright 2009 Rag Sagar.V &#60;ragsagar@gmail.com&#62;
 *
 *      This program is free software; you can redistribute it and/or modify
 *      it under the terms of the GNU General Public License as published by
 *      the Free Software Foundation; either version 2 of the License, or
 *      (at your option) any later version.
 *
 *      This program is distributed in the hope that it will be useful,
 *      but WITHOUT ANY WARRANTY; without even the implied warranty of
 *      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *      GNU General Public License for more details.
 *
 *      You should have received a copy of the GNU General Public License
 *      along with this program; if not, write to the Free Software
 *      Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
 *      MA 02110-1301, USA.
 */

#include &#60;stdio.h&#62;
#include &#60;stdlib.h&#62;

typedef struct list
{
	int data;
	struct list *next;
}LIST;
LIST *ptr,*temp,*start=NULL;
void insert_after(int ,int );
void remove_item(int );
void display(void);
int count=0;
int main()
{
	int item,opt,dat;
	system(&#34;clear&#34;);
	ptr=NULL;
	/* printf(&#34;%d&#34;,sizeof(LIST)); */
	do
	{
		printf(&#34;\n########## MENU ##########\n&#34;);
		printf(&#34;1.Insert\n2.Remove\n3.Display\n4.Exit\n&#34;);
		printf(&#34;Enter your option : &#34;);
		scanf(&#34;%d&#34;,&#38;opt);
	 	switch(opt)
	 	{
	 		case 1:
	 		printf(&#34;Enter the data to insert &#34;);
	 		scanf(&#34;%d&#34;,&#38;item);
	 		if(count==0)
	 		{
	 			ptr = (LIST *)malloc(sizeof(LIST));
	 			ptr-&#62;next = NULL;
	 			ptr-&#62;data = item;
	 			start = ptr;
			}
			else
			{
				printf(&#34;Enter the item after which you have to insert new element : &#34;);
				scanf(&#34;%d&#34;,&#38;dat);
				insert_after(dat,item);
			}
			count++;
				break;
	 		case 2:
	 		if(count==0)
	 		{
	 			printf(&#34;\nList is empty\n&#34;); break;
			}
	 		printf(&#34;Enter the item to remove : &#34;);
	 		scanf(&#34;%d&#34;,&#38;item);
	 		remove_item(item);
	 		count--;
	 		break;
	 		case 3:
	 		if(count==0)
	 		{
	 			printf(&#34;\nList is empty\n&#34;);
	 			break;
	 		}
	 		else
	 		{
	 			printf(&#34;List elements are \n&#34;); display();
			}
			break;
	 		case 4: break;
		}
	}while(opt!=4);
	return 0;
}
void insert_after(int data, int item)
{
	LIST *tmp;
	temp=(LIST *)malloc(sizeof(LIST));
	temp-&#62;data=item;
	ptr=start;

	while(ptr!=NULL)
	{
		if(ptr-&#62;data==data)
		{
			tmp=ptr-&#62;next;
			ptr-&#62;next=temp;
			temp-&#62;next=tmp;
			break;
		}
	ptr=ptr-&#62;next;
	}

}	

void remove_item(int item)
{
	ptr=start;
	if(ptr-&#62;data == item)
	{
		start = ptr-&#62;next;
		free(ptr);
	}
	while(ptr-&#62;next!=NULL)
	{
		if((ptr-&#62;next)-&#62;data==item)
		{
			temp=ptr-&#62;next;
			ptr-&#62;next=(ptr-&#62;next)-&#62;next;
			free(temp);
			break;
		}
		ptr=ptr-&#62;next;
	}
}

void display()
{
	ptr = start;
	while(ptr!=NULL)
	{
		printf(&#34;%d -&#62; &#34;,ptr-&#62;data);
		ptr=ptr-&#62;next;
	}
}			
</pre>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Nostalgia.....and a long awaited comeback]]></title>
<link>http://z0ltan.wordpress.com/2009/11/17/nostalgia-and-a-long-awaited-comeback/</link>
<pubDate>Tue, 17 Nov 2009 06:39:30 +0000</pubDate>
<dc:creator>z0ltan</dc:creator>
<guid>http://z0ltan.wordpress.com/2009/11/17/nostalgia-and-a-long-awaited-comeback/</guid>
<description><![CDATA[Got bored, wanted to test my memory (and latent C skills): #include typedef struct node { char* data]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Got bored, wanted to test my memory (and latent C skills):</p>
<pre>
#include 

typedef struct node {
        char* data;
        struct node* next;
}* NODE;

int main()
{

        NODE root = NULL;
        NODE tail = NULL;

        root = (NODE) malloc(sizeof(struct node));
        tail = (NODE) malloc(sizeof(struct node));

        root-&#62;data = "hello";
        tail-&#62;data = "world";

        root-&#62;next= tail;
        tail-&#62;next = NULL;

        // display the contents of the linked list
        NODE cur = root;
        for(;cur != NULL; cur= cur-&#62;next)
        {
                printf("%s ", cur-&#62;data);
        }

        free(root);
        free(tail);

        printf("\n");

        return 0;
}
</pre>
<p>And it worked. First time around. Nice! <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[This is the post I was planning to write Tuesday]]></title>
<link>http://apriltuesday.wordpress.com/2009/11/13/this-is-the-post-i-was-planning-to-write-tuesday/</link>
<pubDate>Sat, 14 Nov 2009 00:15:17 +0000</pubDate>
<dc:creator>April</dc:creator>
<guid>http://apriltuesday.wordpress.com/2009/11/13/this-is-the-post-i-was-planning-to-write-tuesday/</guid>
<description><![CDATA[But then I got distracted so I&#8217;m writing it now instead, and it necessarily will be a very dif]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>But then I got distracted so I&#8217;m writing it now instead, and it necessarily will be a very different post than it would&#8217;ve been had I gone through with my original plan, because you know, life happens, and by life I mean Wednesday and Thursday and a good bit of Friday too.</p>
<p>So!  First order of business is my health.  I woke up Monday feeling AWESOME BEYOND ALL BELIEF, but in the reality outside my own delusional reality, I didn&#8217;t actually feel fully up to par till, um, Thursday-ish.  Ergo, I suspect it was the sunshine and cupcakes and my burning desire to leave my room that truly inspired my AWESOME BEYOND ALL BELIEF feeling Monday morning.</p>
<p>[Fun fact!  The instant I finished typing the word "sunshine," the chapel bells started playing "Here Comes the Sun."]</p>
<p>Turns out I made an excellent call not going to the health center over the weekend, for a number of reasons.  A number of reasons equaling two.  Reason one!  They apparently stopped actually diagnosing people with the swine, and instead have been just deporting everyone who walks in there with a fever and a bad mood.  Not cool, health services people, not cool.</p>
<p>Reason two!  Because of this, the quarantine mansion, AKA Mt. Hope, AKA Mt. DOOM, is full, so people within like driving distance are getting sent home.  Imagine how that return would go: &#8220;Hey mom!  Nice to see you for the first time since September!  Now care for me, because I have swine flu!&#8221;</p>
<p>[I'm in an all-caps, exclamation point sort of mood.  Could you tell?]</p>
<p>Anyway, this whole hype-enhanced epidemic means that a few of my classes are emptier than usual. This is particularly noticeable in classes like CS, where there are usually only 13 students… four of them girls… three of them absent today… Let&#8217;s just say I felt awfully female. At least my prof&#8217;s female.</p>
<p>Our CS lab for the last couple weeks before Thanksgiving is funny to me, because it&#8217;s basically the infamous GridWorld case study from AP CS, except we&#8217;re writing the code and not just understanding it, and&#8211; well, there are lots of other technical differences that are unimportant because hey, it&#8217;s a bunch of squares and little creatures moving through them and we get to make them battle and that&#8217;s FUN.</p>
<p>Speaking of battling, video games have been introduced to the Pratt 3 common room.  Sleeping and study habits have declined proportionally.  Swearing and machismo have increased proportionally.</p>
<p>Also!  Ben apparently has the magic touch with regards to my computer, because he broke it (i.e. incurred the spontaneous apparition of a black box at various places, plus other bizarre bugs) and fixed it&#8211; all just by hitting random keys.  And it FREAKS ME OUT.</p>
<p>By the way, this is in fact related to the previous discussion of video games, but unfortunately for you who were not involved in last night and its fairly ridiculous discussions, the relation is not topical but temporal.</p>
<p>Also!  Alex and Ben have a way of shouting my name that prevents me from distinguishing the two of them at all, which kind of messes with my mind.  They also have a knack for bothering me about math homework, although their ways of doing so are more distinguishable.  Last night I had the realization of epiphanic proportions that I&#8217;m going to be in linguistics class with <em>both</em> of them next semester.  It will be interesting.</p>
<p>I am not proceeding through my week in an at all efficient (let alone chronological) manner.</p>
<p>Because I should really be wrapping up, but there&#8217;s so much stuff I&#8217;ve missed.  Like my CS prof&#8217;s [AGE REDACTED] birthday party!  And the utter failure of Pratt 3 to produce a birthday party for our man Ian!  And the amazingly hilarious debate between Adams and Garrity about whether the derivative or the integral is superior (somehow mimes and cavemen were involved)!  And how Adams still calls me Emily but at least has apologized, and how I literally ran into Burger after the debate (sorry), and other things relating to <a href="http://apriltuesday.wordpress.com/2009/10/24/fun-story-about-morning/" target="_blank">this day</a>!  And CHEESE!  And LLAMAS!  And BULLDOZERS!</p>
<p>See, this is why I need to blog more frequently.  You need to remind me to blog more frequently.  Yes, YOU do.</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Is Python Slow?]]></title>
<link>http://hbfs.wordpress.com/2009/11/10/is-python-slow/</link>
<pubDate>Tue, 10 Nov 2009 14:33:29 +0000</pubDate>
<dc:creator>Steven Pigeon</dc:creator>
<guid>http://hbfs.wordpress.com/2009/11/10/is-python-slow/</guid>
<description><![CDATA[Python is a programming language that I learnt somewhat recently (something like 2, 3 years ago) and]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p><a href="http://www.python.org/" target="_blank">Python</a> is a programming language that I learnt somewhat recently (something like 2, 3 years ago) and that I like very much. It is simple, to the point, and has several <a href="http://en.wikipedia.org/wiki/Functional_programming" target="_blank">functional</a>-like constructs that I am already familiar with. But Python is <em>slow</em> compared to other programming languages. But it was unclear to me just how slow Python was compared to other languages. It just felt slow.</p>
<p><a href="http://en.wikipedia.org/wiki/File:Lewis_chess_queen_.jpg"><img src="http://hbfs.wordpress.com/files/2009/09/lewis_chess_queen_.jpg?w=265" alt="Lewis_chess_queen_" title="Lewis_chess_queen_" width="265" height="300" class="aligncenter size-medium wp-image-1670"></a></p>
<p>So I have decided to investigate by comparing the implementation of a simple, compute-bound problem, the <a href="http://en.wikipedia.org/wiki/Eight_queens_puzzle" target="_blank">eight queens puzzle</a> generalized to any board dimensions. This puzzle is most easily solved using, as <a href="http://en.wikipedia.org/wiki/Edsger_Dijkstra" target="_blank">Dijkstra</a> did, a depth-first <a href="http://en.wikipedia.org/wiki/Backtracking" target="_blank">backtracking</a> program, using bitmaps to test rapidly whether or not a square is free of attack<sup><a href="#1">1</a></sup>. I implemented the same program in <a href="http://en.wikipedia.org/wiki/C%2B%2B" target="_blank">C++</a>, Python, and <a href="http://en.wikipedia.org/wiki/Bash" target="_blank">Bash</a>, and got help from friends for the <a href="http://en.wikipedia.org/wiki/C_Sharp_(programming_language)" target="_blank">C#</a> and <a href="http://en.wikipedia.org/wiki/Java_(programming_language)" target="_blank">Java</a> versions<sup><a href="#2">2</a></sup>. I then compared the resulting speeds.</p>
<p><!--more--></p>
<p>The eight queens puzzle consist in finding a way to place 8 queens on a 8&#215;8 chessboard so that none of the queen checks another queen. By &#8220;checking&#8221;, we mean that if the other queen was of the opposing camp, you could capture it if it were your turn to play. So this means that no queen is on the same row, column, or diagonal as another queen. Using a real chessboard and pawns in lieu of queens, you can easily find a solution in a few seconds. Now, to make things more interesting, we might be interested in enumerating <em>all</em> solutions (and, for the time being, neglecting to check if a solution is a rotated or mirrored version of another solution).</p>
<p>The basic algorithm to solve the Eight queens puzzle is a relatively simple recursive algorithm. First, we place a queen somewhere on the first row and we mark the row it occupies as menaced. We also do so with the column and two diagonals. We then try to place a second queen somewhere on the second row, on a square that is menace-free, and mark the row, column, and diagonals of the new queen as menaced. And we proceed in the same fashion for other queens. But suppose that at the <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />th stage, we cannot find a menace-free square, preventing us from placing the <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />th queen. If the situation arises, we give up for the <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />th queen and <em>backtrack</em> to the <img src='http://l.wordpress.com/latex.php?latex=%28n-1%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n-1)' title='(n-1)' class='latex' />th queen. We try a new (and never tried before) location for the <img src='http://l.wordpress.com/latex.php?latex=%28n-1%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n-1)' title='(n-1)' class='latex' />th queen and we go forth trying to place the <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />th queen. If may happen that we go all the way back to the <img src='http://l.wordpress.com/latex.php?latex=%28n-k%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n-k)' title='(n-k)' class='latex' />th queen because there are no other solutions for the <img src='http://l.wordpress.com/latex.php?latex=%28n-1%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n-1)' title='(n-1)' class='latex' />th queen, which asks for the <img src='http://l.wordpress.com/latex.php?latex=%28n-2%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n-2)' title='(n-2)' class='latex' />th queen to be moved, which can also result in a dead-end; and so forth all the way down to the <img src='http://l.wordpress.com/latex.php?latex=%28n-k%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(n-k)' title='(n-k)' class='latex' />th queen.</p>
<p>Because the algorithm proceeds depth-first and can rear back quite a bit in looking for new solutions, it is called a <em>backtracking</em> algorithm.  Backtracking algorithms are key to many <a href="http://en.wikipedia.org/wiki/Artificial_intelligence" target="_blank">artificial intelligence</a> systems like, well, <a href="http://en.wikipedia.org/wiki/Chess_engine" target="_blank">chess programs</a>.</p>
<p>So I wrote the program for the <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> queens puzzle for a <img src='http://l.wordpress.com/latex.php?latex=n%5Ctimes%7B%7Dn&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n\times{}n' title='n\times{}n' class='latex' /> board in C quite sometime ago (circa 1995), then I ported to C++ recently (2006). And just for kicks, I decided to port it to different languages, with help from friends for Java and C# (of which I know about zilch). The C++ versions consist in a generic version that accepts the board size as a command-line argument and in a <a href="http://en.wikipedia.org/wiki/Constant_folding" target="_blank">constant-propagated</a> version where the board size is determined at compile-time. The Python versions, as there are two of them also, differ on how <em>python-esque</em> they are. The first variant is a rather literal translation of the C++ program. The second uses pythonesque idioms such as sets rather than bitmaps and turns out to be quite a bit faster&#8212;about 40% faster in fact. The Bash version is necessarily rather <em>bash-esque</em> as Bash does not offer anything much more sophisticated than arrays in terms of data structures.</p>
<p>All implementations were compiled with all optimizations enabled (<tt>-O3</tt>, inlining, interprocedural optimizations, etc.). For C++, I used g++ 4.2.4, for C#, gmcs 1.2.6.0, for Java, gcj 4.2.4, Python 2.6.2, and, finally, Bash 3.2.39&#8212;all latest versions for Ubuntu 8.04 LTS, which doesn&#8217;t means that they&#8217;re the really latest versions.</p>
<p>So I ran all seven implementations with boards sizes ranging from 1 to 15 on a AMD64 4000+ CPU (the numbers are arbitrary; I wanted to get good data but also limit the CPU time spent as the time increases <a href="http://en.wikipedia.org/wiki/Factorial" target="_blank">factorially</a> in board size!). At first, the results are not very informative:</p>
<div id="attachment_1674" class="wp-caption aligncenter" style="width: 310px"><a href="http://hbfs.wordpress.com/files/2009/09/results-linear-scale1.png"><img src="http://hbfs.wordpress.com/files/2009/09/results-linear-scale1.png?w=300" alt="Results, Time, Linear Scale" title="results-linear-scale" width="300" height="251" class="size-medium wp-image-1674" /></a><p class="wp-caption-text">Results, Time, Linear Scale</p></div>
<p>As expected, all times shoot up quite fast, with, unsurprisingly, BASH shooting up faster, followed by the two Python implementations. At this scale, Java, C#, C++, C++-fixed (the constant-propagated version) all seems to be more or less similar. To remove the factorial growth (as even with a log plot the data remains rather unclear), I scaled all performances relative to the C++ version. The C++ version is therefore 1.0 regardless of actual run-time; for board size <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />, the time needed by Bash, for example, was divided by the time needed by C++ for board size <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />. We now get:</p>
<div id="attachment_1687" class="wp-caption aligncenter" style="width: 310px"><a href="http://hbfs.wordpress.com/files/2009/09/results-ratio-linear-scale2.png"><img src="http://hbfs.wordpress.com/files/2009/09/results-ratio-linear-scale2.png?w=300" alt="Speed Ratio, Linear Scale" title="results-ratio-linear-scale" width="300" height="251" class="size-medium wp-image-1687" /></a><p class="wp-caption-text">Speed Ratio, Linear Scale</p></div>
<p>We can use a log-scale for the <img src='http://l.wordpress.com/latex.php?latex=y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='y' title='y' class='latex' /> axis to better separate the similar results:</p>
<div id="attachment_1678" class="wp-caption aligncenter" style="width: 310px"><a href="http://hbfs.wordpress.com/files/2009/09/results-ratio-log-scale.png"><img src="http://hbfs.wordpress.com/files/2009/09/results-ratio-log-scale.png?w=300" alt="Speed Ratio, Log Scale" title="results-ratio-log-scale" width="300" height="251" class="size-medium wp-image-1678" /></a><p class="wp-caption-text">Speed Ratio, Log Scale</p></div>
<p>We see three very strange things. The first is Bash shooting up wildly. The second is that C# relative time <em>goes down</em> with the increasing board size. The third is that before <img src='http://l.wordpress.com/latex.php?latex=n%3D8&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n=8' title='n=8' class='latex' />, the results are fluctuating. The first is easily explained: Bash is incredibly slow. About 10,000 times as slow as the optimized compiled C++ version. It is so slow that the two last data points are <em>estimated</em> as it would have taken an extra week, or so, to get them. The second anomaly needs a bit more analysis. Looking at the raw timing data, we can see that the C# version seems to be needing an extra 7ms regardless of board size. I do not know where that comes from, as the timings do not time things such as program load and initialization, but only the solving of the puzzle itself. It may well be the JIT that takes a while to figure out that the recursive function is expensive and the run-time compilation takes 7ms? Anyway, would we remove this mysterious extra 7ms, the odd behaviour of the C# implementation would vanish. The third anomaly is easily explained by the granularity of the timer and the operating system; up to puzzle size of 8, the times are much less than 100 &#956;s for most implementations. Suffice to move the mouse and generate a couple of interrupts to throw timing off considerably.</p>
<p>Removing the small board sizes yields:</p>
<div id="attachment_1680" class="wp-caption aligncenter" style="width: 310px"><a href="http://hbfs.wordpress.com/files/2009/09/results-ratio-log-scale-wo-outliers.png"><img src="http://hbfs.wordpress.com/files/2009/09/results-ratio-log-scale-wo-outliers.png?w=300" alt="Speed Ratios, Log Scale, Without Outliers" title="results-ratio-log-scale-wo-outliers" width="300" height="251" class="size-medium wp-image-1680" /></a><p class="wp-caption-text">Speed Ratios, Log Scale, Without Outliers</p></div>
<p>which shows that the respective implementations are well behaved (except for C# and its most probably JIT-related extra 7ms).</p>
<p>So we see that Bash is about 10,000 times slower than the optimized C++ version. The constant-propagated version is only a bit (~5%) faster than the generic C++ version. The Java version takes about 1.5&#215; the time of the C++ version, which is way better than I expected&#8212;don&#8217;t forget that this is a native-code version of the Java program, it doesn&#8217;t run on the JVM. The C# version is twice as slow as the C++ version, which is somewhat disappointing but not terribly so. The Python versions are 120&#215; (for the more pythonesque version) and 200&#215; slower. That&#8217;s unacceptably slow, especially that the Python programs aren&#8217;t particularly fancy nor complex. We do see that using pythonesque idioms yields a nice performance improvement&#8212;40%&#8212;but that&#8217;s still nowhere useful.</p>
<p align="center">*<br />*&#8195;*</p>
<p>So what does this tell us? For one thing, that Bash is slow. But that Bash is slow even when it doesn&#8217;t use any external command should not come as a surprise. From what I understand from Bash, data structures are limited to strings and arrays. Lists and strings are the same. Basically, a list is merely a string with items separated by the <tt>IFS</tt> character(s), which causes all array-like accesses to lists to be performed in linear time as each time the string is reinterpreted given the current <tt>IFS</tt>. So even though a construct such as <tt>${x[i]}</tt> looks like random-access, it is not. As for explicitly constructed arrays (as opposed to lists), there seems to be a real random-access capability, but it&#8217;s still very slow. I do not think that bash uses something like a virtual machine and an internal tokenizer to speed up script interpretation. Maybe that&#8217;d be something to put on the to-do list for Bash 5? In any case, I also learnt that Bash is a lot more generic than I thought.</p>
<p>The other thing is that Python is not a programming language for <a href="http://en.wikipedia.org/wiki/CPU_bound" target="_blank">compute-bound</a> problems. This makes me question how far projects such as <a href="http://www.pygame.org/" target="_blank">Pygame</a> (which aims at developing a cromulent gaming framework for Python) can go. While all of the graphics and sound processing can be off-loaded to third party libraries written in C or assembly language and interacting with the platform-specific drivers, the central problem of driving the game itself remains complete. How do you provide strong non-player characters/opponents when everything is 100&#215; slower than the equivalent C or C++ program? What about strategy games? How can you build a <a href="http://en.wikipedia.org/wiki/Massively_multiplayer_online_role-playing_game" target="_blank">MMORPG</a> with a Python engine? Is a MMORPG I/O or compute bound? Could you write a championship-grade chess engine in Python?</p>
<p>My guess is that you just can&#8217;t.</p>
<p>And that&#8217;s quite sad because I <em>like</em> Python as a programming language. I used it in a number of (<a href="http://en.wikipedia.org/wiki/I/O_bound" target="_blank">I/O-bound</a>) mini-projects and I was each time delighted with the ease of coding compared to C or C++ (for those specific tasks). It pains me that Python is just too slow to be of any use whatsoever in scientific/high-performance computing. I wish for Python 4 to have a new virtual machine and interpreter to bring Python back with Java and C#, performance-wise. Better yet, why not have a true native compiler like <tt>gcj</tt> for Python?</p>
<p align="center">*<br />*&#8195;*</p>
<p>I am fully aware that the <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' /> queens on a <img src='http://l.wordpress.com/latex.php?latex=n%5Ctimes%7B%7Dn&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n\times{}n' title='n\times{}n' class='latex' /> board puzzle is a toy problem of sorts. But its extreme simplicity and its somewhat universal backtracking structure makes it an especially adequate toy problem. If a language can&#8217;t handle such a simple problem very well, how can we expect it to be able to scale to much more complex problems like, say, a championship-level chess engine?</p>
<p align="center">*<br />*&#8195;*</p>
<p>The raw data (do not forget that the two last timings for Bash are estimated). All times are in seconds.</p>
<table border="1">
<tr>
<td>Language</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
<td>11</td>
<td>12</td>
<td>13</td>
<td>14</td>
<td>15</td>
</tr>
<tr>
<td>C++</td>
<td>0.000000</td>
<td>0.000002</td>
<td>0.000003</td>
<td>0.000046</td>
<td>0.000005</td>
<td>0.000012</td>
<td>0.000036</td>
<td>0.000126</td>
<td>0.000569</td>
<td>0.002973</td>
<td>0.016647</td>
<td>0.080705</td>
<td>0.448602</td>
<td>2.830424</td>
<td>18.648028</td>
</tr>
<tr>
<td>C++-fixed</td>
<td>0.000000</td>
<td>0.000001</td>
<td>0.000001</td>
<td>0.000007</td>
<td>0.000005</td>
<td>0.000011</td>
<td>0.000033</td>
<td>0.000104</td>
<td>0.000500</td>
<td>0.002369</td>
<td>0.011468</td>
<td>0.065407</td>
<td>0.462506</td>
<td>2.765098</td>
<td>18.032639</td>
</tr>
<tr>
<td>C#</td>
<td>0.007315</td>
<td>0.007260</td>
<td>0.008356</td>
<td>0.007302</td>
<td>0.007864</td>
<td>0.007451</td>
<td>0.007421</td>
<td>0.008287</td>
<td>0.008452</td>
<td>0.012481</td>
<td>0.033548</td>
<td>0.154490</td>
<td>0.877194</td>
<td>5.444645</td>
<td>35.769138</td>
</tr>
<tr>
<td>Java</td>
<td>0.000001</td>
<td>0.000002</td>
<td>0.000002</td>
<td>0.000003</td>
<td>0.000005</td>
<td>0.000015</td>
<td>0.000049</td>
<td>0.000198</td>
<td>0.000895</td>
<td>0.004244</td>
<td>0.022947</td>
<td>0.128498</td>
<td>0.692349</td>
<td>4.457763</td>
<td>29.321677</td>
</tr>
<tr>
<td>Python</td>
<td>0.000012</td>
<td>0.000029</td>
<td>0.000054</td>
<td>0.000136</td>
<td>0.000424</td>
<td>0.001708</td>
<td>0.006078</td>
<td>0.026174</td>
<td>0.115218</td>
<td>0.546922</td>
<td>2.822554</td>
<td>15.405020</td>
<td>91.170068</td>
<td>581.983903</td>
<td>3762.739785</td>
</tr>
<tr>
<td>Python-2</td>
<td>0.000012</td>
<td>0.000028</td>
<td>0.000045</td>
<td>0.000109</td>
<td>0.000318</td>
<td>0.001275</td>
<td>0.004125</td>
<td>0.017448</td>
<td>0.077854</td>
<td>0.348823</td>
<td>1.701767</td>
<td>9.349661</td>
<td>55.532111</td>
<td>339.244132</td>
<td>2259.010794</td>
</tr>
<tr>
<td>Bash</td>
<td>0.003054</td>
<td>0.010938</td>
<td>0.006067</td>
<td>0.011355</td>
<td>0.026913</td>
<td>0.091869</td>
<td>0.347451</td>
<td>1.472102</td>
<td>6.483076</td>
<td>30.813085</td>
<td>163.061341</td>
<td>891.828031</td>
<td>5031.663741</td>
<td>31746.000000</td>
<td>209162.000000</td>
</tr>
</table>
<p>The More Pythonesque version:</p>
<pre class="brush: python;">
#!/usr/bin/python -O
# -*- coding: utf-8 -*-

import sys
import time

########################################
##
##   (c) 2009 Steven Pigeon (pythonesque version)
##

diag45=set()
diag135=set()
cols=set()
solutions=0

########################################
##
## Marks occupancy
##
def mark(k,j):
    global cols, diag45, diag135
    cols.add(j);
    diag135.add(j+k)
    diag45.add(32+j-k)

########################################
##
## unmarks occupancy
##
def unmark(k,j):
    global cols, diag45, diag135
    cols.remove(j);
    diag135.remove(j+k)
    diag45.remove(32+j-k)

########################################
##
## Tests if a square is menaced
##
def test(k,j):
    global cols, diag45, diag135
    return not((j in cols) or \
        ((j+k) in diag135) or \
        ((32+j-k) in diag45))

########################################
##
## Backtracking solver
##
def solve( niv, dx ):
    global solutions, nodes
    if niv &#62; 0 :
        for i in xrange(0,dx):
            if test(niv,i) == True:
                mark ( niv, i )
                solve( niv-1, dx)
                unmark ( niv, i )
    else:
        for i in xrange(0,dx):
            if (test(0,i) == True):
                solutions += 1

########################################
##
## usage message
##
def usage( progname ):
    print &#34;usage: &#34;, progname, &#34; &#60;size&#62;&#34;
    print &#34;size must be 1..32&#34;

########################################
##
## c/c++-style main function
##
def main():
    if len(sys.argv) &#60; 2:
        usage(sys.argv[0])
    else:
        try:
            size = int(sys.argv[1])
        except:
            usage(sys.argv[0])
            return

        if (size &#60;= 32) &#38; (size&#62;0):

            start = time.time()
            solve(size-1,size)
            elapsed = time.time()-start

            print &#34;%s %0.6f&#34; % (solutions,elapsed)

        else:
            usage(sys.argv[0])
#
if __name__ == &#34;__main__&#34;:
    main()
</pre>
<p>The other versions can be found here in <a href="http://www.stevenpigeon.org/blogs/hbfs/super_reines.zip">super_reines.zip</a>. The Bash and C++ versions are somewhat Linux-specific.</p>
<hr align="left" width="30%">
<sup><a name="1">1</a></sup>&#160;A trick explained in detail in Brassard and Bratley, <i>Introduction à l&#8217;algorithmique</i>, Presses de l&#8217;Université de Montréal. Translated to English: <a href="http://www.amazon.com/gp/product/0133350681?ie=UTF8&#38;tag=hardbettfasts-20&#38;linkCode=xm2&#38;camp=1789&#38;creativeASIN=0133350681" target="_blank"><i>Fundamentals of Algorithmics</i></a> (at Amazon.com)</p>
<p><sup><a name="2">2</a></sup>&#160;Frédéric Marceau translated the program from C++ to C#. <a href="http://lostwebsite.wordpress.com/" target="_blank">François-Denis Gonthier</a> translated from C++ to Java.</p>
<p align="center">*<br />*&#8195;*</p>
<p>So I added a third Python version to the archive. I also benchmarked it. It is quite a bit faster than the original one, and even than the &#8216;pythonesque&#8217; version. For example:</p>
<pre class="brush: bash;">
$ super-reines.py 12
14200 15.966019
$ super-reines-2.py 12
14200 9.478235
$ super-reines-3.py 12
14200 6.845071
</pre>
<p>So a more &#8216;functional&#8217; version performs about 2.3&#215; faster than a direct translation from C++. For the same problem size, however, the C++ (fixed) version takes 0.069s&#8230; Roughly 100&#215; faster than the 3rd version.</p>
<div id="attachment_1899" class="wp-caption aligncenter" style="width: 310px"><a href="http://hbfs.wordpress.com/files/2009/11/results-ratio-log-scale-wo-outliers-2.png"><img src="http://hbfs.wordpress.com/files/2009/11/results-ratio-log-scale-wo-outliers-2.png?w=300" alt="results-ratio-log-scale-wo-outliers-2" title="results-ratio-log-scale-wo-outliers-2" width="300" height="251" class="size-medium wp-image-1899" /></a><p class="wp-caption-text">Speed Ratios, Log Scale, Without Outliers, with 3rd Python version</p></div> 
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Problems with RIA Services (Feedback for July 2009 CTP)]]></title>
<link>http://dvanderboom.wordpress.com/2009/11/09/problems-with-ria-services-feedback-for-july-2009-ctp/</link>
<pubDate>Tue, 10 Nov 2009 02:35:45 +0000</pubDate>
<dc:creator>Dan Vanderboom</dc:creator>
<guid>http://dvanderboom.wordpress.com/2009/11/09/problems-with-ria-services-feedback-for-july-2009-ctp/</guid>
<description><![CDATA[RIA Services (new home page) is a collection of tools and libraries for making Rich Internet Applica]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>RIA Services (<a href="http://silverlight.net/riaservices/">new home page</a>) is a collection of tools and libraries for making Rich Internet Applications, especially line of business applications, easier to develop.&#160; Brad Abrams did a <a href="http://videos.visitmix.com/MIX09/t40f">great presentation</a> of RIA Services at MIX 2009 that touches on querying, validation, authentication, and how to share logic between the server and client sides.&#160; Brad also has a huge series of articles (26 as I write this) on using Silverlight and RIA Services to build a realistic application.</p>
<p>I love the concept of RIA Services.&#160; Brad and his team have done a fantastic job of identifying the critical issues for LOB systems and have the right idea to simplify those common data access tasks through the whole pipeline from database to UI controls, using libraries, Visual Studio tooling, or whatever it takes to get the job done.</p>
<p>So before I lay down some heavy criticism of RIA Services, take into consideration that it’s still a CTP and that my scenario pushes the boundaries of what was likely conceived of for this product, at least for such an early stage.</p>
<h2>Shared Data Model with WPF &#38; Silverlight Clients</h2>
<p>The cause of so much of my grief with RIA Services has been my need to share a data model, and access to a shared database, across WPF as well as Silverlight client applications.&#160; Within the constraints of this situation, I keep running into problem after problem while trying to use RIA Services productively.</p>
<p><u>The intuitive thing to do is</u>: define a single data model project that compiles to a single assembly, and then reference that in my Silverlight and non-Silverlight projects.&#160; This would be a 100% full-fidelity shared data model.&#160; As long as the code I wrote was a subset of both Silverlight and normal .NET Frameworks (an intersection), we could share identical types and write complex validation and model manipulation logic, all without having to constrain ourselves to work within the limitations of a convoluted code generation scheme.&#160; Back when I wrote Compact Framework applications, I did this with great success despite the platform gap, and I didn’t have anything like RIA Services to help.</p>
<h3>Incompatible Assemblies</h3>
<p>Part of the problem arises because Silverlight assemblies are incompatible with non-Silverlight assemblies.&#160; A lot of what RIA Services is doing is trying to find a way around this limitation: picking up attributes and code files from one project and inserting that code into the Silverlight project with a build action.&#160; This Visual Studio “magic” has been criticized for its weakness in dealing with multiple-solution systems where Visual Studio can’t update the client because it’s not loaded, and I’ve heard there’s work being done to address this, but for my current needs, this magic aspect of it isn’t a problem.&#160; The specifics of how it works, however, are.</p>
<h3>Different Data Access APIs</h3>
<p>Accessing entities requires a different API in Silverlight via RiaContextBase versus ObjectContext elsewhere.&#160; Complex logic in the model (for validation and other actions against the model) requires access to other entities and therefore access to the current object context, but the context APIs for Silverlight and WPF are very different.&#160; Part of this has to do with Silverlight’s inability to make synchronous calls to the server.</p>
<p>In significantly large systems that I build, I use validation logic such as “this entity is valid if it’s pointing to an entity of a different type that contains a PropertyX value of Y”.&#160; One of my tables stores a tree of data, so I have methods for loading entire subtrees and ensuring that no circular references exist.&#160; For these kinds of tasks, I need access to the data context in basic validation methods.&#160; When I delete nodes from a tree, I need to delete child nodes, so update logic is part of the model that needs to be the same in every client.&#160; I don’t want to define that multiple times for multiple clients.&#160; I like to program very <a href="http://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY</a>.&#160; In other words, I find myself in need of a shared model.</p>
<p>RIA Services doesn’t provide anything like type equivalence for a shared model, however.&#160; Data model classes in Silverlight inherit from Entity, but EntityObject in WPF.&#160; In the RIA Services domain context, we RaiseDataMemberChanged, but in a normal EF object context, we need to ReportPropertyChanged.&#160; In WPF, I can call MyEntity.Load(MergeOption.PreserveChanges), but in Silverlight there&#8217;s no Load method on the entity and no MergeOption enum.&#160; In WPF I can query against context.SomeEntitySet, but in Silverlight you would query against context.GetSomeEntitySetQuery() and then execute the query with another method call.</p>
<p>This chasm of disparity makes all but the simplest shared model logic impractical and frustrating.&#160; The code generation technique, though good in principle, keeps getting in the way.&#160; For example, I have both parameterless and parameterized constructors in my entity classes.&#160; This works great in my WPF client, but when this code is synchronized to my Silverlight client, I get an error because the Silverlight-side entity class is generated in two parts: in the hidden partial class, a parameterless constructor is generated which calls partial method OnCreated; and in the visible partial class, the constructor method I defined on the server is dumped into another file, so I have duplicate constructors.&#160; If I remove the parameterless constructor from the server side, I get an error because my entity class requires a parameterless constructor (and defining a non-default constructor effectively removes the default one from the resulting type unless it’s explicitly defined).&#160; I thought I could define the partial method OnCreated and put my construction logic in there, but the partial method is only defined on the client side.&#160; That means sharing construction logic consists of copying and pasting the OnCreated method across the various clients—far from an ideal solution.</p>
<h3>Entity Data Model Required to be in Web Project</h3>
<p>Another strategy I attempted was to define the .edmx file and my partial class extensions in a class library, and then reference that from the web project.&#160; I could define the LinqToEntitiesDomainService&#60;MyDataContext&#62;, but sharing entity class code (by generating code in the Silverlight project) isn’t possible unless the .edmx file and partial class extensions are defined in the web project itself.&#160; This would mean that my WPF client would have to reference a web project for data access, which by itself seems wrong.&#160; (Or making a copy of the data model, which is worse.)&#160; It would be better for the WPF client to talk to the same domain service as the Silverlight client, but RIA Services doesn’t give you an option to link that web project to a non-Silverlight project, so again I ran into a brick wall.</p>
<h3>So Don’t Do That</h3>
<p>The kind of advice I’m getting for this is, “so don’t do that”.&#160; In other words, don’t write complex validation logic in the model or otherwise try to access the data context; don’t write parameterized constructors; don’t aim for 100% type fidelity across all endpoints of a system; don’t try to share data models with Silverlight and non-Silverlight projects, etc.&#160; But I see the potential for RIA Services, so I have to push for these things unless I hear really convincing arguments against them (or compelling alternatives).</p>
<h2>Conclusion</h2>
<p>The fact that there are different data contexts and data item definitions within those contexts imposes a burden on the developer to use different techniques for each environment, and creates challenges for centralizing data model logic and reusing equivalent logic across different kinds of clients.&#160; My gut feeling is that RIA Services in its current form has some fundamental design flaws that will need to be addressed, taking into consideration systems with a mix of Silverlight, WPF, and other clients, before it becomes a truly robust data access platform.</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Happy Technical Difficulty Day!]]></title>
<link>http://apriltuesday.wordpress.com/2009/10/30/happy-technical-difficulty-day/</link>
<pubDate>Fri, 30 Oct 2009 20:48:04 +0000</pubDate>
<dc:creator>April</dc:creator>
<guid>http://apriltuesday.wordpress.com/2009/10/30/happy-technical-difficulty-day/</guid>
<description><![CDATA[Somewhat ironically, we underwent some serious technical problems in CS today.  At least more seriou]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Somewhat ironically, we underwent some serious technical problems in CS today.  At least more serious than the ones we typically undergo, which involve Jeannie&#8217;s computer being unhappy and everyone complaining about the wireless network.  (Purple Air, why do you hate my iPhone?)</p>
<p>Today&#8217;s problems involved the projector, which refused to project anything but a blue screen of death.  We looked through the menus and were particularly attracted to one option that depicted a smiling face gazing serenely at a screen.  But alas, it did no good.</p>
<p>See, being in CS classes does NOT imply technical know-how.  The only thing that implies technical know-how is being on the staff in OIT, whom we called for assistance.  Fortunately, with their help everything was running smoothly within 20 minutes, so Jeannie could show us photos of cacti and kings of England as was her plan.</p>
<p>After that was calculus, where our professor is diligently training us to become gamblers.</p>
<p>English, however, would have to be my favorite class of the day.  Because we listened to rap and watched music videos and got to hear our very academic and soft-spoken professor say &#8220;shit&#8221; (&#8220;Yes, I <em>do</em> know that word&#8221;).</p>
<p>But moreover, because it took a good portion of the hour to figure out how to operate the projector/screen/computer/magical techiness system in the lecture hall&#8211; which admittedly was a little complicated but really shouldn&#8217;t have required the full mental capacities of ten college students working on it.</p>
<p>&#8230; But okay, mostly because we transitioned from Whitman to Jay-Z in a single hour  I later told Alex about this unusual class and he called me a &#8220;closet thug,&#8221; which I think is a description that anyone who truly knows me would have to agree with.</p>
<p>In other news, yay it&#8217;s the weekend!</p>
<p>This post was made possible by two Milky Way bars and one Snickers.</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Cargo Cult Programming (part 1)]]></title>
<link>http://hbfs.wordpress.com/2009/10/13/cargo-cult-programming-part-1/</link>
<pubDate>Tue, 13 Oct 2009 10:47:54 +0000</pubDate>
<dc:creator>Steven Pigeon</dc:creator>
<guid>http://hbfs.wordpress.com/2009/10/13/cargo-cult-programming-part-1/</guid>
<description><![CDATA[Programmers aren&#8217;t always the very rational beings they please themselves to believe. Very oft]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Programmers aren&#8217;t always the very rational beings they please themselves to believe. Very often, we close our eyes and take decisions based on what we <em>think</em> we know, and based on what have been told by more or less reliable sources. Such as, for example, taking <a href="http://en.wikipedia.org/wiki/Red-black_tree" target="_blank">red-black</a> trees rather than <a href="http://en.wikipedia.org/wiki/AVL_tree" target="_blank">AVL</a> trees because they are faster, while not being able to quite justify in the details why it must be so. Programming using this kind of decision I call <em>cargo cult programming</em>.</p>
<p><a href="http://en.wikipedia.org/wiki/File:CMA_CGM_Balzac.jpg"><img src="http://hbfs.wordpress.com/files/2009/08/cargo.jpg?w=300" alt="cargo" title="cargo" width="400" height="179" class="aligncenter size-medium wp-image-1539" /></a></p>
<p>Originally, I wanted to talk about red-black <em>vs.</em> AVL trees and how they compare, but I&#8217;ll rather talk about the STL <tt>std::map</tt> that is implemented using red-black trees with G++ 4.2, and <tt>std::unordered_map</tt>, a <a href="http://en.wikipedia.org/wiki/Hash_table" target="_blank">hash-table</a> based container introduced in <a href="http://en.wikipedia.org/wiki/C%2B%2B_Technical_Report_1" target="_blank">TR1</a>.</p>
<p><!--more--></p>
<p>TR1 <tt>std::unordered_map</tt> is a map that does not maintain any particular order between the keys of the data it contains. It is therefore implemented as a <a href="http://en.wikipedia.org/wiki/Hash_table" target="_blank">hash table</a>. The <tt>std::map</tt> is implemented, at least in G++ 4.2, as a <a href="http://en.wikipedia.org/wiki/Red-black_tree" target="_blank">red-black tree</a>. Self-balancing trees change their shape with each insertion in order to maintain most leaves at an equal depth, ensuring an <img src='http://l.wordpress.com/latex.php?latex=O%28%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\lg n)' title='O(\lg n)' class='latex' /> access time. The difference between an AVL tree and a red-black tree is the way the tree is rebalanced, and the red-black tree does about half as much operations as an AVL tree on insertion, making somewhat faster&#8212;how much faster remains to be quantified.</p>
<p>So I ran a few simple experiments to compare <tt>std::unordered_map</tt> and <tt>std::map</tt>. First, I got hold of the Zingarelli word list, containing some 585,000 Italian words (I can&#8217;t find a working URL, but I got the list a few years ago on a <a href="http://en.wikipedia.org/wiki/Scrabble" target="_blank">Scrabble</a>-related page). I ran three tests: insertion, successful search, and failed search. The list is broken into two parts, one containing 90% of the words, to be inserted, and 10% of the words where kept for the unsuccessful search test.</p>
<p>The insertion test consisted into inserting all of Zingarelli&#8217;s list into both data structures, in the most simplistic way: scanning the list sequentially and inserting items one by one.</p>
<table align="center" border="1">
<tr>
<td></td>
<td>wall time</td>
<td>op/s</td>
</tr>
<tr>
<td><tt>std::map</tt></td>
<td>0.75s</td>
<td>~699000</td>
</tr>
<tr>
<td><tt>std::unordered_map</tt></td>
<td>0.76s</td>
<td>~690000</td>
</tr>
</table>
<p>Insertion times are about the same; so we cannot conclude anything special here except maybe that the insertion time is largely dominated by memory allocation and copy. For the successful search (10,000 tries in each case):</p>
<table align="center" border="1">
<tr>
<td></td>
<td>wall time</td>
<td>op/s</td>
</tr>
<tr>
<td><tt>std::map</tt></td>
<td>0.047s</td>
<td>~209000</td>
</tr>
<tr>
<td><tt>std::unordered_map</tt></td>
<td>0.006s</td>
<td>~1.6&#215;10<sup>6</sup></td>
</tr>
</table>
<p>Map look-up is <em>immensely</em> faster with <tt>std::unordered_map</tt>, a ratio of about 8:1! Failed searches exhibit the same behavior. For ~60,000 failed searches:</p>
<table align="center" border="1">
<tr>
<td></td>
<td>wall time</td>
<td>op/s</td>
</tr>
<tr>
<td><tt>std::map</tt></td>
<td>0.263s</td>
<td>~37700</td>
</tr>
<tr>
<td><tt>std::unordered_map</tt></td>
<td>0.037s</td>
<td>~268000</sup></td>
</tr>
</table>
<p>We see the same kind of differences here again, but the failed searches are much slower than the successful searches.</p>
<p>When we repeat the experiment with integers (with the same kind of numbers; 500,000 integers of which 10% are randomly chosen for the failed searches), we get essentially the same picture but a massive speedup as strings are rather costly to copy. Indeed, for the insertion:</p>
<table align="center" border="1">
<tr>
<td></td>
<td>wall time</td>
<td>op/s</td>
</tr>
<tr>
<td><tt>std::map</tt></td>
<td>0.19s</td>
<td>2.3&#215;10<sup>6</sup></td>
</tr>
<tr>
<td><tt>std::unordered_map</tt></td>
<td>0.10s</td>
<td>4.5&#215;10<sup>6</sup></td>
</tr>
</table>
<p>Now, we see the algorithmic difference between the two algorithms as the time spent allocating and copying strings is eliminated. For successful and failed searches, we get (again for 10,000 look-ups):</p>
<table align="center" border="1">
<tr>
<td></td>
<td>wall time</td>
<td>op/s</td>
</tr>
<tr>
<td><tt>std::map</tt></td>
<td>0.015s</td>
<td>0.7&#215;10<sup>6</sup></td>
</tr>
<tr>
<td><tt>std::unordered_map</tt></td>
<td>0.005s</td>
<td>2.0&#215;10<sup>6</sup></td>
</tr>
</table>
<p>and:</p>
<table align="center" border="1">
<tr>
<td></td>
<td>wall time</td>
<td>op/s</td>
</tr>
<tr>
<td><tt>std::map</tt></td>
<td>0.082s</td>
<td>0.1&#215;10<sup>6</sup></td>
</tr>
<tr>
<td><tt>std::unordered_map</tt></td>
<td>0.009s</td>
<td>1.1&#215;10<sup>6</sup></td>
</tr>
</table>
<p>The <tt>std::unordered_map</tt> therefore seems to be much faster than <tt>std::map</tt>. What have we lost in order to gain this speed? First, a lot of memory as hash table must retain a certain sparseness to sport their average constant-time look-ups; and the ordering of keys. Listing the items sorted in a hash map won&#8217;t produce an ordered list, but rather a randomized version of the list. <tt>std::map</tt> on the other hand, allow lexicographic enumeration of its contents.</p>
<p align="center">*<br />*&#8195;*</p>
<p>So you&#8217;re reading this and are thinking, <em>mmppfh, of course, it&#8217;s a hash table, you nitwit</em>. Well, yes. Maybe so.</p>
<p>But here&#8217;s what prompted me to do the test. Red-black trees offer <img src='http://l.wordpress.com/latex.php?latex=O%28%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\lg n)' title='O(\lg n)' class='latex' /> access. But that&#8217;s assuming (and quite wrongly) that comparing keys can be performed in constant time. This may be true for simple keys, such as machine-size integers (known in C and C++ as <tt>int</tt>) but not so for more complex data, like strings and other structures. For string, for example, comparison may still be very fast because the cost of comparing two string only depends on the longest common prefix; if two long strings have differences in the first few characters, comparison terminates rapidly and the cost is moderate. If, on the other hand, the string share a very long prefix, then the comparison algorithm must scan both strings until the end is reached or a difference is found; this can be very long. So, lets <img src='http://l.wordpress.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' /> be the average common prefix length. The expected search time is now <img src='http://l.wordpress.com/latex.php?latex=O%28p+%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(p \lg n)' title='O(p \lg n)' class='latex' /> which can grow large if <img src='http://l.wordpress.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' /> is large.</p>
<p>For a hash table look-up, you must first compute the hash key which can be at best done in linear time in the average key length. In our first case, this length is the average string length. Let this average length be <img src='http://l.wordpress.com/latex.php?latex=h&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h' title='h' class='latex' />. The number of probe is <img src='http://l.wordpress.com/latex.php?latex=c_n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='c_n' title='c_n' class='latex' /> a small constant that depends on the sparseness of the table and the number of items, <img src='http://l.wordpress.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' />, it contains. For each of those <img src='http://l.wordpress.com/latex.php?latex=c_n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='c_n' title='c_n' class='latex' /> probes, a string comparison is performed, leading to an expected complexity of <img src='http://l.wordpress.com/latex.php?latex=O%28c_n+h+p%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(c_n h p)' title='O(c_n h p)' class='latex' />.</p>
<p>Now, it is not clear when <img src='http://l.wordpress.com/latex.php?latex=p+%5Clg+n+%5Cgeqslant+c_n+h+p&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p \lg n \geqslant c_n h p' title='p \lg n \geqslant c_n h p' class='latex' />. Both <img src='http://l.wordpress.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p' title='p' class='latex' /> cancel, which boils down to <img src='http://l.wordpress.com/latex.php?latex=%5Clg+n+%5Cgeqslant+c_n+h&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lg n \geqslant c_n h' title='\lg n \geqslant c_n h' class='latex' />. So, in <em>some</em> conditions, it may be possible to make the tree appear faster than the hash table.</p>
<p>On the average, however, the hash map should win becase we expect <img src='http://l.wordpress.com/latex.php?latex=c_n+%26%2360%3B%26%2360%3B+%5Clg+n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='c_n &lt;&lt; \lg n' title='c_n &lt;&lt; \lg n' class='latex' />.</p>
<p align="center">*<br />*&#8195;*</p>
<p>A <a href="http://en.wikipedia.org/wiki/Cargo_cult" target="_blank">cargo cult</a> is described as:</p>
<blockquote><p>
A cargo cult is a type of religious practice that may appear in primitive tribal societies in the wake of interaction with technologically advanced, non-native cultures. The cults are focused on obtaining the material wealth of the advanced culture through magical thinking, religious rituals and practices, believing that the wealth was intended for them by their deities and ancestors.
</p></blockquote>
<p></p>
<p>Except from the part with ancestor spirits but especially when considering <a href="http://en.wikipedia.org/wiki/Magical_thinking" target="_blank">magical thinking</a>, cargo culting applies very often to how programmers write code and take decisions about data structures and algorithms. Choosing a red-black tree over an AVL, or over a <a href="http://en.wikipedia.org/wiki/Splay_tree" target="_blank">splay</a> tree, because we think that it is somehow always better&#8212;sometimes because Ancestor X (a more or less reliable source or authority, such as a more experience programmer, a teacher, or some Internet <a href="http://en.wikipedia.org/wiki/Dude" target="_blank">dude</a>) said so&#8212;is a form of cargo cult where the programmer does not use rationality to its full extent to make a decision.</p>
<p>When choosing data structures, one must be fully aware of the dual cost of data structures. The first cost is the run-time cost. Theoretical complexity and actual implementation-dependant run-times may be quite different. The second cost is memory usage. Memory is large on modern systems but not infinite. A particularly wasteful method that offers constant-time access to the data may use as much as, say, ten times the memory used by a method that gives you <img src='http://l.wordpress.com/latex.php?latex=O%28%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\lg n)' title='O(\lg n)' class='latex' /> access. If for small data set this 10&#215; memory usage pose no particular problem, it may be quite different with a largish data set. It&#8217;s all a very delicate balancing act between run-time performance and scalability.</p>
<p>It&#8217;s very hard to trust one&#8217;s gut feelings about a data structure and the data put in. Combined with the access patterns, the data structure may yield a very counter-intuitive performance. Rather than giving into magic thinking cargo cult programming, you should <em>always</em> take a little time to validate our assumptions and hypotheses about the data, the data structure, and the access patterns, as data structure behavior is clearly <em>not</em> independent from the data and the access patterns.</p>
<p>Consider this very simple example. We have a simple binary tree and a list of strings. Binary trees offer <img src='http://l.wordpress.com/latex.php?latex=O%28%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\lg n)' title='O(\lg n)' class='latex' /> insertion and access times, so it is, <i>a priori</i>, a good choice. Now, the list of string is composed of words, but it so happens it&#8217;s already sorted. If we insert the strings in the order they are in the list, we get a degenerate binary-tree that&#8217;s in fact a list! Indeed so: insertions are always performed at the far right of the tree, causing the tree to degenerate into something we could call a <em>vine</em> and then all operations degenerate to linear time. If we randomize the list and insert the words into the tree in that randomized order, we get a ragged tree, but about equally deep everywhere, leading to the expected <img src='http://l.wordpress.com/latex.php?latex=O%28%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\lg n)' title='O(\lg n)' class='latex' /> access and insertion time.</p>
<p>The sorted list / binary tree example is an over-simplistic one, I admit, but it gets to the point: the programmer used a data structure cargo-culted to offer <img src='http://l.wordpress.com/latex.php?latex=O%28%5Clg+n%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='O(\lg n)' title='O(\lg n)' class='latex' /> access time, but due to his incomplete comprehension of the data (the <em>sorted</em> list) and of the access pattern (inserting items sequentially from the list), the result was disastrous.</p>
<p>But the thing is, we <em>all</em> do that to a certain extent!</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[CSE225-Data Structures]]></title>
<link>http://csemarmara.wordpress.com/2009/10/07/cse225-data-structures/</link>
<pubDate>Wed, 07 Oct 2009 15:08:59 +0000</pubDate>
<dc:creator>caglauskent</dc:creator>
<guid>http://csemarmara.wordpress.com/2009/10/07/cse225-data-structures/</guid>
<description><![CDATA[Borahan hocanın derste takip ettiği slaytlar (pdf olarak) : http://ifile.it/wsjzg4r Dersin kitabı Mc]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Borahan hocanın derste takip ettiği slaytlar (pdf olarak) :</p>
<p><a href="http://ifile.it/wsjzg4r">http://ifile.it/wsjzg4r</a></p>
<p>Dersin kitabı McGraw-Hill &#8211; Introduction to Algorithms 2ed Cormen ( Analysis of Algorithms dersinde de kullanılacak ) :</p>
<p><a href="http://ifile.it/023p51u">http://ifile.it/023p51u</a></p>
<p>Eski sınav soruları ve quizler :</p>
<p><a href="http://ifile.it/76yupo0">http://ifile.it/76yupo0</a></p>
<p>İlk proje (son gün 16 Ekim) :</p>
<p><a href="http://ifile.it/tokur2p">http://ifile.it/tokur2p</a></p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Umbrella fail]]></title>
<link>http://apriltuesday.wordpress.com/2009/09/28/umbrella-fail/</link>
<pubDate>Mon, 28 Sep 2009 20:27:15 +0000</pubDate>
<dc:creator>April</dc:creator>
<guid>http://apriltuesday.wordpress.com/2009/09/28/umbrella-fail/</guid>
<description><![CDATA[I&#8217;m currently hanging in Paresky till the rain lets up. Goddammit. I had my umbrella with me a]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>I&#8217;m currently hanging in Paresky till the rain lets up. Goddammit. I had my umbrella with me all day to ensure it wouldn&#8217;t rain, and Emily had hers too so the weather would in fact be absolutely lovely.  (Apparently, the effects of umbrellas on the weather are cumulative.)  And so it came to pass.</p>
<p>But it was so nice and sunny when I headed out to practice piano that I figured surely, surely I wouldn&#8217;t need an umbrella. Will I ever learn…</p>
<p>Anyway, I&#8217;d like to preface the remainder of this post with the disclaimer that not all my days are so <a href="http://apriltuesday.wordpress.com/2009/09/24/the-first-three-hours-of-today/" target="_blank">thoroughly wonderful</a> as you may believe from reading this blog; I just always feel like writing when I&#8217;ve had a thoroughly wonderful day.</p>
<p>So in CS I was telling Katie a little about <a href="http://apriltuesday.wordpress.com/2009/09/27/next-time-need-bloggin-hiatus/" target="_blank">my program</a>, including my hopes to have it generate compilable code from its own code and thus evolve into sentience. Katie apparently found this worth drawing our professor&#8217;s attention to with the following exclamation: &#8220;Hey Jeannie, we&#8217;ve got a megalomaniac here!&#8221;</p>
<p>I protested that it&#8217;s almost certainly every programmer&#8217;s dream to write a program that could eventually surpass its creator in intelligence and consequently (of course) take over the world. But evidently that&#8217;s not the case.  What?  What&#8217;s everyone else doing in CS classes then?</p>
<p>Then there was calculus. Multivariable thus far has been a kind of hybrid of precalc, calc, and linear (a little). If I had a professor less fantastic than Adams&#8211; or alternatively, if I remembered more than the scantest remnants of precalc, calc, or linear&#8211; this would be torture.</p>
<p>Fortunately, I have a professor who can c<em>ore an apple with his bare hands</em>, AKA the most badass math professor in the history of the profession.  I mean, I knew Adams was friendly and funny and brilliant, but I had no idea he was so fucking badass.  I think he might be superhuman.</p>
<p>The other thing worth mentioning at this point is the puppy chow that Taylor, Annie, and I made for snacks last night.  I hadn&#8217;t heard of puppy chow before this, so let me explain. Take Chex cereal, melted chocolate, peanut butter, and powdered sugar. Put them in a garbage bag.  Shake violently.</p>
<p>Bring trash bag to common room and invite entry to reach in and see what mysteries await them. Be amused at their skepticism.  Feel happy when they realize just how awesome the contents are.  Snack on the plentiful leftovers whenever you go into the common room.  Bring a bowl of leftovers to your room to snack on intermittently (or rather constantly). Decide this is either the best or the worst idea you&#8217;ve ever had.</p>
<p>Ooh, rain&#8217;s stopped.  Briefly.</p>
<p><a href="http://www.zooomr.com/photos/apriltuesday/8245560/"><img src="http://static.zooomr.com/images/8245560_ae548e6b2d.jpg" alt="IMG_0327" width="450" /></a></p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Decouple Case Execution From Switch By Enum]]></title>
<link>http://mf9it.wordpress.com/2009/09/22/decouple-case-execution-from-switch/</link>
<pubDate>Tue, 22 Sep 2009 19:27:40 +0000</pubDate>
<dc:creator>MF</dc:creator>
<guid>http://mf9it.wordpress.com/2009/09/22/decouple-case-execution-from-switch/</guid>
<description><![CDATA[This is one of the posts on Extend Switch Functionality With Enum. The idea is already in used. In f]]></description>
<content:encoded><![CDATA[This is one of the posts on Extend Switch Functionality With Enum. The idea is already in used. In f]]></content:encoded>
</item>
<item>
<title><![CDATA[HashMap Switch By Enum]]></title>
<link>http://mf9it.wordpress.com/2009/09/22/hashmap-switch/</link>
<pubDate>Tue, 22 Sep 2009 19:24:47 +0000</pubDate>
<dc:creator>MF</dc:creator>
<guid>http://mf9it.wordpress.com/2009/09/22/hashmap-switch/</guid>
<description><![CDATA[This is one of the posts on Extend Switch Functionality With Enum. The HashMap-Switch implementation]]></description>
<content:encoded><![CDATA[This is one of the posts on Extend Switch Functionality With Enum. The HashMap-Switch implementation]]></content:encoded>
</item>
<item>
<title><![CDATA[String Switch By Enum]]></title>
<link>http://mf9it.wordpress.com/2009/09/22/string-switch/</link>
<pubDate>Tue, 22 Sep 2009 19:21:06 +0000</pubDate>
<dc:creator>MF</dc:creator>
<guid>http://mf9it.wordpress.com/2009/09/22/string-switch/</guid>
<description><![CDATA[This is one of the posts on &#8220;Extend Switch Functionality With Enum&#8221;. I hava found a post]]></description>
<content:encoded><![CDATA[This is one of the posts on &#8220;Extend Switch Functionality With Enum&#8221;. I hava found a post]]></content:encoded>
</item>
<item>
<title><![CDATA[Extend Switch Functionality With Enum]]></title>
<link>http://mf9it.wordpress.com/2009/09/22/extend-switch-functionality-with-enum/</link>
<pubDate>Tue, 22 Sep 2009 16:49:55 +0000</pubDate>
<dc:creator>MF</dc:creator>
<guid>http://mf9it.wordpress.com/2009/09/22/extend-switch-functionality-with-enum/</guid>
<description><![CDATA[The java switch statement does not perform directly on a String, or any non-primitive types. A switc]]></description>
<content:encoded><![CDATA[The java switch statement does not perform directly on a String, or any non-primitive types. A switc]]></content:encoded>
</item>
<item>
<title><![CDATA[Navigating/traversing a data tree in C#, C and Java, II]]></title>
<link>http://aldosalzberg.wordpress.com/2009/09/19/navigatingtraversing-a-data-tree-in-c-sharp-c-and-java-2/</link>
<pubDate>Sat, 19 Sep 2009 18:41:49 +0000</pubDate>
<dc:creator>Aldo Salzberg</dc:creator>
<guid>http://aldosalzberg.wordpress.com/2009/09/19/navigatingtraversing-a-data-tree-in-c-sharp-c-and-java-2/</guid>
<description><![CDATA[I&#8217;m now ready to start defining a &#8220;navigator&#8221; or traversal routine. I assume that ]]></description>
<content:encoded><![CDATA[I&#8217;m now ready to start defining a &#8220;navigator&#8221; or traversal routine. I assume that ]]></content:encoded>
</item>
<item>
<title><![CDATA[Excuse me while I radiate joy in a vaguely obnoxious manner]]></title>
<link>http://apriltuesday.wordpress.com/2009/09/11/excuse-me-while-i-radiate-joy-in-a-vaguely-obnoxious-manner/</link>
<pubDate>Fri, 11 Sep 2009 23:37:36 +0000</pubDate>
<dc:creator>April</dc:creator>
<guid>http://apriltuesday.wordpress.com/2009/09/11/excuse-me-while-i-radiate-joy-in-a-vaguely-obnoxious-manner/</guid>
<description><![CDATA[Because after classes today (more on that in a moment), I walked all the way back to the 1914 Librar]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><p>Because after classes today (more on that in a moment), I walked all the way back to the 1914 Library, in the rain, to get the books I&#8217;d bought with the voucher.  And when I got there, not only was the line negligible, but on a whim with no expectation of satisfaction, I asked if there were any <a href="http://apriltuesday.wordpress.com/2009/09/09/in-which-i-repay-a-debt/" target="_blank">multivariable calc</a> texts that just magically materialized during the past couple days.</p>
<p>And I GOT ONE.  It was a miracle of epic proportions.  I walked back to my dorm with a heavy, heavy bag.  A bag heavy with KNOWLEDGE.  Sweet, papery knowledge.  Which sounds like some kind of pastry.</p>
<p>Okay, time to talk about classes.</p>
<p><strong>Intro to Logic and Semantics:</strong> My first college class (besides Linear Algebra at Cornell, which I&#8217;m going to discount because it kind of sucked) could not have been better.  It was like <a href="http://apriltuesday.wordpress.com/2008/07/30/how-brilliant-is-this-book/" target="_blank">Steven Pinker</a> in a class form.</p>
<p>I hope the full and incredible import of this fact does not escape you.  Formal logic and linguistics are things that I read about on my own <em>for fun</em>.  The notion of being able to take a <em>class</em> on these subjects and get tests and grades and credits for it is not something that our public school system prepared me for.  Of course in high school we could choose some classes, and of course I honestly loved many classes I took in high school.  But this is totally, completely, and wonderfully different.</p>
<p>I guess what I&#8217;m saying is, if you&#8217;re not yet in college, get there ASAP because it rocks.</p>
<p>It happens that I know tons of people in my class, where &#8220;tons&#8221; is equivalent to three (and counting).  Hey, that&#8217;s a lot if you&#8217;re a frosh.  Professor Sanders is also hilarious and amazing, but I knew that already because I sat in on one of his courses back in <a href="http://apriltuesday.wordpress.com/2008/09/26/impressions-of-williams/" target="_blank">September &#8216;08</a>.  Which reminds me: I need to send a wedding gift for the chair and his intelligence, plus some dandelions for the unicorn in his garden.</p>
<p><strong>Data Structures and Advanced Programming:</strong> I was actually fairly scared walking into this class because of my deep dark secret: I have done absolutely no programming since the topological hell of yesteryear.  I know, it&#8217;s shameful.  Fortunately we&#8217;re starting with some Java review, although what we really started with today was playing Boggle.  Fun times.</p>
<p>Jeannie is a very straight forward and cool professor, about whom I&#8217;ve heard only good things.  Which is pretty typical for Williams profs, as the same applies to Sanders and Adams (more about the latter anon).  When she was taking attendance and came to my name, she asked if I was &#8220;Antal&#8217;s buddy,&#8221; which I found highly amusing.  Antal is ubiquitously known in the CS department here, and I&#8217;m not surprised.</p>
<p><strong>Multivariable Calculus:</strong> Ultimately, I&#8217;m really glad I got kicked out of Cornell&#8217;s multivariable class, even though it sucked at the time.  Because my professor here is <a href="http://en.wikipedia.org/wiki/Colin_Adams_%28mathematician%29" target="_blank">Colin Adams</a>.  And multivariable is, according to him, the funnest class offered at Williams.</p>
<p>I&#8217;m going to try really, really, really hard to not be a giddy fangirl, because dude, Adams wrote <em>The Knot Book</em>!  Plus other books!  He spent our first class talking about mathematical beauty and then knots, which is basically what guys should talk about if they want me to fall in love with them.  (Bonus points if they mention Gödel.)  (Or muffins.)</p>
<p>The class is huge (liberal arts college-huge, that is, so like 40-50 people), but I have a couple friends there, and I don&#8217;t think I&#8217;ll get lost in the crowd too much.  And if I feel at risk of doing so, I&#8217;ll just go to <em>all</em> of my professor&#8217;s office hours and talk to him about knots, and maybe calculus too because that&#8217;s also fun.</p>
<p>Quite frankly, the math department is probably the number one biggest reason I chose Williams.  Needless to say, I am excited about multivariable.</p>
<p><strong>Poetry and the City: </strong>I actually could still get kicked out of this class, because it&#8217;s a 200-level gateway English class and I&#8217;m a first-year who&#8217;s probably not going to be an English major.</p>
<p>But my silly AP Lit exam score means I basically can&#8217;t take any 100-level English classes (they&#8217;re mostly full at this point anyway), and I do think it&#8217;s important for me to take a writing class my first semester here.  Who would&#8217;ve thought doing well on an AP would cause problems?</p>
<p>The professor here does not radiate enthusiasm and awesomeness like my others do, but I can tell she&#8217;s still really good, knows her stuff, and is impressively open to having students talk to her outside of class.  Her office hours actually take place in Paresky snack bar.</p>
<p>One slight downside to this class is that I know absolutely no one in it, but it&#8217;s very discussion-based so I&#8217;m sure I&#8217;ll get to know people.  And even though I&#8217;m not a very English-y person, I actually really enjoy talking about poetry, so I think I&#8217;m going to like this class.  In addition to <em>all</em> my other ones.</p>
<p>In other news, this is the first time it&#8217;s rained since I got to Williams.</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[Better Tool Support for .NET]]></title>
<link>http://dvanderboom.wordpress.com/2009/09/07/better-tool-support-for-net/</link>
<pubDate>Mon, 07 Sep 2009 16:01:46 +0000</pubDate>
<dc:creator>Dan Vanderboom</dc:creator>
<guid>http://dvanderboom.wordpress.com/2009/09/07/better-tool-support-for-net/</guid>
<description><![CDATA[Productivity Enhancing Tools Visual Studio has come a long way since its debut in 2002.&#160; With t]]></description>
<content:encoded><![CDATA[<div class='snap_preview'><h2>Productivity Enhancing Tools</h2>
<p>Visual Studio has come a long way since its debut in 2002.&#160; With the imminent release of 2010, we’ll see a desperately-needed overhauling of the archaic COM extensibility mechanisms (to support the <a href="http://msdn.microsoft.com/en-us/library/bb166360.aspx">Managed Package Framework</a>, as well as MEF, the <a href="http://www.codeplex.com/MEF">Managed Extensibility Framework</a>) and a redesign of the user interface in WPF that I’ve been pushing for and predicted as inevitable quite some time ago.</p>
<p>For many alpha geeks, the Visual Studio environment has been extended with excellent third-party, productivity-enhancing tools such as <a href="http://www.devexpress.com/Products/Visual_Studio_Add-in/Coding_Assistance/">CodeRush</a> and <a href="http://www.jetbrains.com/resharper/">Resharper</a>.&#160; I personally feel that the Visual Studio IDE team has been slacking in this area, providing only very weak support for refactorings, code navigation, and better Intellisense.&#160; While I understand their desire to avoid stepping on partners’ toes, this is one area I think makes sense for them to be deeply invested in.&#160; In fact, I think a new charter for a Developer Productivity Team is warranted (or an expansion of their team if it already exists).</p>
<p>It’s unfortunately a minority of .NET developers who know about and use these third-party tools, and the .NET community as a whole would without a doubt be significantly more productive if these tools were installed in the IDE from day one.&#160; It would also help to overcome resistance from development departments in larger organizations that are wary of third-party plug-ins, due perhaps to the unstable nature of many of them.&#160; Microsoft should consider purchasing one or both of them, or paying a licensing fee to include them in every copy of Visual Studio.&#160; Doing so, in my opinion, would make them heroes in the eyes of the overwhelming majority of .NET developers around the world.</p>
<p>It’s not that I mind paying a few hundred dollars for these tools.&#160; Far from it!&#160; The tools pay for themselves very quickly in time saved.&#160; The point is to make them ubiquitous: to make high-productivity coding a standard of .NET development instead of a nice add-on that is only sometimes accepted.</p>
<p>Consider just from the perspective of watching speakers at conferences coding up samples.&#160; How many of them don’t use such a tool in their demonstration simply because they don’t want to confuse their audience with an unfamiliar development interface?&#160; How many more demonstrations could they be completing in the limited time they have available if they felt more comfortable using these tools in front of the masses?&#160; You know you pay good money to attend these conferences.&#160; Wouldn’t you like to cover significantly more ground while you’re there?&#160; This is only likely to happen when the tool’s delivery vehicle is Visual Studio itself.&#160; Damon Payne <a href="http://www.damonpayne.com/2009/08/16/WhyMEFMatters.aspx">makes a similar case</a> for the inclusion of the Managed Extensibility Framework in .NET Framework 4.0: build it into the core and people will accept it.</p>
<h2>The Gorillas in the Room</h2>
<p>CodeRush and Resharper have both received recent mention in the Hanselminutes podcast (<a href="http://hanselminutes.com/default.aspx?showID=196">episode 196 with Mark Miller</a>) and in the Deep Fried Bytes podcast (<a href="http://deepfriedbytes.com/podcast/episode-35-why-comments-are-evil-and-pair-programming-with-corey-haines/">episode 35 with Corey Haines</a>).&#160; If you haven’t heard of CodeRush, I recommend watching these videos on their use.</p>
<ul>
<li><a href="http://www.dnrtv.com/default.aspx?showNum=143">CodeRush Express</a> (which is free) </li>
<li><a href="http://www.dnrtv.com/default.aspx?showNum=107">CodeRush with Refactor!</a> </li>
<li><a href="http://www.dnrtv.com/default.aspx?showNum=5">DXCore</a> (the library you can use to build your own add-ins, and on which CodeRush and Refactor! are built) </li>
</ul>
<p>For secondary information on CodeRush, DXCore, and the principles with which they were designed, I recommend these episodes of DotNetRocks:</p>
<ul>
<li><a href="http://www.dotnetrocks.com/default.aspx?showNum=80">Tool Development</a> </li>
<li><a href="http://www.dotnetrocks.com/default.aspx?showNum=185">Discoverability</a> </li>
<li><a href="http://www.dotnetrocks.com/default.aspx?showNum=338">The Science of Good UI</a> </li>
</ul>
<p>I don’t mean to be so biased toward CodeRush, but this is the tool I’m personally familiar with, has a broader range of functionality, and it seems to get the majority of press coverage.&#160; However, those who do talk about Resharper do speak highly of it, so I recommend you check out both of them to see which one works best for you.&#160; But above all: <strong>go check them out!</strong></p>
<h2>Refactor – Rename</h2>
<p>Refactoring code is something we should all be doing constantly to avoid the accumulation of <a href="http://blogs.construx.com/blogs/stevemcc/archive/2007/11/01/technical-debt-2.aspx">technical debt</a> as software projects and the requirements on which they are based evolve.&#160; There are many refactorings in Visual Studio for C#, and many more in third-party tools for several languages, but I’m going to focus here on what I consider to be the most important refactoring of them all: <strong>Rename</strong>.</p>
<p>Why is <strong>Rename</strong> so important?&#160; Because it’s so commonly used, and it has such far-reaching effects.&#160; It is frequently the case that we give poor names to identifiers before we clearly understand their role in the “finished” system, and even more frequent that an item’s role changes as the software evolves.&#160; Failure to rename items to accurately reflect their current purpose is a recipe for code rot and greater code maintenance costs, developer confusion, and therefore buggy logic (with its associated support costs).</p>
<p>When I rename an identifier with a refactoring tool, all of the references to that identifier are also updated.&#160; There might be hundreds of references.&#160; In the days before refactoring tools, one would accomplish this with Find-and-Replace, but this is dangerous.&#160; Even with options like “match case” and “match whole word”, it’s easy to rename the wrong identifiers, rename pieces of string literals, and so on; and if you forget to set these options, it’s worse.&#160; You can go through each change individually, but that can take a very long time with hundreds of potential updates and is a far cry from a truly intelligent update.</p>
<p>Ultimately, the intelligence of the <strong>Rename </strong>refactoring provides safety and confidence for making far-reaching changes, encouraging more aggressive refactoring practices on a more regular basis.</p>
<h2>Abolishing Magic Strings</h2>
<p>I am intensely passionate about any tool <u>or coding practice</u> that encourages refactoring and better code hygiene.&#160; One example of such a coding practice is the use of lambda expressions to select identifiers instead of using evil “magical strings”.&#160; From <a href="http://dvanderboom.wordpress.com/2009/08/20/strongly-typed-dynamic-linq-order-operator/">my article on dynamically sorting Linq queries</a>, the use of “magic strings” would force me to write something like this to dynamically sort a Linq query:</p>
<p><font size="2">Customers = Customers.Order(<span style="color:#a31515;">&#34;LastName&#34;</span>).Order(<span style="color:#a31515;">&#34;FirstName&#34;</span>, <span style="color:#2b91af;">SortDirection</span>.Descending);</font></p>
<p> <a href="http://11011.net/software/vspaste"></a>
<p>The problem here is that “LastName” and “FirstName” are oblivious to the <strong>Rename </strong>refactoring.&#160; Using the refactoring tool might give me a false sense of security in thinking that all of my references to those two fields have been renamed, leading me to The Pit of Despair.&#160; Instead, I can define a function and use it like the following:</p>
<pre class="code"><font size="2"><span style="color:blue;">public static </span><span style="color:#2b91af;">IOrderedEnumerable</span>&#60;T&#62; Order&#60;T&#62;(<span style="color:blue;">this </span><span style="color:#2b91af;">IEnumerable</span>&#60;T&#62; Source,
    <span style="color:#2b91af;">Expression</span>&#60;<span style="color:#2b91af;">Func</span>&#60;T, <span style="color:blue;">object</span>&#62;&#62; Selector, <span style="color:#2b91af;">SortDirection </span>SortDirection)
{
    <span style="color:blue;">return </span>Order(Source, (Selector.Body <span style="color:blue;">as </span><span style="color:#2b91af;">MemberExpression</span>).Member.Name, SortDirection);
}</font></pre>
<p><a href="http://11011.net/software/vspaste"></a></p>
<pre class="code"><font size="2">Customers = Customers.Order(c =&#62; c.LastName).Order(c =&#62; c.FirstName, <span style="color:#2b91af;">SortDirection</span>.Descending);</font></pre>
<p><a href="http://11011.net/software/vspaste"></a></p>
<p>This requires a little understanding of the structure of expressions to implement, but the benefit is huge: I can now use the refactoring tool with much greater confidence that I’m not introducing subtle reference bugs into my code.&#160; For such a simple example, the benefit is dubious, but multiply this by hundreds or thousands of magic string references, and the effort involved in refactoring quickly becomes overwhelming.</p>
<p>Coding in this style is most valuable when it’s a solution-wide convention.&#160; So long as you have code that strays from this design philosophy, you’ll find yourself grumbling and reaching for the inefficient and inelegant Find-and-Replace tool.&#160; The only time it really becomes an issue, then, is when accessing libraries that you have no control over, such as the Linq-to-Entities and the Entity Framework, which makes extensive use of magic strings.&#160; In the case of EF, this is mitigated somewhat by your ability to regenerate the code it uses.&#160; In other libraries, it may be possible to write extension methods like the Order method shown above.</p>
<p>It’s my earnest hope that library and framework authors such as the .NET Framework team will seriously consider alternatives to, and an abolition of, “magic strings” and other coding practices that frustrate otherwise-powerful refactoring tools.</p>
<h2>Refactoring Across Languages</h2>
<p>A tool is only as valuable as it is practical.&#160; The <strong>Rename</strong> refactoring is more valuable when coding practices don’t frustrate it, as explained above.&#160; Another barrier to the practical use of this tool is the prevalence of multiple languages within and across projects in a Visual Studio solution.&#160; The definition of a project as a single-language container is dubious when you consider that a C# or VB.NET project may also contain HTML, ASP.NET, XAML, or configuration XML markup.&#160; These are all languages with their own parsers and other language services.</p>
<p>So what happens when identifiers are shared across languages and a Rename refactoring is executed?&#160; It depends on the languages involved, unfortunately.</p>
<p>When refactoring a C# class in Visual Studio, the XAML’s x:Class value is also updated.&#160; What we’re seeing here is cross-language refactoring, but unfortunately it only works in one direction.&#160; There is no refactor command to update the x:Class value from the XAML editor, so manually changing it causes my C# class to become sadly out of sync.&#160; Furthermore, this seems to be XAML specific.&#160; If I refactor the name of an .aspx.cs class, the <strong>Inherits</strong> attribute of the <strong>Page</strong> directive in the .aspx file doesn’t update.</p>
<p>How frequent do you think it is that someone would want to change a code-behind file for an ASP.NET page, and yet would not want to change the Inherits attribute?&#160; Probably not very common (okay, probably NEVER).&#160; This is a matter of having sensible defaults.&#160; When you change an identifier name in this way, the development environment does not respond in a sensible way by default, forcing the developer to do extra work and waste time.&#160; This is a failure in UI design for the same reason that Intellisense has been such a resounding success: Intellisense anticipates our needs and works with us; the failure to keep identifiers in sync by default is diametrically opposed to this intelligence.&#160; This represents a fragmented and inconsistent design for an IDE to possess, thus my hope that it will be addressed in the near future.</p>
<p>The problem should be recognized as systemic, however, and addressed in a generalized way.&#160; Making individual improvements in the relationships between pairs of languages has been almost adequate, but I think it would behoove us to take a step back and take a look at the future family of languages supported by the IDE, and the circumstances that will quickly be upon us with Microsoft’s <a href="http://msdn.microsoft.com/en-us/oslo/default.aspx">Oslo</a> platform, which enables developers to more easily build tool-supported languages (especially DSLs, <a href="http://en.wikipedia.org/wiki/Domain-specific_language">Domain Specific Languages</a>).&#160; </p>
<p>Even without Oslo, we have seen a proliferation of languages: IronRuby, IronPython, F#, and the list goes on.&#160; A refactoring tool that is hard-coded for specific languages will be unable to keep pace with the growing family of .NET and markup languages, and certainly unable to deal with the demands of every DSL that emerges in the next few years.&#160; If instead we had a way to identify our code identifiers to the refactoring tool, and indicate how they should be bound to identifiers in other languages in other files, or even other projects or solutions, the tools would be able to make some intelligent decisions without understanding each language ahead of time.&#160; Each language’s language service could supply this information.&#160; For more information on Microsoft Oslo and its relationship to a world of many languages, see <a href="http://dvanderboom.wordpress.com/2009/01/17/why-oslo-is-important/">my article on Why Oslo Is Important</a>.</p>
<p>Without this cross-language identifier binding feature, we’ll remain in refactoring hell.&#160; I <a href="https://connect.microsoft.com/oslo/feedback/ViewFeedback.aspx?FeedbackID=404898">offered a feature suggestion</a> to the Oslo team regarding this multi-master synchronization of a model across languages that was rejected, much to my dismay.&#160; I’m not sure if the Oslo team is the right group to address this, or if it’s more appropriate for the Visual Studio IDE team, so I’m not willing to give up on this yet.</p>
<h2>A Default of Refactor-Rename</h2>
<p>The next idea I’d like to propose here is that the <strong>Rename</strong> refactoring is, in fact, a sensible default behavior.&#160; In other words, when I edit an identifier in my code, I more often than not want all of the references to that identifier to change as well.&#160; This is based on my experience in invoking the refactoring explicitly countless times, compared to the relatively few times I want to “break away” that identifier from all the code that references.</p>
<p>Think about it: if you have 150 references to variable <strong>Foo</strong>, and you change <strong>Foo</strong> to <strong>FooBar</strong>, you’re going to have 150 broken references.&#160; Are you going to create a new <strong>Foo</strong> variable to replace them?&#160; That workflow doesn’t make any sense.&#160; Why not just start editing the identifier and have the references update themselves implicitly?&#160; If you want to be aware of the change, it would be trivial for the IDE to indicate the number of references that were updated behind the scenes.&#160; Then, if for some reason you really did want to break the references, you could explicitly launch a refactoring tool to “break references”, allowing you to edit that identifier definition separately.</p>
<p>The challenge that comes to mind with this default behavior concerns code that spans across solutions that aren’t loaded into the IDE at the same time.&#160; In principle, this could be dealt with by logging the refactoring somewhere accessible to all solutions involved, in a location they can all access and which gets checked into source control.&#160; The next time the other solutions are loaded, the log is loaded and the identifiers are renamed as specified.</p>
<h2>Language Property Paths</h2>
<p>If you’ve done much development with Silverlight or WPF, you’ve probably run into the PropertyPath class when using data binding or animation.&#160; PropertyPath objects represent a traversal path to a property such as “Company.CompanyName.Text”.&#160; The travesty is that they’re always “magic strings”.</p>
<p>My argument is that the property path is such an important construct that <u>it deserves to be an core part of language syntax</u> instead of just a type in some UI-platform-specific library.&#160; I created a data binding library for Windows Forms for which I created my own property path syntax and type, and there are countless non-UI scenarios in which this construct would also be incredibly useful.</p>
<p>The advantage of having a language like C# understand property path syntax is that you avoid a whole class of problems that developers have used “magic strings” to solve.&#160; The compiler can then make intelligent decisions about the correctness of paths, and errors can be identified very early in the cycle.</p>
<p>Imagine being able to pass property paths to methods or return then from functions as first-class citizens.&#160; Instead of writing this:</p>
<p>Binding NameTextBinding = new Binding(&#34;Name&#34;) { Source = customer1; }</p>
<p>… we could write something like this, have access to the <strong>Rename</strong> refactoring, and even get Intellisense support when hitting the dot (.) operator:</p>
<p>Binding NameTextBinding = new Binding(@Customer.Name) { Source = customer1; }</p>
<p>In this code example, I use the fictitious @ operator to inform the compiler that I’m specifying a property path and not trying to reference a static property called Name on the Customer class.</p>
<p>With property paths in the language, we could solve our dynamic Linq sort problem cleanly, without using lambda expressions to hack around the problem:</p>
<p><font size="2">Customers = Customers.Order(@Customer.LastName).Order(@Customer.FirstName, <span style="color:#2b91af;">SortDirection</span>.Descending);</font></p>
<p>That looks and feels right to me.&#160; How about you?</p>
<h2>Summary</h2>
<p>There are many factors of developer productivity, and I’ve established refactoring as one of them.&#160; In this article I discussed tooling and coding practices that support or frustrate refactoring.&#160; We took a deep look into the most important refactoring we have at our disposal, <strong>Rename</strong>, and examined how to get the greatest value out of it in terms of personal habits, as well as long-term tooling vision and language innovation.&#160; I proposed including property paths in language syntax due to its general usefulness and its ability to solve a whole class of problems that have traditionally been solved using problematic “magic strings”.</p>
<p>It gives me hope to see the growing popularity of <a href="http://en.wikipedia.org/wiki/Fluent_interface">Fluent Interfaces</a> and the use of lambda expressions to provide coding conventions that can be verified by the compiler, and a growing community of bloggers (such as <a href="http://weblogs.asp.net/podwysocki/archive/2009/03/19/functional-net-lose-the-magic-strings.aspx">here</a> and <a href="http://handcraftsman.wordpress.com/2008/11/11/how-to-get-c-property-names-without-magic-strings/">here</a>) writing about the abolition of “magic strings” in their code.&#160; We can only hope that Microsoft program managers, architects, and developers on the Visual Studio and .NET Framework teams are listening.</p>
</div>]]></content:encoded>
</item>
<item>
<title><![CDATA[F#: Nested Data Structures, Enumeration and Sequence Comprehensions]]></title>
<link>http://stevehorsfield.wordpress.com/2009/09/02/f-nested-data-structures-enumeration-and-sequence-comprehensions/</link>
<pubDate>Wed, 02 Sep 2009 08:54:04 +0000</pubDate>
<dc:creator>Steve Horsfield</dc:creator>
<guid>http://stevehorsfield.wordpress.com/2009/09/02/f-nested-data-structures-enumeration-and-sequence-comprehensions/</guid>
<description><![CDATA[I have been working on a business data model for my WPF demo application (which I hope to begin desc]]></description>
<content:encoded><![CDATA[I have been working on a business data model for my WPF demo application (which I hope to begin desc]]></content:encoded>
</item>

</channel>
</rss>
