giovedì 25 settembre 2008

MySQL - CouchDB performance comparison

Here you are with the results of our benchmarks!


mercoledì 24 settembre 2008

CouchDB performance testing: first results

We are trying to test the performances of CouchDB by measuring the times it takes to execute a query (=a view) in different conditions.

We used a modified version of the default test suite - I'm not posting it now cause the code isn't very readable, but if you need it, just ask and I'll fix it up.

In this version we put some functions to create a named view and execute it with a single argument; then we ran the test suite from the Futon utility, we collected execution times from the test suite and put them into a google document.

Results show more shadows than lights: the database is fast when executing queries already indexed, but it's really slow into creating indexes.

We are working to expand this test with different sample-data size. So keep in touch!

Here is a mind map I made to explain the differences between CouchDB views. Thanks to FreeMind application!

Yesterday... and plans for today

Yesterday we completed two user stories of our project in java using couchdb4j:

- add a custom field and be able to see it in listings;
- find records by a specific field.

Then we switched to a more difficult task: evaluate the performance differences between CouchDB and MySQL. It's a difficult task because the two databases are logically very different, and it's not easy to find the common operations to test and compare.

In example, the "find" operation in MySQL can be done with a query; it can be a compiled query or an interpreted one.
In CouchDB we need to create a view function and execute it; we can have "named" views - that resemble the MySQL compiled queries - and temporary views, that are executed on demand.

In the CouchDB Futon utility - the web application builtin with CouchDB - it is possible to run a suite of tests and see their execution time.
Editing the file /usr/local/share/couchdb/www/script/couch_tests.js it is possible to modify the default tests or to add new ones, so we quickly wrote a new javascript function to check the execution time of a named view.

Our results were really poor for the first access, when the view is created - around 1 second to create a view on a single field, for a database with only 5000 records; after the view was created, access times were much slower, in the range of 5 to 15 ms.

I don't know if this problem is related to any error we made in the test-writing process, but I'm pretty sure I can exclude this; maybe there is another way to iprove performances of the standard javascript view server, I'll try to find out today. Otherwise, we'll have to change our project, because such a poor performance would make CouchDB not usable in any productive environment.

Keep in touch! Andrea

lunedì 22 settembre 2008

What a day of work! :)

Today's work has been really intense. You can see this also from this post's issue hour: it's 20 pm here, and for the first time I took some work home.

Today we started working on the CouchDb handling of custom fields. In our application - you can see the link in the previous post - we already wrote the CRD parts of the CRUD; we needed the Update functionalities.
At first we wrote our code keeping the previous structure with fixed fields in the class Person. This was not so good for the CouchDB approach: when we had to insert a custom field, we had to overcome the limitation of the fixed fields.
So we decided to make a major design change into our application: we switched from the standard class design to a more flexible one, where instead of fixed named attributes we put them into an HashMap.
This way we could handle variable-length lists of attributes, and also attributes with variable names; in a while it was possible to handle the update part successfully.
We had some problems because a part of our application was still old-fashion code, and we forgot to convert it; when we got that, we felt really stupid, because we were getting mad looking for the cause of some missing functionalities.

venerdì 19 settembre 2008

Today's work

Today we worked on the project addressbookcouchdb4j. It's an explanation project for the couchdb4j libraries, and I hope it will prove itself useful for anybody trying to work on couchdb in Java.

I found some difficulties in this work. Some were due to my not perfect knowledge of java; some others to the lack of documentation on both couchdb and the libraries. However I contacted Marcus Breeze, the developer of couchdb4j, and he explained us something we missed. Maybe we could join our works to help the couchdb user base: thanks a lot Marcus!

The current state of our work can be seen easily at the following address:

http://www.bitbucket.org/metalelf0/addressbookcouchdb4j/overview/


Feel free to clone the hg repository if you want to sneak at the code ;)

Actually our code allows to insert, remove and list elements into a simple couchdb database. It's an addressbook - jeje, not so original, but it was the simplest test case that came to our mind. We'll add some more functions in the next days, so stay in touch!

giovedì 18 settembre 2008

What we did in the last two days

This is a summary of the work I did with my colleague, Simone Albertini, in the last two days.

Tuesday, september 16

  • Study of couchDB on the official site. Read the introduction, technical overview and wiki. Studied the difference between relational and non-relational DBs, and concepts of document, view.
  • Local installation on a linux VM of a working CouchDB.
  • Work on the local DB. Experimented the http communication system with CouchDB. Tried the ruby basic access library.
  • Started using CouchDB in a java environment. Installation and tryout of the couchdb4j libraries and needed dependencies.

Wednesday, September 17

  • Getting deep into CouchDB.
  • Brainstorming about possible techniques we can use to reimplement a relational DB into a document-oriented one.
  • Downloaded and studied the source files of couchdb4j, ran the tests included, and saw they didn't pass.
  • Partial reimplementation of the provided tests, to make them pass and to understand the library behaviour.
  • Read some introductions about Lucene and its applicability fields.

mercoledì 17 settembre 2008

Relational to document based: what to do?

I'm trying to figure out how to "convert" a relational database to a document based one. Specifically, an existent (legacy) relational database.

As an example, think about a DB where the record can be represented by the following Java class:

public class Person {
String name;
String surname;
Date registration_date;
}

If I want to write an instance of Person to a document based DB, as CoachDB, I can use the following approaches.

The simplest one that may come to mind is making Person implement the Serializable interface, serialize the instance of the class, and then write the obtained bit-code to a Document. However, this would make the DB completely useless: in the DB I would have only bit-codes, and I wouldn't be able to retrieve data without the Java serializer/deserializer. I would use the DB as a file system, so this approach is definitely bad.

Another idea could be: for each field, write it to the document in a form like

{ field.key , field.value.toString() }

However this would work without problems only for the first two fields. In fact, when we write to the document the registration_date.toString(), we would have some problems reconstructing it back from the document.

I'm currently searching for a solution about this topic. Stay in touch!

CouchDB accessing issues

I installed on my local system the CouchDB database, without any problem. Then I tried the given Ruby code to access the DB. It's really easy as CouchDB has no query language, but it uses http messages as directives; so, as I supposed, the Ruby code worked flawlessly.

THEN, I tried the Java code. I'm still trying it.Unluckily, as you can see on this page,

http://couchdb4j.googlecode.com/svn/trunk/javadoc/index.html


the documentation is lacking. I'll work on this during the next days to make it better.

martedì 16 settembre 2008

CouchDB

CouchDB is a document-based database developed by Apache Foundation. Its primary features are eventual consistency, high availability and extreme scalability. It's written in Erlang. Here's the related link on wikipedia: http://en.wikipedia.org/wiki/CouchDB.

Here's a link to a good installation guide on Ubuntu Hardy 8.04:

http://barkingiguana.com/2008/06/28/installing-couchdb-080-on-ubuntu-804

It works perfectly also with the latest version of CouchDB (currently 0.8.1).

lunedì 15 settembre 2008

15/09/2008: Erlang

Today I'm continuing my Friday work on Erlang. Sorry for not posting about that, but we had some network problems and I forgot to update during the weekend.

Here is some sample code to explain TAIL RECURSION in Erlang.

-module(sum).
-export([

sum/1,
sum_acc_caller/1,
sum_acc/2

]).

sum([H|T]) -> H + sum(T);

sum([]) -> 0.

sum_acc([H|T], X) -> sum_acc(T, X+H);

sum_acc([], X) -> X.

sum_acc_caller(H) -> sum_acc(H, 0).

The basic problem we face is the sum of the integers in a list. In this code you can see two functions:

- the first one is sum/1. This is the basic, non-tail-recursive function. It calls itself (n-1) times, where n is the size of the list, and then returns the expected value.
- the second one is sum_acc/2. This is the tail-recursive function: it uses an accumulator variable X to keep the value of the sum while traversing the list. It keeps adding to this variable until the remaining part of the list (tail) has size 0; then it returns the sum.

giovedì 11 settembre 2008

Using JIRA in an agile environment

Using JIRA in an agile environment

Although JIRA is not explicitely designed for use in an agile environment, its flexibility allows agile developers to adapt it to their needs, making it a good complementary software for project management.

After a first look, there are some obstacles in JIRA issue management, that we’ll need to workaround:

  • there is no “difficulty point” variable for the issues;
  • there’s no “iteration” concept: we’ll need to use the “version” variable to represent the iteration;
  • the available reports and graphs are based on the concept of tasks per version. However, in an agile environment, this information is not important, as it’s much more important the difficulty of the task than the number of them.

Finding a solution to these problems is not easy. Anyway, there are some plugins - both commercial and freeware - that can help us to improve our Jira experience. Let’s see the most important ones.


GreenHopper

GreenHopper by GreenPepper Software is at the moment the most used plugin for Agile developers using JIRA. It’s a stable project with very frequent updates (they claim one release every 2.5 weeks). But it’s not a free plugin: the licence prices range from 350 canadian dollars (234 €) to 1150 canadian dollars (768 €) per production server. They offer a free licence too, but only for managers of open-source projects.

GreenHopper is a full implementation of the Agile planning wall in a JIRA plugin.

It represents all of your JIRA issues as an index card; different types of issues are different colors. Each card displays the Summary and the other information crucial to planning, like estimates and time-remaining. It hides all of the other information: full description, comments, etc., but they’re all just a click away. You can drag and re-order the tasks. You can edit them on the fly, to add comments or log work.

The Planning Board helps you schedule many issues over an entire release cycle. GreenHopper also offers a Task Board to show you the work-queue for a version or component. And there is a Chart Board which will show you the burn-down chart for a release.

The chart board offers the following graphs:

  • Display of the burndown curve
  • Display of the team effort curve
  • Display of the estimation accuracy curve
  • Burndown chart based on a custom field
  • Burnup chart based on a custom field
  • Value chart based on a custom field
  • Issue filtering
  • Configurable start date and end date

The most evident downsides of this plugin are that the graphs it produces are limited to a single iteration (or version); and these graphs can be seen only after the iteration is closed.

This problem is mostly due to the internal Jira structure; like previously said, the Jira application lacks an “iteration” concept. The team effort curve for a single iteration, telling how many story points are being completed every day, doesn’t give any information about the overall project.

The Planning board, instead, is a good way to keep track of the story cards of the project. However many agile developers still prefer to have physical papers, sticky notes and a real blackboard.

Link: GreenHopper plugin

CFO Approval Required

CFO Approval Required is a plugin developed by Go2Group. Its original target was different from the one an agile group could use it for, but sometimes it may find its space. It’s a free plugin, so try it if you need it. It’s not maintained since the original 1.0 version, released in April 2007.

This plug-in is designed to allow a CFO or manager to approve issues based on the amount of a requested purchase. The issue is then forwarded for approval to a second user, such as a Controller or Project Manager.

This plug-in is also used for any generic approval process such as travel request, vacation request, Feature Request, Test Requests, PO, etc.

This plug-in can be useful in scenarios such as developer changeset (set of associated changed source code files) for verification to a team member, if approved by the team member the changeset is then marked as ready for test. This scenario has proven useful to shops working in xtreme - scrum - agile environments.

In my opinion, this plugin can be useful only in big teams, or in outsourced development teams. Agile practices encourage communication as the most important value, so decisions are usually taken in the open space. However in different environments this plugin could help a lot.

Link: CFO Approval Required plugin

Custom Issue Order

This freeware plugin is very simple, yet it can be very useful. It’s not being updated since November 2007.

(it adds a) Custom field to order issues. With this plugin, any issue list can be ordered in a custom way. Useful for work queues or fine grained prioritizing.

It can be a cheap alternative (even if less eye-pleasing) to the GreenHopper planning board: if your only need is to sort user stories, and you don’t mind colors and stuff, this one is for you. The custom field, however, cannot solve the big problems noted before: it’s related to the single issue, not to iterations (= versions).

Link: Custom Issue Order

Laughing Panda JIRA Agile Report Plugins

This free plugin offers at the moment a single graph: specifically, it’s a workload burndown chart (in hour or from a custom field like story points). Again, it suffers the same problems of its bigger brother GreenHopper: graphs are related to the single iteration, so they don’t give any “wide view” of the project.

Link: Laughing Panda

Agile Velocity Tracking plugin

This free plugin offers a report that will generate velocity tracking charts over versions. The report displays charts for Velocity Points tracking. Versions are treated as iterations. The points are gathered from a custom field which must be manually added. Iterations can be of variable length. Forecasting is currently done on ‘yesterday’s weather.

This plugin offers something really different from the previous ones: it’s giving the chance to analyze more than a single iteration. The information gathered from its graphs are usually more important to the agile developer’s eyes: points over stories, points burndown.

We could argue that this plugin needs a couple of iterations completed before proving its strength, as it doesn’t give any information about the current iteration; the predictions made by this plugin are not based on a very complex model, so they could be inaccurate and should be treated carefully.

The development of this plugin is not going on since August 2007.

Link: Agile Velocity Tracking Plugin

Agile Wall Plugin

This is another free plugin, offering a display of a Project Wall.

Report for viewing issue statuses in the same way as in agile project progress walls (issues in three columns: to do/in progress/done). Many agile teams use task walls to represent the current status of the sprint (or project). Tasks are grouped by their status to three columns: Not started, In progress and Done. Tasks are usually also sorted so that top priority issues are on the top and low level issues are on the bottom.

Agile Wall Report plugin tries to mimic this view in JIRA so that team members can use a same kind of a view for a current project version that is being developed.

Note that current version uses default JIRA statuses. Issues with “In Progress” status are rendered in the middle and issues with “Closed” status are rendered as done. Other status types are rendered in the Not started column.

The biggest limitation of this plugin is the impossibility of using custom issue status (e.g. “waiting for approval”, test related statuses and so on). However it’s still a good plugin, allowing a visual representation of a chosen sprint (or version) to be seen in Jira. Like said for the GreenHopper plugin, many agile developers prefer a real board, so the real need of this plugin is subjective; but this one is free, so you can try it and decide by yourself.

The latest release was made on 2007, July 17.

Link: Agile Wall Plugin

11/9/2008

Morning task: try to figure out how i18n works.



Setting a text variable in a .properties file and referencing it in the view is working.

I.e., if the file is called assigned_portlet.properties, and its path in the plugin dir is

src/etc/com/atlassian/jira/plugin/portlets/example/assigned/assigned_portlet.properties

it should be referenced in the atlassian-plugin.xml with a line like this one:

<resource type="i18n" name="i18n" location="com.atlassian.jira.plugin.portlets.
example.assigned.assigned_portlet">

Now, I have to understand which variables or properties are not explicitely being assigned by the user.



Update: I think those values are being set in another file; the problem is that I have only the jar file and I'm not able to see the source code of it. As a developer, trying to evaluate a product like jira without being able to see its code, is really a tough task.

In the afternoon I switched to a simplier task. I'm writing a paper about available Jira plugins for agile developers.

mercoledì 10 settembre 2008

10/9/2008

Task: Try to write a more complicated Jira plugin.



Again, the Jira developer resources SUCK. They provide a developer toolkit without any readme file. They provide plugin examples that don't build. They don't provide any step-by-step tutorial or explanation about how to do things. It's like trying to study marine biology, fishing on a lonely ship in the middle of the ocean.

The provided helper command "mvn eclipse:eclipse"should generate an eclipse project file for a new plugin. Ok, every time I launch it, it downloads more than an hundred MBs from the web. No matter if the files are already in the local repository: it *has* to download them again. I'm getting sick wasting my time waiting for a damn helper tool to finish.

Update: reading in some forum comments (yeah, not in official documentation, in comments!) I found out that jira/atlassian plugins have to be built with the version 1.x of Maven. Amazing. The newest version, 2.0.x, is not working anymore; it's kind of a completely different application. This thing should be written in capital letters with a huge font in EVERY developer page. Instead, it's in a forum comment. Great work, atlassian guys! What a wonderful documentation you have!

Then I tried to build a plugin, this time successfully. However, I can't understand the parameter-passing format of the velocity templates. There's a variable being printed correctly without being assigned anywhere; and assigning manually a value to a variable is not working. Still searching for a solution on this. This should be a i18n (internationalization) variable, and I can't find any documentation about this.



Still stuck on the i18n issue. Posted a question about this on the atlassian jira forum:

http://forums.atlassian.com/thread.jspa?threadID=28566&tstart=0


martedì 9 settembre 2008

9/9/2008

Writing a sample Jira plugin:



Found a great tutorial for a "hello, world" servlet plugin: Sample hello world plugin tutorial
However be careful if you copy'n'paste from this tutorial, cause the syntax is a little messed up from the blog platform.


Maven sucks!
At compile time it doesn't check if the xml files provided are valid. It says everything is ok, then when you start the tomcat server you have to discover by yourself what is stopping it from loading, searching in the start log. What a PITA!

After 4 pomodori I finally managed to get a working plugin. This whole thing is a mess. Just to get an "Hello, world!" string on my Jira front page I had to write around 70 lines of code in many different files. Ok, it's only my second day of work on this subject; but the lack of informations (the *simple* ones) is getting me mad.

Here is the link to the jar file I generated. Install it as usual in your jira installation to try it.

8/9/2008

- Jira installation on local machine and exploration of the application:



- Jira and XP: plugins, graphs. Research for informations about Jira usage in XP environments.



- Example jira plugin. Installed plugin developer toolkit, tried to understand the basic structure of a plugin. Tried to write a sample plugin, without success.



Note: simply installing the developer resources took 2 pomodori: the download process of the needed files was very long.