Good ‘help’ for ‘less’

Usually, for any Unix command line tools, I go for the man pages. And I did the same thing for less for a few years. I have never looked into less --help until now.  The help is easy-to-read and contains vast information you would ever need. So, at least for the less utility, consider using the command-line help itself instead of searching the manuals.

Happy Programming.

 

ETL Tools – A misunderstanding

ETL Tools

They got their name based on the operations they are intended to do. Extract, Transform and Load are their primary operations.  They make our life easier by automating most of the things in creating or running the data warehouses. There are several such software that exist in market. Talend is an open source one whereas Informatica and Data Stage are leaders in this market.

Now.  What is the misunderstanding here?

  • Extract – This operation is supposed to be carried out on several source systems.
  • Transform – This operation is supposed to happen at the ETL tool itself.
  • Load – Obviously, this is supposed to be done at the target data warehouse.

 

But I have seen some systems that do the following.

  • extract the existing data from the target data warehouse (not from the source)
  • add the daily incremental data
  • do some transformation
  • truncate the existing data
  • load the new data back to the target

 

What is wrong here?

So many GBs of data are extracted from and loaded back into the data warehouse. Do you see the huge number of Disk I/Os here?  If we are able to load the incremental data and process the data within the data warehouse or better say the database itself, we can save the time spent for the disk input/output operations.  We do not need the ETL tool here. The database engine should be able to apply the transformation logic.  Database should not be used just for storing the data.  Database has powerful processing engines SQL or stored procedures.  We should know to use them correctly.

I wonder what would have made them to choose such a design approach here?

 

Happy Programming!

 

Binary to Decimal – Doubling

I really didn’t know that there is an easier way of converting from binary to decimal as shown in the wikihow article. The main advantage of this method is we don’t have to compute the powers of 2 (as shown in the positional notation method of the given article). Another advantage is that this method is generic and can be used for any base to decimal conversion.

Let us jump to the code now.

use strict;
use warnings;

sub bin2dec_doubling {
    my ($bin, $sum) = @_;
    $sum = 0     if not defined $sum;
    return $sum  if $bin eq '';
    bin2dec_doubling((substr $bin, 1), 2 * $sum + (substr $bin, 0, 1));
}

print bin2dec_doubling($ARGV[0]), "\n";

The code doesn’t deal with positional notation method but only with the doubling method. Also, it is pretty much easy to write the logic using procedural language like Perl (hardly 4 lines). What about SQL?

with recursive
binnum as (select '1011001'::text as str),

tmp as (
select ''::text as curdigit, cast(0 as numeric) as sum, 0 as level
 union all
select substr(binnum.str, level + 1, 1), sum(2 * sum + cast(substr(binnum.str, level + 1, 1) as int)) over (), level + 1
  from tmp cross join binnum
 where substr(binnum.str, level + 1, 1) <> ''
)

select max(sum) from tmp;

The above SQL snippet is tested on PostgreSQL 9.4 and yes it is not pretty formatted.
Hope you get some idea from this.

Happy Programming!

After a long time …

It has been almost a year since my last post. So many things have happened these days. My laptop OS (Ubuntu) got corrupted (so many things lost). That could be also be one of the reasons I didn’t blog. On my personal side, I got married.

Anyway, nowadays, I am working in Netezza database. Not much technically; it is just a plain old SQL queries; not even stored procedures. Apart from Netezza, I learn some other stuff like a bit of Cognos, a bit of Control-M, a bit of Agile, etc. and these things make me not too bored of my work.

Now, I’d like to add a minimum technical value to this post.

In Netezza, unlike Oracle or PostgreSQL, we can refer to the computed columns at the same level of the SELECT/FROM clause.

For e.g. in Netezza:


select date'2015-05-11' as dt, dt + 1 as dt2;

The same code, in Oracle, should be written as:


select dt, dt + 1 as dt2 from (select date'2015-05-11' as dt from dual);

This feature can be helpful some times. Happy Programming!

http://exercism.io

http://exercism.io

This is really a very good site where you can learn new languages, and learn new ideas from others’ codes, review and appreciate each others’ code, etc. This is really a great effort and I enjoy it. I am stuck with the initial exercise named “Bob” because it involves unicode as well. But Perl5 and Go languages do not have “Bob” as the 1st exercise. So, as of now, I have some breathing space. Let us see.