nice to know – Page 6 – Data, data en nog eens data

Reading and writing in Java

Jun 22, 2015

—

by

Reading and writing from and to files is not easy in Java. This can already be seen if one simply googles on “Java Filereader problem”. This generated 477000 hits. Apparently, reading (and writing) is not trivial. I wrote a small program that is able to read a file and copy its contents to another file.…

Another set of keys and values

Jun 16, 2015

—

by

tom

in nice to know

Another example on how mapper and reducers are used in a Hadoop context is given below. This programme is created as three classes. One class is an overall class that calls two other classes: a mapper class and a reducer. The mapper classer reads a file and creates a series of words. In the first…

Hive – connecting from SQL Developer

Apr 28, 2015

—

by

tom

in nice to know

In my impression, the big development that takes place now in the world of Big Data is the creation of connectors. Such connectors enable us to continue using standard tools (R for example) with the data being stored in Hadoop. I am very much impressed with Hive. Hive allows us to access data being stored…

Use Case, Business Events and Time Events

Oct 5, 2014

—

by

tom

in nice to know

In a previous post, I showed the context diagramme. I then continued by saying that each of the arrows that flow to and fro the bubble in the middle can be translated as use cases. But one may take a slightly different view: each of these arrows are either business events or time events. A business…

Access odbc

Sep 21, 2014

—

by

tom

in nice to know

Just a little note. I discovered that one may access the 32 bit ODBC drivers on a 64 bit platform via: C:\Windows\SysWOW64\odbcad32.exe and the 64 bit ODBC drivers via: C:\Windows\System32\odbcad32.exe This may be handy if one must access a 32 bit application (like Access 2007) via ODBC in a 64 bit environment. The standard ODBC…

Teradata: what columns to choose as Primary Index?

Aug 26, 2014

—

by

tom

in nice to know

A major question with Teradata tables is what columns to choose when a primary index must be created. I understand that 4 different arguments might be used: 1: Is the column often accessed? If it is often accessed, best usage is made of the distibution of records over the different amps. 2: Is the column…

Teradata and fall back

Aug 19, 2014

—

by

tom

in nice to know

I understand Teradata has a so-called fall back option. The idea is that data are stored twice; each record being stored to two different amps. I saw a nice picture that descibes the situation. Each record (Row 1, Row 2 etc) are linked to an amp. It is shown in below scheme as yellow boxes.…

Dynamic SQL

Jul 17, 2014

—

by

tom

in nice to know

I must confess I never heard of Dynamic SQL. But suddenly around me everyone started talking on Dynamic SQL. What is dynamic SQL? Dynamic SQL is a SQL statement that is created at runtime. Such is the definition. I realised that I had already created quite some dynamic SQL without realising that such construction is…

Duplicate records in teradata tables

Jul 3, 2014

—

by

tom

in nice to know

Teradata offers the user the choice whether of not a check is made on duplicate records. Let’s first look at some code that allows duplicate records to be inserted. The code below has two elements that enables duplicate records: it contains a primary index that is not unique. the table is a multiset table. CREATE…

Different codepages in SQL Server

Jun 23, 2014

—

by

tom

in nice to know

It is possible in SQL Server to create columns in a table that have different codepage. This can be shown in the table below. That table has two attributes: char_unicode with a unicode codepage and char_latin that has a Latin codepage with West European characters only. The char_latin cannot contain characters like ł, İ or…

Category: nice to know