Author: tom
-
Oracle OBIEE
Recently, I came across OBIEE. This is an Oracle tool that is created to support OLAP – type investigation into a database. I have to confess: I am impressed with it. The first great thing for me as an analyst is that it closely integrated into existing Oracle tool. One may continue to use the…
-
Generate table in Oracle
I regularly happens to me that I want to generate a random set of records in an Oracle table. That could happen if we want to assess performance of a certain procedure. Or (other example) if we want to estimate the size of a table. A great thing of Oracle is the random function where…
-
Sqoop and Hive
It is possible to use Sqoop to directly load from a RDBMS into Hive. This opens interesting possibilities. Data that are stored in a RDBMS and that need to be analysed on a cheaper platform, can be migrated via Sqoop to a Hadoop platform. Sqoop is generaly seen as a reliable medium to undertake such…
-
Hive – mapreduce extension
It is good to realise that Hive is built upon a mapreduce framework. The idea is that Hive is developed by facebook to facilitate analysis on Hadoop files. It is possible to use some kind of a SQL dialect in stead of a Python or a java programme to do your analysis. When a Hive…
-
Python: yet another way to implement map/ reduce
In this blog, I will discuss the word count problem as done with Python. It is often used to show how map reduce works. In most examples, it is developed within the context of a Java programme. The idea is that the programme is split into two stages. In one stage, calculations are made on…
-
ODBC en Hive
In my view, the new development that we see now is building links to a Hadoop platform. One such development is building ODBC drivers that allow windows tools to access a Hadoop platform. An an example, one may think of Excel accessing tables on Hive. Think for a second on the possibilities: one may…
-
Pig revisited
Recently, I revisited Pig. Pig is a language that allows you to analyse data sets in a Hadoop environment. It is created at Yahoo to circumvent the technicalities of creating a MapReduce Java job. Yahoo claims that most of her queries on a Hadoop platform can be replaced by a Pig script. As Pig is…
-
Oops how much tablespace is left?
A few days ago, I was asked to load some tables in Oracle. A rather trivial question but I wasn’t sure if enough tablespace was left. From the table definition, I came to know what tablespace was used. After that I ran below query to see how much tablespace was actually left. I want to…
-
Sqoop
Sqoop is a tool that allows you to ship data from a RDBMS to a Hadoop platform. Let us take an example to clarify this. One may have some data in a MySQL table persons, within database thom. This database is stored on server 62.51.51.999. The data can be accessed with the knowledge of the…
-
pushing files via Netcat
Netcat is a utility in unix to investigate network connections. It has now been ported to windows and it allows us to query network connections on a windows platform with netcat (nc). A nice possibility is to push files via nc from one machine to another. Assume for the moment that both machines have netcat…