Tuesday, January 25, 2011

The Quick & Dirty Guide to Oracle Warehouse Builder 11g

Last week I found myself getting defensive with a colleague of mine in a conversation about the value of system prototyping. My colleague suggested that there's a difference between IT people and everyone else and that it's that two IT guys can look at a diagram and see how a system will work while everyone else needs to see the system in action. There is perhaps an element of truth to this but like most things in life I think it depends on the people involved (which is why I leapt to the defence of IT people everywhere as I don't like generalisations being made that somehow mark us out as different). Sometimes I can look at a diagram and make the imaginative leap to how a system might work. There are other times when I like to play with a piece of software or whatever to get to know it, and at the end of the day there's nothing like using a thing to understand how it operates.

Also last week I was discussing Oracle Warehouse Builder with someone. OWB (as it's known) is one of those systems that I've experienced but would like to get more experience of and it offers some functionality that might solve a nasty little problem many businesses suffer from which has bothered me for a while.

Speaking of nasty little business problems...

If you maintain any sort of business system, like ERP or CRM, then the chances are that over time all sorts of data quality problems occur in the database. Data duplication, dirty data, and difficulties with uploads from legacy systems are all variations on a theme of a data quality that plague businesses that maintain any kind of database system over an extended period of time. Most smaller organisations tend not to deal with these problems the way they should for a variety of reasons, mostly relating to cost.

When you look into dealing with data quality the costs can quickly mount up. Most firms would rather not have to deal with this rather messy business themselves so they look to outsourcing the problem which doesn't come cheap. If you decide to take a swing at the issue yourself then the cost of getting some heavy hitting software will put you off quite quickly. Dealing with data quality yourself is also going to take some time and needs quite a bit of expertise. That said, if you have the time and the inclination, there are some really good software options available to you.

Oracle Warehouse Builder is, as the name suggests, software for the purpose of creating and maintaining a data warehouse. OWB comes with all the tools needed to deal with the Extract Transform and Load (ETL) routines and can therefore deal with the data quality and mapping issues that come with warehouse building. These tools are useful not only for building warehouses but also for sorting out the poor data quality problems that haunt businesses everywhere and have been causing me grief of late.

I've been reading up on Oracle Warehouse Builder but reading about it only goes so far and I wanted to try it out, so it was time for a Quick & Dirty installation!


The Quick & Dirty approach to problems in IT is well established and considered such a valid process that it evolved into the Rapid Application Development methodology. In the case of trying out OWB I went for the quickest and dirtiest way of doing things.

Step 1: Operating System
I installed a Windows XP guest onto a VirtualBox virtual machine running on Windows XP. Doesn't get much more straightforward then that, and I wanted to see OWB in action, not an operating system so I decided to leave off the Solaris 11 or Linux installs for another time.

Step 2: Oracle 11g
Oracle Warehouse Builder comes bundled with the Oracle database, though it is licensed separately, so to install OWB you install Oracle. I downloaded a copy of Oracle Database 11g from the Oracle Technology Network and installed the Enterprise Edition just to be sure that I'd installed everything I needed. The OTN license allows for non-production development use of Oracle products so it's always the first port of call when you want to try out an Oracle product. However, a full (pricey) license is needed if that system is to be used commercially.

Step 3: OWB Schema Preparation
Oracle installs a couple of users for OWB but they're locked by default as a security precaution. In order to get OWB up and running the user accounts OWBSYS and OWBSYS_AUDIT, need to be unlocked and have their passwords set, a task best achieved via SQL Plus. There are also a handful of SQL scripts available that prepare the objects needed in the schema to run the OWB workspaces which can be run in SQL Plus while you're there.

Step 4: Repository Assistant
With the users operational the Repository Assistant can be run. The Repository Assistant is a GUI application that populates the repository and creates a workspace for you to conduct your Warehouse Building in as well as offering the ability to create users who own workspaces. The wizard is powerful and allows for all sorts of settings to be tweaked. I accepted the default of installing everything and found the process to be quite simple.

Step 5: Design Center
With OWB installed and configured its primary user interface is the Design Center application. In Design Center you can connect to sources of data and targets for that data. It's also used to configure Data Profiles and mappings that can transform data into the format you desire for your target database, usually the Data Warehouse. I set about a simple connection and loading exercise where I took data from a basic data source, an Excel file, and loaded into an Oracle table. Now, this is far from the best way of doing such a simple task in Oracle but I wanted to see OWB in action and I was interested in seeing how it would handle connecting to data in a format common to the smaller enterprise.

OWB comes with the ability to connect to a range of different databases via its built-in JDBC drivers. Connecting to an Oracle database is, as expected, a simple task and there are options for SQL Server, Sybase, and Informix systems, as well as an ODBC option which piqued my interest. The OWB Sources & Targets guide offered details on how to connect to ODBC options like Excel and while the procedure follows the standard Oracle approach to doing things (lots of file editing and connections into connections into the database you're trying to get to) it does make sense and isn't too hard to follow. It didn't take long to get to a point where data could be extracted from Excel, transformed into a format of my choosing, and loaded into an Oracle database.

My quick & dirty installation performed admirably and has provided not only an insight into how to setup OWB and begin construction of a Data Warehouse (albeit without the massive amounts of designing that would be needed a real world situation) but has also given me a better understanding of what can be done to reign in common data quality problems as the OWB Design Center provides tools for dealing with duplications, common Address formats, and for writing transformation rules that enable you to fix and map practically any data problem.

Data Warehouses in their truest form only really come into their own when there are a large number of systems in use all collecting data, preferably from all across a large organisation and the warehouse and its associated tools can consolidate that data into a meaningful store that can then be analysed as a whole. In the case of a smaller organisation the issues relate more to how to migrate data from legacy systems into new ones and how to deal with data quality problems, as ideally the number of core systems in use in the business should be relatively limited. For all the power that OWB can bring to solve these issues it's likely that the cost of licensing the software will be prohibitive for the types of companies that could get the most value from it.

While I had gotten a little defensive during that conversation last week, I was in favour of the idea to do some prototyping and at the end of a Quick & Dirty installation you realise just how much can be gained from the exercise, and how much can be lost if anyone forgets that the Quick & Dirty way is not the right way.

Links:
Quick & Dirty in IT: http://en.wikipedia.org/wiki/Quick-and-dirty
Oracle Warehouse Builder Installation and Admin Guide: http://download.oracle.com/docs/cd/B28359_01/owb.111/b31280/reqs01.htm#i1139371
Financial and Management Reporting: Data Warehouse vs ERP System: http://www.glassoc.com/resource_center_white_papers.php?id=4&msg=1


Monday, January 17, 2011

Really Remote Live Internet Video Streaming on a Budget

The I.T. industry is made up of lots of companies who declare themselves to be solution providers of varying sorts but I wonder how many times a prospective customer really goes to one of these firms and says "I have a problem" and how many times the provider actually has to go off and come up with a solution.

Not too long ago I was presented with an interesting problem that really needed a solution. And an inexpensive one at that.

A good friend of mine asked for my assistance with an I.T. problem he had. He farms for a living and was wondering if some modern technology could take some of the hassle out of his job, in particular the need to be out late or very early checking on livestock during lambing/calving season. He recently built a new shed for baby animal production and wanted to setup a CCTV type system that would let him keep an eye on the beasts over the Internet. We got to talking about his requirements and it quickly became obvious that an IP camera accessed via the Internet was the best fit for his needs. Of course, it's not as easy as simply buying one of these and turning it on, and due to the agricultural nature of the environment there were some unique challenges to be dealt with that you wouldn't encounter in an office setting.

Problem Number One: No Power.
That's right, the first issue to be addressed was that the shed is quite remote and used to house farm animals so there's no electricity supply. Any equipment that was to be installed needed to get power from somewhere therefore some research had to be done. I had a feeling that the answer lay in the caravan/motor home world so I took a look at how electrical power is delivered in those situations. A car battery or similar is obvious, but you don't just remove the plug from your kit and wire the battery up directly, instead a piece of equipment called an inverter (or more accurately, a DC/AC inverter) is attached to the battery and then provides power via a standard 3 pin plug. Inverters are used in caravans and the like to power electrical items like TV's, DVD players, lights, and so on, and are able to provide a fair bit of power over extended periods; running a router and a camera for days and days without charging is no problem to a setup like this.

Problem Number Two: No Internet
Like I said, the shed is off the beaten path so there's no phone line to provide nice and easy broadband Internet access. The quick solution to this is mobile broadband from one of the mobile companies and luckily 3G coverage on O2 and 3 Ireland was available. To be able to use a Mobile USB Broadband setup required a router with a USB port and capabilities to use 3G to access the Internet. The Zoom 4501 router was chosen as it is relatively inexpensive and has quite a good compatibility list for USB modems. 3 Ireland offer a Huawei modem that's on the list and it connected up the first time it was tried, though it did present a sneaky little difficulty later on.

Problem Number Three: DHCP
In my previous posting I discussed the problems with DHCP and broadband and accessing equipment like cameras over the internet without a static IP address. It was working on this project that I really learned about Dynamic DNS and its capabilities and for the most part DDNS worked perfectly. The only problem was with the default configuration of the USB modem which threw off the DDNS routing as it had its APN set to one of two options, and it was the second one I needed. A quick call to the good folk over at 3 Ireland tech support sorted this quickly.

With these challenges sorted out the solution was implemented. The equipment list is actually quite short and as solutions go it's simple, which is always the best kind of solution. A large car battery is connected to a DC/AC inverter to provide power to a router that uses a USB modem to connect to the Internet. Into the LAN port of the router is connected an IP Camera (with pan, tilt, and night vision capabilities) that also gets its power from the battery. The camera has a small webserver on board to enable access to the imagery, controls, and settings. The router has port forwarding enabled so that requests from the Internet are serviced by the camera. The router is also connected to a DDNS provider so that, even though the WAN IP address changes, users can access the camera via a URL.

Doing a small project like this is an absolute joy as there's a surprising amount to be learned along the way and there's a great kick to be gotten from seeing it work - which it does! I was concerned that there'd be major problems or that the end result would be of too low a quality to be useful but luckily it turned out fine.

Like any first attempt at something there are some things that can and will be improved. On the hardware side the camera is an entry level model so it would be the first target for an upgrade in order to improve quality and add in the ability to zoom (a function that adds quite a bit to the price) especially as now I know how the whole thing goes together and therefore the risk of failure is greatly reduced. I'd also change the model of router to one that had the 3G modem built in so all that was needed was a SIM from a broadband provider. The things I'd add in terms of hardware is a small solar panel to charge the battery continuously and a decent IR lamp to better illuminate the shed for night vision. In terms of software, or more accurately configuration, I'm planning on looking at what benefits can be gained from tweaking QoS settings on the router.

All too often I.T. ends up being an over-engineered solution to a non-existent problem. Thankfully this time the problem was real (and therefore worth solving) and the solution simple and inexpensive.

Thursday, January 13, 2011

Dynamic DNS or "New Adventures in Old Routers"

Paracetamol is a wonder drug. Whenever I get the man flu (which can kill!) I have found that plain old paracetamol works far better than the various cold and flu remedies offered at outrageous prices at the local pharmacy. The way that particular medicine can deal with a man flu and the crippling symptoms that come with it is truly amazing. While many think that the age of wonder gave way to the age of reason a long time ago, I still find wonder in how things work, be that painkillers or computer networking systems.

Dynamic Host Configuration Protocol (DHCP) is a wonder. I've been impressed with it since I was a young I.T. Technician putting together my first Windows NT 4.0 network back in the day. DHCP overcomes the hassle of having to manually assign IP addresses to every node on a TCP/IP network by doing it automatically and provides a handy way of telling those nodes (computers, printers, or any networked device) various details of the network, like how to get out on to the Internet or the addresses of certain servers or whatever. DHCP is a neat solution to a big problem for Systems Administrators.

As networking has developed to the point where we have broadband speeds and routers instead of modems in our homes DHCP has once again stepped up to the mark and is providing broadband service users with IP addresses that identify them on the Internet. In this way the broadband companies are able to service all their customers even though they have a limited number of public IP addresses at their disposal by only assigning an IP address to an active router and only for a limited amount of time; every now and then your router will need to renew the lease of its IP address or be assigned a new one.

DHCP has been used by broadband providers as the number of IP addresses available to them is nearly exhausted and they are unable to give a unique IP address to every router on their network. Using DHCP in this way has the effect of changing the IP address of your router every now and then which is normally not a problem, but what if you wanted to take advantage of those broadband speeds to do something other than surf the Web? What if you wanted to access your computer remotely, or what if you wanted to share some files with friends or colleagues via a personal web server or ftp server, or (and this is increasingly common) what if you wanted to set up a web cam to keep an eye on something at home which you could access from work? How do you access your home or other broadband system if you don't have a fixed IP address and how could you ever know what the address will be from day to day?

There is a way, and it is called Dynamic DNS. Most routers come with the facility to use Dynamic DNS (DDNS) built in and it is this system that allows you to access your network via a standard broadband setup without the need to know the IP address at any given time. I've been playing with DDNS for a while now and let me tell you, if DHCP is a wonder then Dynamic DNS is some kind of sorcery!

The simplicity of how DDNS works is what makes it so great. The standard flavour of DNS (Domain Name System) is used to translate URL's (like hackerscoven.net) to IP addresses and it does this by maintaining a database of IP addresses and URLs. When you request a website via its URL, some DNS server somewhere checks its database, retrieves the IP address, and then sends you there. What DDNS does is map a URL to an IP address in the same way but also provides a mechanism to regularly update the IP address whenever it's changed by your broadband company.

Most routers come equipped with the facility to register with a DDNS provider, one of the most popular being DynDNS.com who offer a free DDNS service, and when I set this up in order to access a test webserver of mine I was surprised not only by how easy it was to get going but also by how fast the service operates, my little webserver responds nicely whenever I access it, though to be fair it's under no load at all really as it's only me using it.

DHCP solved a big problem for networking people and it did it well. DDNS solves a major problem for anyone with a need to access a computer remotely in an easy manner and the chances are the equipment you already have is able to use it. For testing webservers or sharing files, DDNS offers all the functionality without the cost of domain name registration, though the security burden is yours alone and can be quite high.

Tuesday, January 11, 2011

Hotmail & the iPhone

When I was in college our Programming lecturer cautioned against a career in I.T. due to the need to keep up with the constantly changing technologies such a career entails. Personally, I thought that sounds brilliant, always something new to learn and play with, bring it on, I said. Later on in my education I encountered other lecturers who said that yes, keeping up can be difficult as well as fun but there are some fundamental skills you can learn that transcend specific technologies (which is why good developers are able to learn several languages as long as they have a good grasp of the principles of programming).

I.T. is constantly changing and with that pace of change some technologies come into and slip out of fashion quite quickly. As technologies like smartphones and other internet devices become accepted by wider audiences than traditional I.T. people, fashion plays an increasingly important role. For example, the iPhone.

I've been using the iPhone for over two years now, having started out on a 3G and moving to the 3GS (I'm now waiting to see what Apple do next before having an "accident" that will justify a new phone). I do not feel like much of an Apple fanboy and have endeavoured to be quite critical of the iPhone in order to ensure that I wasn't just following the trend but was actually getting good tech for my money. My critical analysis of the iPhone has left me feeling that the best has yet to come especially as there's no escaping the fact that, really, it's not a great phone (that is, a device for making phone calls), but that it does do a lot of different things well that when put together make it the best choice.

I've looked at and used other smartphones, having been a Blackberry user before the iPhone, and I've implemented HTC handsets at work as there are some serious concerns about using iTunes (a necessity for the iPhone) in a multi-national corporate environment. I've seen a nice Android handset recently and I would like to see what Android can do differently/better then the iPhone. But for the time being I'm sticking with the boys from Cupertino.

As I'm on the 3GS I find myself increasingly getting into conversations with iPhone 4 users about the merits of that version of the phone and the challenges it has faced since its launch (the dodgy reception problem that's become known as Antennagate for example). 4 users, perhaps subconsciously, sometimes look down on 3GS users, and we in turn sneer at 3G users, and they mock Nokia users. This has little to do with technology, often nothing, as the users are rarely techies. This is powerful marketing at work that has made the latest slick device from Apple a fashionable thing to own.

Fashion in technology manifests itself quite graphically in email providers too. While I was at college, trying to ignore lecturers giving careers advice, I set up my first free email account with Hotmail. I've had a Hotmail account since February 1997, which was before Microsoft owned it and back when it ran on Unix servers. There have been other email addresses but for the sake of history and to maintain some "really early adopter" cred I've held on to the account.

There is a prejudice against certain email addresses. Obviously, addresses that are made up of some nickname are not going to fly if you're trying to pass yourself off professionally but the provider you choose, it is claimed, reveals something about you too. Worryingly it can be people who might be hiring you for a job that can subscribe to this notion and may automatically reject you for a high-tech position if you come with an old email address, or so it is suggested on some websites.

http://lifehacker.com/5445607/whats-your-email-address-prejudice
http://lifehacker.com/5447335/know-what-your-email-address-says-about-you
http://theoatmeal.com/comics/email_address
http://articles.chicagotribune.com/2010-01-10/news/1001090192_1_e-mails-aol-gmail-account

This is a tragedy as far as I'm concerned as I like my Hotmail address and have found the latest Hotmail integration with the iPhone to be excellent. Up to now getting email on the phone hasn't been a problem, but I really wanted push notifications (easy-peasy for Hotmail) and Calendars/To Do reminders to synchronise with the phone and the account, which are now working perfectly since you can configure the Hotmail account to act like an Exchange account on the iPhone.

The debate about email addresses and what they say about you leads me back to what my slightly more enlightened lecturers taught about good fundamentals. While @hotmail.com might not be the most desirable address in the world, and while Microsoft hasn’t had the best track record running Hotmail, as an email platform it's not bad and it does allow for good integration of other accounts. So, while I might be rocking a Gmail account and will certainly sign up to the next trendy service provider who comes along, I can manage it all from my dinosaur of a Hotmail account.... and access it from my iPhone (whenever I get a signal).