Big Data, Big Problems: Leveraging Informatica 9.5 to Build an Effective Data Governance Strategy to Meet the Big Data Challenge

Big data is something that I am continually asked about by clients, as the subject continues to gain significant press. While discussing this topic, I often address it from the angle that bigger data volumes will result in bigger data problems. Although this seems like a logical premise, the reality of what it really means to an organization and how to plan accordingly is what is often overlooked. Rather than solve the problem in this blog post, I want to focus on two key considerations from a data governance standpoint, as well as discuss why SSG sees the Informatica 9.5 Platform as a core component of a sound data governance strategy that can ensure an organizations’ business decision-making success.

Regardless of the amount of data within an organization, the same type of problems around the integrity of the data and the need to put policies, procedures and an organizational structure in place to address data governance is important. The difference is that the size and scope of data issues (and how to resolve them) will become magnified as the volumes of data increase. In addition, the type of data often varies in a big data scenario, with a significant amount of unstructured data being available for consumption into an organization.

When discussing the importance of data governance with clients, as well as the impact big data can have on this strategy, I often start with two core tenets:

  1. What is your starting point?– Before you can determine where you would like to go with a data governance initiative, you need to understand the starting point. The first key question to understand is what is business objective/goal for considering a governance initiative? If you don’t have a clear purpose, the initiative is likely to steam over time.Once you understand the objective, other questions to ask include: Do you have an understanding of the quality of data within your organization? Are tools in place to resolve data quality issues and report on them? Also, what level of support do you have from senior management to embark on a data governance initiative? The intention of these types of questions is to determine if data governance is just a buzzword, or is it something that could gain traction if a plan is put into place within the organization.
  2. Don’t Boil the Ocean – This phrase refers to having a realistic game plan for the governance initiative. Although data governance should be adopted throughout the entire organization, starting at a project level will allow you to pilot the processes, organizational changes and/or technologies that have been selected as a part of the data governance initiative. This also goes back to point #1, as we need to understand our starting point in order to set realistic expectations when building a data governance organization based on the current maturity around the topic within the company.

How does the Informatica 9.5 Platform help?

Informatica 9.5 is focused on maximizing return on data within an organization. If we consider some of the capabilities provided in Informatica 9.5 along with the need to develop a data governance strategy within an organization, we can make sure your company is ready to address the big data challenge. Just remember my earlier point – the concepts around data governance are the same, it’s just the data volumes and type of data that are going to be different.

With Informatica 9.5, companies can increase the value of data to their organization while also lowering the cost of data. If we consider these points from a data governance standpoint and the big data challenge, some of the benefits that Informatica 9.5 provides to an organization include:

Increasing the Value of Data:

  • Relevant data: The larger the volumes of data, the more important it is to determine which information is relevant to your business. Just having the ability to access Facebook and Twitter data doesn’t make it relevant, as you need to have a purpose for this information. Having data governance policies and standards in place are critical, as you need to determine how you plan to use the additional data feeds and who will be responsible for making decisions based on the new information is critical. Informatica 9.5 provides new capabilities such as Natural Language Processing (NLP) and Social MDM, which helps organizations further analyze and integrate social data. Having data governance policies in place can provide the framework to decide how this information can be leveraged in your organization’s business activities.
  • Timely data: Having policies in place that determine how you will bring real-time data feeds into the organization and who will be responsible for using the data (and the timeframe for leveraging it) will be the responsibility of the data governance organization. From a software perspective, data streaming capabilities in Informatica 9.5 will help your organization capture data in a timely fashion. The data governance organization needs to ensure you have the processes/policies in place to do something with it before the data is no longer relevant.

Lower Cost of Data:

  • Business costs considerations: I have seen a number of organizations make bad decisions as they do not have the right software technology and policies in place within the organization to know that their data is “bad”. Through the use of Informatica 9.5, your organization will have increased visibility into data issues through data discovery, and the ability to resolve the issues in a more governed process through data stewardship workflows.
  • Labor costs considerations: A typical scenario within organizations involves teams of people performing manual intervention to resolve data issues. By leveraging Informatica 9.5, you can automate the discovery of data issues across hundreds of tables or sources at once. As you migrate from a manual process of data review, the team members within the organization previously responsible for these activities can be leveraged in other areas more critical to the success of the business. In addition, from a governance perspective, the ability to catalog issues and build business glossaries will provide reusable templates and rules that can further reduce costs within the organization.

As your organization looks to address the big data challenge, you need to ensure that you have the foundation in place for ongoing success. Although a data governance strategy addresses the people, processes and policies required, you also need a software platform that can enable these processes. Informatica 9.5 provides the software platform required to quickly jumpstart your efforts in order to ensure your data governance strategies are executed as planned. Remember this mantra – think big (data) but start small to build the data governance framework needed for ongoing success.

Posted in Data Management, Informatica, Master Data Management, SSG | Tagged , | 1 Comment

Handling Special Characters (i.e. Foreign Languages) in BRM

Some BRM installations require special characters to be displayed on invoices and other purposes.  This can pose additional problems for those not familiar with the different character sets.

This can lead to all kinds of confusing issues as any of the following may be using differing character sets:

  • BRM Database
  • Unix Shell
  • SSH Program used to connect to Unix OS
  • Windows Operating System

By default, Solaris and Windows will use a variant of the ISO-8859   character set, while the BRM database is typically UTF-8  .

This will lead to problems if you try and copy/paste text with special characters from your windows box to a file that will be loaded to the database, such as testnap files or localized strings configurations.  If you load ISO-8859  strings to a UTF-8  database, any non-ascii characters will not display correctly.

Once you recognize the differences in the character set, you can make accomodation to translate between character sets when needed.

Unix systems provide a command line application called “iconv” that can be used to translate files between different character sets.  For example, converting a testnap script file from ISO-8859  -1 to UTF-8   the following command could be used:

iconv -f ISO-8859-1 -t UTF-8 iso8859_file.nap > utf8_file.nap

Once the file is in UTF8 format, the characters will not view normally in a console that has ISO-8859  -1 as the character set.  You must switch your shell and SSH program to the desired character set in order to view UTF-8   correctly should you wish to do so.

In Solaris 10 this is done by setting the LC_ALL environment variable as such:

LC_ALL=en_US.UTF-8
export LC_ALL
Posted in BRM, SSG | Tagged , , , , | 1 Comment

Setting up BRM Shared Memory Segments on Solaris 10

This blog entry deals with setting up the shared memory and other system resource pools required by BRM when installing on the Solaris 10 platform.

The PIN user on the operating system must be set up to use operating system shared memory above the Solaris 10 default settings.

There are 2 ways to achieve this:

  • Set shared memory parameters in /etc/system (deprecated)
  • Use of Solaris Projects

Modifying /etc/system

This method requires a reboot of the operating system once the changes have been made.

The following lines should be added to /etc/system:

set shmsys:shminfo_shmmax=1073741824
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=250
set shmsys:shminfo_shmseg=100
set semsys:seminfo_semmni=750
set semsys:seminfo_semmns=75000
set semsys:seminfo_semmsl=100
set semsys:seminfo_semmap=75000
set semsys:seminfo_semmnu=75000
set msgsys:msginfo_msgmap=75000
set msgsys:msginfo_msgmax=6144
set msgsys:msginfo_msgmni=640
set msgsys:msginfo_msgssz=64
set msgsys:msginfo_msgtql=640
set msgsys:msginfo_msgseg=32768

Solaris Projects

Solaris projects is the indended replacement for the /etc/system modifications going forward.  Shared memory for a given user may be controlled via use of projects without a system reboot, but the user must log in and log out in order for any changes to take effect.

projadd -c "pin" 'user.pin'
projmod -s -K "process.max-sem-nsems=(privileged,256,deny)" 'user.pin'
projmod -s -K "project.max-shm-ids=(privileged,256,deny)" 'user.pin'
projmod -s -K "project.max-shm-memory=(privileged,4294967296,deny)" 'user.pin'

Once the project has been added, log in as the pin user to make sure the changes have taken effect:

prctl -i user.pin
Posted in BRM, SSG | Leave a comment

Installing BRM Client Tools on Linux x64 Platforms

Recently, I have been setting up a new Fedora 16 x64 BRM/Web development environment (need more than 4 gig of ram!).  I ran into an issue while installing the DeveloperCenter application provided for linux platforms.  Included here are the hoops had to jump through to get it working.

Install the client tools as per the BRM documentation.

By default the application will be installed in /opt/portal/$BRM_VERSION/$APP_NAME.

Each application comes with its own JRE version, which is 32 bit and does not work with x64 libraries (X11, etc).

The following error will occur if you do not update the JRE:

[allan@fed64-devel ~]$ Exception in thread "main" java.lang.UnsatisfiedLinkError: /opt/portal/7.4/DevCenter/jre/lib/i386/xawt/libmawt.so: libXext.so.6: cannot open shared object file: No such file or directory
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1751)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1647)
    at java.lang.Runtime.load0(Runtime.java:769)
    at java.lang.System.load(System.java:968)
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1751)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1668)
    at java.lang.Runtime.loadLibrary0(Runtime.java:822)
    at java.lang.System.loadLibrary(System.java:993)
    at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:50)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.awt.NativeLibLoader.loadLibraries(NativeLibLoader.java:38)
    at sun.awt.DebugHelper.<clinit>(DebugHelper.java:29)
    at java.awt.Component.<clinit>(Component.java:545)

In order to prevent this, we must remove the installed JRE and provide a symbolic link to our platform specific 64 bit JRE.

For this example, JAVA_HOME is

JAVA_HOME=/usr/java/jdk1.6.0_30

The JRE provided with this JDK is $JAVA_HOME/jre.  Therefore to solve this problem:

su - root
... (enter password, etc)
cd /opt/portal/$PORTAL_VERSION/$APP_NAME
mv jre jre.old
ln -s /usr/java/jdk1.6.0_30/jdk jdk

At this point when you start the application, it should work. Now to make the application easier to start, we can add a symbolic link from /usr/local/bin to the start script for the application:

cd /usr/local/bin
ln -s /opt/portal/$PORTAL_VERSION/$APP_NAME/$START_SCRIPT.sh $APP_NAME
An example for the DeveloperCenter would be:
ln -s /opt/portal/7.4/DevCenter/DevCenter.sh devcenter
Posted in BRM, SSG | Tagged | Leave a comment

SSG on Tap event at Oracle Open World 2011

Steve Stienhimer and SSG held our second annual SSG on Tap event at Open World 2011. This event was held at the fabulous Larkspur Hotel in San Francisco.  Steve explained the complexities of the beers while food that complemented the beer was served at the same time as each tasting.  The following combinations were served:

Pyramid, Haywire Hefeweizen:  Paired with Roasted Beets and Goat Cheese

Blue Moon:  Paired with Roasted Beets and Goat Cheese

Pilsner Urquell: Paired with Crostinni with Tuna Tartare

Sam Adams Oktoberfest: Paired with Spiedini of Jerk Chicken

Mendocino Bewing, White Hawk I.P.A: Paired with Pizza with Peppers and Italian Sausage

Sierra Nevada Stout: Paired with Slider of BBQ Pulled Pork

Big thanks to all in attendance and to the Hotel staff for a wonderful evening.

Posted in SSG | Leave a comment

Communications and Media Reception

The OOW2011 Communications and Media reception was held on Monday night and fun was had by all.  As you can see by this picture of Jason Anderson and Steve Steinheimer!  We think Jason’s light up glasses suit him. What do you think?

Posted in OOW | Leave a comment

SSG Makes Laynards Cool at OOW11

SSG had a great time at Open World 2011 this year.  If you were there, you may have noticed our “Talk nerdy to me” lanyards that were passed out. There were winners of a Kindle, Flip Cameras, Starbuck’s gift cards and Amazon gift card; and all they had to do was wear a cool lanyard!

Jon and Jason modeled their lanyards at the welcome reception.

 

Our lanyard models
Posted in OOW | Tagged | Leave a comment

Informatica World 2010 – A Quick Recap

Paul Scott and I recently attended the Informatica World Conference in Washington DC.   Here are some highlights from the event.

This is the first World conference  since 2008.  There were about 1300 attendees and 48 countries represented.

The keynote speaker was Richard Clarke.  He is the former White House Counterterrorism Czar, serving under three different presidents.  He spoke about Cyber War, Cyber Crime and Cyber Espionage.  The last, espionage, is the biggest threat  and most undetected.  About 70% of espionage is undetected.

Clarke also mentioned the stuxnet worm and its implications.  Since that time there has also been a news story about Cyber Espionage which makes his points even more relevant!

There were dozens of great sessions offered at the conference by both Informatica, partners and customers.  Some of my favorites were:

  • Master Your Data and Master Your Business using MDM
  • Informatica Data Quality v9
  • What’s New in PowerCenter and PowerExchange v9
  • Lean Integration
  • Velocity Methodology: Best Practices
  • Top 10 Implementation Best Practices for MDM

My unscientific poll indicates that the most popular acronyms were MDM, ILM, DQ, B2B,CEP, SaaS, EDM, SOA, ERP and CRM.

I also learned some new terms at the conference, click on the following links to find out more.  HADOOPCEP. Lean Integration. Ultra Messaging.

Several new products and topics were popular at the conference!

We both took advantage of the Hands-On-Labs that were offered for new products.  And lastly, we were also able to meet with several key members of Informatica’s executives and were invited to participate in an Informatica event scheduled for January.

To wrap up, here’s  a nice graphic showing Informatica’s Platform as it stands now.

Posted in SSG | Tagged , , , , , , , , , | 1 Comment

Our Focus on Data Management and Data Integration

Many of you know that SSG is an Oracle partner and that we support many clients who rely on Oracle’s Billing and Revenue Management (BRM) product.  We have worked with BRM for over 13 years and have some of the most knowledgeable BRM experts in the industry.

What you may not know is that we are also an Informatica partner and that we have a growing Data Integration and Data Management practice based on years of experience with Enterprise data.  Our focus is a different from the typical Business Intelligence (BI) services that are common in the marketplace.  Even at SSG we’ve struggled to define what our overall focus is, in this area, since BI is only a portion of what we do.

The term “BI” focuses on building and managing Data Warehouses or Data Marts.  We can do BI, of course, but there are many other data management needs that we focus on, and which are just as strategic.  All of them have been driven by challenges that our clients have  faced.

Some of those challenges have been driven by these questions.  How well are we:

  • Sharing data between our systems in a timely manner? (Data Integration)
  • Ensuring we have all the information about Customer/Vendors/Products/Accounts in one place? (Master Data Management)
  • Monitoring the quality of our business data so it can be trusted? (Data Quality)
  • Sharing the right data with the right permissions with partners, vendors and customers? (Data Federation)
  • Having an Enterprise Strategy for managing our data more cost effectively? (Data Governance)
  • Managing the growing volume of historical data? (Information Life Cycle Management)

For almost 20 years, our staff has worked with client data in all of these situations.  Even before the acronyms and buzzwords, we were providing organizations with solutions to help them better manage their critical data to grow their businesses and reduce costs.

With that in mind we have now strategically partnered with Informatica to extend our capabilities.  (Ironically, we have leverage Informatica in the past to implement solutions but never taken the next step to partner with them…)

We are excited about this because Informatica helps us better serve our clients who want to get the most from their data and control costs.  Companies spend more than they realize managing their data (or suffering from the lack of data management) and their are great solutions to remedy that.

Informatica is laser-focused on Data Management and all of its complexities.  Regardless of where your data resides, they help streamline the discovery, assessment, cleanup, organization and distribution of your information using the KISS principle.

In my next few posts I’ll review the data management challenges, listed above, along with how SSG can address them more effectively with Informatica solutions.

Posted in 10g Database, 11g Database, Data Integration, Data Management, Data Quality, Database, Information Lifecycle Management, Master Data Management, Oracle | Tagged , , , , , | Leave a comment

OOW 2010

Wow! There are 41,000 attendees at Oracle OpenWorld this year. And I think they’re all trying to attend my sessions.

Posted in SSG | Tagged | Leave a comment