Monday, June 30, 2008

Google Trends for Websites and Ad Planner

Google has recently introduced two new tools in the traffic measurement and advertisement planning / targeting space: Trends for Websites and Ad Planner. Ad Planner is currently by invitation only for agencies and advertisers but is not going to cost anything (free!). TechCrunch has covered Ad Planner here.

There is a lot of speculation around how and where Google is getting this data from. Traffic analysis bucketed by things like gender, age, household income, education background are potential gold mines for advertisers. Google claims it is using the following sources of information:
- Aggregated Google search data,
- Opt-in anonymous Google Analytics data
- Opt-in external consumer panel data, and other third-party market research.

Google Toolbar is conspicuously missing from the above list.

Anil Batra writes about some of these issues in his Web Analysis (Analytics), Online Advertising and Behavioral Targeting blog. I asked him a couple of questions there that he has replied to. Worth reading if this topic interests you.

Wednesday, June 18, 2008

John McCain

Would you really want to have a President in the White House who has not cared to learn how to use a computer?

http://www.motherjones.com/mojoblog/archives/2008/03/7743_john_mccain_doe.html

Thursday, May 29, 2008

MapReduce vs. RDBMS vs. 'new'DBMS

The MapReduce vs. RDBMS vs. newDBMS debate wages on ...

See my comment on Anand Rajaraman's blog here

Tuesday, May 13, 2008

Mozilla Firefox plugins (Ubuntu)

Last year, I started to use Ubuntu 7.10 on a DELL inspiron 1420 and recently ran into problems around firefox plugins and thought it'll be useful to share these experiences here.

1) First of all, there are a few different folders for the plugins (based on how mozilla and firefox got installed):

- /usr/lib/mozilla/plugins
- /usr/lib/mozilla-firefox/plugins
- /usr/lib/firefox/plugins

It is important to make sure that you know which folder your browser uses. One way to tell is by typing about: plugins in the browser address bar and checking which plugins are listed (they should correspond to the so files in the folder). In my case it was /usr/lib/firefox/plugins

2) Secondly, I was having problems not being able to run certain java applets (especially the ones for web conferencing from webex, yugma etc.). Their system requirements indicated that I should have JRE >=1.5 installed as a plugin for firefox. I did that by doing the following:

cd /usr/lib/firefox/plugins

sudo ln -s /usr/lib/jvm/java-1.5.0-sun/jre/plugin/i386/ns7/libjavaplugin_oji.so .

and I then restarted the browser.

However, these applets still refused to work (bad error reporting though: they kept saying Java was not "enabled" even though it was enabled in the Edit/Preferences/Content tab of the browser preferences).

I then noticed that there was another link to another java plugin sitting in that folder: libgcjwebplugin.so. I removed that link, restarted the browser and was able to run the applets I wanted to run!

Wednesday, April 2, 2008

Advances in Data Warehousing

As the size of the data is growing and as the enterprises are expecting deeper analysis from larger historical volume of data, there is a clear need in the market to provide solutions that can provide efficiencies across various dimensions in Data Warehousing implementations:
1) Speed to query the data
2) Space to store the data
3) Hardware costs and reliability
etc.

Netezza has addressed the speed issue by throwing (mostly) hardware and hardware architectures at the problem.
Then, there is another model - that of "cloud computing" - that Google Map/Reduce has embraced and other open source implementations of Map/Reduce such as Hadoop are following suit that utilizes parallel computing across commodity hardware.

However, I've been intrigued by the advances in the fundamental data storage and data access technologies for OLAP / data warehousing style applications - especially around "column based databases". Column based databases are not a new concept (Sybase has been offering Sybase IQ for years now) - but some of the upstarts such as Vertica are showing how this fundamental database technology can be packaged and marketed to change the data warehousing and BI technology landscape altogether.

Vertica's database solution seems promising and the following features stand out:
- Column based storage (fast query processing compared to row-based query processing - at least 20x improvement for selects across standard star schemas with fat fact tables).
- Compression of the data (10x compression is easily achievable).
- Shared Nothing architecture (built-in redundancy and parallelism) across commodity hardware nodes - with any kind of storage such as SAN etc.
- Continuous loading architecture. One can write into the warehouse while one is reading from it concurrently (as long as the current epoch is not being read).
- Applications can use SQL (no MDX required).

Wednesday, March 19, 2008

Flex Builder

Adobe recently released Flex 3 with some interesting twists:
- Flex 3 SDK itself has been open sourced and is also free.
- FlexBuilder (the eclipse based IDE) is not free and comes in 2 flavors: Standard and Professional.

The standard version costs $249 (reasonable) but the Professional version costs significantly more $699 ($450 more than the standard version!!). The Professional version gets you the features from the standard version, along with the datagrid and charting components for data visualization, plus the testing and profiling tools.

I like that the Flex SDK has been open sourced. Adobe is definitely embracing the developer community and will certainly reap the rewards. However, I don't agree with how Adobe has bundled features (components and controls) along with the profiling and testing tools into a single product. Why should I have to buy the profiling and testing tools if I just want to use the controls? I understand that the richer controls are more complex and therefore should be priced more but why couldn't they give me just the data visualization controls for an additional $99 instead of having to pay $450 more for the tools I am not going to need? In this day and age when the industry is talking SAAS and paying by the feature, this is clearly a setback.

BTW, Flex Builder for Linux is on a different schedule and track and is still in alpha but (thankfully) is available as public alpha at Adobe Labs. No word on how they are going to price this yet.

Tuesday, March 18, 2008

YUI overlays on Flash under Firefox/Linux

I recently ran into a known problem of not being able to overlay DHTML on top of a Flash object in a web page. This problem was fixed in the flash player by the introduction of the window mode (wmode) which is described in detail here. The fix involves changes to both the flash player and the browser. However, this fix is currently not available in firefox on linux - the environment I am currently using and the one I need to support.

One of the ways to work around this is by placing the div on top of an iframe in the z-index order and keeping z-index of the iframe positive. This is illustrated in this example here. Use view source to take a look at the source.

The same idea and concept has been implemented nicely in YUI toolkit's Overlay widget.

This example does not work properly in firefox on Linux when there is a flash movie under the calendar widget (when the button is clicked, the calendar does not overlay on top of the movie but goes under the movie instead).

I was able to add the following methods in the YAHOO.util.Event.onDOMReady method of this example (after the call to calendar.rende()) to fix this issue
oCalendarMenu.cfg.setProperty("iframe", true);
oCalendarMenu.cfg.setProperty("zIndex", 10);
oCalendarMenu.stackIframe();
oCalendarMenu.showIframe();

These methods (plus more) essentially let one control the iframe that is dynamically created inside the overlay widget at any specified z-index.
Sweet!

Hello World

Ok. So this is my first post... hence the title!

So, why start now?

One of the reasons being that blogging has clearly emerged as one of the most efficient ways to share knowledge, wisdom, experiences and opinions and therefore to give back to the community.

During the last several years, we all have witnessed fundamental changes that the search engines have brought to the way we all look for and consume information on the internet. I've been mostly on the consuming end in this ecosystem, and must say have been feeling somewhat guilty lately. Furthermore, like Avinash Kaushik (a must read blogger btw if you are into web analytics), I hope to follow one of Guy Kawasaki's principles: "Eat like a bird, and poop like an elephant".

Some of the guidelines that I've promised myself include:
  • These posts must provide some value to someone. So no "I walked my dog last night" types of posts (they belong on twitter).
  • These posts must not simply be links to other articles and to the other posts elsewhere. They must have their own value (with possible references to other relevant articles of course).
These posts are mostly going to be about software and programming and about technologies, tech companies and such topics. The focus might shift at times, as I dabble into other areas or may have something interesting to say about something else. I'll be "tagging" these posts appropriately in order to categorize them into these areas.

Hope you all will enjoy and benefit from these as I have from several of yours.