Re: [xml-dev] What is the general direction you are seeing these daysto store and query lots of large complex XML?
Steve hits on an important part of my reasoning. For example, you can take something like Hadoop and run variations of analysis iteratively. So let's say you're doing a (now classic) friends of friends analysis which is known to have polynomial complexity as you increase the relationship depth. For a given set of users that depth can vary considerably depending on how far away any given person is from a "super node" or other data patterns. Set an upper bound on execution time and start running the analysis, continually increasing the depth until you hit that bound. You're going to pull out way more interesting data; things like there is a 40% chance of knowing somebody that know somebody that knows Kevin Bacon and an 70% chance of knowing someone at 4 steps. etc. If you're dealing in statistical analysis then the algorithms are already coded up for many common analysis and it's just a case of configuring them for a given use case. Yes, you are talking about entire new sets of infrastructure and skill set for many organizations, but the gain is the ability to perform many orders of magnitude more analysis tasks, perform them many orders of magnitude faster, and perform them over many more magnitudes of volume of data.
Having said all that, I do have to qualify it: I don't know the business domain, I don't know your organization and I don't know your organizations technical capabilities. I'm making this recommendations based purely on two things: you have a huge volume of data and you tell us you want to feed something into SAS and SPSS. I'm assuming that this is part of a larger set of analysis that is ongoing and that it is worth some considerable investment to build a tool set to get the benefits I describe above....