Another reason to use R

by Kieran Healy on February 15, 2008

The wacky world of software licensing visits my inbox:

The newest version of SPSS cannot leave the country according to our current licensing agreement and US Export laws. Additionally, graduate students are not legally allowed to work on laptops (regardless of ownership) that utilizes the university site license. As a result, we are imposing a hiatus on SPSS installations on laptops and on any system that will leave the country until this can be resolved. Anyone who is leaving the country with a UA laptop, please contact us to remove the software before you leave to ensure software licensing and export conditions are met.

They’re trying to fix this absurd state of affairs, but the Contracting Office apparently signed off on the original site-license agreement. If you’re using SPSS in the first place you need to reconsider your plan for your life, but still.

{ 4 trackbacks }

Please Leave Your Brain with Customs « Grand Moff Texan’s MOMENT OF TRIUMPH
02.15.08 at 6:51 pm
Even Funnier than Microsoft at Jacob Christensen
02.15.08 at 7:59 pm
Noli Irritare Leones » Blog Archive » Blogwatch
02.17.08 at 4:45 pm
Crooked Timber » » Languages
02.17.08 at 7:08 pm

{ 34 comments }

1

mq 02.15.08 at 5:35 pm

Do people think it’s worth learning R if you already use STATA?

2

Kieran Healy 02.15.08 at 5:38 pm

Stata is good.

Do people think it’s worth learning R if you already use STATA?

Probably in the general sense that it’s worth learning new languages or applications so as not to get too rusty.

3

Barry 02.15.08 at 6:05 pm

R is on my list of things to get up on (after I de-rust my SAS; I’ve been using SPSS (!) for the past few years).

The biggest thing that I see, aside from costs[1], is that R seems to be the primary R&D language of statistics. New methods will roll out in R and later be copied into conventional stat software.

[1] If you’re not purchansing on an academic license, or being furnished with softare by your employer, statistical software starts at expensive, and works up from there. Usually with annual fees.

4

David Kane 02.15.08 at 6:26 pm

R is the future. If you are not using it now, you will be someday.

Is the wonderfulness of R a topic that can unite the community of Crooked Timber readers?

5

jdkbrown 02.15.08 at 6:38 pm

Re: 4.

And you don’t even have to know anything about statistics!

6

Grand Moff Texan 02.15.08 at 6:48 pm

This reminds me of a bit of hard encryption software that the US government was trying to keep to itself over a decade ago. You couldn’t take it out of the country.

One of my more political friends bought a T-shirt that had the code spelled out and even bar-coded so that it could be scanned quickly (assuming you had the gear).

Above all the lines of code were the words “THIS T-SHIRT IS A MUNITION.” I believe she committed a felony by wearing it in front of a Brit. Oh, well.

Since I’m not a programmer, it was just another excuse to stare at her tits.
.

7

Walt 02.15.08 at 6:57 pm

This comment thread has inspired me to learn R.

8

SamChevre 02.15.08 at 7:04 pm

A real question–is R better than APL (which is my fallback all-purpose language).

9

bi 02.15.08 at 7:06 pm

What’s the precise issue that’s making SPSS unexportable?

10

Kieran Healy 02.15.08 at 7:13 pm

A real question—is R better than APL (which is my fallback all-purpose language).

Dude. Here are my views on APL.

11

ben wolfson 02.15.08 at 7:15 pm

And how does R compare to J?

12

Kieran Healy 02.15.08 at 7:15 pm

What’s the precise issue that’s making SPSS unexportable?

I don’t know. Maybe it’s a health & safety initiative.

13

Sherman Dorn 02.15.08 at 7:30 pm

I’d gladly pick up R, if someone can point me to a tutorial for someone who knows the statistical-procedure side of SAS fairly well to ease the learning curve…

14

Sherman Dorn 02.15.08 at 7:34 pm

Hey! If someone’s going to mention APL or J, why not UML? (Not that this has anything to do with R, mind you, but …)

15

Bill Gardner 02.15.08 at 7:58 pm

“Do people think it’s worth learning R if you already use STATA?”

STATA is great… unless you need to write more than a very simple script or program. Then it is awkward, like the MACRO language in SAS (although people accomplish great things with these tools). R is a wonderful programming language. Particularly if you deal with n-dimensional arrays for data, for n > 2.

16

nick s 02.15.08 at 8:30 pm

What’s the precise issue that’s making SPSS unexportable?

Presumably that these copies of SPSS are used under the educational license, which are negotiated on a country-by-country basis, with institutions providing a site license fee. (I don’t think it’s that SPSS can be used by oh noes teh terrists!1!)

This reminds me of a bit of hard encryption software that the US government was trying to keep to itself over a decade ago.

PGP? Yeah, that was a funny situation: it was legal to export books containing printouts of the source code, so people abroad scanned it in with OCR and built the executable. And there was a weird situation in the early days of the web when browsers with 128-bit encryption support for SSL couldn’t legally be made available outside the US, and 40-bit versions were provided instead. In reality, you could always find an FTP file with the samizdat 128-bit version.

Is there any good documentation on migrating from SPSS to R? ‘The SPSS user’s guide to R’, or similar?

17

Michael Froomkin 02.15.08 at 8:32 pm

Unless export control laws relating to encryption (which I’m assuming is the issue here?) have changed in the last three years, and I suppose they may have since I last looked, they don’t ban you from taking your software abroad.

When last I checked, you just couldn’t share the software with foreigners, you couldn’t leave it abroad, and you — not your employer — had to document your “export” and “re-import” on some form you had to keep handy in case Officer Plod ever wanted to see it.

So the reference to law may be a bit of a red herring. More likely the license agreement is the issue (and it may have scary language about export control compliance, which is why the university is worried)

18

nick s 02.15.08 at 8:49 pm

So the reference to law may be a bit of a red herring.

It may, since this the PDF form for UofA faculty. But it may also be the client-server element that uses strong (SSL) encryption, or perhaps even the license verification code.

19

nick s 02.15.08 at 10:41 pm

Ah, here’s a link to a guide to R for SPSS/SAS users. Anyone got anything else to add?

20

vivian 02.16.08 at 2:05 am

Stata rocks, with a remarkable number of flexible built-in functions so that it’s possible to do some very fancy modeling without “programming” in it. (Back in version 4, when I last tried programming in it, it was noticeably harder than Gauss or C, and less rewarding.) But even better is their customer support – they support old versions, and actually use the stuff themselves. And it’s priced reasonably – not as cheap as R but the manuals and phone support come with it, saving a boatload of money on how-to books. Even as a grad student, Stata was a better deal than R, imo.

Now that I’ve been using SAS for work, while it feels old-fashioned, two things impress the hell out of me. (1) Boy is it fast, even on massive files. (2) When it chunks away for twenty or forty minutes on a large file, and then runs out of memory, it says so and asks you to delete some stuff to make room. Then it RESUMES! Boy was I a happy little nerd that week. But there’s no built-in I can find for, say, using log scales on graph axes.

Matlab has remarkably good free, online manuals these days. Haven’t done more than dabble though – anyone here like using it?

21

Hugh 02.16.08 at 3:15 am

I’m curious how R (or J for that matter) compares with L?

22

Tom Ames 02.16.08 at 4:51 am

A great R resource is Maindonald’s “Data Analysis and Graphics Using R” (“DAAG”).

R is quite unlike SAS and SPSS, but that’s one of its strengths. Don’t expect to just pick it up though: it takes a bit of a mindset readjustment.

But I’d rather work in R than in just about any other programming language.

23

CB 02.16.08 at 7:13 am

@vivian,
Matlab has become very common amongst the kind of economists that used to use Gauss.

I take this as a BAD sign. These are the same people who used a clunky package on the grounds of sunk costs.

24

SG 02.16.08 at 7:33 am

Regarding r, I just want to say it’s really frustrating.

I have to use it a lot, and here are my problems so far:

1) it doesn’t handle vectors and matrices the same way, so I have to convert vectors into matrices in any function where the row dimension of the input varies
2) sometimes it calculates matrix products as 1×1 matrices, which it does not treat as scalars
3) when I turn it on, the Japanese language comes out as gibberish so I have to go and change the gui preferences first thing – even though I’m using the Japanese language version
4) It frequently doesn’t give warnings within functions when small errors occur, instead just returning NAs

I spent all last weekend helping a fellow member of my lab implement cox proportional hazards modelling using the Newton-Raphson method, allowing for ties, in R. A large part of my code is peppered with methods to handle these idiosyncracies in matrix mathematics.

Also a proper object browser would be nice.

I have been having equally frustrating problems with the japanese-language port of latex. Free software may be a cute idea, but all too often it is really, really frustating.

25

sv 02.16.08 at 4:20 pm

re: #20: MATLAB is really powerful, with tons of libraries or ‘toolboxes’ for specific applications like image processing, statistics, etc. it’s relatively easy for a non-programmer to use. i’m an engineer and i’ve never done anything beyond simple statistical stuff in it, so i dont know how it compares to R on that front.

i hadn’t heard of R, but reading some basic stuff about it, it looks like it would be better for statistics as long as you’re comfortable with object-oriented programming. MATLAB is more shell/scripting style.

26

Barry 02.16.08 at 5:37 pm

“A real question—is R better than APL (which is my fallback all-purpose language).”
Posted by SamChevre

APL would be a cool, useful and very powerful thing to use back in the 1980’s (I encountered it in 1990). It’s obsolete now, given all purpose statistical software, MATLAB, R, Maple and Mathematical.

27

Barry 02.16.08 at 5:39 pm

About MATLAB – the University of Michigan is having problems because MATLAB decided to replace departmental licenses (buy rights to run N copies for one code and one line-item charge) with individual copies. This means that the software licensing folks were swamped with trying to process a very large number of individual licenses at the last minute.

People like that are begging to scr*w you over.

28

tina 02.16.08 at 10:47 pm

Plus, you get to talk like a pirate every time you do stats: “Arrrrr!”

29

yabonn 02.16.08 at 11:35 pm

Matlab spares you lots of the hassle. It’s not stat-centric, though : you work on matrices. Heaven for the code line/evaluate line kind of people.

But : the editor is bad, and you can’t change it if you want to debug too (there’s an emacs mode, but yuck). Also, it’s more a propotype thing – not the easiest thing to deploy your stuff. I blame, in both case, the fact that it’s a commercial product.

So, these days I’m thinking of moving to the Python based alternatives (Numpy/Pylab) or to Scilab – nice and free clone.

30

vivian 02.17.08 at 4:13 am

hmm. Doesn’t sound like enthusiasm to me. So are we still bashing SPSS or is it economists now?

31

GreatZamfir 02.17.08 at 12:38 pm

Yabonn, I switched from Matlab to Numpy some time ago. Object-oriented programming is really heaven after Matlab, and while its array syntax is less polished than Matlab’s, for the rest Numpy is just as good and fast, plus it has zero-based indexing. And f2py allows you to use Fortran and C routines seemlessly within your Python code.

A big downside is that a working Numpy/Scipy system consists of lots of separate packages that all have to be installed and updated correctly. If you’re on Windows, use the “Enthon” package that does this for you.

Another downside: no user-friendly 3-D plotting.

A good extra is Mlabwrapper (not in Enthon, but easy to install). If you import this module, it will start up Matlab in the background and you can seemlessly use any Matlab command in Python. I have heard something similar exists for R.

Don’t switch to Scilab, unless the money/open source is your reason. It’s a clone, without benefits over Matlab, only drawbacks.

32

eszter 02.17.08 at 6:22 pm

Thanks for this post (although I would retitle it “Another reason to use Stata”). I can’t wait to forward it to everybody in my department! (My students are the only ones who know anything other than SPSS. For now they might hate me for it, but hopefully they’ll understand one day.)

Why people insist on spelling Stata with all caps is a mystery to me.

33

drew 02.17.08 at 7:38 pm

I think the compulsion to capitalize all programming languages comes people used to older languages like LISP, FORTRAN, COBOL, and PROLOG. (Although these days you often seen them written in normal capitalization.) My undergrad CS advisor was a former COBOL programmer and ever wrote Python as PYTHON.

34

engels 02.18.08 at 1:59 pm

I’m curious how R (or J for that matter) compares with L?

But what about Q?

Comments on this entry are closed.