Today, doing a little task in R, I had cause to look up the following “warning” that appeared after compiling a script:
Warning message:
In readLines(file) : incomplete final line found
I couldn’t figure out what this warning meant, because the script ran fine, so I did a web search and I came across this exemplary example of why working with R really sucks: the help files are completely useless, the warning messages are cryptic and meaningless, the inbuilt editor is broken, there is no standardization of externally-developed editors, and the people who provide help online are some of the rudest people you will ever meet in computer science. This simple warning shows it all at once. I’ve complained about the dangers of R’s cryptic and meaningless warning messages again, but this example should really serve to show how they also cry wolf in a really unhelpful way.
The linked page is a message board of some kind (I think a reproduction of the “official” R boards on a another site) where a person called Xiaobo.Gu has posted up a request for help in decoding the above warning message. The request is polite enough though not voluminous, asking “Can you help with this?” but the first response (from someone with 7328 posts on this board!) consists entirely of the following:
Help with what? You got a warning. And it had information that should
tell you how to edit the file if the warning bothers you.
What is the point of a reply this rude and dismissive? This person actually took the time to reply to a post, in order simply to say “I won’t help you.” On a message board explicitly intended to help resolve problems with R. In addition to being rude it’s arrogant: there is no information abou thow to edit the file, just a pointer to the final line. We will shortly see the cause of the error, and it should be clear that no one in their right mind would consider the warning to have provided “information” of any form.
The next reply admonishes the original poster for failing to follow the posting rules (though doesn’t say how they were breached – so is essentially another contentless reply!) and then includes a little sneering aside about the way Windows encodes ASCII text that makes me think the developers of R have an elitist refusal to engage with Windows’s flaws. It then reveals that the warning is harmless and only appears in R version 2.14.0 (unpatched).
Why bother putting such a warning into a program? Whose idea was it to put a harmless warning in a single version of R, and why and how can a warning be a warning and also be harmless? Either something risky is going on, or it’s not. If it’s not, don’t waste my time with red text.
Finally another person comes along to sneeringly answer the question and provide actual information:
A warning message such as this could not be clearer.
It means that the last line of the file does not end with a <newline> sequence ==> the final line of the file is incomplete.In an editor go to the end of that line and press <Enter> or <Return>
And save.Alternatively configure your editor to always terminate the last line of a file with a <newline> sequence.
This is a sparkling gem of passive-aggressive “help.” I can see a simple way in which the warning could be “clearer:” It could say “you did not press enter or return.” Then, it would be clearer. As it is, there is no information about what is missing in the final line: it just says it is “incomplete.” How can anyone claim that a warning such as this could not be clearer?
But then, just to top it off, this commenter has suggested that the poster configure their editor to “always terminate the last line of a file with a <newline> sequence.” This might seem to be reasonable advice, except that I get this warning in every script I write and I am using the built-in editor! This means that some muppet at C-RAN shipped a version of R with an editor configured to write scripts in such a way that they would trigger a warning. By default. Then, the very first patch they released got rid of the warning. wtf!? Is this what passes for quality control at C-RAN?
This is why wherever possible I use Stata for my work. I need software I can trust to produce the same results every time I run it, that isn’t going to waste my time with meaningless warnings and threats in glaring red, that isn’t configured to do things wrong by default, and that performs all calculations correctly. In order to trust that my stats software will perform all calculations correctly, I really need to know that the designers have some degree of basic quality control. When I see stuff like this – simple programmatic failings in things like the default settings of the script editor – I find it really hard to believe that the correct attention has been paid to, say, the way that the program performs adaptive Gaussian quadrature.
I also expect that the people who design this stuff will be polite when answering questions. I don’t need some passive-aggressive guy on the internet telling me off for failing to understand an extremely vague warning message that is only troubling me because C-RAN don’t have adequate quality control. The replies on that thread should have been polite requests for more information followed by an apology and a promise to fix this problem – or, if these people aren’t directly involved in C-RAN (and we know one of them is … one of R’s designers is on that thread) then a suggestion about how to alert the developers to the problem. Sneering and bullying – no thanks. I don’t get that when I contact Mathworks for help with Matlab, no matter how stupid my request.
This is why when I teach my students about stats packages I tell them a) you can’t trust R and b) it has a nasty community. I teach them its value for automation and experimental stats, and warn them away from using it for anything that has to be published in serious journals.
I think R is just another example of how dangerous it is to run your business on open source software, though I’m sure there are times when it’s safe. And I think it would be fascinating to see a detailed textual analysis comparing the message boards of an open source community (linux, R, latex) with a proprietary product like Stata, because in my experience there’s a world of difference between the two communities. Why that difference exists would not only be a fascinating anthropological study, but would no doubt be of relevance to the scientific study of neckbeard behavior, because I have a strong suspicion that neckbeards are the dominant species in the open source world. Will an anthropologist somewhere take on the task?
June 17, 2012 at 8:18 pm
As a data point, I am a software developer and I instantly knew what that warning meant. I definitely not saying that your experience was invalid or that all the other stuff you said was not correct, but there is a subset of people (and not just experienced R users) for whom that warning makes sense.
June 17, 2012 at 10:05 pm
well and good, Fanguad, but R is not a serious programming language and its users are intended to be statisticians and researchers, not programmers. I’ve never received formal training in programming and very few people who use R have – it’s intended as a tool, not a development package. Which is all the more reason not to swan in with nasty, sneering “advice” when someone asks a question about something that is supposed to be “obvious” – the someone in question could be a psychologist, for example, who has no prior experience in programming and is training themselves in R so that they can implement some experimental statistical method or automate a multivariate statistical process for a large dataset. That person is a legitimate user of R, but they can’t be expected to understand why not putting a carriage return at the end of a program should invoke a warning (which, btw, for my own edification, why should I give a toss?)
Also, regardless of its viability as a warning in programming languages in general, it obviously isn’t very important in R since the replies dismissed it as “harmless” and observed that it only existed in one version of R and disappeared after patching. And whether it’s important or not, telling someone to configure their editor when the basic built-in editor for the software package does it, is … rich, to say the least. And doing all that in a rude tone as if to suggest that the problem is with the user rather than the extremely idiosyncratic decisions of the developers? That’s priceless behaviour, that is.
June 17, 2012 at 10:06 pm
That use of “swan in” was about the people in the thread, not your contribution here (which I don’t think is rude or sneering in anyway, just in case you were confused by my reply).
June 18, 2012 at 12:21 pm
It’s possible that the (nasty) nature of support for open source software is due to the relationship of the people supporting it to the software itself. For Open Source software, the people supporting it frequently had a hand building it. That makes them invested in the program and questions about it are regarded as an assult on them.
By contrast, professional softwarre houses are more likely to have separate development and support teams. The Support team is far less likely to take a question about the software as an insult to them and it’s also much more likely to be trained and experienced in dealing with “dumb” user questions.
The final aspect that applies for both groups is I can assure you that many questions about software really are blindly stupid and that tends to make people doing support pretty dismissive towards their users. There’s a reason that “Is your computer plugged in” is the first question support lines ask – it elminates a significant portion of the problems [1]. This doesn’t excuse the attitude, but does help highlight that being trained in “Not being snarky to your users” is actually a critical skill to train your support staff in.
[1] For the system I work with, most queries are “I submitted X and it isn’t working” leading to a conversation of “We’ve told you (personally) that doesn’t work a couple of times now. I’ll cancel off this one. Please don’t do it again.”
November 9, 2013 at 8:32 am
I agree that the warnings and error messages are usually really bad in R and this one is a good example. Even computer literate people would expect something as “file did not terminate with newline” or something less convoluted than “incomplete final line”, specially if they’re aiming at a non-programming savvy public (which is a substantial amount of their users, as a someone mentioned here).
However, what you say about open source software is completely wrong. Many excellent open source languages such as Perl, Python, PHP have excellent community support, not to mention software as Linux itself (Ubuntu, specially).
R is infamous for the nasty people who reply to user questions, but it’s far from the norm of open source software. Plus, users often times do ask stupid questions and that might have become really annoying to the highly knowledgeable R developers.
There are tons of other reasons why I dislike R (why so many data structures in a high-level language, why are they are not standardized, etc?), but being open source and free is definitely not one of them.
November 9, 2013 at 6:39 pm
Thanks for commenting seamonkey. You might have noticed that I have a thing against open source software, but I’m sure you’re right that there are some that work well.
It’s funny you mention Python because I and a student have been trying to get python to work so we can use some software designed for it, and it’s been an absolute disaster. Just installing the package we want to use in python requires hours of work, chasing down dependencies, uninstalling and reinstalling packages, identifying missing packages, and changing parameters and variables that just don’t work. In the end my student had to get an engineer friend to come in and write some code. We don’t need it for deep underlying programming reasons, we just want to use the statistical software. This is an absolutely ridiculous level of work just to install something. Why? I can install Stata just by sticking the disc in the drive, it doesn’t demand that I spend hours cruising crazy web forums learning an entire new language just so I can install the software. Back when I used SAS the first thing it would do was identify stuff I needed to install and tell me where to get it. Not so with this package for python. Why?
So far in my life the open source software I have encountered has been: R, Python, Linux, and LateX. LateX is good, but largely because it’s trivial, and even then I avoid it as much as possible and it has basically no productivity enhancements: most particularly, it doesn’t have any kind of track changes functionality, which makes it impossible to use in collaboration with others. I now use mathtypeML and write formulae using the LateX language – in word. Vastly superior, and I can change margins without having to get a PhD in typesetting. Linux is, of course, a disaster.
I don’t dislike R because it is open source and free. I dislike it because it can be incredibly frustrating to use, it has nonsensical error messages, and its help is terrible. I think this is because it is open source. Any business that actually had to make money from this level of service would be on its arse in a matter of months. I do think the only reason R is successful is that it is free, and that it has good automation. But that doesn’t make it good software – just sometimes useful.
February 22, 2014 at 7:18 am
So, to sum up… you got an error message which you didn’t find sufficiently helpful, then people were rude on an unofficial Q&A site, so you concluded that R “sucks” and you tell your students not to use it?
1) I find R’s help files to be quite good actually, at least relative to Stata (though this varies across packages) and 2) Stata is not devoid of incomprehensible errors. For example, I got this one the other day: “Release flags of 103 to 114 are supported. You had -> 115
Requested Input File Is Invalid” What?
February 25, 2014 at 10:15 am
Jason, this comment is strangely defensive, and also wrong. I make clear in the post how I teach my students to use R, and why I don’t trust it. I also make clear that this error isn’t just not “sufficiently helpful,” but arises from flawed program design rather than user error. And there is no such thing as an “official” Q&A site for R, as you surely know.
Actually this post turned out to be prophetic: sometime after i wrote
I discovered that R does, indeed, have a poor implementation of adaptive quadrature, to the extent that any work done using R on a model with more than two levels and binary responses is likely wrong. It takes a long time to find these things out – by which time you have published R’s mistakes in a serious journal. You might say that experienced statisticians should know this but you would be surprised how many don’t – as the Gelman blog post makes clear. Would you recommend working with a software package that has defaults that don’t work for any sophisticated problem, and that doesn’t allow you to change them and doesn’t give any warnings?
I agree with you that Stata’s help files could be better, and its error messages could also be a lot better. Also the GLLAMM package produces different errors to the basic GLM-type packages for the same problem, which is something you only learn from experience. But Stata at least have some quality control. R, not so much.
May 4, 2014 at 7:02 pm
If you can’t figure out R then you certainly don’t have to use it. It was the first language I learned in depth but have since moved on to the greener pastures of python for my data science needs. R is sub par for any sort of programming in terms of both syntax and capabilities but as a data science sandbox it’s unmatched. The beauty of R IS the open source community and the sheer quantity of packages that are available, both tried and true and cutting edge. Interestingly enough, the documentation for all of the disparate packages is relatively standardized and usually very thorough. In my opinion it boils down to this. R is clunky but very we’ll supported and gets the job done in a prototyping setting when you want to explore every option you can without having to build them all from scratch. For ease of use, power and flexibility you have python. There may be a higher initial learning curve on the installation side but after you get the hang of it you’re productivity will be off the charts. The world is your oyster my frustrated friend.
May 5, 2014 at 12:59 pm
Thanks for commenting Chris. I agree with much of this perspective on R, though I have to say I don’t accept the idea that any part of R is “tried and true” – even basic packages like glm have problems that don’t occur in other sofware with better quality control. As an example, yesterday I was programming R on windows for the first time in a long time, and I was copying and pasting file directory paths from the properties window of the file into the script window, because file paths were too long for me to bother remembering directly. To my amazement R doesn’t recognize those file paths because of the backslash – you have to use mac OS style path names just to get this to work. How stupid is that?
R remains my go-to package for things like automation though. It’s much more flexible than Stata in those circumstances. I haven’t got any familiarity with python, though I work with people who use it. I am thinking of picking up on that this year…
December 13, 2014 at 1:02 am
Late to the game here, but I agree wholeheartedly with this entire post. R’s documentation and community are extremely unhelpful. I’ll also add that I *do* have past experience as a developer, working with languages from Pascal to C++ to Java at various points, and it is grossly apparent that the designers of R are not proper coders. It is a hideously counter-intuitive language, cobbled together from popsicle sticks and Play-Doh, and any prior programming experience pretty much needs to be jettisoned from one’s mind to work with it.
My advice to people working with it: don’t try to hard to understand it…doing so will corrupt your approach to real languages. Just get your stuff working and move on. With R, asking “why” will just lead you down a path of justified incredulity. The reason things in R don’t make sense is because the people who made it had no sense.
Thankfully, some aspects of R make it incredibly useful, and make slogging through the nonsense worthwhile.
January 9, 2015 at 2:49 pm
You’re right about R sucking, but not quite right about the reasons. I’m an old programmer with a science background, and when I recently started learning R I had flashbacks of mag-tape drives and line printers. It has that early 1980’s feel to it. The syntax is absolutely cryptic, what we call “write once, read never” which goes hand in hand with those warning messages you saw. That’s the way things were done back then. R makes Perl seem elegant, and Perl, if you don’t know, is a warthog of a language.
I’m sorry to hear you had issues with Linux and Python. These have been my go-to OS and language for a long time. Just as another data point, there are open source communities where the members are absolute tools, very much the big fish in the small pond, and there are communities where the members go out of their way to help newbies. I’ve found the Ubuntu community to be pretty friendly, FWIW. Good luck on your attempt at python. Remember that whitespace is important, which is sort of a silly attempt to enforce coding etiquette.
August 28, 2015 at 12:54 pm
Amen brother. Interesting how little self-insight the defensive defenders of R have–I’m looking at you Jason and Chris k. The immediate response to most questions is that the help file is self explanatory (it’s not), and that you’re an idiot for asking the question (I’m not). But unfortunately given that it’s free and ubiquitous, R is a necessary but unpleasant skill to acquire. I’m okay at Stata and a newbie at R, but if Stata’s user communities were as snarky and unpleasant as R’s, I would definitely not be ok at stata. What a pain in the buttocks. To rant, it really is incredible how much of their own time R defenders waste by posting completely useless incorrect defenses of how great R is–and which I then have to wade through–and how foolish you are to be asking an obvious question. Dang, what a shitshow.
October 16, 2016 at 9:08 am
R sucks if you don’t have the gumption to code. Stick to minitab if you can’t handle thinking for yourself.
The people who complain that R (or any other real tool) sucks, are usually just upset that they are expected to handle their own problems. It’s the equivalent of a snotty little kid crying about how math sucks after failing a test.
Maybe, just maybe, the fault isn’t with the most popular statistical programming language, maybe it’s with you.
October 16, 2016 at 9:10 am
Also, lol @ the people who denounce R as a bad programming language, yet then go on to complain that they have never programmed before. Grow up. You’re too old to act like entitled brats.
October 23, 2016 at 6:35 am
I HATE R!!! My mentor forces me to use it. I have been reshaping data for two months, instead of running analyses and I STILL am getting error messages every time. The message boards are utterly unhelpful. No one has a clue. I don’t know why people choose to use R. It’s horrible. I’d rather use SAS or ANYTHING ELSE. Sadly, I have no choice.
February 9, 2017 at 2:00 am
[…] is that the support community sometimes has the social graces of Unix-bearded cheese graters, as this example illustrates. Python takes days to comfortably grasp; R can take […]
July 27, 2017 at 11:23 pm
Hello,
I am using R extensively, and I sure have much to say about some design choices. But the worst problem with R is the community: not very kind, to say the least! I have experimented it, and I saw many other comments about it saying the same.
However, R is pretty good for statistics, at least among open source software.
I also agree that answers are nicer on commercial software forums. I have had a much better experience on Intel compilers forums, for instance. But it’s really not a problem with open source. It’s a problem with the R community. The same can be said about the Lisp community by the way. And R is derived from Scheme, a Lisp dialect. The apple doesn’t fall far from the tree…
March 17, 2021 at 4:57 am
Totally agree. In particular Data Camp’s https://www.rdocumentation.org/ is so poorly designed that its a very annoying joke because it comes up as a top choice in any search for a function. One can only see a third of the text on the left side of the screen; the right side is pointless vertical bar, and there is no way to see the rest without zooming way out.
March 18, 2021 at 3:36 pm
Thanks for commenting Yetta. I guess since your comment comes 4 years after a previous comment on this thread, the generic experience of R help hasn’t improved in the last 5 years … how disappointing. Is your problem with rdocumentation a browser problem? It seems to work fine in mine (Opera on Mac) but I only looked at one page. Perhaps in a few minutes a serious R defender will be along to tell you that the problem is you … anyway, good luck with your R programming!