Thursday, June 20, 2013

Just grumbling a bit about a few things

Yes, I am aware that the following is nothing of importance, but there are a few things in science that have annoyed me recently.

1. People constantly writing things like "X has got 250 papers, several of them in Science or Nature" or "Y has an h-index of 18 despite his young age". Yeah, sure. But quite apart from the fact that throwing around numbers does not, on its own, prove anything whatsoever, just look at the publication lists in question. Did X write those 250 papers on her own? Does Y have that h-index based on single-author papers? Of course not. We are talking papers with >20 authors in many cases, especially in the high impact ones. So how about we divide the number of papers and the citations per paper by the number of co-authors? Does it still look that impressive compared to others in the same area who have a seemingly less copious output because they don't get to undeservedly tack their name onto papers to which they contributed perhaps 4% of the work? Thought so.

2. It seems to get ever more difficult to submit DNA sequences to Genbank (which we have to do so that they are publicly available, so that our studies are reproducible). Years ago they accepted nearly everything, today it is standard procedure to engage in a long e-mail conversation with their staff complaining that you haven't annotated the sequences correctly. See, as phylogeneticists we are often using intergenic spacer regions for our analyses, and the sequences we generate may have a few bases of coding region on their left and on their right, which are completely irrelevant for our purposes, but Genbank wants them correctly annotated. A noble goal, and so far so good.

The problem is that I do not have the foggiest idea how I can really be sure, just from my PCR products, where exactly that intergenic spacer starts and the coding region ends, and vice versa at the other end. (And again, I don't really care about those ca. twenty base pairs anyway, not least because as coding regions they are usually so conservative that they don't contain any phylogenetically useful information, but they are also too short a fragment to be useful to anybody who would want to do something with the gene.) What resources do I have to find out? The only guide I have are annotations of the same regions previously submitted to Genbank by other researchers.

Now, remember the sentence above, the one with "years ago they accepted nearly everything"? So I look up how others have annotated their DNA before, all of which was accepted by Genbank, and they have genes not starting with start-codons, or not ending with stop-codons, or three different submitters have annotated the end of the same gene to be in three different places. The only people I can rely on were as clueless as me. Yippee.

3. Over the last few years, an increasing number of buildings, research groups and even faculty positions seems to be named after people, generally famous scientists of the past although I assume that the odd sponsor may also be among them. At the same time, research groups are more and more named after the professor leading them instead of after what they actually do. Now I see how that may flatter the people these things are named after, but practical it isn't.

How about "Institute of Microbiology" instead of "Eliza Helms Institute"? How about writing "Research Group for Applied Genomics" on a door instead of merely "Smith lab"? The first of these would actually, you know, provide some useful clue as to what these institutions are actually for. Can't have that, I guess. And seriously, is an official job title like "Edward and Regina Whittaker Assistant Professor of Ecology and Evolutionary Biology" supposed to look good on a business card, as opposed to conceited? What are the people thinking who come up with these?

(The specific names were chosen arbitrarily and do not refer to real names that I would be aware of. Any similarity to existing institutions is purely accidental.)

No comments:

Post a Comment