Tuesday, 13 January 2009

Who-o-o are you? Who who? Who who?

Identity Card - National RegistrationImage by Danny McL via Flickr

There’s been quite a lot of discussions going on lately about author identification: Raf Aerts’ correspondence piece in Nature (doi:10.1038/453979b), discussions on FriendFeed, ... The issue is that it can be hard to identify who the actual author of a paper is if their name is very common. If your name is Gudmundur Thorisson (“hi, mummi”) you’re in luck. But if you are a Li Y, Zhang L or even an Aerts J it’s a bit harder. Searching PubMed for “Aerts J” returns 299 papers. I surely don’t remember writing that many. I wish… So if a future employer would search pubmed for my name they will not get a list of my papers, but a list of papers by authors that have my name. Also, some of my papers mention jan.aerts@bbsrc.ac.uk as the contact email. Well: you’re out of luck, I’m afraid. That email address doesn’t exist anymore because I changed jobs.

The idea exists to call into life a unique ID for each author similar to the doi (“digital object identifier”) for a paper. Thomson Reuters have created ResearcherID, but because doi’s are handled through a not-for-profit CrossRef, let’s call the unique author ID a dsi (“digital scientist identifier”). This dsi can then be used by that scientist to identify himself wherever he needs.

Here I’ll try to explain how I think this could work.

But first of all: what are the prerequisites for a dsi-based environment? Obviously, the journals would need to request the dsi of authors on submission rather than just their names and email addresses. They are able to get names and email addresses through the dsi. And secondly, we need a service that assigns dsi’s and where scientists can update their details and add information.

The service/website

Let there be a website (for argument’s sake http://www.dsi.org) that assigns new dsi’s to new authors (only one dsi per author). So I could for example be dsi.12345. This service should have additional functionality such as list of contributions, curriculum vitae, contact details, network. It should also provide a homepage or profile page for each scientist listing at least the name, affiliation and literature list (i.e. what you would get from a PubMed search). So if you’d go to http://www.dsi.org/dsi.12345 you’d see at least my name, the address of the institute I work and a list of papers that I co-authored.

Getting a dsi

It’s critical that one researcher only gets one dsi. This is less than straightforward because I believe many researchers will not be interested enough in the whole identity story to even remember if they already had a dsi or not. So if I were to go to the dsi website and request an ID, the website would ask for my name first. It’d also ask if I used different names in author lists (e.g. I’m a woman, got married and started using my married name instead of my maiden name). Using that information the service would then search pubmed for papers that are authored by someone with my name (who might be me). It could present that list to me and ask if I’m actually that same person or not. This way we’d build up a minimal list of papers. That minimal list would then be checked against the dsi database to see if there isn’t already someone with my name who has claimed these papers. Logically that person would be me and it would appear that I already have a dsi. If no dsi has this name and these papers associated the new dsi can be assigned.


A central service like this would be ideal for collaborators and possible employers to find out about contributions of a specific researcher to science. Instead of asking for author names and emails (the latter change over time anyway), a journal would ask for the dsi of all authors. If the paper gets accepted that journal would notify the dsi service to add that paper to the researchers publication list. But it goes further than just the papers. It’s a shame that researchers virtually only get marks for their published papers (Publish or Perish) and not for other contributions to scientific research. What about people who submit data to genome annotation databases? What about contributions to discussion in comments to blog posts, FriendFeed, ...? Setting up public databases? Writing APIs for scientific data? Think of a browser-button with which you could sign certain contributionsanywhere. Signing a contribution would add a link in your list of contributions in the dsi system.

It should obviously be possible to log into the dsi system and edit or remove contributions that you made. That one little API you wrote 5 years ago seemed so important then but you’ve come to see it as insignificant now, for example.

Contact details

People change employer, email, address and even name. So there’s a problem inherent in only listing email address and institute on a paper. Using the unique dsi for the authors would always point to that researcher no matter how many times he or she moved jobs or contact information. When a researcher has his contact details changed he would log onto the dsi service (we’ll come to this later) and change those data. Other people would then see those details on the researchers dsi page (http://www.dsi.org/dsi.12345), or if the researcher wants to keep them hidden send a message through the dsi service itself. The researcher’s email address does not have to be visible to the outside world.


Even though you might not want to make your email address visible for the whole world, you wouldn’t mind if the people you know would see it. Your network. I think that a dsi service should contain capabilities like those from LinkedIn. You should be able to build a trusted network (with people that you know well). This network is another important pilar in your contribution to science.

There would ideally be different personas you could set for your profile. The default would for example be that your profile page would only show your name and papers. But you might also have a full profile that is only to researchers who are logged into the service and are not further than two steps away in your network. That extended profile might show your contact details (including email), contributions outside of papers (e.g. comments on blog posts) and curriculum vitae.


The above explains how I would like to see the issue of identification solved. But there is also the problem of authentication. How do I prove that I am dsi.12345? Ideally the dsi service would be an OpenID provider so that it let’s me prove that I own http://www.dsi.org/dsi.12345. Hopefully more and more websites (biomedcentral, nature, ...) would allow logging in using OpenID.

Apart from serving as an OpenID provider, the dsi service should obviously also be an OpenID consumer so I don’t have to remember another username and password but can use http://jandot.myopenid.com or http://saaientist.blogspot.com to log in.

I hope this gives a little bit of an idea of the environment I hope we’ll move to. Any comments welcome. Any progress even more…

Reblog this post [with Zemanta]


  1. I've thought about this before and I think it's a great idea, and imperative that it's API-accessible, open, and not owned by a commercial company like ResearcherID is.

    Two things:

    (1) Maybe 'digital researcher identifier'?

    No point excluding the historians, economists, philosophers, linguists, mathematicians etc. who might also benefit.

    (2) Not sure about LinkedIn-style functionality being a core feature. That'd be like conflating DOI and a search engine.

    But as long as you have a decent API and authentication mechanism, there's no reason why sites like LinkedIn couldn't cross-refer to it.

  2. Andrew, you're right with your second point. I guess I was more talking about the whole picture. However, the dsi (or rather dri) assigner should still store some information as it needs to check if a scientist applying for a dsi/dri already has one or not. For that it needs to store at least the name and probably a list of references as well, I think.

  3. Great outline of the problem and the benefits of uniquely identifying researchers. A little while back I blogged about my thoughts on ResearcherID. I still think that uniquely identifying people on the internet is something that everyone is interested in; artists, musicians, scientists, etc. I think more wide adoption of OpenID is still the best bet on getting this done.

  4. Love The Who reference...

    I really wanna know!

  5. Exactly. See also my follow-up corres "Digital identifiers could keep up with authors' moves." (DOI:10.1038/454575c).