Judy Schmidt, our designer/developer, and I have a new paper, "Looking before Leaping: Creating a Software Registry," in the Journal of Open Research Software. The article is open access and can be found here: http://doi.org/10.5334/jors.bv
When I started work on the ASCL in 2010, I wanted to understand why the original ASCL -- started in 1999 -- and other previous similar resources had not reached critical mass. I looked at these resources, what they offered, and how they were structured, and for some of them, talked with the people who had started them, to see what I could learn from their experiences. In addition, Robert Nemiroff and I have had many conversations about the early days of the ASCL, and I also talked with researchers who used some of these services. The lessons from this look back has informed our work on the ASCL. My background in change management has also been helpful in determining the ASCL's path forward. In the paper, we share not only some of what was learned, but also specific steps we've taken, why we've taken them, how the ASCL has changed over time, and some of our future plans.
The first version of this paper was accepted for the 2nd Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), which took place in New Orleans in November 2014, and was later revised for publication.
WSSSPE2 blog post
3rd Workshop on Sustainable Software for Science: Practice and Experiences
On Tuesday, October 27, the ASCL held a Birds of a Feather session at ADASS on Improving Software Citation and Credit. The session was opened with a brief presentation by Bruce Berriman, who reported on a Software Publishing Special Interest Group meeting held at the January 2015 AAS meeting and the ongoing work that has come out of that. I followed with a quick overview of other efforts to improve software credit and citation, not just in astronomy but across disciplines, after which Keith Shortridge moderated a lively discussion among the forty people present. The slides Bruce and I presented are now available online.
Previously, we shared resources for the session and the Google doc created during the session to capture some of the main points from the discussion.
The ASCL has organized a Birds of a Feather session (BoF) at ADASS to discuss improving software citation and credit to be held on Tuesday, October 27; the following links may be helpful for the discussion.
Astronomy-specific
Astronomy software citation examples and ideas (working [Google] document arising from AAS SPSIG discussion)
Astronomy software indexing workshop
Cross-disciplinary
Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE)
Force11 Software Citation Working Group (Mission statement, member list, timeline, communications plan, etc. on GitHub)
Center for Open Science's Transparency and Openness Promotion (TOP) Guidelines
Google doc created during the BoF session; anyone with the link can comment.
The ASCL is at .Astronomy ("dot astronomy"), which officially starts tomorrow morning. Three days of unconference, discussion, sessions, hacking, ideas, collaborating, fun, cool buttons, and, rumor has it, Belgian chocolate, in the magnificent city of Chicago. Follow along on Twitter, hashtag #dotastro!
I'm delighted to offer the following guest post by Jonathan Petters, Data Management Consultant, Johns Hopkins Data Management Services, and thank him very much for it!
In a recent discussion on preservation and sharing of research data, a few participants expressed their concern (paraphrased here) that “My research community doesn't know how to create a quality data management plan” and “We don't know how to evaluate data management plans.” The astronomy community explicitly requested a little guidance. We in Johns Hopkins University Data Management Services have developed a few resources, described below, of use in both developing and evaluating data management plans within all research disciplines, including astronomy.
Funding agencies have long encouraged and expected that data and code used in the course of funded research be made available to those in the research discipline. NSF is an important funder of astronomical research that has such expectations (and the agency I will focus on here). A few years ago NSF began requiring data management plans as part of research proposal, in part to aid in the dissemination and sharing of research data and code. Following a February 2013 Office of Science and Technology Policy memo other US funding agencies are expected to follow suit with similar data management plan requirements, including the Department of Energy's Office of Science.
What does NSF say about writing and evaluating quality data management plans? A good overview of NSF data policies relevant for the AST community can be found in these slides from Daniel Katz, NSF). In general the National Science Foundation (NSF) states that data management will be defined by “the communities of interest.” The NSF AST-specific policy further states “MPS Divisions will rely heavily on the merit review process in this initial phase to determine those types of plan that best serve each community and update the information accordingly.” Neither statement is especially prescriptive and can leave researchers unclear as to what they should do.
Creating a plan
While effective research data management certainly has community- and discipline-specific attributes, there ARE aspects of effective data management that are generalizable across research disciplines. It is around these general aspects that we in Johns Hopkins University Data Management Services (JHUDMS) devised our Data Management Planning Questionnaire. We work through this questionnaire with researchers at Johns Hopkins to help them create effective data management plans.
The Questionnaire is designed to comprehensively hit upon the important aspects of effective research data management (e.g. data inputs/outputs in the research, ethical/legal compliance, standards and formats used, intended sharing and preservation, PI restrictions on the use of the data). By answering the applicable questions in the document, removing the questions/front matter and connecting the answers in each section into paragraphs, a researcher would be well on their way to a quality, well thought-out data management plan.
Two relevant side-notes:
1.) For the Questionnaire we consider code and software tools as one 'kind' of research data; thus analysis or simulation codes used in the course of your proposed research should be included as a Data Product. While research code and research data generated or processed by code are clearly NOT the same, there are many similarities in managing the two. In both cases effective management should include consideration of documentation, licensing, formats, associated metadata, and upon what platform(s) the data or code could be shared.
2.) Astronomy, as in other disciplines, conducts a substantial amount of research through large collaborations (e.g. surrounding HST or SDSS data). In these cases it is typical for investments in research data infrastructure to be made, and data policies/practices to be defined for those working with the data. Citing those policies and practices in a data management plan would be appropriate.

Evaluating a plan
To help researchers evaluate data management plans for their quality, my colleagues developed the Reviewer Guide and Worksheet for Data Management Plans (dotx). This Guide and Worksheet is a complement to our Questionnaire; it is a handy checklist by which a grant reviewer can determine whether a researcher thoroughly considered the important aspects of research data management.
For those who researchers saying to themselves, “The Questionnaire and Reviewer Guide are nice, but PLEASE just tell me what to do!!!”, I found two tweets from the code sharing session at the latest (223rd) AAS meeting in January to be quite relevant (h/t August Muench and Lucianne Walkowicz):
![]() |
![]() |
I wholeheartedly agree with both tweets. It is up to the research community members to police and enforce the data management and sharing practices they would like to see in their community. That’s how peer review works! So the next time you review astronomical research proposals, look over the data management plans carefully and bring up relevant thoughts and concerns to the review panel.
Summing up
I hope the Data Management Planning Questionnaire and Reviewer Guide and Worksheet for Data Management Plans help you and other researchers in the astronomy community more fully develop expectations for data management and sharing practices. It’s likely your institution also has research data management personnel (like the JHUDMS at Hopkins) who are more than happy to help!
Mozilla Science Lab, GitHub and Figshare team up to fix the citation of code in academia
The Mozilla Science Lab, GitHub and Figshare – a repository where academics can upload, share and cite their research materials – is starting to tackle the problem. The trio have developed a system so researchers can easily sync their GitHub releases with a Figshare account. It creates a Digital Object Identifier (DOI) automatically, which can then be referenced and checked by other people.
Discussion of the above article on YCombinator
...it always make me cringe when privately held companies want to define an "open standard" for scientific citations that (surprise!) relies completely on their proprietary infrastructure. I still remember the case of Mendeley, which promised to build an open repository for research documents, and which is now a subsidiary of Elsevier, an organization that does not really embrace "open science", to put it mildly.
Tool developed at CERN makes software citation easier
Researchers working at CERN have developed a tool that allows source code from the popular software development site GitHub to be preserved and cited through the CERN-hosted online repository Zenodo....
Now, people working on software in GitHub will be able to ensure that their code is not only preserved through Zenodo, but is also provided with a unique digital object identifier (DOI), just like an academic paper.
Webcite
WebCite is an on-demand archiving system for webreferences (cited webpages and websites, or other kinds of Internet-accessible digital objects), which can be used by authors, editors, and publishers of scholarly papers and books, to ensure that cited webmaterial will remain available to readers in the future.
DOIs unambiguously and persistently identify published, trustworthy, citable online scholarly literature. Right?
So DOIs unambiguously and persistently identify published, trustworthy, citable online scholarly literature. Right? Wrong.
The examples above are useful because they help elucidate some misconceptions about the DOI itself, the nature of the DOI registration agencies and, in particular issues being raised by new RAs and new DOI allocation models.
The ASCL has 779 codes in it now, some of which date back to the 1990s. With the speed at which both the web and code authors (often grad students or post docs) move, links to some code sites are bound to go bad over time. We use a checker regularly to test links to ensure we're not pointing to dead links; when we do find a broken link (defined as one we haven't been able to reach for at least 2 weeks), we look for a new one and, if that doesn't work, email the code author(s) to ask where the code has moved.
We can't always find a good link, and code authors sometimes don't reply to our emails. Currently, eight codes -- 1% of our entries -- have bad links. Of these, for half of them we either cannot find the code author or the code author has not replied to numerous emails.
What else can we do?
I assume that some code authors forget their codes. Having moved on perhaps to another institution and other work, they do not have time nor incentive to create a new web home for a code they wrote some years ago. That's understandable, but then the code, a unique solution to a problem, an artifact of astrophysics research, a method used in research, is lost.
We'd like to save the codes (Save the Codes! I may have to put that on glow-in-the-dark pencils); here are a few ideas for authors who no longer want to maintain a site for their codes:
I don't know about option 4, but options 1-3 should take 15 minutes or less. Surely a code is worth that little bit of extra time to make it available to others even if you don't want to be bothered with it anymore.
Please save your code; don't let it go bad!
"...some of the greatest artifacts of the [astronomy] community’s creative problem-solving are at risk of being lost."
I believe this; a good thing, since this is what Peter Teuben and I wrote in We didn’t see this coming: Our unexpected roles as software archivists and what we learned at Preserving.exe, one of three participant reports in "Preserving.exe: Toward a National Strategy for Software Preservation."
This report arose from a summit held at the Library of Congress on May 20-21, 2013 by the National Digital Information Infrastructure and Preservation Program. Our piece discusses the summit itself, some of what we learned there, and its impact on the way we think about the ASCL and our work. Among the ideas raised at the summit was that of software as a cultural artifact. We wrote:
The Summit broadened our view and appreciation for software as a cultural artifact and as a method of capturing creativity in problem-solving.
Now we see the loss of computational methods that result in research as a loss of part of astronomy’s cultural heritage. This isn’t happening just for astronomy, of course; the Summit made clear that it is happening for everything. With so much rendered digitally, whether born that way or migrated to a digital medium, without preserving the digital artifacts and the software (and sometimes hardware) to lift these artifacts from their digital storage, we risk losing our art, our music, our games, our prose, our data, and our histories, of daily life and activities, of solutions to scientific problems, of popular pastimes and play experiences, and even knowledge of our computer worries and angst.
More on what we learned at the summit is available in the full report, which includes excellent pieces by participants Henry Lowood, Stanford University (The Lures of Software Preservation) and Matthew Kirschenbaum, University of Maryland (An Executable Past: The Case for a National Software Registry), an introduction by Trevor Owens, Library of Congress, and interviews of Doug White of the National Institute of Standards and Technology's National Software Reference Library and Michael Mansfield from the Smithsonian American Art Museum.
PreservingEXE: Toward a National Strategy for Software Preservation
It's not just astrophysics; other sciences are also grappling with issues surrounding software release, transparency of research, and collaboratively sharing codes.
The challenge of software licensing came up in the AAS 223 Special Session on code sharing; ASCL advisor Bruce Berriman followed up on this issue with a post on Astronomy Computing Today, and I've recently run across A Quick Guide to Software Licensing for the Scientist-Programmer, which also offers some guidance on this important issue.
The code sharing crowd took over the AAS Twitter feed, it seems, during the Special Session on code sharing at AAS 223. Bottom up is the best way to read these, as the most recent tweet is on the top, and please note they aren't strictly in order of occurrence and I likely missed some (there were so many!). I'm happy to add those I missed if someone tells me about them. Thanks to all those who tweeted throughout the session!
ASCL
Lucianne Walkowicz
August Muench
Nuria Lorente
Zach Pace
Nuria Lorente
Chrissy Madison
Ben Thompson
Lucianne Walkowicz
Adrian Price-Whelan
Lucianne Walkowicz
Lucianne Walkowicz
August Muench
August Muench
Lucianne Walkowicz
Lucianne Walkowicz
Christopher Hanley
August Muench
Lia Corrales
Lucianne Walkowicz
Ben Cook
Kelle Cruz
Lucianne Walkowicz
Kelle Cruz
Alexa Villaume
Meredith Rawls
Lucianne Walkowicz
August Muench
August Muench
Lucianne Walkowicz
Lucianne Walkowicz
August Muench
Meredith Rawls
Meredith Rawls
Lucianne Walkowicz
Meredith Rawls
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Lia Corrales
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Ben Thompson
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Alexa Villaume
Laura Watkins
Ian Paul Freeley
Ben Thompson
Meredith Rawls
August Muench
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Matthew Turk
Lucianne Walkowicz
Laura Watkins
August Muench
August Muench
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
August Muench
August Muench
Ben Thompson
Kelle Cruz
Lucianne Walkowicz
Lucianne Walkowicz
Lucianne Walkowicz
Ben Thompson
Lucianne Walkowicz
August Muench
August Muench
David Morrison
Lucianne Walkowicz
August Muench
Ian Paul Freeley
Alex Parker
Dr Chris Tibbs
Kelle Cruz
Alexa Villaume
August Muench
Meredith Rawls
Lucianne Walkowicz
Timothy Pickering
Lucianne Walkowicz
Kelle Cruz
Kelle Cruz
Lucianne Walkowicz
Ben Thompson
August Muench
August Muench
Lucianne Walkowicz
August Muench
Michelle Collins
Laura Watkins
Kelle Cruz
Laura Watkins
Meredith Rawls
Michelle Collins
Ben Thompson
Erik Tollerud
Benjamin Weiner
ADASS
Benjamin Weiner
Astropy @astropy 6 Jan
At the #aas223? Don't miss Tuesday's 14:00-15:30 session on code sharing - including a talk by @eteq about @astropy!
David W. Hogg