One of the aims of the Research Software Encyclopedia is to bring attention to libraries that we take for granted. The ones that easily come to mind are git and Linux, but today’s showcase honors another open source library that you’ve probably used and taken for granted - curl! Some might argue that this library is not qualified to be called research software, but we could probably agree that is plays an import role in the research ecosystem, which is why we want to showcase it today.
Are you already familiar with this software? We encourage you to contribute to the research software encyclopedia and annotate the respository:
otherwise, keep reading!
If you’ve ever written any kind of shell script that interacts with a web resource such as an API, you’ve very likely used curl. Curl is, by it’s own definition on the GitHub repository:
A command line tool and library for transferring data with URL syntax, supporting DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP. libcurl offers a myriad of powerful features
Yes, holy cow, that is a lot of formats supported! You might call it the bread and butter of requests.
If I’m not using Python or some other higher level language to make a request, curl is my go-to tool.
And you can definitely look at the man pages to see a wide variety of examples.
I use it most often for debugging the GitHub Rest API. For example, here is how I might
open a pull request from a particular $BRANCH
for the Research Software Encyclopedia Software
database:
curl -X POST -H "Accept: application/vnd.github.v3+json" -H "Authorization: token $GITHUB_TOKEN" https://api.github.com/repos/rseng/software/pulls -d '{"head":"${BRANCH}", "base":"master", "body": "Fixes https://github.com/rseng/software/issues/160", "title": "Update from annotate/taxonomy-vsoch-2020-12-09/22-13-20"}'
In the above, we see that we are adding special headers (-H
), and then providing an entire body of data to be parsed into some
kind of dictionary or hash with values. But this is just a simple example - curl can do so much more than standard GET
and POST
!
To really see it’s influence, not just for research software but the entire open source ecosystem, just
search GitHub for curl. You’ll see over 17K repositories, 23 million code snippets, and usage across
many different languages.
What does it take to maintain such a library? Check out this post written by the creator and (still) maintainer Daniel Stenberg, who started the code base as early as 1996 but really established it in 1998. He tells a story of how he thought about licensing, copyright, and balancing development of the library with sustaining his own livelihood. Given that it’s been more than 2 decades and the project is going strong, I highly recommend reading the post to learn some of his insights.
Curl is exactly the kind of library that you would use and not think to cite in a paper. Because of this, it’s likely that citation (or providing a DOI) is not a priority for the authors. Aside from GitHub releases and the repository, I don’t see any strong indication that there is a preference or way to easily cite it. If you know of one, please open an issue!
Curl is widely used enough so that there are many online resources! Here are a few:
And another fun way to learn is just looking at examples in the wild!
or read more about annotation here. You can clone the software repository to do bulk annotation, or annotation any repository in the software database, We want annotation to be fun, straight-forward, and easy, so we will be showcasing one repository to annotate per week. If you’d like to request annotation of a particular repository (or addition to the software database) please don’t hesitate to open an issue or even a pull request.
You might find these other resources useful:
For any resource, you are encouraged to give feedback and contribute!